Re: [pygame] Promoting cheeseshop/pypi for game releases?

tldr;

How about we recommend this API:

.image(file_path) -> pygame.Surface
.image('images/bla.png'), note no os.path.join needed.
.sound(file_path) -> pygame.Sound
.font(file_path) -> pygame.Font

.music(file_path) -> calls music.load() on the file.

We have these available, but don't recommend:
.get_file(file_path) -> open file

.find_data_dir() -> folder for 'data'.

.find_file(file_path) -> finds the full path to the file.

Something for later:

.preload(images, sounds, fonts, music) preloads files perhaps over network or perhaps multi core, or just into a cache.

I think this API is simple, and it's possible to do a fairly good implementation and a very quick easy implementation in time for pyweek.

Perhaps we can have a namespace package pygame.resources, released on pypi as pygame_resources. Or maybe include it in a 'skellington' package? If so, what should that package be called?

----

Ok, I am happy to avoid pkg_resources. Does it give us anything at all?

Here's the pyglet approach to resources loading. Note there's also a config interface, for user game options. Something that everyone eventually writes. I think these are all good, but out of scope for now.
http://pyglet.readthedocs.io/en/pyglet-1.2-maintenance/api/pyglet/pyglet.resource.html

Note: for network connections I think gamejs (a JS lib that is fairly close to the pygame API) shows a fairly nice way to do it. You can have a preload('bla.png', 'yes.png') call to tell it to immediately go off and load images. (This API could also let us easily implement a)

https://github.com/GameJs/gamejs/blob/master/examples/collisionmask/main.js

It's common for people to have their resources libraries make resources.load('bla.png') return the same Surface, by storing a weak reference to it in a cache. An advanced implementation could do this I think.

I don't have benchmark numbers for you, but it definitely is faster to load one file than 100. Way less seeks (on media like SD cards, and magnetic disks this is more important). It's also faster to do one read, rather than "random reads" (there's lots of IO benchmarks online proving this is true). On high end modern SSDs with the latest kernels and best file systems you can get multi core disk IO when the moon is aligned right. Many games use 'pack files' for speed, instead of the file system. Note, this is still true even in web browsers - having a big image (sprite sheet) is faster, and uses less memory than having many separate images.

For optimal loading from disk, having one big file with the separate files start and end positions is a good idea. Then you can load each one with mmap (so the image is only in memory once... the file cache), and then use separate python processes to load each image in with mmap (pygame can load images via mmap). Unfortunately some image routines are not multithread safe, so the GIL is held sometimes... meaning processes are faster to load pygame images on python. On some graphics card/OSes you can even mmap the raw files directly from the file cache into GPU texture (on some apple hardware you can do this). Another trick for when you have a cold file cache (like the first time you load the file) is to first start a 'cat' process sending the pack file to /dev/null, this uses an optimised kernel path to put the data into the file cache.

Anyway... that was a bit of a digression.

On Thu, Feb 2, 2017 at 2:01 PM, Thomas Kluyver <takowl@xxxxxxxxx> wrote:

On 2 February 2017 at 06:34, René Dudfield <renesd@xxxxxxxxx> wrote:
Whilst naming is important, the name of the package doesn't need to be the name of the Game. I've worked on projects where the company name, and app name changed at least three times whilst the package name stayed the same. So I don't think people need to worry so much. Also, it's not too hard to change the package name later on.

My vote would be to follow the python sampleproject way of doing things.

That gets my vote as well. If you're putting it on PyPI, sooner or later you have to give the package a good name (at least, not the same name as every other game's package). We can encourage people to use relative imports (from . import foo) which makes it less work to rename.

Data folder, and "get_data".

2). The other aspect I'm not sure of is having a data folder outside of the source folder. Having it inside the source folder means you can more easily include it inside a zip file for example. It also makes packaging slightly easier. Since you can just use from within a MANIFEST.in a recursive include of the whole "mygamepackage" folder.

Having data/ separate is the choice of the sampleproject, and the skellington.

I haven't really seen a modern justification for keeping data out of the package folder?

My vote would be for it to go inside the package, and for the skeleton to provide a bit of code like this:

DATA_DIR = os.path.join(dirname(abspath(__file__)), 'data')
def find_data(*path):
return os.path.join(DATA_DIR, *path)

This is roughly what I recommend in Pynsist's FAQ.

I have vague recollections of reasons being: 'because debian does it'. My recollection is that Debian traditionally did it to keep code updates smaller. Because if you only change 1KB of source code, there's no point having a 20MB update every time.

Linux packagers like to put data files in /usr/share to comply with the filesystem hierarchy spec. However, with the snippet of code I gave above, it's easy for them to patch DATA_DIR = '/usr/share' and move the files out.

A bonus of keeping data separate is that it forces you to use relative addressing of file locations. You don't hardcode "data/myfile.png" in all your paths. Do we recommend the python way of finding the data folder? This is package_data, and data_files setup attributes. https://github.com/pypa/sampleproject/blob/master/setup.py

Actually, I think that's more of a risk if it's a separate top-level directory, because then 'data' is going to be in the CWD when you run it as a developer.

They (package_data) are a giant pain. One, because they require mentioning every single file, rather than just the whole folder. Two, because they require you updating configuration in both MANIFEST.in and setup.py. Also, there are different files included depending on which python packaging option you use.

Agreed that they are a giant pain. They don't quite require mentioning every single file, as you can glob directories, but you do need to mention every subdirectory, at least for package_data.

This is one of the things that prompted me to write a packaging tool called 'flit', which doesn't require this. I don't know that it's quite ready to be pushed on people in a game skeleton, though.

Another issue is that, using the python way pkg_resources from setuptools needs to be used at runtime. pkg_resources gets you access to the resources in an abstract way. Which means you need setuptools at runtime (not just at install time). There is already work going into keeping that separate here: https://github.com/pypa/pkg_resources So I'm not sure this will be a problem in the next months.

Ugh, I avoid pkg_resources at all costs. If you use package_data (as opposed to data_files), it's not necessary; see my snippet above.

I haven't confirmed if pkg_resources works with the various .exe making tools. I've always just used file paths. Thomas, does it work with Pynsist?

I haven't tried, but I'd guess it may well not work.

Having game resources inside a .zip (or .egg) makes loading a lot faster on many machines. pkg_resources supports files inside of .egg files. So, I think we should consider this possibility.

I've heard this claim made before, but I haven't seen numbers. Having resources inside a zip file is awkward if you need to pass a file path or a file handle to other libraries: you have to extract it and write it to a temporary file before you can pass it in. If the performance difference is important, I'd favour writing some helper routines using the zipfile module rather than doing anything with pkg_resources.