package.mask as a directory and PMS

October 21, 2011

Just a quick note, for any repository/overlay maintainers who are have profile package.* or use.* as directories (usually package.mask), please add profile-format = portage-1 to the repositories metadata/layout.conf.

As folk probably are aware, PMS rules are that those nodes must actually be files; portage allows both. The problem here is that package managers that try to enforce strictness for QA reasons, wind up blowing up when using those repositories. Pretty much a bad situation- it’s a useful feature, but QA strictness is also useful.

Thus the new repository configuration option; it supports either ‘pms’ or ‘portage-1’ as the setting. If unspecified, it defaults to PMS for compatibility reasons. If it’s in PMS mode, than PMs should strictly enforce PMS rules. If set to portage-1, than package.* and use.* can be directories of arbitrary depth, parsed in alphanumeric sorted order (thus given 3 files, “a2”, “a1”, “1”, it would parse in order of “1”, “a1”, “a2”). Pretty much this is the enhanced profiles portage supports now, just explicitly marked as a format so that it’s possible to detect it, and vary the QA rules as needed.

Realize this might seem minor to people, but one of the historical issues we’ve had w/ the tree and our formats is the lack of versioning/format markers- thus making it far harder for alternative package managers, and raising backwards compatibility issues for portage itself. Profile nodes as directories has been an issue for a long while (basically a bit after PMS was created), and this functionality is designed to allow it to be supported properly, while also giving us a way to swap in new profile formats down the line as needed.

Thanks in advance for updating your repositories. 😉


repository enhancements

October 19, 2011

While the support is fairly recent, I wanted to highlight some of the recent portage improvements for repository supported that were released in/before 2.1.10.29 These are configurable per repository via modifying metadata/layout.conf for that repository.

  • thin-manifests = [ true | false ]
    Defaults to false if unspecified.
    For repositories that are distributed via git, the VCS provides via it’s sha1 internals guarantees of the content. This means the non-distfile manifest checksums are redundant; if enabled, this disables non-distfile validation and turns off generation of those checksums when creating manifests. Pretty much if you’re got a git vcs repo, you likely want this enabled unless you’re paranoid about someone having a sha1 pre-image attack they’re sitting on.
  • use-manifests = [ false | true | strict ]
    Defaults to strict if unspecified.
    This provides per repository control as to whether manifests should be used/generated; if set to false, manifest usage and generation is disabled for that repository.
    If set to true, this directs the package manager to use manifest data if available, but to not consider it a failure if a manifest file is missing. Additionally, if set to true, the package manager will generate manifests by default. This mode is primarily of use for migrating a repository that lacked manifests, to using/requiring manifests.
    Finally, if set to strict, manifests are generated/required and it’s considered a failure if the data isn’t available. Generally speaking, there rarely is any reason to set this option to anything other than strict.
  • cache-format = [ pms | md5-dict ]
    Defaults to pms.
    As it sounds- this is a directive which cache format this repository uses for any pregenerated cache distributed with the tree. Currently there are two formats; pms, the standard metadata/cache format that has the following restrictions- should not be used for any repository that has specified masters, and cannot be used if the repository is distributed in a fashion that doesn’t preserve mtime (git for example, doesn’t preserve mtime). Not a great cache format frankly, but for where it’s usable it suffices and is well supported; main limitation is that it has no real validation built into it beyond an mtime check of the cache file in comparison to the ebuild; as such, it’s impossible to validate if the eclass has changed since thus precluding from using it in an overlay/repository w/ master setup. As said, it’s historical, works well for the main repository but has definite flaws.
    The new kid on the block is md5-dict, which is a bit of a hack, but has useful properties. It enabled, it lives at metadata/md5-cache; it’s basically a flat_hash cache (the format used at /var/cache/edb/dep/*), just using md5 rather than mtime for validation. Specifically, this means you can distribute this cache via git, and means you can safely use it for overlays/repositories with masters specified; it carries enough validation information to detect if the cache entry is stale, in which case the manager regenerates as necessary.
    Down the line, I intend to design a format explicitly optimized for pregenerated usage- both reducing the inode requirements of existing pregenerated caches, and speeding it up. In the interim, md5-dict is probably what you’re after unless we’re talking about the literal gentoo repository itself (which must remain pms format).

I’d like to note that this functionality was contributed by the chromium-os project, one of the multiple gentoo derivatives; in addition, cookies should be sent to Zac for cleanup/fixing of the cache-format functionality (the refactoring enabling it was a bit touchy getting right).

Beyond those features, sign-manifests was added (unless you have a good reason, leave it enabled), and manifest-hashes was added (again, unless you have a good reason, leave it alone right now). I expect in the next week or two for an additional feature to appear to explicitly mark a repository if it’s using PMS incompliant package.mask as a directory.

At this point, as stated this functionality is in portage 2.1.10.29; for pkgcore, thin-manifests are supported, and the rest will be addressed in the next release or two.


So which python version you want?

June 1, 2010

For pkgcore, we run a pretty comprehensive set of buildslaves targets for testing pkgcore. Specifically

  • python 2.4
  • python 2.5
  • python 2.6
  • unladen swallow; python 2.6 based
  • python 2.7 snapshot (20100523)
  • python 3.1
  • python 3.2 VCS snapshot (20100523)

Originally, I’d ran this as separate KVM instances. This is nonoptimal however, since each instance is basically the exact same OS just w/ a differing python version overlaid, and w/ buildbot’s buildslave running within each. So via bastardizing some LXC work from diego, we now run a single kvm instance w/ each python version (and it’s buildslave) tucked away into their own container. Each container (including the raw parent) is intentionally externally addressable so that developers can reach into the container and tinker as needed, or experiment w/ that particular version of python.

Couple of folk have poked me for access to a copy of the vm image, so the puppy was stripped down (buildslave machinery removed among other things) and was posted to gentoo mirrors (and here is a direct link to allpython-amd64-qemu-20100531.qcow2.xz); still is propagating in full, but hit whatever your favorite local mirror is and raid it from there.

Few things to note about this vm image:

  1. it’s configured for, and expects to get access to it’s block device as virtio; this is tweakable, but really not recommended (virtio performance is quite nice)
  2. the containers are currently named buildbot-py$VER; this is a hold over from stripping down pkgcore’s buildslave vm… and my own lazyness in not changing the names
  3. each container root is actually an AUFS2 union of the raw parent FS. This was done so that the container could still share dentry cache for it’s libs w/ the parent, and to keep each containers footprint minimal.
  4. these are *full* containers, intentionally so to keep them from screwing up the parent/eachother in any fashion.
  5. the root password is ‘python’. Strongly suggest you change that if you expose this puppy publically.
  6. this is a bit of a custom setup- this is a patched version of lxc (backport adding init shutdown support), and a nasty little trick in the lxc init scripts to allow the parent to cleanly tell the guest container to shutdown (lxc-ps –lxc auxf # is a good command to look at- note that init is not the first process in each container).
  7. if you’re just running this for your own usage in a non deployed manner, feel free to remove acpid. acpid runs by default since the buildslave VM this was derived from is ran via init scripts, so there needed to be a way to tell it to shutdown (monitor shutdown events trigger acpi events, thus acpid).
  8. This was an oversight on my part during the scrub/releasing, but the default python in each container is still set to python-2.6. Feel free to run `eselect python set` to change the default version. In our buildslave usage, we leave the default python as 2.6 and force the target python via the buildbot step’s themselves; this was done to avoid having to modify buildbot bits hardcoding the python version into it’s shebang.
  9. Bugger is running a snapshot of pkgcore/snakeoil; mainly wanted a couple of unreleased fixes in there. VM and containers have been maintained/created via pkgcore in addition (issues/bug reports welcome).

Finally, there is *zero* support for this. I’m interested in bugs mind you, but I’m not supporting this- I’m just putting it out there since people have asked for it. Also if you’re interested in building your own buildslaves setup targeting multiple python versions based on this, feel free to either find me in irc or email me. I’ve been quite happy w/ this setup, including it’s minimal resource usage.

Hope it’s useful.