package.mask as a directory and PMS

October 21, 2011

Just a quick note, for any repository/overlay maintainers who are have profile package.* or use.* as directories (usually package.mask), please add profile-format = portage-1 to the repositories metadata/layout.conf.

As folk probably are aware, PMS rules are that those nodes must actually be files; portage allows both. The problem here is that package managers that try to enforce strictness for QA reasons, wind up blowing up when using those repositories. Pretty much a bad situation- it’s a useful feature, but QA strictness is also useful.

Thus the new repository configuration option; it supports either ‘pms’ or ‘portage-1’ as the setting. If unspecified, it defaults to PMS for compatibility reasons. If it’s in PMS mode, than PMs should strictly enforce PMS rules. If set to portage-1, than package.* and use.* can be directories of arbitrary depth, parsed in alphanumeric sorted order (thus given 3 files, “a2”, “a1”, “1”, it would parse in order of “1”, “a1”, “a2”). Pretty much this is the enhanced profiles portage supports now, just explicitly marked as a format so that it’s possible to detect it, and vary the QA rules as needed.

Realize this might seem minor to people, but one of the historical issues we’ve had w/ the tree and our formats is the lack of versioning/format markers- thus making it far harder for alternative package managers, and raising backwards compatibility issues for portage itself. Profile nodes as directories has been an issue for a long while (basically a bit after PMS was created), and this functionality is designed to allow it to be supported properly, while also giving us a way to swap in new profile formats down the line as needed.

Thanks in advance for updating your repositories. 😉


repository enhancements

October 19, 2011

While the support is fairly recent, I wanted to highlight some of the recent portage improvements for repository supported that were released in/before These are configurable per repository via modifying metadata/layout.conf for that repository.

  • thin-manifests = [ true | false ]
    Defaults to false if unspecified.
    For repositories that are distributed via git, the VCS provides via it’s sha1 internals guarantees of the content. This means the non-distfile manifest checksums are redundant; if enabled, this disables non-distfile validation and turns off generation of those checksums when creating manifests. Pretty much if you’re got a git vcs repo, you likely want this enabled unless you’re paranoid about someone having a sha1 pre-image attack they’re sitting on.
  • use-manifests = [ false | true | strict ]
    Defaults to strict if unspecified.
    This provides per repository control as to whether manifests should be used/generated; if set to false, manifest usage and generation is disabled for that repository.
    If set to true, this directs the package manager to use manifest data if available, but to not consider it a failure if a manifest file is missing. Additionally, if set to true, the package manager will generate manifests by default. This mode is primarily of use for migrating a repository that lacked manifests, to using/requiring manifests.
    Finally, if set to strict, manifests are generated/required and it’s considered a failure if the data isn’t available. Generally speaking, there rarely is any reason to set this option to anything other than strict.
  • cache-format = [ pms | md5-dict ]
    Defaults to pms.
    As it sounds- this is a directive which cache format this repository uses for any pregenerated cache distributed with the tree. Currently there are two formats; pms, the standard metadata/cache format that has the following restrictions- should not be used for any repository that has specified masters, and cannot be used if the repository is distributed in a fashion that doesn’t preserve mtime (git for example, doesn’t preserve mtime). Not a great cache format frankly, but for where it’s usable it suffices and is well supported; main limitation is that it has no real validation built into it beyond an mtime check of the cache file in comparison to the ebuild; as such, it’s impossible to validate if the eclass has changed since thus precluding from using it in an overlay/repository w/ master setup. As said, it’s historical, works well for the main repository but has definite flaws.
    The new kid on the block is md5-dict, which is a bit of a hack, but has useful properties. It enabled, it lives at metadata/md5-cache; it’s basically a flat_hash cache (the format used at /var/cache/edb/dep/*), just using md5 rather than mtime for validation. Specifically, this means you can distribute this cache via git, and means you can safely use it for overlays/repositories with masters specified; it carries enough validation information to detect if the cache entry is stale, in which case the manager regenerates as necessary.
    Down the line, I intend to design a format explicitly optimized for pregenerated usage- both reducing the inode requirements of existing pregenerated caches, and speeding it up. In the interim, md5-dict is probably what you’re after unless we’re talking about the literal gentoo repository itself (which must remain pms format).

I’d like to note that this functionality was contributed by the chromium-os project, one of the multiple gentoo derivatives; in addition, cookies should be sent to Zac for cleanup/fixing of the cache-format functionality (the refactoring enabling it was a bit touchy getting right).

Beyond those features, sign-manifests was added (unless you have a good reason, leave it enabled), and manifest-hashes was added (again, unless you have a good reason, leave it alone right now). I expect in the next week or two for an additional feature to appear to explicitly mark a repository if it’s using PMS incompliant package.mask as a directory.

At this point, as stated this functionality is in portage; for pkgcore, thin-manifests are supported, and the rest will be addressed in the next release or two.