[devel] [PATCH 2/2] gb: optimize rebuilt srpm if its identity is equal to identity of srpm in the repo

Alexey Tourbin alexey.tourbin на gmail.com
Пн Апр 20 11:36:00 MSK 2020


On Fri, Apr 17, 2020 at 12:51 AM Alexey Tourbin
<alexey.tourbin на gmail.com> wrote:
> So for src.rpm packages, it's a solved problem. For binary packages,
> the identity should specifically exclude disttag. It will no longer
> satisfy the definition of ID for rpm (substitution will break for
> subpackages with strict dependencies). Therefore for binary packages,
> we need to track <ID,disttag> tuples. This is a one-to-many relation:
> for each ID, there may be a few disttags.  So for binary packages we
> need a separate identity-addressable storage which maps ID to
> <disttag,filehash> (while for source packages, a hardlink maps ID to
> filehash).  If implemented naively, this will create many small files,
> one file per ID, most files with just one line. In a more practical
> implementation, you should probably group all those small files by
> package name.  So you'll have:
>
> $ cat id2f/libfoo
> <libfoo-ID1> <disttag1> <libfoo-filehash1>
> <libfoo-ID2> <disttag2> <libfoo-filehash2>
>
> $ cat id2f/foo-data
> <foo-data-ID1> <disttag1> <foo-data-filehash1>
> <foo-data-ID1> <disttag2> <foo-data-filehash2>
>
> Note that for libfoo, the IDs are different, but with foo-data the IDs
> are the same. This indicates that the contents of libfoo have changed
> after a rebuild, while the contents of foo-data have not.

It may even make sense to group the mappings by src.rpm name instead
of package name. At first it seems less intuitive, but in return it
can give you a consistent view similar to MVCC snapshot.  Of course,
these files should be updated atomically, with rename(2). To check a
set subpackages, you first need to copy the file to a local dir. This
should rule out the case in which some subpackages have been added to
the file and some not.

These files are to be updated during the task-commit stage, under the
exclusive lock. This is also the right moment to detect race
conditions. Suppose you build the same package for sisyphus and p9 in
parallel, and the build result is the same. Before adding new
packages, you recheck if the whole set can be replaced with the
already existing packages.  One of the two tasks then should fail (or
automatically scheduled for another iteration).


Подробная информация о списке рассылки Devel