[devel] RFC: girar: optimize rebuild

Vladimir D. Seleznev vseleznv на altlinux.org
Сб Апр 11 18:21:01 MSK 2020


On Sat, Apr 11, 2020 at 02:04:25PM +0300, Gleb Fotengauer-Malinovskiy wrote:
> Hi,
> 
> On Sat, Apr 11, 2020 at 02:10:42AM +0300, Vladimir D. Seleznev wrote:
> > 
> > Hi!
> > 
> > The first part of rebuilt packages optimization for girar. It introduces
> > pkg_identity() and simple optimization of the rebuilt sourcerpm.
> 
> Why do we rebuild source rpm at all when we already have one?  I mean,
> when we use hasher with --query-repackage this new rebuilt source rpm is
> no better then original one.
> 
> I think we can always save the original source rpm when we rebuild
> a package or copy it from branch to branch (like we actually do for
> packages originally built from src.rpm-s).

I'm sorry, I was not clear. Sure when a package is built from the
sourcerpm, no optimization is required in this case as girar saves only
original sourcerpm. The different things happen when package is built
from the gear. In the case when package is rebuilt from the gear, girar
produce new source and binary rpms, and when the rebuilt task is done it
saves all these new source and binary rpms. The proposed optimization is
aimed for that case.

> > pkg_identity() takes RPM package and returns a value called package identity,
> > a hash of subset of RPM package header. That subset is the entire header
> > without some nonessential artifacts like buildhost, buildtime, header hashsum,
> > etc.
> > 
> > The two package builds of the same NEVR might have equal or different
> > package identities. The equal identities mean that build results of these
> > packages are equal too, that allows build optimization. The practical
> > example of simple rebuilt sourcerpm optimization also introduced.
> 
> Did you consider adding all this identity logic on the rpm's side (as a
> standalone helper may be)?  I personally don't like the whole idea
> of tracking rpm tags status on girar side.  Also, this helper may be
> useful outside of girar.

I did, but it's a bit complicated. RPM community likes the idea, but
there is no consensus about how it should work. Sure each project can
realize it by its own specific way.

So, whether we should calculate the package identity in the girar side
or the rpm side? If it should be on rpm side, should it support rpm
4.0.4?

> > The future work can be about optimization of "copied" to another branch
> > sourcerpm with retrieved from archive sourcerpm, and binary packages
> > optimization (this case has an issue when binary subpackages are mixed
> > archs, i.e. arch and noarch, this probably could work only with single-arch
> > builds).
> 
> Looks like a good plan.  I think optimization of binary packages is more
> important then optimization which looks for archived packages.
> We may want to take binary packages from archive too anyway.

Ok.

-- 
   WBR,
   Vladimir D. Seleznev


Подробная информация о списке рассылки Devel