Add macro %_duplicate_files_terminate_build to allow terminating build
on duplicate file-list entries. As there are common packaging scenarios
where duplicates are kind of required to achieve a thing reasonably
(eg a directory full of files and just one of them needs different
permissions/flags), default to off until we grow a way to better deal
with that.
Fixes: #1158
Only call free on the success path, getaddrinfo() is like most this
type functions and only allocates on success and so, in failure we'd
end up freeing an uninitialized pointer.
Reported by Scott Andrews on rpm-list.
Previously we only checked for unpackaged files and symlinks, completely
ignoring eg extra directories that might be there. Just check for everything
instead. Related to #994.
Directories are a little tricky as some of them are almost always unowned
so we need to allow all path components leading to packaged files. This
would be a *lot* of relatively expensive lookups, but we only need to do
the dance when the directory index changes, and we can stop when no new
paths get added to the pool.
We only actually used gethostbyname() for canonicalizing buildhost,
convert that to use getaddrinfo() instead, which actually has an
option for retrieving exactly what we want.
The other "use" was to initialize name services, but as we don't need
or use hostnames for any operation, we can just as well drop it. User
and group names are what we care about.
The intent of %exclude was always to merely support sub-packaging with
wildcards in %files sections, not to permit leaving junk in the buildroot.
Enforce this by checking against the actually packaged contents rather
than everything we encountered during collection, and document the
behavior.
This has been widely abused so the change is likely to break quite a few
packages in the wild. As a side-effect this also cures a long-standing bug
where unpackaged excluded files leak their debuginfo into packaged contents,
as such a package will now fail to build at (RhBug:878863)
As a nice side-bonus, this also gets rid of the ugly static check_fileList
buffer - besides being ugly, such things are bad for parallelism.
Fixes: #994
Despite ultimately having the same exact options and arguments, parametric
and non-parametric generators receive them in such different manners that
trying to handle them together only makes things more complicated.
Pass the main macro name around and let the runFoo() calls construct
the arguments as fits the case.
This also lets us take advantage of rpmExpandThisMacro() for calling
parametric generators, eliminating the need for the %1 macro argument
hack.
As of Doxygen >= 1.8.20 it started complaining about anything marked
as @retval being undocumented. As this is widely used in rpm...
Mass-replace all @retval uses with @param[out] to silence. Some of
these are actually in/out parameters but combing through all of them
is a bit too much...
Also escape <CR><LF> in rpmpgp.h to shut up yet another new warning.
Not much changes here in practise, although this does patch_nums and
source_nums "leaking" after a spec parse as we forgot to update
*that* code when adding them. More visible when consolidated...
Also store the Lua context in the spec struct. This doesn't make much
of a difference as it is, but it'll be needed someday when we create
a new Lua environment for each spec parse, and at any rate this is
gives us a single, easy place to check whether it was initialized or not.
Calling exit() as introduced in commit
8d84878ee0 is only okay from child
processes we might launch, never really from librpm* itself.
Just flag the error instead for later processing.
When built with -fsanitize=address, gcc complains that source_date_epoch
could be used uninitialized. This really cannot be so as override_date
is only ever set if source_date_epoch is set. However we can simplify
code by removing override_date variable out of the picture, we can just
as well initialize source_date_epoch and consider non-zero value as
being enabled instead.
%_changelog_trimtime is an absolute timestamp which needs to be constantly
pushed forward to preserve the same relative age, and will start trimming
entries from unchanged packages until none are left, leading to unexpected
and confusing behavior (RhBug:1722806, ...)
It's better to trim by age relative to newest changelog entry. This way the
number of trimmed entries will not change unless the spec changes, and at
least one entry is always preserved. Introduce a new %_changelog_trimage
macro for this and mark the broken by design %_changelog_trimtime as
deprecated, but autoconvert an existing trimtime into relative for now.
As a seemingly unrelated change, move the "time" variable declaration
to a narrower scope to unmask the time() function for use on entry.
Fixes: #1301
Otherwise executables that are not proper elf files are leaking libelf
handles. This results in file being left open (mmap'ed) and fails the
build on NFS as those files can't be deleted properly there.
Resolves: rhbz#1840728
See also: https://bugzilla.redhat.com/show_bug.cgi?id=1840728
The symlinks do occupy space so they should be counted, except for
hardlinks to symlinks.
Rpm's own disk-space accounting is not affected, it always looks at
individual file sizes rather than the total size.
Previously only the legacy external dependency generator listened to
exit codes from the generator, and even that only for provides.
Anybody building packages will want to know if generators barf up
for one reason or another. Let them.
Always call rpmfcExec() with failnonzero set, pass errors around, add tests.
Fixes: #1183
The pkg variable used in the parallel loop was declared outside
of the omp parallel construct, so it was shared among tasks. This
had the potential to cause a data race. The gcc openmp implementation
did not hit this problem, but I uncovered it while trying to compile with
clang. My best guess as to what was happening is that after the last
task was launched, all tasks had the same value of pkg and were operating
on the same data at the same time.
This patch declares the variable inside the omp parallel construct, so each
task gets its own copy of the variable.
RPM build did not fail if rpmlint (%_build_pkgcheck_set) failed when checking binary RPMs
(it did fail correctly when rpmlint failed when checking SRPMs)
If %_nonzero_exit_pkgcheck_terminate_build is true, then the build must fail, otherwise it must not.
This regressed and the build never failed.
Probably fixes regression introduced by 78f61f273 ("Refactor package set checking out of packageBinaries()")
It was possible to generate Requires, Recommends etc. using external dependency generators.
Adding ability to generate OrderWithRequires.
Example use case:
When a package contains a systemd unit, %systemd_* macros are usually used;
it is usefull to add "OrderWithRequires: systemd" in this case to ensure
that systemd is installed before that package.
It will help to avoid adding "%systemd_ordering" manually to all packages using systemd.
Having systemd preinstalled before packages with systemd scriptlets is really important,
otherwise those scriptlets fail silently, and the resulting ISO or chroot may be broken.
The same makes sense for e.g. systemd-sysusers, systemd-tmpfiles.
An RPM generator using this functionality was implemented: https://abf.io/import/order-rpm-generators
Rebuilding packages with systemd stuff in rosa2019.05 using this generator has already helped
to improve installation order, e.g. make e.g. openvpn be installed when systemd already exists
in a big transaction with ~3500 packages when building a big ISO image.
Before that openvpn was installed when systemd did not exist yet.
P.S. This patch adds %__find_orderwithrequires, maybe that is legacy code that should not be changed.
Signed-off-by: Mikhail Novosyolov <m.novosyolov@rosalinux.ru>
NSS is a behemoth of a library which drags in a whole runtime subsystem
of its own which is often at odds with normal Unix system behavior
(hello SIGPIPE). Now that we have nicer alternatives available there's
little reason to lug this baggage along. NSS was deprecated in rpm 4.16
(commit 0b9efb93fb).
Binary packages come in different sizes and so their build time can vary
greatly. Dynamic scheduling, which we currently use for parallel
building, is a good strategy to combat such differences and load-balance
the available CPU cores.
That said, knowing that the build time of a package is proportional to
its size, we can reduce the overall time even further by cleverly
ordering the task queue.
As an example, consider a set of 5 packages, 4 of which take 1 unit of
time to build and one takes 4 units. If we were to build these on a
dual-core system, one possible unit distribution would look like this:
TIME --->
CPU 1 * * * * * * # package 1, 3 and 5
CPU 2 * * # package 2 and 4
Now, compare that to a different distribution where the largest package
5 gets built early on:
TIME --->
CPU 1 * * * * # package 5
CPU 2 * * * * # package 1, 2, 3 and 4
It's obvious that processing the largest packages first gives better
results when dealing with such a mix of small and large packages
(typically a regular package and its debuginfo counterpart,
respectively).
Now, with dynamic scheduling in OpenMP, we cannot directly control the
task queue; we can only generate the tasks and let the runtime system do
its work. What we can do, however, is to provide a hint to the runtime
system for the desired ordering, using the "priority" clause.
So, in this commit, we use the clause to assign a priority value to each
build task based on the respective package size (the bigger the size,
the higher the priority), to help achieve an optimal execution order.
Indeed, in my testing, the priorities were followed to the letter (but
remember, that's not guaranteed by the specification). Interestingly,
even without the use of priorities, simply generating the tasks in the
desired order resulted in the same execution order for me, but that's,
again, just an implementation detail.
Also note that OpenMP is allowed to stop the thread generating the tasks
at any time, and make it execute some of the tasks instead. If the
chosen task happens to be a long-duration one, we might hit a starvation
scenario where the other threads have exhausted the task queue and
there's nobody to generate new tasks. To counter that, this commit also
adds the "untied" clause which allows other threads to pick up where the
generating thread left off, and continue generating new tasks.
Resolves#1045.
In todays' "look ma what crawled from under the bed" episode, we have
encounter a subpackage whose name is not derived from the main package
name, and a manually specified debuginfo package on that. This particular
combo manages to evade all our checks for duplicate package names, and
in the right phase of the moon actually creates corrupt packages due to
two threads end up writing to the same output file simultaneously. Which
is what happened in https://pagure.io/fedora-infrastructure/issue/8816
Catch the case and use the spec-defined variant (because getting rid
of it would be harder) but issue a warning because most likely this
is not what they really wanted.
'fn' already contains full path to a file, so no need to prepend it once
more. This is actually breaking things.
Before:
D: Calling %{__pythonname_provides %{?__pythonname_provides_opts}}() on /home/brain/rpmbuild/BUILDROOT/hello-1-1.fc33.x86_64//home/brain/rpmbuild/BUILDROOT/hello-1-1.fc33.x86_64/usr/share/applications/org.gnome.Terminal.desktop
After:
D: Calling %{__pythonname_provides %{?__pythonname_provides_opts}}() on /home/brain/rpmbuild/BUILDROOT/hello-1-1.fc33.x86_64/usr/share/applications/org.gnome.Terminal.desktop
Fixes: https://github.com/rpm-software-management/rpm/issues/1162
Signed-off-by: Igor Raits <i.gnatenko.brain@gmail.com>
Commit cb4e5e755a added flags arguments
to rpmExprBool() and rpmExprStr(), but unfortunately rpm 4.15 sailed
with flagless versions them. It's extremely unlikely that anything out
there is actually using these, but then you never really know.
Rpm soname bumps are so inconvenient that we really do not want to do
that just for these two, so preserve binary compatibility and restore
flagless variants of both, adjust internal code to use flagged versions
always. If only we had symbol versioning, sigh.
In addition to "magic" strings, support classifying files by matches
on their MIME types which are far more predictable than the notoriously
volatile magic strings which are intended for human consumption.
This adds an optional %__foo_mime and %__foo_exclude_mime patterns
to file attributes. If the mime-variant is present, magic is ignored
if present (with a warning).
The testcase is a fine example of how the grass not necessarily being
any greener on the other side: the actual output is far more predictable,
but the actual classification is not. Our "script" with an arbitrary
unknown interpreter is considered text/plain by libmagic.
Fixes: #1097
libmagic strings are notoriously unreliable as the details from version
to version. We link to libelf anyway so we might as well as get the
info straight from the horse's mouth.
Besides being more reliable, this detaches the coloring business from
the hardcoded rpmfcTokens struct and informative-only FILECLASS contents,
opening the door for other changes in that area.
Immaterial %ghosts (ie those that only exist in spec, not disk) trip
up ENOENT errors which end up in package headers, so we have gems like
this in CLASSDICT:
cannot open `/builddir/build/BUILDROOT/crypto-policies-20191128-2.gitcd267a5.fc31.noarch/etc/crypto-policies/back-ends/krb5.config' (No such file or directory)
Treat errors in classify as errors, and filter out ENOENT to handle
the above case, others we'll want to have rpm error or at least warn on.
Notably this trips up an error from a symlink loop in two of our test-cases
that have gone unnoticed until now.
Add a new "meta" qualifier for expressing dependencies that are not
concrete install- or run-time dependencies, and thus should not take
part in install ordering.
There are quite a lot of such dependencies in the wild, for example
versioned sub-package cross-dependencies are typically of this type
and a common source of unnecessary dependency loops. Another common
case are dependencies of meta-packages.
In some cases generators are remarkably simple, such as just echoing
back the basename of the file in some namespace such as foo(lib.so),
and forking out a shell to perform such a mundane task is both hideously
slow and plain dumb, when we have quite some string processing facilities
and even a full-blown programming language embedded in rpm itself.
This adds support for using macro functions as generators: if the
generator macro is a parametric macro, then we call that macro
with the file name as the first argument instead of shelling out,
and the expansion of the macro is used as the output. Multiple lines
in output are allowed, and generator styles can be mixed freely
(eg shell out for provides but use macro function for requires etc).
Constructing the expand call with "proper" macro arguments runs into all
sorts of trouble with special character escaping, work around that by
defining the file name as literal macro %1 prior to calling the
generator. Which is a bit dirty but works...
popt always returned malloc'ed memory for POPT_ARG_STRING items, but
for whatever historical reason rpm systematically passed const char *
pointers as targets, making them look non-freeable. Besides changing
just the types and adding free()'s, const-correctness requires extra
tweaks as there's mixed use from string literals and poptGetArg() which
does return const pointers.
Since requires in source rpms are buildrequires, with the same logic
provides in source rpms are buildprovides, and what does a build
provide if its not the names of the packages to be generated.
This seems like a natural fit, should be useful for various purposes, and
eliminates the need for a special rpmspecQuery() with RPMQV_SPECBUILTRPMS
to get that information.
Store the file type strings in the classifier, and generate the dictionary
and its ids serially after the parallel section completes to ensure
stable order. Besides making the classifying really run in parallel
again, this also moves the pool- and file-counting related constraints
out of the parallel section for theoretically better parallelization.
Fixes#934
The order directive might be useful in some cases, but for our purposes
it very effectively serializes the whole classification operation.
Which means that we get the speed of serial classification with the
complexity of parallel execution, ugh. Revert, we need a better fix.
This reverts commit 3691d99c8b.
The order of file classification isn't interesting in itself, but arbitrary
order makes contents of RPMTAG_CLASSDICT non-deterministic which is not
nice for reproducable builds. Tell OMP to handle the class dictionary
in order.
Cancellation points are not allowed in ordered construct so we need to
drop that. It doesn't change the actual results, just means that we run
a little longer in case errors are encountered.
Fixes#934
Commit fa303d5ba6 moved buildhost and
buildtime calculation out of the package generation to early spec
initialization, but this broke reproducable builds: if buildtime is
to be set from changelog, changelog needs to be parsed first.
So either we need to do it twice or we need to do it right, and
besides avoiding duplication, conceptually these values are only
meaningful during a build and not a parse, so this restores that part
of the original code while keeping things thread-safe.
Fixes: #932
Commit e68eb68c4a introduced a regression
on Icon tag causing a crash on source rpm build, due to spec->numSources
being off by one if an icon was present.
A nicer fix would be eliminating numSources entirely but it's not as
easy as it should be due to dynamic buildrequires messing with it,
leaving that for another time.
-P can appear multiple times so a string arg pointer is not the right
thing here in any case. There are other similar and related leaks all
over the codebase but this is especially insulting as the leaked pointer
was never used for anything at all.
Thanks for Peter Jones for pointing this out.