We need to support both fd-based and (dirfd+) path based operations
due to all the lovely mismatches in POSIX, so lotsa half-duplicated
tedious stuff here.
As of this commit, we only use fd based ops for regular files.
Notably cap_set_file() doesn't have a dirfd-based mode, to handle that
safely we'll need to use fd-based operation. Which would be nicer anyhow
but symlinks can't be opened so we'll have to carry the dirfd/path based
mode forever more anyhow (yes Linux has extensions but that's another
story).
fsmUnpack() is the only place in FSM that needs to deal with rpmio FD
types, everywhere else they are nothing but a hindrance that need to
be converted to OS level descriptors for use. Better deal with OS
level descriptors to begin with.
It doesn't make much sense to call plugins for files that wont be
unpacked at all, and in particular it wont make much sense to do the
entire directory dance just to be able to pass meaningful path values
to plugins. So from now we'll only be calling file-pre for things that
we're about to lay down, which it how it used to be before splitting
the stages anyhow.
This isn't ideal from the sense that some files may get a success post
call while something later can still fail, but things get even weirder
with doing it in a separate round where things could fail because of
a vanished directory and then we'd still need to call the plugin hook
with some result. Also, this lets us skip the backwards walk on the
normal case of success, which is nice.
Commit a82251b44e moved metadata setting
to a separate step because there are potential benefits to doing so, but
the current downsides are worse: as long as we operate in potentially
untrusted directories, we'd need to somehow verify the content is what we
initially laid down to avoid possible privilege escalation from non-root
owned directories.
This commit does not fix that vulnerability, only makes the window much
smaller and paves the way for the real fix(es) without introducing a
second round of directory tree validation chase to the picture.
Fix the initial setmeta value to something meaningful: we will never
set metadata on skipped files, and hardlinks are handled with a special
logic during install. They'd need different kind of special logic on
FA_TOUCH so just play it safe and always apply metadata on those.
Harlink metadata setting on install should happen on the *last* entry
of hardlinked set that gets installed (wrt various skip scenarios)
as otherwise creating those additional links affects the timestamp.
Note in particular the "last file of..." case in fsmMkfile() where we
the comment said just that, but set the metadata on the *first* file
which would then be NULL'ed away.
This all gets current masked by the fact that we do the metadata setting on
a separate round, but that is about to change plus this makes the overall
logic clearer anyhow.
Handling this in a separate clause makes the logic much clearer and
(in theory at least) lets us handle hardlinks to any content, not
just regular files.
Any unowned directories will be created inline during processing now
so we can just flush this big pile of code that was insecure anyhow.
As an additional bonus creating the directories inline gives us an
opportunity to track the creation so we can undo too, but that is
not done here.
Whenever directory changes during unpacking, walk the entire tree from
starting from / and validate any symlinks crossed, fail the install
on invalid links.
This is the first of step of many towards securing our file operations
against local tamperers and besides plugging that one CVE, paves the way
for the next step by adding the necessary directory fd tracking.
This also bumps the rpm OS requirements to a whole new level by requiring
the *at() family of calls from POSIX-1.2008.
This necessarily does a whole lot of huffing and puffing we previously
did not do. It should be possible to cache secure (ie root-owned)
directory structures to avoid validating everything a million times
but for now, just keeping things simple.
This fixes how RPM handles packages that contain a header signature, but
neither header+payload signature nor payload digests. Such packages are
obviously not properly signed, but RPM previously accepted them.
This could be used to confuse both ‘rpmkeys -K’ and old versions of DNF.
Both would report that the package has been properly signed even when it
has not. The included regression tests demonstrates the change in
behavior.
If the i18ntable was smaller than the i18nstring entry an out of bounds
read could result. This should not happen in a valid package, but even
if RPM rejected such packages during load, this situation could still
result as a result of usage of the RPM API.
At least ECDSA and RSA signatures can vary in length, but the IMA code
assumes constant lengths and thus may either place invalid signatures on
disk from either truncating or overshooting, and segfault if the stars are
just so.
As we can't assume static lengths and attempts to use maximum length
have proven problematic for other reasons, use a data structure that
can actually handle variable length data properly: store offsets into
the decoded binary blob and use them to calculate lengths when needed,
empty data is simply consequtive identical offsets. This avoids a whole
class of silly overflow issues with multiplying, makes zero-length data
actually presentable in the data structure and saves memory too.
Add tests to show behavior with variable length signatures and missing
signatures.
Additionally update the signing code to store the largest IMA signature
length rather than what happened to be last to be on the safe side.
We can't rely on this value due to invalid packages being out there,
but then we need to calculate the lengths on rpmfiles populate so there's
not a lot to gain anyhow.
Fixes: #1833
Up to now, rpmfiSetFX() has returned the previous file index on success,
and -1 on error. Which seems okay on the outset, but on a just
initialized iterator the file index is at -1 which means the returned
-1 sometimes indicates an error and sometimes success. This is so broken
that none of the callers even try to use it (grep for it). Which is
lucky in the sense that it means we can change it.
Simply return the newly set index on success and -1 on error, it may
not be the greatest return code on earth but at least it's
non-ambiguous.
The directory index must only be changed in sync with file iteration,
otherwise you get garbage out. The NOTYET'ed rpmfiDI() seems to suggest an
idea to have a separate mode of directory-only iteration in which context
rpmfiSetDX() would've perhaps made sense, but as it is this is plain
dangerous. Thankfully these APIs so broken that there can be no
legitimate users, so we can just turn them into a no-op until we have
to bump the soname next time.
--setperms, --setugids and --setcaps were fun demos of alias capabilities
in the nineties, but they can be downright dangerous when used
separately, are blisfully unaware of all state in rpm yet try to
duplicate functionality existing in C, and thus are a constant source
of bugs that are between hard to impossible to fix in the alias space.
Add a new transaction element type for the restore operation, wire
through all the necessary places. In places (like ordering) this is
an overkill but otherwise it seems like a natural thing to be able
to process restore alongside package install/remove. The restore
operation is a cross between install and erase codepath-wise so touches
some funny places, but FA_TOUCH does just the thing, and now all the
regular disablers like --nocontext and --nocaps can be used if
necessary, plugins get to do their work and also timestamps are
restored.
Remove the dangerous shell implementations of things and just make them
aliases to --restore.
Fixes: #965
Commit b3d672a552 got the base reasoning
in the ballpark but the code all wrong, introducing a severe performance
regression without actually fixing what it claimed to.
The missing incredient is actually comparing the current prefix with the
triggers in matched package (trying to describe this makes my head
spin): a package may have multiple triggers on multiple prefixes and
we need to make sure we only execute triggers of this type, from this
prefix.
This stuff really needs more and better testcases.
Fixes: b3d672a552
The rpmdb cookie is not a security feature, but as these existing
hashes are more convenient than coming up with our own... we then
run into the great big wall of FIPS which in its current incarnation
disallows use of SHA1. And so rpmdbCookie() fails under current FIPS.
Just bumping the algorithm to SHA256 seems the path of lowest
resistance, whether that algo makes sense for this purpose or not.
Sometimes you just want to extract the files without touching the
database, just like sometimes you just want the database changed.
Potential use-cases include rpm2cpio style operation and src.rpm
install.
How do we even _know_ user wants to debug malloc in rpmbuild,
maybe user wants to debug it in _the child_?
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Pass relocatable package prefixes as RPM_INSTALL_PREFIX Lua table
to Lua scriptlets, add a test-case.
In Lua, indexes start at 1 rather than 0, so the numbers appear
off by one which is annoying, but consistent within Lua and our other
similar Lua constructs such as scriptlet arg.
Fixes: #1531
Most qualifiers make as much sense to weak dependencies as they do for
normal dependencies, so we'd like to allow them in packages. Rpmbuild
doesn't currently allow them but packages from other implementations may
have them anyway.
To protect rpm's primary means of dependency loop cutting, don't let
weak dependencies inflate the value of pre/post etc dependencies by
simply ignoring those flags.
Commit 13f70e3710 caused minimize_writes
to actually not minimize anything since fsmVerify() only "verifies"
the thing does NOT exist anymore when it exist. Sigh.
FA_TOUCH needs different kind of verification, stat the file instead
to see if it needs creating afterall. This is all soooo broken...
Fixes: #1881
rpm.next_file() is only meaningful at all when there's a backing
function set for the scriptlet, so only expose it at those times.
We shouldn't be touching the rpm namespace at all here, but
it's required for backwards compatibility.
As the nextFileFunc is something that often is not there, this makes
testing for nextFileFunc presence a little more obvious and easy.
No actual functional change.
This screams for a closure, use one. This is not supposed to change
any functionality except, except only make the rpm.next_file() function
available in the context of scriptlets which is the only context it is
meaningful.
This is mainly to place *some* limit on number of files a package may
have. It's unlikely that rpm scales to million files in a package but
that much may actually work, whereas closer to int32 limits we run into
all manner of issues with internal types mismatching and whatnot.
Commit a79d7ae0f0 had two major flaws:
Firstly, it negletted to update the part that copies the data in place to
match the new layout where data lengths are not always equal, and the for
loop would overshoot its bounds on data shorter than maximum.
Secondly, rpmfilesFSignature() would now crash on packages with no IMA
signatures because fi->signaturelengths is not allocated. Take care not
to change API behavior wrt *len return value: set to zero if no
signatures are present.
At least ECDSA and RSA signatures can vary in length, but the IMA code
assumes constant lengths and thus may either place invalid signatures on
disk from either truncating or overshooting, and segfault if the stars are
just so.
Luckily the signatures are stored as strings so we can calculate the
actual lengths at runtime and ignore the stored constant length info.
Extend hex2bin() to optionally calculate the lengths and maximum,
and use these for returning IMA data from the rpmfi(les) API.
Additionally update the signing code to store the largest IMA signature
length rather than what happened to be last to be on the safe side.
We can't rely on this value due to invalid packages being out there,
but then we need to calculate the lengths on rpmfiles populate so there's
not a lot to gain anyhow.
Fixes: #1833
When trying to achieve a fully reproducible build process for an
OS image timestamps are a major source of variance. The RPM
database contains two fields in package header records that
are relevant here:
- RPMTAG_INSTALLTIME which is an explicit timestamp indicating
when the package was installed.
- RPMTAG_INSTALLTID which is an opaque value represending the
transaction ID under which a package was installed but in terms
of internal implementation is also a timestamp.
This change allows the presence of the SOURCE_DATE_EPOCH
environment variable, commonly used to override timestamps in
build systems, to institute an "override time" mode in which
values for RPMTAG_INSTALLTID and RPMTAG_INSTALLTIME become
predicatble.
If a package has multiple %transfiletriggerpostun triggers, any one
of them matching would cause all of them to run, due to disconnect
in the intel gathering stage: we'd gather all the headers with matching
files into a lump, and then add any postun triggers found in them,
but this loses the triggering file information and causes all postuns
to run.
The triggers need to be added while looping over the file matches,
like runFileTriggers() does. Doing so actually simplifies the code.
These should really be unified to use the same code, but leaving
that exercise to another rainy day.
Commit 717a3f7ecf changed Lua scriptlet
arguments from numbers to strings, and claimed that Lua will handle
this automatically so nobody will notice a thing. Not so, Lua only
converts automatically when doing so is unambiguous, such as doing
math. In comparison Lua cannot know which kind of comparison we want,
and rightly refuses to convert. Thus, the commit broke any Lua
scriptlets doing `if arg[2] > 1 then ... end` style comparisons.
As the whole rpmluav variable API is gone now, we can't just revert
the commit. Adding a preamble to the scriptlet to convert the args
to numbers may be hackish, but its a far lesser evil than bringing the
internal variable API back. Kudos to Michael Schroeder for the idea.
Fixes: #1790
RPM packages are binary documents and must be signed as such. To avoid
having to access the fields of pgpDigParams() directly, this adds a new
accessor function, pgpSignatureType(). pgpSignatureType() returns the
type of a signature, or -1 if the PGP data is NULL or is not a
signature.
Prior to commit b5e8bc74b2 this was kind
of checked inside the parser but incorrectly.
There are legitimate reasons (such as rhbz#1940895 or the included test)
for wanting the former behavior where all file states were considered in
file queries prior to commit 9ad57bda4a,
so celebrate the tenth anniversary of that commit by adding a CLI switch
(a new package selector --path), as contemplated back then.
Update the man page for --file to reflect it's current behavior and make
--path that more obvious.
Resolves: rhbz#1940895
We now only invalidate the cache if the cached entry gets written
or deleted. This is needed for the code in rpmdbNextIterator()
which first reads the next header and then writes the modified
old header to the database. Therefore we must not free the cached
entry if a modified header with a different id is written.
We must avoid the "database is locked" errors at every cost because
otherwise the rpmdb gets corrupted and system ends up in inconsistent
state.
Resolves: rhbz#1946046
Even though the actual implementation of rpmGlob() does not allocate the
passed arg list (av) if the return code (rc) is non-zero or arg count
(ac) is 0, it's the responsibility of the caller (rpmInstall() here) to
free that memory, so make sure we do that irrespectively of the above
conditions.
Found by Coverity.
As we allow dependeny flags like pre, post, ... in the spec syntax we
should not remove them when reading in the tags from a package.
Thanks to Michael Schröder for spotting this.
Resolves: #1703
In Hash v8 databases page type differs from newer ones to denote
the difference between sorted and unsorted pages.
Fixes reading rpm databases from older distros like SLES 11 and RHEL 5
(RhBug:1965147)
Commit ddb32b9187 added an
extra check that tests if the provide we are checking really
intersects the dependency from rpmdb. Unfortunately the
rpmdsCompare() call does not understand rich dependencies and
will consider them as not intersecting.
Unbreak the check by not doing the intersection test for
rich dependencies. We'll improve this in a later commit.
Also add test cases for dependency problems with installed
rich dependencies.
Commit d6a86b5e69 introduced far stricter
checks on what tags are allowed in signature and main headers than rpm
had previously seen, and unsurprisingly this introduced some regressions
on less common cases:
- On rpm v3 packages and some newer 3rd party created packages (such as
install4j < 9.0.2), RPMTAG_ARCHIVESIZE resides in the main header
to begin with
- In rpm 4.13 - 4.14, file IMA signatures were incorrectly placed in
the main header.
As a quirk, permit the existence of RPMTAG_ARCHIVESIZE,
RPMTAG_FILESIGNATURES and RPMTAG_FILESIGNATURELENGTH in the main header
too provided that the corresponding signature tag is not there (so
they can reside in either but not both headers).
Initial workaround patch by Demi Marie Obenour.
Fixes: #1635
Since commit 376fef14a6 librpmbuild is
calling Lua directly rather than through the rpmio wrapper layer, so
librpmbuild needs LUA_CFLAGS and LUA_LIBS for building. While librpm
does not currently call Lua directly, make native Lua available there
as well as it's only a matter of time.
The implicit database creation even on read-only access is rooted so
deep in so many places that this needs more thought and saner APIs.
While creating databases on read-only access remains a crazy thing to
do, we need to have a proper solution to point people to when changing
the behavior, and right now we don't.
This effectively reverts 86f593d513,
d601a7b7ae and
afbc2b0783, but adopting 4.16.x version
of the sqlite open error message to avoid accessing freed handle, and
leaving the testcase for query on non-existent db in place (just
changing the expected result) as this is a rather fundamental behavior thing.
Commit c3e04c2ac9 tried to address
a theoretical issue with undefined behavior but cause a regression
which was fixed in commit 1efe530450
which introduced another regression: package reading on 32bit
architectures is broken. Enough is enough. Revisit some day with
an actual plan.
This reverts commits c3e04c2ac9 and
1efe530450.
Disabling implicit database creation on read-only handles in commit
afbc2b0783 broke number of handy
use-cases such as install to an empty chroot directory, both with
rpm itself and dnf/yum at least, probably others too.
This minimally resurrects the desired part of the behavior: if people are
asking us to install something, creating a missing database is probably
okay to create without requiring an explicit --initdb action first.
It'll still spit some ugly errors from trying to load the keyring but
at least it'll work. The harmless errors we can try to deal with
separately later on.
RPM_NULL_TYPE is not a legit tag type, but RPM_MIN_TYPE is defined
and used as if it was. Change RPM_MIN_TYPE to 1 to make the MIN-MAX
range actually relevant, in particular for hdrchkType(). Also fix
the lone user who in query code who clearly had encountered this very
issue before but worked around locally...
Reported and initial patch by Demi Marie Obenour.
This introduced regressions rather than fixing something, len *is*
in range because it's capped to header size already.
This reverts commit a1344cf2a5.
The count can never be larger than header data size, which can never be
larger than 256MB. Most datatypes have further restrictions of course, this
is merely an outer perimeter check to catch impossibly large values that
could otherwise overflow all manner of trivial calculations.
Addresses the point I missed in PR #1493 but with a much tighter limit.
The other backends would want to create the missing index, but as bdb_ro
is read-only it can't do that. As the main purpose of bdb_ro is to support
migrating away from BDB for which only the primary database is needed,
it doesn't make sense to fail it for non-essential data. Let it fail
for secondary indexes - this might affect our ability to query but
that's secondary, literally, and we also do emit a warning here.
Fixes: #1576
open() returns -1 on error, not 0, so this wasn't catching what it
was supposed to. Drop the error message as this case is better handled
in the caller.
Only look for known tags, and ensure correct type and size where known
before copying over. Bump the old arbitrary 16k count limit to 16M limit
though, it's not inconceivable that a package could have that many files.
While at it, ensure none of these tags exist in the main header,
which would confuse us greatly.
This is optimized for backporting ease, upstream can remove redundancies
and further improve checking later.
Reported and initial patches by Demi Marie Obenour.
Fixes: RhBug:1935049, RhBug:1933867, RhBug:1935035, RhBug:1934125, ...
Fixes: CVE-2021-3421, CVE-2021-20271
Using filesystem as keyring is an unfinished development idea from 2008
which has somehow lingered on ever after.
The fs keyring behavior has potential security problems as it allows any
package to drop in files at the keyring path, instantly becoming trusted
keys. In some scenarios this may be a desireable feature though, and
apparently some distros (Yocto at least) actually rely on this feature.
The other issue is that the on-disk and rpmdb variants don't play well
together, in fact they don't play together at all.
The rpm keyring needs a thorough redesign but until then we don't want
to force users to adjust to something that is also destined to go away.
So in the meanwhile, let people choose with a macro which behavior they
want, using a macro seems the lowest bar to cross. Add new %_keyring
macro that accepts either `rpmdb` or `fs` values and falls back to `rpmdb`
unknown/unset values.
Fixes: #1543
headerLoad(), headerCopyLoad() and headerUnload() have been deprecated
as unsafe since rpm 4.10, time to start pushing them out for real.
The first two are unsafe because they don't take argument for size,
the last one is because it doesn't return one so it forces using the
unsafe variants.
Since --eval only has one job - to actually *print* something to stdout,
we should fail completely if that's not possible due to an external
error, instead of just not printing anything. One particular case is
redirecting stdout to a file on a file system that's full (generating
the ENOSPC error).
In order to be able to detect such errors, we need to flush the stream
buffer first; that's because normally, if stdout is not referring to an
interactive device (a terminal), it is block-buffered, as opposed to
line-buffered.
Commit fb58884177 introduced a regression
where on database open failure we end up accessing the already freed
handle when trying to report the error.
Now that we're no longer implicitly creating non-existent databases,
just simplify the whole logic to get rid of it. 4.16.x branch will need
a different fix, maybe use sqlite3_errstr() instead of _errmsg().
Reported and initial patch by Demi M. Obenour.
This is one tricksy piece of code, in particular wrt interactions with
skipped files. Streamline the logic to make it easier to follow, and
consolidate as much as possible inside fsmMkfile() so there's only
one place opening, closing and writing to files. Add bunch of tests
for partially skipped sets.
This actually also fixes a case where metadata does not get properly
set on a partially skipped hardlink file set (the --excludepath=/foo/zzzz
case would fail prior to this)
Add debugging facilities to file open and close routines in fsm, rename
wfd_open/close to fsmOpen/Close to bring the messages in line with
our other stuff.
Slight reorganization needed for the error paths, in particular in
case of file close we haven't even been recording the close result
with can be critical when writing files. Add an error code for close
failure and return it from fsmClose(), but updating the calling code
to take this into account is out of scope in this commit.
With the changed logic, the if-clause can fall through without ever
initializing s. The exit code condition is getting more complicated
now so move it to helper variable, assume failure for a safe default.
Fixes: 165330b7bf
The ‘end’ parameter to ‘strtaglen’ might point past the end of an
allocation. Therefore, if ‘start’ becomes equal to ‘end’, exit the loop
without calling ‘memchr’ on it.
Ensure negation comes *after* division; unary ‘-’ has a higher precedence than binary ‘/’ in C. Found by UBSAN and libfuzzer.
Add comment as hint to warn future code readers.