The previous commit removed all *_Type variables.
This adds them back, but as *pointers* to Python type objects
rather than type objects themselves.
All usages now need an extra pointer dereference (or removing an
address-of).
(The compiler will complain of type safety if we miss a spot,
so I'm OK using the original names.)
Also, add header declarations for the Spec structs.
In the stable ABI Python's type objects aren't defined in structs
that reference other structs, but by an array of "slots" from which
type objects are created.
(This commit was generated semi-automatically. The automation
is too messy and bespoke to share widely, but if you're interested
in undocumented work in progress, look in
https://github.com/encukou/api-limiter/tree/rpm-mess )
The Python type struct changes between versions, and thus direct access to
its fields is not allowed in Stable ABI.
The members can, however, be retrieved using `PyType_GetSlot`.
Commit generated with this semantic patch (see https://coccinelle.lip6.fr ):
```
@@
identifier s;
type t;
@@
- Py_TYPE(s)->tp_free((t*) s);
+ Py_TYPE(s)->tp_free(s);
@@
@@
- Py_TYPE(s)->tp_free(s);
+ freefunc free = PyType_GetSlot(Py_TYPE(s), Py_tp_free);
+ free(s);
```
applied using:
spatch --sp-file patch.cocci --dir python/ --in-place
Python 2 will be EOL by the time of the next major rpm release,
time to retire the Python 2 bindings. Specifically we require
Python >= 3.1 for surrogateescape-support.
In the almost ten years of rpm sort of supporting Python 3 bindings, quite
obviously nobody has actually tried to use them. There's a major mismatch
between what the header API outputs (bytes) and what all the other APIs
accept (strings), resulting in hysterical TypeErrors all over the place,
including but not limited to labelCompare() (RhBug:1631292). Also a huge
number of other places have been returning strings and silently assuming
utf-8 through use of Py_BuildValue("s", ...), which will just irrevocably
fail when non-utf8 data is encountered.
The politically Python 3-correct solution would be declaring all our data
as bytes with unspecified encoding - that's exactly what it historically is.
However doing so would by definition break every single rpm script people
have developed on Python 2. And when 99% of the rpm content in the world
actually is utf-8 encoded even if it doesn't say so (and in recent times
packages even advertise themselves as utf-8 encoded), the bytes-only route
seems a wee bit too draconian, even to this grumpy old fella.
Instead, route all our string returns through a single helper macro
which on Python 2 just does what we always did, but in Python 3 converts
the data to surrogate-escaped utf-8 strings. This makes stuff "just work"
out of the box pretty much everywhere even with Python 3 (including
our own test-suite!), while still allowing to handle the non-utf8 case.
Handling the non-utf8 case is a bit more uglier but still possible,
which is exactly how you want corner-cases to be. There might be some
uses for retrieving raw byte data from the header, but worrying about
such an API is a case for some other rainy day, for now we mostly only
care that stuff works again.
Also add test-cases for mixed data source labelCompare() and
non-utf8 insert to + retrieve from header.
These lines within python/rpmfd-py.c: rpmfdFromPyObject
are the wrong way around:
Py_DECREF(fdo);
PyErr_SetString(PyExc_IOError, Fstrerror(fdo->fd));
If fdo was allocated by the call above to PyObject_CallFunctionObjArgs,
it may have an ob_refcnt == 1, and thus the Py_DECREF() frees it, so
fdo->fd is reading from deallocated memory.
Signed-off-by: Ales Kozumplik <akozumpl@redhat.com>
- Various places within the bindings use PyArg_ParseTuple[AndKeywords] to
extract (char*) string arguments. These are pointers to the internal
representation of a PyStringObject, and shouldn't be modified, hence
it's safest to explicitly mark these values as (const char*), rather
than just (char*).
Signed-off-by: Ales Kozumplik <akozumpl@redhat.com>
- Various functions in the Python bindings have expressions of the form:
PyObject_Call(callable,
Py_BuildValue(fmtstring, ...), NULL);
This leaks memory for the case when Py_BuildValue succeeds (it returns a
new reference, which is never freed; PyObject_Call doesn't steal the
reference): the argument tuple and all of its components will not be
freed (until the process exits).
Signed-off-by: Ales Kozumplik <akozumpl@redhat.com>
- Similarly to python file object having o.name, export Fdescr()
as fd.name. Python uses <foo> for non-paths but [foo] seems like
a safer choice wrt accidental redirections.
- Also add a basic testcase for fd.name
- FD's are not really immutable, move the initialization work into
tp_init and use PyType_GenericNew for tp_new since we're not
doing anything special there.
- Remove half a dozen different unnecessary exit points
- this permits any file-like object implementing .fileno() method
(including rpm.fd) to be dup'ed, not just PyFile subtypes
- this also avoids yet another incompatibility with Python 3 which doesn't
have PyFile_Check() and PyFile_AsFile() at all
- In Python 2.6 PyBytes is just an alias for PyString, Python 3.0
removed PyString entirely
- Add compatibility defines for Python < 2.6
- Based on David Malcolm's Python 3.x efforts
The layout of PyVarObject changed between python 2 and python 3, and this leads
to the existing code for all of the various PyTypeObject initializers failing to
compile with python 3
Change the way we initialize these structs to use PyVarObject_HEAD_INIT directly,
rather than merely PyObject_HEAD_INIT, so that it compiles cleanly with both major
versions of Python
Python 2's various object structs use macros to implement common fields at the top of each
struct.
Python 3's objects instead embed a PyObject struct as the first member within the more refined
object structs.
Use the Py_TYPE() macro when accessing ob_type in order to encapsulate this difference.
- turn rpmfdFromPyObject() into a python-level object converter, add
a separate C-level getter for the fd pointer itself
- take advantage of python refcounting to handle differences between
native vs converted rpm.fd in callers so we can simply decref the
rpmfdFromPyObject() result without having to worry whether it was
converted or not (ie should we close it or not)
- attempt to mimic python file object where possible, but nowhere near
all methods are supported, either just not yet done or due to
underlying limitations