Commit Graph

23 Commits

Author SHA1 Message Date
Panu Matilainen 84920f8983 In Python 3, return all our string data as surrogate-escaped utf-8 strings
In the almost ten years of rpm sort of supporting Python 3 bindings, quite
obviously nobody has actually tried to use them. There's a major mismatch
between what the header API outputs (bytes) and what all the other APIs
accept (strings), resulting in hysterical TypeErrors all over the place,
including but not limited to labelCompare() (RhBug:1631292). Also a huge
number of other places have been returning strings and silently assuming
utf-8 through use of Py_BuildValue("s", ...), which will just irrevocably
fail when non-utf8 data is encountered.

The politically Python 3-correct solution would be declaring all our data
as bytes with unspecified encoding - that's exactly what it historically is.
However doing so would by definition break every single rpm script people
have developed on Python 2. And when 99% of the rpm content in the world
actually is utf-8 encoded even if it doesn't say so (and in recent times
packages even advertise themselves as utf-8 encoded), the bytes-only route
seems a wee bit too draconian, even to this grumpy old fella.

Instead, route all our string returns through a single helper macro
which on Python 2 just does what we always did, but in Python 3 converts
the data to surrogate-escaped utf-8 strings. This makes stuff "just work"
out of the box pretty much everywhere even with Python 3 (including
our own test-suite!), while still allowing to handle the non-utf8 case.
Handling the non-utf8 case is a bit more uglier but still possible,
which is exactly how you want corner-cases to be. There might be some
uses for retrieving raw byte data from the header, but worrying about
such an API is a case for some other rainy day, for now we mostly only
care that stuff works again.

Also add test-cases for mixed data source labelCompare() and
non-utf8 insert to + retrieve from header.
2019-02-22 20:37:20 +02:00
Panu Matilainen 71ff0b7e73 Eliminate dead-on-arrival code
This code was disabled in commit 9b94ae3dbc
about seven years ago before making a public appearance in any release.
That nobody has missed it in all this time tells me it's not that
necessary to have a python-level rpmtd object...
2016-10-26 13:38:12 +03:00
Panu Matilainen 3667f34d0e Make rpmtd_ItemAsPyobj() usable within the python bindings
- Needed for the next steps...
2013-03-21 17:00:52 +02:00
Panu Matilainen 1432d53383 Optimize header data python conversion for array tags a bit
- We know the array size beforehand, allocate the entire array
  at once and set the elements instead of appending one by one.
  This is (an obvious) and well-measurable, if not a huge, win.
2012-03-08 10:02:51 +02:00
David Malcolm 3157d6d7b7 handle errors when constructing lists in the Python bindings
- Various functions in the Python bindings construct lists of objects, but
  assume that all calls succeed. Each of these could segfault under
  low-memory conditions: if the PyList_New() call fails,
  PyList_Append(NULL, item ) will segfault. Similarly, although
  Py_List_Append(list, NULL) is safe, Py_DECREF(NULL) will segfault.

Signed-off-by: Ales Kozumplik <akozumpl@redhat.com>
2011-12-21 08:45:52 +01:00
Panu Matilainen e8f777b69d Switch python bindings to use rpm(Dbi)TagVal as appropriate
- None of these are true enum uses as the value typically originates
  from python integers etc.
2010-10-22 11:57:38 +03:00
Panu Matilainen 7e53dc6ee1 Avoid stepping on toes of relatives, part 2
- Eliminate uses of "class" which is a reserved keyword in C++
2010-09-21 15:02:43 +03:00
Panu Matilainen be488096e0 Use the new tag type/return type getters everywhere
- Instead of masking and bitfiddling all over the place, use the
  new getters to get the exact (enum) type directly. rpmTagGetType()
  is now unused within rpm but leaving around for backwards compatibility
2010-09-21 12:40:33 +03:00
Panu Matilainen 60b66dc7d9 Fix a few list-related memleaks in python bindings
- PyList_Append() bumps the object reference count, callers need to
  explicitly decref them... oops :)
2009-12-09 14:44:18 +02:00
Panu Matilainen 9b94ae3dbc Disable the entire rpm.td type for now
- need to figure out saner semantics & stuff...
2009-12-07 11:32:51 +02:00
Panu Matilainen 1866fc41c8 Replace PyString usage with PyBytes everywhere
- In Python 2.6 PyBytes is just an alias for PyString, Python 3.0
  removed PyString entirely
- Add compatibility defines for Python < 2.6
- Based on David Malcolm's Python 3.x efforts
2009-10-21 13:15:44 +03:00
David Malcolm 4b8e0ebde6 Generalize type object initialization to work with both Python 2.* and Python 3.*
The layout of PyVarObject changed between python 2 and python 3, and this leads
to the existing code for all of the various PyTypeObject initializers failing to
compile with python 3

Change the way we initialize these structs to use PyVarObject_HEAD_INIT directly,
rather than merely PyObject_HEAD_INIT, so that it compiles cleanly with both major
versions of Python
2009-10-19 11:02:13 +03:00
David Malcolm 7b51c4a1eb Generalize access to ob_type so that they work with both Python 2.* and Python 3.*
Python 2's various object structs use macros to implement common fields at the top of each
struct.

Python 3's objects instead embed a PyObject struct as the first member within the more refined
object structs.

Use the Py_TYPE() macro when accessing ob_type in order to encapsulate this difference.
2009-10-19 10:50:24 +03:00
Panu Matilainen 181a3ac6a5 Raise exception in the converter, not caller 2009-10-12 15:15:39 +03:00
Panu Matilainen 597973befe Permit changing rpm.td tag from python 2009-10-12 15:05:50 +03:00
Panu Matilainen c6649c55a8 Add limited support for modifying headers to python
- for now we only support tag deletion and assigning rpmtd objects, limiting
  this to copying data from other headers
2009-10-12 14:43:44 +03:00
Panu Matilainen dc6946e72e Revert explicit PyErr_NoMemory() returns to just returning NULL
- tp_alloc failing is likely to be OOM but we dont know that for a fact,
  and the failing method is responsible for setting the exception
2009-10-09 11:57:46 +03:00
Panu Matilainen f819760dc3 Fix couple of recently introduced compiler warnings 2009-10-01 14:17:11 +03:00
Panu Matilainen b0d374528a Include structmembers.h centrally from rpmsystem-py.h
- pretty much everything might need this...
2009-10-01 14:16:44 +03:00
Panu Matilainen d59e715c1b Add some flags to rpmtd creation
- permit disabling extension retrieval and "raw" (untranslated i18n) tags
- always use HEADERGET_ALLOC for data availability sanity
2009-09-30 13:05:24 +03:00
Panu Matilainen 1ddee37628 Add beginnings of rpmtd wrappings to python
- unlike other types, store the C-level td structure directly in the
  python object, this lets us selectively expose some members directly,
  avoids having to deal with rpmtd allocation separately and as leaves
  the reference counting to python as rpmtd's aren't refcounted on C-level
2009-09-30 12:45:07 +03:00
Panu Matilainen 86527f0d45 Oops, binary data can and should be presented as python strings 2009-09-23 13:12:43 +03:00
Panu Matilainen 76d8d16de0 Add rpmtd to python object converter, change header code to use that
- vastly simpler than the former goo in hdr_subscribe
2009-09-23 12:49:15 +03:00