29 lines
1.3 KiB
Plaintext
29 lines
1.3 KiB
Plaintext
xidel (tool to extract data from HTML/XML/JSON files or pages)
|
|
|
|
Xidel is a command line tool to query data from HTML/XML web pages,
|
|
JSON-APIs and local files. It implements interpreters for XPath 2,
|
|
XPath 3, XQuery 1, XQuery 3, JSONiq, CSS selectors and custom pattern
|
|
matching.
|
|
|
|
XPath and CSS selectors are the most efficient way to select certain
|
|
elements from XML/HTML documents. JSONiq (with custom extensions)
|
|
is an easy way to select data from JSON. XQuery is a Turing-complete
|
|
superset of XPath and allows arbitrary data transformations and the
|
|
creation of new documents.
|
|
|
|
Pattern matching is for XML/HTML documents what regular expressions
|
|
are for plaintext, i.e. pattern matching behaves like a regular
|
|
expression over the space of tags, instead over the space of
|
|
characters.
|
|
|
|
Xidel implements a kind of internal pipes to pipe HTTP requests from
|
|
one query to the next, so there is no need to distinguish selecting
|
|
links and downloading the data referenced by them. Therefore arbitrary
|
|
complex queries going over arbitrary many pages can be executed with a
|
|
single call of Xidel.
|
|
|
|
Xidel is a powerful and complex tool, with a steep learning
|
|
curve. For examples, see the man page xidel(1), and also
|
|
/usr/doc/xidel-$VERSION/examples/. The full documentation is available
|
|
via "xidel --usage | less".
|