diff options
author | Peter Eisentraut | 2002-10-22 20:03:09 +0000 |
---|---|---|
committer | Peter Eisentraut | 2002-10-22 20:03:09 +0000 |
commit | 0bd223291f57a126587a0a06ba61f30eac80ea3c (patch) | |
tree | 50590a2a7ecf6de1114c62c42c949560c7984044 /contrib/xml/README.pgxml | |
parent | 1c23cf4371707dee522cff08cc52ea86b99f24f6 (diff) |
Update build system.
Diffstat (limited to 'contrib/xml/README.pgxml')
-rw-r--r-- | contrib/xml/README.pgxml | 118 |
1 files changed, 118 insertions, 0 deletions
diff --git a/contrib/xml/README.pgxml b/contrib/xml/README.pgxml new file mode 100644 index 00000000000..6c714f74e12 --- /dev/null +++ b/contrib/xml/README.pgxml @@ -0,0 +1,118 @@ +This package contains some simple routines for manipulating XML +documents stored in PostgreSQL. This is a work-in-progress and +somewhat basic at the moment (see the file TODO for some outline of +what remains to be done). + +At present, two modules (based on different XML handling libraries) +are provided. + +Prerequisite: + +pgxml.c: +expat parser 1.95.0 or newer (http://expat.sourceforge.net) + +or + +pgxml_dom.c: +libxml2 (http://xmlsoft.org) + +The libxml2 version provides more complete XPath functionality, and +seems like a good way to go. I've left the old versions in there for +comparison. + +Compiling and loading: +---------------------- + +The Makefile only builds the libxml2 version. + +To compile, just type make. + +Then you can use psql to load the two function definitions: +\i pgxml_dom.sql + + +Function documentation and usage: +--------------------------------- + +pgxml_parse(text) returns bool + parses the provided text and returns true or false if it is +well-formed or not. It returns NULL if the parser couldn't be +created for any reason. + +pgxml_xpath (XQuery functions) - differs between the versions: + +pgxml.c (expat version) has: + +pgxml_xpath(text doc, text xpath, int n) returns text + parses doc and returns the cdata of the nth occurence of +the "simple path" entry. + +However, the remainder of this document will cover the pgxml_dom.c version. + +pgxml_xpath(text doc, text xpath, text toptag, text septag) returns text + evaluates xpath on doc, and returns the result wrapped in +<toptag>...</toptag> and each result node wrapped in +<septag></septag>. toptag and septag may be empty strings, in which +case the respective tag will be omitted. + +Example: + +Given a table docstore: + + Attribute | Type | Modifier +-----------+---------+---------- + docid | integer | + document | text | + +containing documents such as (these are archaeological site +descriptions, in case anyone is wondering): + +<?XML version="1.0"?> +<site provider="Foundations" sitecode="ak97" version="1"> + <name>Church Farm, Ashton Keynes</name> + <invtype>watching brief</invtype> + <location scheme="osgb">SU04209424</location> +</site> + +one can type: + +select docid, +pgxml_xpath(document,'//site/name/text()','','') as sitename, +pgxml_xpath(document,'//site/location/text()','','') as location + from docstore; + +and get as output: + + docid | sitename | location +-------+--------------------------------------+------------ + 1 | Church Farm, Ashton Keynes | SU04209424 + 2 | Glebe Farm, Long Itchington | SP41506500 + 3 | The Bungalow, Thames Lane, Cricklade | SU10229362 +(3 rows) + +or, to illustrate the use of the extra tags: + +select docid as id, +pgxml_xpath(document,'//find/type/text()','set','findtype') +from docstore; + + id | pgxml_xpath +----+------------------------------------------------------------------------- + 1 | <set></set> + 2 | <set><findtype>Urn</findtype></set> + 3 | <set><findtype>Pottery</findtype><findtype>Animal bone</findtype></set> +(3 rows) + +Which produces a new, well-formed document. Note that document 1 had +no matching instances, so the set returned contains no +elements. document 2 has 1 matching element and document 3 has 2. + +This is just scratching the surface because XPath allows all sorts of +operations. + +Note: I've only implemented the return of nodeset and string values so +far. This covers (I think) many types of queries, however. + +John Gray <jgray@azuli.co.uk> 16 August 2001 + + |