Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 31f4b59

Browse files
committed
Move new version of contrib/ xml into xml2, keep old version in /xml.
1 parent adca025 commit 31f4b59

File tree

11 files changed

+751
-0
lines changed

11 files changed

+751
-0
lines changed

contrib/README

+4
Original file line numberDiff line numberDiff line change
@@ -217,5 +217,9 @@ vacuumlo -
217217
by Peter T Mount <peter@retep.org.uk>
218218

219219
xml -
220+
Storing XML in PostgreSQL (obsolete version)
221+
by John Gray <jgray@azuli.co.uk>
222+
223+
xml2 -
220224
Storing XML in PostgreSQL
221225
by John Gray <jgray@azuli.co.uk>

contrib/xml/TODO

+78
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,78 @@
1+
PGXML TODO List
2+
===============
3+
4+
Some of these items still require much more thought! Since the first
5+
release, the XPath support has improved (because I'm no longer using a
6+
homemade algorithm!).
7+
8+
1. Performance considerations
9+
10+
At present each document is parsed to produce the DOM tree on every query.
11+
12+
Pros:
13+
Easy
14+
No persistent memory or storage allocation for parsed trees
15+
(libxml docs suggest representation of a document might
16+
be 4 times the size of the text)
17+
18+
Cons:
19+
Slow/ CPU intensive to parse.
20+
Makes it difficult for PLs to apply libxml manipulations to create
21+
new documents or amend existing ones.
22+
23+
24+
2. XQuery
25+
26+
I'm not sure if the addition of XQuery would be best as a function or
27+
as a new front-end parser. This is one to think about, but with a
28+
decent implementation of XPath, one of the prerequisites is covered.
29+
30+
3. DOM Interfaces
31+
32+
Expose more aspects of the DOM to user functions/ PLs. This would
33+
allow a procedure in a PL to run some queries and then use exposed
34+
interfaces to libxml to create an XML document out of the query
35+
results. I accept the argument that this might be more properly
36+
performed on the client side.
37+
38+
4. Returning sets of documents from XPath queries.
39+
40+
Although the current implementation allows you to amalgamate the
41+
returned results into a single document, it's quite possible that
42+
you'd like to use the returned set of nodes as a source for FROM.
43+
44+
Is there a good way to optimise/index the results of certain XPath
45+
operations to make them faster?:
46+
47+
select docid, pgxml_xpath(document,'//site/location/text()','','') as location
48+
where pgxml_xpath(document,'//site/name/text()','','') = 'Church Farm';
49+
50+
and with multiple element occurences in a document?
51+
52+
select d.docid, pgxml_xpath(d.document,'//site/location/text()','','')
53+
from docstore d,
54+
pgxml_xpaths('docstore','document','//feature/type/text()','docid') ft
55+
where ft.key = d.docid and ft.value ='Limekiln';
56+
57+
pgxml_xpaths params are relname, attrname, xpath, returnkey. It would
58+
return a set of two-element tuples (key,value) consisting of the value of
59+
returnkey, and the cdata value of the xpath. The XML document would be
60+
defined by relname and attrname.
61+
62+
The pgxml_xpaths function could be the basis of a functional index,
63+
which could speed up the above query very substantially, working
64+
through the normal query planner mechanism.
65+
66+
5. Return type support.
67+
68+
Better support for returning e.g. numeric or boolean values. I need to
69+
get to grips with the returned data from libxml first.
70+
71+
72+
John Gray <jgray@azuli.co.uk> 16 August 2001
73+
74+
75+
76+
77+
78+

0 commit comments

Comments
 (0)