lecture16-xpath-xquery
lecture16-xpath-xquery
CSE 414
The root
bib
The root element
book book
publisher author . . . .
Addison-Wesley Serge Abiteboul
5
XPath: Simple Expressions
/bib/book/year
/bib/paper/year
Result: empty (there were no papers)
/bib//first-name
Result: <first-name> Rick </first-name>
CSE 414 - Spring 2013 7
XPath: Attribute Nodes
/bib/book/@price
Result: “55”
//author/*
Functions in XPath:
– text() = matches the text value
– node() = matches any node (= * or @* or text())
– name() = returns the name of the current tag
/bib/book/author/last-name
Then add them one by one:
/bib/book/author[first-name][address]/last-name
CSE 414 - Spring 2013 12
XPath: More Predicates
/bib/book[author/text()]
/bib/book[.//review/../comments]
Same as
/bib/book[.//*[comments][review]] Hint: don’t use ..
Zero or more
FOR ...
LET... Zero or more
WHERE...
RETURN... Zero or one
Exactly one
FOR $x IN doc("bib.xml")/bib/book
WHERE $x/year/text() > 1995
RETURN $x/title
Result:
<title> abc </title>
<title> def </title>
<title> ghi </title>
CSE 414 - Spring 2013 21
FOR-WHERE-RETURN
Equivalently (perhaps more geekish)
Result:
<answer> <title> abc </title> <year> 1995 </ year > </answer>
<answer> <title> def </title> < year > 2002 </ year > </answer>
<answer> <title> ghk </title> < year > 1980 </ year > </answer>
24
FOR-WHERE-RETURN
• Notice the use of “{“ and “}”
• What is the result without them ?
FOR $x IN doc("bib.xml")/ bib/book
RETURN <answer>
<title> $x/title/text() </title>
<year> $x/year/text() </year>
</answer>
<answer> <title> $x/title/text() </title> <year> $x/year/text() </year> </answer>
<answer> <title> $x/title/text() </title> <year> $x/year/text() </year> </answer>
<answer> <title> $x/title/text() </title> <year> $x/year/text() </year> </answer>
CSE 414 - Spring 2013 25
Nesting
• For each author of a book by Morgan
Kaufmann, list all books he/she published:
FOR $b IN doc(“bib.xml”)/bib,
$a IN $b/book[publisher /text()=“Morgan Kaufmann”]/author
RETURN <result>
{ $a,
FOR $t IN $b/book[author/text()=$a/text()]/title
RETURN $t
}
</result>
<result>
<author>Jones</author>
<title> abc </title>
<title> def </title>
</result>
<result>
<author> Smith </author>
<title> ghi </title>
</result>
CSE 414 - Spring 2013 27
Aggregates
Find all books with more than 3 authors:
FOR $x IN doc("bib.xml")/bib/book
WHERE count($x/author)>3
RETURN $x
FOR $x IN doc("bib.xml")/bib/book[count(author)>3]
RETURN $x
FOR $a IN distinct-values($b/book/author/text())
RETURN <author> { $a } </author>
FOR $b in doc(“bib.xml”)/bib
LET $a:=avg($b/book/price/text())
FOR $x in $b/book
WHERE $x/price/text() > $a
RETURN $x
LET enables us to declare variables
Result:
FOR $b IN doc("bib.xml")/bib, <answer>
$x IN $b/book/author/text() <author> efg </author>
RETURN <title> abc </title>
<answer> <title> klm </title>
<author> { $x } </author> ....
{ FOR $y IN $b/book[author/text()=$x]/title </answer>
RETURN $y } What about
</answer> duplicate
authors ?
CSE 414 - Spring 2013 33
Re-grouping
Same, but eliminate duplicate authors:
FOR $b IN doc("bib.xml")/bib
LET $a := distinct-values($b/book/author/text())
FOR $x IN $a
RETURN
<answer>
<author> $x </author>
{ FOR $y IN $b/book[author/text()=$x]/title
RETURN $y }
</answer>
CSE 414 - Spring 2013 34
Re-grouping
Same thing:
FOR $b IN doc("bib.xml")/bib,
$x IN distinct-values($b/book/author/text())
RETURN
<answer>
<author> $x </author>
{ FOR $y IN $b/book[author/text()=$x]/title
RETURN $y }
</answer>
SQL
XQuery
<answer>
<name> abc </name>
<price> 7 </price>
</answer>
<answer>
<name> def </name>
<price> 23 </price> Notice: this is NOT a
</answer> well-formed document !
.... (WHY ???)
<aQuery>
{ FOR $x in doc(“db.xml”)/db/Product/row
ORDER BY $x/price/text()
RETURN <answer>
{ $x/name, $x/price }
</answer>
}
</aQuery>
<aQuery>
<answer>
<name> abc </name>
Now it is well-formed !
<price> 7 </price>
</answer>
<answer>
<name> def </name>
<price> 23 </price>
</answer>
....
</aQuery>
CSE 414 - Spring 2013 39
SQL and XQuery Side-by-side
Product(pid, name, maker, price)
Company(cid, name, city, revenues) Find all products made in Seattle
FOR $r in doc(“db.xml”)/db,
$x in $r/Product/row,
SELECT x.name $y in $r/Company/row
FROM Product x, Company y WHERE
WHERE x.maker=y.cid $x/maker/text()=$y/cid/text()
and y.city=“Seattle” and $y/city/text() = “Seattle”
RETURN { $x/name }
SQL XQuery
FOR $y in /db/Company/row[city/text()=“Seattle”],
Cool $x in /db/Product/row[maker/text()=$y/cid/text()]
XQuery RETURN { $x/name } 40
<product>
<row> <pid> 123 </pid>
<name> abc </name>
<maker> efg </maker>
</row>
<row> …. </row>
…
</product>
<product>
...
</product>
....
FOR $r in doc(“db.xml”)/db,
$y in $r/Company/row[revenue/text()<1000000]
RETURN
<proudCompany>
<companyName> { $y/name/text() } </companyName>
<numberOfExpensiveProducts>
{ count($r/Product/row[maker/text()=$y/cid/text()][price/text()>100])}
</numberOfExpensiveProducts>
</proudCompany> 42
SQL and XQuery Side-by-side
Find companies with at least 30 products, and their average price
SELECT y.name, avg(x.price)
FROM Product x, Company y
WHERE x.maker=y.cid
GROUP BY y.cid, y.name An element
HAVING count(*) > 30
FOR $r in doc(“db.xml”)/db,
$y in $r/Company/row
LET $p := $r/Product/row[maker/text()=$y/cid/text()]
WHERE count($p) > 30
A collection RETURN
<theCompany>
<companyName> { $y/name/text() }
</companyName>
<avgPrice> avg($p/price/text()) </avgPrice>
</theCompany> 43
XML Summary
• Stands for eXtensible Markup Language
1. Advanced, self-describing file format
2. Based on a flexible, semi-structured data model