Basic XPath and CSS Theory
Basic XPath and CSS Theory
Basic XPath and CSS Theory
This chapter will provide you with a basic understanding of XPath. Just enough to cover the basic requirements for writing Selenium tests. XPath is the XML Path Language. Since all HTML, once loaded into a browser, becomes well structured and can be viewed as an XML tree, we can use XPath to traverse it. Note: To help follow this section you might want to visit the web page http://compendiumdev.co.uk/selenium/basic_web_page.html and use the Firefox plugin XPather to try out the XPath statements listed.
I'll include the listing of the XHTML for the basic_web_page.html here so you can follow along:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html> <head> <title>Basic Web Page Title</title> </head> <body> <p id="para1" class="main">A paragraph of text</p> <p id="para2" class="sub">Another paragraph of text</p> </body> </html>
XPath Expressions
XPath expressions select 'nodes' or 'node-sets' from an XML document. e.g. The XPath expression //p would select the following node-set from the example in the Basic HTML Theory section:
Node Types
XPath has different types of nodes. In the example XHTML these are: Document node (the root of the XML tree): <html> Element node e.g.: <head><title>Basic Web Page Title</title></head>
<title>Basic Web Page Title</title> <p id="para1">A paragraph of text</p> Attribute node e.g. id="para1"
Selections
/ start selection from the document node allows you to create 'absolute' path expressions e.g. /html/body/p matches all the paragraph element nodes
assertEquals(2,selenium.getXpathCount("/html/body/p"));
// start selection matching anywhere in the document allows you to create 'relative' path expressions e.g. //p matches all paragraph element nodes
assertEquals(2,selenium.getXpathCount("//p"));
@ selects attribute elements e.g. //@id would select all the attribute nodes
assertEquals(2,selenium.getXpathCount("//@id"));
Predicates
Predicates help make selections more specific and are put in square brackets. Predicates can be indexes. e.g. //p[2] selects the second p element node in the node-set,
assertEquals("Another paragraph of text",selenium.getText("//p[2]"));
Predicates can be attribute selections e.g. //p[@id='para1'] selects the p element node where the value of the attribute id is 'para1'
assertEquals("A paragraph of text", selenium.getText("//p[@id='para1']"));
//p[@class='main'] selects the p element node where the value of the attribute class is 'main'
assertEquals("A paragraph of text", selenium.getText("//p[@class='main']"));
Predicates can be XPath functions e.g. //p[last()] would select the last paragraph
Predicates can be comparative statements e.g. //p[position()>1] would return all but the first p element
assertEquals("Another paragraph of text", selenium.getText("//p[position()>1]")); assertEquals("A paragraph of text", selenium.getText("//p[position()>0]"));
Advanced
XPath is a full programming language so you can perform calculations (e.g. last()-1), and use boolean operations ( e.g. or, and)
//node() get all the nodes (try it, you may not get the results you expect) //body/node() get all the nodes in the body (again, try it to see if you get the value you expect) * matches anything depending on its position @* matches any attribute e.g. //p[@*='para1'] would match the first paragraph
assertEquals(1,selenium.getXpathCount("//p[@*='para1']"));
assertEquals(2,selenium.getXpathCount("/html/*")); //*[starts-with(.,'Basic')]
example this would match the title You can setup matches with multiple conditions e.g.
would Boolean Operators match any node where the contents of that node start with 'Basic', in our
assertEquals("Basic Web Page Title", selenium.getText("//*[starts-with(.,'Basic')]")); //p[starts-with(@id,'para') and contains(.,'Another')]
//*[starts-with(@id,'p')] find all paragraphs where the id starts with 'para' and the text contains 'Another' i.e. the would match second paragraph/ any node where the id name started with 'p', in our example this would match the paragraphs
assertEquals("Another paragraph of text", selenium.getText( assertEquals(2, "//p[starts-with(@id,'para') and contains(.,'Another')]")); selenium.getXpathCount("//*[starts-with(@id,'p')]"));
are many XPath functions available to you, I have just picked a few of the most common There//*[@id='para1' or @id='para2'] ones thatfind any recommend that id is 'para1' or the id isweb sites belowtwolearn more about XPath I use. I node where the you visit some of the 'para2' i.e. our to paragraphs functions, and experiment with them.
selenium.getXpathCount("//*[@id='para1' Recommended web sites for function references: or @id='para2']")); assertEquals(2,
XPath http://www.w3schools.com/XPath/xpath_functions.asp Functions http://msdn.microsoft.com/en-us/library/ms256115.aspx Since XPath is actually a programming language it has built in functions which we can use in our http://www.xmlpitstop.com/ListTutorials/DispContentType/XPath/PageNumber/1.aspx XPath statements. Some common XPath functions are listed below
allows you to match the value of attributes and elements based on text anywhere in the For our testing we typically want to get the shortest and least brittle XPath statement to identify comparison item elements on a page. e.g. Some XPath optimisation strategies that I have used are: //p[contains(.,'text')] use the id, would match any paragraph with text in the main paragraph e.g. Both our use a combination of attributes to make the XPath more specific, paragraphs start at the first unique element assertEquals(2, We have to make a trade off between handling change and false positives. So we want the XPath to //p[contains(.,'Another')] return the correct item, but don't want the test to break when simple changes are made to the would application undertest. match any paragraph with Another in the paragraph text, in our example this would match the second paragraph.
selenium.getXpathCount("//p[contains(.,'text')]"));
XPath optimisation
contains()
Use the ID
//*[@id='para2'] match any paragraph where the id had '1' in it, in our example this is the would Or you probably wantfirstbe even more specific and state the type e.g. to paragraph //p[@id='para2'] assertEquals("A paragraph of text",
selenium.getText("//p[contains(@id,'1')]"));
starts-with()
Selenium XPath
Usage
Because only XPath locators start with // it is possible to write XPath locators without adding xpath= on the front. e.g.
selenium.isElementPresent("//p[@id='para1']")
The specific XPath command getXpathCount expects an XPath statement as its argument so you should not use xpath= in front of the XPath locator. Possibly a good reason for not using xpath= in any of your locators, but each of us has personal coding styles so you get to make a choice as to which you prefer. e.g.
selenium.getXpathCount("//p"); //return a count of the <p> elements
You can combine the XPath statement in the getAttribute statement to get specific attributes from elements e.g.
assertEquals("para2", selenium.getAttribute("xpath=//p[2]@id")); assertEquals("para2", selenium.getAttribute("//p[2]@id"));
The @id (or more specifically @<attribute-name> means that the statement is not valid XPath but Selenium parses the locator and knows to split the @id on the end off before using it.
This Firebug plugin adds a few additional abilities to Firefox. By right clicking on an element in a page in Firefox, you now have the ability to Firefind Element.
Figure 16.1 : Firefind Element This opens Firefind, with the HTML of the element displayed.
Figure 16.2 : Display after having found an element with Firefind By typing a CSS selector as the 'filter' input, and pressing the [Filter] button. You can see the elements in the page which match the CSS selector this allows you to check if your CSS selector constrains the search results enough.
You can also use the CSS statement in the getAttribute statement to get specific attributes from elements e.g.
assertEquals("para2", selenium.getAttribute("css=p.main@id"));
Selenium does not provide a getCSSCount function, like the getXPathCount function, but we can create a simple getCSSCount function using the getEval command that we will explain later.
// based on http://www.ivaturi.org/home/addgetcsscountcommandtoselenium private int getCSSCount(String aCSSLocator){ String jsScript = "var cssMatches = eval_css(\"%s\", window.document);cssMatches.length;"; return Integer.parseInt( selenium.getEval(String.format(jsScript, aCSSLocator))); }
Note: String.format is a particularly useful Java command for avoiding concatenating strings together. String.format has the following form: String.format(<a format string>,<list of arguments>); For the format string, you create a string e.g. hello there %s, I have %d for sale and then add the replacement items for the % markers as arguments: String name = alan; int amount=10; String.format(hello there %s, I have %d for sale,name,amount); This allows you to have constants which you can add values into without concatenating lots of variables together. For more information on String.format visit: - http://java.sun.com/j2se/1.5.0/docs/api/java/util/Formatter.html#summary If you want to use getCSSCount in the short term then add it as a private method in the test class, and you can use it in your tests as follows.
@Test public void someCounts(){ assertEquals(2,getCSSCount("p")); assertEquals(6, getCSSCount("*")); assertEquals(2, getCSSCount("body > *")); assertEquals(1,getCSSCount("p[id='para1']")); }
Selections
CSS selectors, use a hierarchical definition much like XPath to match elements on the page.
p[class='main'] selects the p element where the value of the class attribute is 'main'
assertEquals("A paragraph of text", selenium.getText("css=p[class='main']"));
p[id] select all p elements with an id attribute although this matches more than one element, Selenium will always use the first
assertEquals("A paragraph of text", selenium.getText("css=p[id]"));
Indexed Matching
CSS Selectors also supports Indexed matching. The w3c specification lists all the indexing predicates, these are called pseudo classes in the w3c specification e.g. first-child , matches the first child of an element e.g. body *:first-child matches the first child of the body element
assertEquals("A paragraph of text", selenium.getText("css=body *:first-child"));
last-child , matches the last child of an element e.g. Body *:last-child, matches the last child of the body element
assertEquals("Another paragraph of text", selenium.getText("css=body *:last-child"));
match the 2nd child of any type nth-last-child() , matches the nth child of an element counting backwards from the last child body p:nth-last-child(1), returns the last child for the body element
assertEquals("Another paragraph of text", selenium.getText("css=body p:nth-last-child(1)"));
nd
The Selenium documentation describes support for all css1, css2 and css3 selectors. With the following exceptions, so no support yet for: CSS3 Name spaces, the following pseudo classes(:nth-of-type, :nth-last-of-type, :first-of-type, :last-of-type, :only-of-type, :visited, :hover, :active, :focus, :indeterminate) Also no support for the pseudo elements(::first-line, ::first-letter, ::selection, ::before, ::after).
body > *, matches all children under body Since 2 elements get returned, Selenium will use the first one
assertEquals("A paragraph of text",selenium.getText("css=body > *"));
$=, matches a suffix p[class$='n'] would match any paragraph with a class name ending in n
assertEquals("A paragraph of text", selenium.getText("css=p[class$='n']"));
*=, matches a substring anywhere in the attribute value p[class*='u'] would match any paragraph with u in the class name
assertEquals("Another paragraph of text", selenium.getText("css=p[class*='u']"));
Boolean Operators
You can setup matches with multiple conditions e.g. p[class='main'][id='para1']
You can negate conditions e.g. p:not([class='main'])[id^='para'] would match any paragraph which does not have the class main and the id starts with para
assertEquals("Another paragraph of text", selenium.getText("css=p:not([class='main']) [id^='para'] "));
p:not([class='main'])[id^='para']:not([class='sub']) you can have multiple negations in the selector would match any paragraph which does not have the class main and the id starts with para and does not have the class sub - in our example this would match no elements
assertEquals(0, getCSSCount("p:not([class='main'])[id^='para']:not([class='sub'])"));
Sibling Combinators
As well as traversing a hierarchy with and > we can also check for siblings before and after a particular element e.g. +, match an element immediately following another element e.g. p+p match a paragraph that immediately follows another paragraph
assertEquals("Another paragraph of text", selenium.getText("css=p + p"));
Useful Links
http://kimblim.dk/css-tests/selectors/ A page that lists browser compatibility with various CSS selectors. http://robertnyman.com/firefinder/ Homepage for Firefinder with links to the support google group and instructions on its use. http://www.w3.org/TR/css3-selectors/ The official W3C selectors specification
The W3c specification, has an excellent summary of the CSS selection patterns