Full Path Is Not an Option
Let's look at the web page below and think of good XPATH expressions for UserName
input field.
<form>
<div>
<p>
<input name="usrName" id="UserName" class="textbox" autocomplete="off" type="text">
</p>
<p>
<input name="pwd" id="Password" class="textbox" autocomplete="off" type="password">
</p>
</div>
</form>
Obviously
//input[@id='UserName']
is better than
/form/div/p/input
because full expression that enumerates nodes from root of a DOM tree to one of it's leaves gets broken after almost any update of a web page.
If there is no ID then
//input[1]
is still better than
/form/div/p/input
A Tree or Not a Tree
Here is another example.
<div class="mainPanel">
<a>All vendors</a>
<a>General ledger</a>
</div>
<div role="tree">
<div role="group" title="Favorities" aria-expanded="false">Favorities</div>
<div role="group" title="Recent" aria-expanded="true">Recent</div>
<div role="treeitem"><a title="Accounts payable > Vendors">All vendors</a></div>
<div role="treeitem"><a title="Procurement and sourcing > Purchase orders">All purchase orders</a></div>
<div role="group" title="Workspaces" aria-expanded="false">Workspaces</div>
<div role="group" title="Modules" aria-expanded="true">Modules</div>
<a role="treeitem" title="Cash and bank management">Cash and bank management</a>
<a role="treeitem" title="Cost management">Cost management</a>
<a role="treeitem" title="General ledger">General ledger</a>
</div>
This is a three level tree but nodes of second and third levels are both children of the tree root. So there is no relationship parent-child
between second and third level nodes.
First level of the tree is presented by a single root node with role tree
. Second level consists of nodes with role group
. And third includes nodes with role treeitem
. Also if a second level node has aria-expanded="false"
then it's descendants are not present in the DOM tree (Favorities
and Workspaces
are examples of such nodes).
How to reliably navigate to a third level node?
If the node is visible it can be found as
//a[text()='All vendors']
//a[text()='General ledger']
But such XPATH expressions will return more than one result. So instead of a tree item we may click on a link in mainPanel
.
Better search for the text from the tree root
//div[@role='tree']//a[text()='General ledger']
//div[@role='tree']//a[text()='All vendors']
And if the required result should be always a node with role="treeitem"
we can use
//*[text()='All vendors']/ancestor-or-self::*[@role='treeitem']
//*[text()='General ledger']/ancestor-or-self::*[@role='treeitem']
If the second level node is collapsed (it can be determined from the attribute aria-expanded
) then our expressions will not work. So to reliably reach an arbitrary node in the tree we need to know a path to it from the root. The algorithm of finding a node in pseudo notation will look like
- Find second level node by its name.
- If it is collapsed then expand it.
- Find third level node by its name.
So knowing XPATH expression of a node may be insufficient to find and interact with it.
Another example of this is a node that appears only when mouse is hovered over a UI control, it can be a sub menu item or even a simple link.
XPATH Building Blocks
Now when you have an idea of how reliable identification of elements in Web UI looks like we present a list of syntactic examples you can use for building resilient XPATH expressions.
Element with ID
Element with id attribute set to a given value.
//input[@id='UserName']
Element by Index
Second input
element on the page.
//input[2]
Element with Text
Link with a given text.
//a[text()='Log In']
Sometimes text contains extra spaces or carriage return symbols. To filter them out use normalize-space
function.
//a[normalize-space(text())="Book Management"]
Element with Class
Sometimes an element may belong to several classes.
<div class="column columnHeader sortable"></div>
To match a specific class with XPATH use contains
, concat
and normalize-space
functions.
//div[contains(concat(' ', normalize-space(@class), ' '), ' columnHeader ')]
Notice spaces in ' columnHeader '
- second argument to contains
function.
Element with Attribute
Find an element which contains a specific attribute. For example, to find all div
elements which have id
attribute do:
//div[@id]
Element without Attribute
Find an element which does not contain a specific attribute. In this example we need a div
element with role
set to main
and without style
attribute.
//div[@role='main' and not(@style)]//div[@data-target="selectionArea"]
Element with Child Element with Specific Text
Select an element which contains a child element with a given text.
<div role="tab">
<header>
<h1>Vendor</h1>
</header>
<header>
<h1>Details</h1>
</header>
</div>
//div[@role='tab']/header[h1/text()='Vendor']
Specific Parent of Element
Select parent node of an element with a given text.
<div role="treeitem">
<a>All vendors</a>
</div>
//*[text()='All vendors']/ancestor-or-self::*[@role='treeitem']
Find a Cell with Specific Text in a Given Column of a Table
Let's assume there is a table of standard structure and we want to search column 2
for a cell with text Window
.
//table[@id="data"]//tr/td[2][normalize-space(.//text())='Window']
Click in Last Row of a Table
To click in a cell of a last row in a table (for example, it may be a row for inserting a new record and you want to click on Insert button) use last() function.
//table[@id='books']/tbody/tr[last()]/td[7]/div/button[2]