How to Search and Locate Elements in Selenium Python

When we create an automation script using Selenium, we have to search and locate elements from HTML/XML documents to perform some action on them.

Sometimes it is easy to locate those elements, but most of the time we couldn’t find the exact selector due to the complexity of HTML/XML documents.

In this article, we will guide you on how to extract those elements using the Selenium library. We will first understand simple ways to extract these elements and then move to more advanced ways to perform this action.

What is Selenium?

Selenium is an open-source library for browser automation. It provides a playback tool for authoring functional tests across most modern web browsers as well as scraping content from websites. To install the Selenium library in Python use the following command in the terminal.

pip install selenium

Then, you need to import the below libraries to use selenium in the Python script. You can use the below code to import these libraries and open the Selenium browser.

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By

In the selenium python library, there are 8 different methods to extract web page elements. These are:

By.CLASS_NAME
By.ID
By.TAG_NAME
By.LINK_TEXT
By.PARTIAL_LINK_TEXT
By.NAME
By.CSS_SELECTOR
By.XPATH
Using get_attribute() to Extract Contents

We will create code for each method to extract elements. We can locate a single element as well as all elements that match the given criteria. For getting the first element we have find_element and for getting all elements we have find_elements. For simplicity, we will just focus on finding the first element find_element and leave find_elements for you. Below is a sample HTML code from which we have to locate the required element.

<div class="parentClass">
	<p id="childID">
		<a href="https://www.vpktechnologies.com/" class="someClassName" id="someIDName" name="someName" data-name="some server data">Open Website</a>
	</p>
</div>

1. Find the Element with By.CLASS_NAME

If you know the exact class name then that element can be extracted by passing it as a parameter. The syntax for finding elements by Class is:

driver.find_element(By.CLASS_NAME, "<class name>")

In the given HTML snippets, the class name is someClassName. The Python code to extract this element will be:

driver.find_element(By.CLASS_NAME, "someClassName")

2. Find the Element with By.ID

If you know the exact ID then that element can be extracted by passing it as a parameter. The syntax for finding elements by ID is:

driver.find_element(By.ID, "<id name>")

In the above source code, the ID is someIDName. The Python code to extract this element will be:

driver.find_element(By.ID, "someIDName")

3. Find the Element with By.TAG_NAME

If you know the exact tag name of the element then it can be extracted by passing it as a parameter. The syntax for finding elements by tag name is:

driver.find_element(By.ID, "<tag name>")

In the above source code, the tag is <a>. The Python code to extract this element will be:

driver.find_element(By.TAG_NAME, "a")

4. Find the Element with By.LINK_TEXT

If you know the exact text inside the HREF link text then that element can be extracted by passing the exact link text as a parameter. The syntax for finding elements by exact link text is:

driver.find_element(By.LINK_TEXT, "<link text>")

In the above source code, the exact link text is Open Website. The Python code to extract this element will be:

driver.find_element(By.LINK_TEXT, "Open Website")

5. Find the Element with By.PARTIAL_LINK_TEXT

Sometimes you want to extract HREF elements by matching some part of the link text. For such conditions, we can use this method. Using this method, we can pass only part of the text which have an inside <a> tag. The syntax for finding elements by exact link text is:

driver.find_element(By.PARTIAL_LINK_TEXT, "<part of link text>")

In the given sample source code, we have text inside the link that is Open Website, but we need to extract this element using only Website text. The Python code to extract this element will be:

driver.find_element(By.PARTIAL_LINK_TEXT, "Website")

6. Find the Element with By.NAME

The name attribute is found in the input tag of HTML which is usually inside the <form> tag. We can locate this input element using this method by passing the tag name as a parameter. For simplicity, the name attribute is added in <a> tag. The syntax for finding elements by exact link text is:

driver.find_element(By.NAME, "<part of link text>")

Let’s create code to extract this element.

driver.find_element(By.NAME, "someName")

7. Find the Element with By.CSS_SELECTOR

Most of the time your element does not have a common attribute like class, id, or name and you want to locate this element. Then you can use the CSS_SELECTOR method to pass any custom attribute name to get that element. The syntax to locate custom/user-defined attributes is:

driver.find_element(By.CSS_SELECTOR, "<tagname[attribute name]>")

For example, in a given sample HTML code data-name is a custom attribute with the value of some server data. To locate this element, we can use the below code.

driver.find_element(By.CSS_SELECTOR, "a[data-name]")

We can use custom as well as predefined attributes like class, and id also to extract specific elements using CSS selectors. There are four different methods to extract elements using any attribute with its value.

1. Match the exact text of the attribute value with the CSS Selector

driver.find_element(By.CSS_SELECTOR, "a[class='someClassName']")

In the above code, we are extracting elements with attribute class with value someClassName. You can use any attribute name here like id, HREF, src, or any custom attributes.

2. Match the beginning text of the attribute value with the CSS Selector

driver.find_element(By.CSS_SELECTOR, "a[class^='someClass']")

In the above code, we are using wildcard character ^ to match text at the beginning of an element having attribute name class with a value beginning with text someClass HTML source.

3. Match the ending text of the attribute value with the CSS Selector

driver.find_element(By.CSS_SELECTOR, "a[class$='Name']")

In the above code, we are using wildcard character $ to match text at the end of an element having attribute class with value end with text Name in HTML source.

4. Match text anywhere in an attribute value with the CSS Selector

driver.find_element(By.CSS_SELECTOR, "a[class*='meClassNa']")

In the above code, we are using wildcard character * to match text at anywhere of an element having attribute class with a value containing text meClassNa HTML source.

Now suppose, you want to extract a child element inside any parent element, then you can do it by specifying their tag name with attribute value separated by space. Let’s look at a sample HTML code to understand.

In the given HTML code, we have parent <div> tag with class name parentClass. Under this parent tag, we have <p> tag with id childID. Now if you want to extract elements inside these two tags, we will use the following code.

driver.find_element(By.CSS_SELECTOR, "div.parentClass p#childID a[class='someClassName']")

Please note: In the above code, we are using . for class and # for id to match the element and both elements are separated by a single space.

8. Find the Element with By.XPATH

We can also find locate elements using XPATH. The XPath is a query language that is the preferred locator when other CSS locators (ID, Class, etc.) are unable to identify elements or unique attributes in an XML/HTML document. XPath in Selenium follows an XML path to navigate through the HTML structure of a web page. The general syntax of XPATH is:

//TagName[@AttibuteName = "value"]

There are two types of XPATH in Selenium which are:

Absolute path: XPath can be used to locate an element in absolute terms. An absolute XPATH expression contains all elements from the root node (HTML), from the path, starting with the desired element. In our sample HTML source code, we have to find the text Open Website, then the selenium code will be:
driver.find_element(By.XPATH, "/html/body/div/div/a")
Relative path: If the element does have class or ID then we can use the relative path. In our sample HTML source code, we have to find the text Open Website, then the selenium code will be:

driver.find_element(By.XPATH, "//*[@class='someClassName']")

Write Dynamic XPath to Find Element

1. Using a single attribute (relative XPath type):

Following are some examples for locating <a> element.

# Locate by HREF
driver.find_element(By.XPATH,"//a[@href='https://www.vpktechnologies.com/']")

# Locate by ID
driver.find_element(By.XPATH,"//a[@id='someIDName']")
# Locate by Class Name
driver.find_element(By.XPATH, "//a[@class='someClassName']")

# Locate by name
driver.find_element(By.XPATH,"//a[@name='someName']")

# Locate by other/custom attribute
driver.find_element(By.XPATH, "//a[@data-name='some server data']")

2. Using multiple attributes (relative XPath type)

You can mix two or more attributes separated by space. In the below code, we are mixing ID and class to locate the element.

driver.find_element(By.XPATH,"//a[@id='someIDName'] [@class='someClassName']")

3. Search using `starts-with` keyword in XPath

You can locate an element by matching the attribute value starting with the given text.

driver.find_element(By.XPATH,"//a[starts-with(@id, 'someID')]")

4. Search using `contains` the keyword in XPath

You can Locate an element by matching the attribute value starting with the given text.

driver.find_element(By.XPATH,"//a[contains(@id, 'meIDNa')]")

5. Using `text()` keyword to search in XPath

You can locate elements by tag text with exact matches using the below code.

driver.find_element(By.XPATH,"//a[text()='Open Website']")

6. Using `text()` keyword to search with contains in XPath

You can locate elements by tag text matching any part of the text using the below code.

driver.find_element(By.XPATH,"//a[contains(text(), 'Website')]")

Using get_attribute() to Extract Contents

Now you have located an element, and you need to extract their innerHTML, innerText, HREF, or any custom attribute value, then you’ve to use get_attribute() property with the given parameter. Let’s assume we have saved an element in my_element variable.

1. To get all text inside the current tag and all child tags

my_element.get_attribute('innerText')

2. To get the inner HTML of the current element

my_element.get_attribute('innerHTML')

3. To get the HREF of the current element

my_element.get_attribute('href')

4. Get the value of any custom attribute name

Let’s assume data-name is any custom attribute name.

my_element.get_attribute('data-name')

Post Views: 10