Selenium: Web Browser Automation Testing Tool

Apr 19, 2020

If you are a testing professional, especially a QA automation engineer, then you must be very cognizant of the term 'Selenium'. Selenium is one of the popular automation tool and framework for testing web applications and services. It is an open-source automation tool which delivers the functionalities of recording and playing the test scripts. Selenium supports execution of test scripts written in multiple programming languages including Java, C#, Ruby, Groovy and Scala.

An Insight to Selenium Background:

Selenium was created in the year 2004 and initially developed by Jason Huggins (Thoughtworks) with the purpose to carry out the tiresome task of manual and repetitive execution of tests on a particular web based application in a much effective and efficient manner.

Jason Huggins created a JavaScript program to automate the testing of a web based application at Thoughtworks, Chicago. He initially named this program as JavaScript TestRunner and made it public as an open-source tool to serve the purpose of testing other web applications also. This open-source tool gained huge popularity amongst the users and later on Huggins re-named this tool as Selenium Core.

Selenium Core is a perfect tool to automate the web based testing, but was bound with certain limitations and drawbacks. One such limitation was same origin policy which states program or code originated or loaded from one domain cannot be used to interact with or access the resources from another domain. As such, javascript code originated from one domain can't access the resources from another domain, compelling testers to install local copies of selenium core along the web application server on their local machines to fulfil the policy of same domain.

To overcome this limitation, Paul Hammant, one of the employees of Thoughtworks has created Selenium Remote Control(RC) or Selenium 1 containing both selenium client and selenium server as a proxy server to support executions on multiple browsers.

Later, in the year 2005 Patrick Lightbody came up with a private cloud named HostedQA and then Phillipe Hanrigou of Thoughtworks created Selenium Grid, which enables testers to run and execute multiple tests concurrently on different machines to ensure speedy execution of tests within less time span.

In the year 2007, Simon Stewart of Thoughtworks created Webdriver to meet the need of efficient execution of different web applications on multiple browsers. Later on, in the year 2009, Selenium RC and Webdriver applications were got merged to deliver a new and better product, named Selenium Webdriver or Selenium 2.0.

evolution of selenium

Selenium IDE was developed as an interest of Shinya Kasatani of Japan in Selenium. It is a plug-in extension developed for the Firefox browser to record and plays the test script which was later donated to Selenium project.

selenium suite

It is pertinent to mention here that the name Selenium was given by the Jason Huggins in the context of joke to compete with or oppose the then popular test automation framework of Mercury Interactive as selenium acts as an antidote to counteract the Mercury poisoning.

Why Selenium is so Preferred?

Selenium is not a single application tool rather it's a package of different applications and frameworks bundled together to meet each different business and functional requirements. Further, with selenium you can leverage the following advantages:

An open source tool, freely available to download under Apache License 2.0.
Easy to install and use record and play features.
Supports test executions on multiple browsers.
Supports different browser extensions.
Apart from its own scripting language, selenium supports test script writing in other various languages including Ruby, Java, C#, PHP, Python and many more.
In-built test results and reporting feature.

Selenium: The Beginning

As stated above, Selenium is not a single application framework. It consists of multiple applications and frameworks. Let's explore each of these components one by one.

Selenium IDE

Selenium Integrated Development Environment or IDE is a Firefox plug-in which is used to develop test scripts using record and play-back feature through graphical user interface actions. Further, you can also edit the test scripts also using this application. It is very simple and easy to install application, and does not require proficiency in different programming languages but the sound knowledge of HTML, DOM and JavaScript would be preferred to utilize all different features of IDE at the fullest.

Due to its simplicity, IDE can't be used to develop complex tests rather it may be used as a prototyping tool.

Features of Selenium IDE:

Selenium IDE framework comprises of following components or functionalities:

Menu Bar:-

selenium IDE menu bar

File: Let's you to create, save, open(existing) test cases and even export test suites & test cases in different languages and formats.
Edit Menu: With usual options like undo, redo, cut, copy, paste and select All, IDE edit menu provides two more options- Insert New Command and Insert New Comment.

Insert New Command lets you to insert a new command anywhere in the currently opened test case. Similarly, with Insert New Comment, a comment may be added correspond to newly added command or any existing command to describe steps or something needy information.

Actions Menu: With action menu, a user can explore and make use of following functionalities:
- Record
- Play Entire Suite
- Play Current Test Case
- Resume/Pause
- Toggle Break Point
- Set/Clear Start Point
- Stop
Options Menu: The menu provide four core features as mentioned below, basically to edit the attributes and properties of IDE with various different configurations:
- Options
- Format
- Clipboard Format
- Reset IDE window
- Clear History
Help Menu: Assistance in terms of documented artifacts and user manuals and guides for the selenium users.

Base URL Bar:-

This stores the URL of all the previously accessed websites and may be seen using drop down feature.

selenium IDE base url bar

Tool Bar:-

Just below Base URL Bar is Toolbar which consists of multiple features to execute w.r.t test execution such as:

selenium IDE tool bar

Playback Speed: Control and regulate the speed of test executions.
Play Test Suite: Execute Test cases of a particularly selected Test Suite.
Play Test Case: Executes particular or selected test cases.
Pause/Resume: With this button, either pause or resume the ongoing execution.
Step:Step into the test steps.
Rollup: Integrate all different and multiple steps as a single step command using this feature.
Record: Records the action executed by the user. Further, it may also be used to stop the ongoing recording of the actions.

Editor:-

Next, we have editor which involves two main component table and source. Table is used to display and edit different commands, parameters and properties w.r.t a test case whereas source is used to display & edit test case properties, commands and parameters in HTML format.

selenium IDE editor

Test Case Pane:-

Test Case Pane is used to display the currently opened test case or test suite including its all test cases along with the status update of passed and failed in terms of green and red color respectively, after the test execution. Further, it also shows the number of test cases executed and number of failed tests.

selenium IDE test pane

Log Case Pane:-

Logs and displays the different updates and information during the test execution task in real time.

Reference Pane:-

Use to display currently selected selenese command along with its brief information and parameters passed.

UI Element Pane:-

An advanced selenium feature for the users to access page elements using JSON.

Roll-up Pane:-

A useful feature to combine multiple steps into a one single command.

selenium IDE multiple panes

Selenium IDE Drawback or Limitations:

Supports Firefox only.
Advance test cases cannot be created using Selenium IDE.

Selenium Web Driver:

Selenium Web Driver or Selenium 2.0 is one of the most popular used automation tools of Selenium suite to automate the test execution of web based applications. Selenium Web Driver is a productive integration of Selenium RC or 1.0 and Web Driver as discussed above.

Unlike Selenium IDE, Selenium Web driver delivers the advantage of writing test scripts in different languages including Java, Perl, Ruby, C# and involves the usage of all different programming logics & concepts to ensure the quality of the test script. It does not involve the usage of any sort of IDE rather tests are created and executed through programming interface.

Why Selenium Web Driver?

Selenium Web Driver is formed from the productive combination of Selenium RC & Web Driver. Web Driver was merged with the Selenium RC to overcome the certain limitations of the latter. Some of the aspects, where Selenium RC was lacking as compared to Web Driver are:

Selenium RC has complex architecture as it involves the need & participation of an intermediate- Selenium RC server between Client and the web browsers i.e. test scripts written at client side are communicated to web browsers via Selenium RC server. Further, the client libraries communicated or passed via RC server needs to be interacted with a JavaScript program (Selenium Core) embedded for each different browser, which further communicates the libraries to the corresponding browser. Thus, it forms a complex architecture. While, Web Driver strikes out the need of any sort of external element or server or Javascript program and directly communicates with the browser.
Web Driver delivers the advantage of speedy executions (due to direct communication with the browser) as compared to Selenium RC, where the involvement of multiple elements slows down the overall speed of the process.
With non-redundant and clear commands, Web Driver APIs seems to be much simpler than RC's APIs.
Selenium RC doesn't support HTMLUnit Browser whereas Web Driver Does.

However, it is pertinent to mention here that selenium RC provides the in-built functionality of automatic generation of test report file, which lacks in the Web Driver. Thus, to ensure best utilization of productive features and functionalities, Selenium RC and Web Driver got merged to form Selenium Web Driver.

Basic Working of the Selenium Web Driver:

Selenium Web Driver makes a direct interaction with the browsers where each different browser have their own driver which needs to imported.

Test scripts written in different programming languages that are compatible with the selenium Web Driver interacts with and invokes the Selenium Web Driver which further execute and pass the test script to the browser embedded with the selenium core (JavaScript Program)

selenium web driver working

Selenium Web Driver may be used to test different elements of a browser or a web application along with the set of operations performed over them such as:

Browser: Opening, closing and refreshing the browser, accessing the intended website using its URL, and many other activities.
Pages: URL of the page, page title.
Buttons: Button name, displaying message on clicking the button.
Images: Image title, image link, image uploading and downloading option.
Links: Clicking the link, link directing to desired location on click, checking broken links, etc.
Radio Buttons & Checkboxes: Selecting and deselecting buttons and checkboxes, and enabling status on click.
Drop Down Box: Opening up the drop down list, clicking & selecting drop-down elements.
Table: Clicking the cells, returning the cell value, row count and column count.

So, How to locate different elements of a browser or a web application?

Elements present on the website can be located using findElementBy() method and 8 different locators where each of these locator locates GUI or web elements uniquely. These 8 locators are as follows:

By ID

Every web has unique ID which may be seen by inspecting the web page.

By Name

Locates element with the value of "Name" Attribute. It is useful for locating similar types of elements such as input fields.

By Class Name

Class element along with its attribute is being used to locate web elements.

By Tag Name

Locate elements on the web page using HTML tags.

By Link Text

Use to locate specified links on the page through the texts of the link.

By Partial Link Text

Similar to "By Link Text" where partial texts of the link is considered to locate the specified link.

By CSS

Using CSS path, GUI elements may be located on the web page findByElement() method.

By XPATH

Locating web element using XPATH of the element is one of the best methods in the Selenium Web Driver to locate any of the elements present on the web page. XPATH makes use of DOM structure i.e. XML format of the web page to locate the web elements. Basically, different HTML tags, attributes, values along with the different conditions (OR, AND) are used to extract XPATH of the element.

XPATH may be created or extracted using following approaches:

Absolute path

In absolute path, path of the element is traced right from the root or parent node till specified element, portraying exact path of the element. It starts with single slash; '/'.

Relative path

Relative provides you short and small path of the web element. In relative path, parent or root node is not being taken rather element may be located anywhere in the web page. It starts with the double slash; '//'

Commands, Methods and Functions Of Selenium Webdriver:

Selenium webdriver provide a good number of commands, methods and functions for the purpose of automation activities.

1. Commands:

a) Get Command:

Get command is used to fetch or retrieve different and important information about the web page or of the web elements available on the page. Below given are some of the Get Commands frequently used in the selenium webdriver:

get():opens up the website of the specified URL in the browser.
getTitle():Retrieves the Title of the currently opened web page.
getPageSource():Returns the source of the web page in a string form.
getCurrentUrl():Fetches the URL of the currently opened page.
getText():Access and retrieves the texts of the specified element.

b) Navigate Command:

Navigate command is used to navigate between different web pages along with other related activities. Some of the navigate commands are:

navigate().to(): It launches a new browser window to open up the specified page as mentioned in the parantheses.
navigate().forward(): Take to the page, which was earlier visited after the current page in the browser history.
navigate().backward(): Navigate to the page, which was earlier visited before the current page in the browser history.
navigate().fresh(): Refresh or reloads the current page.

2. Functions:

Close and Quit functions:

Close() method closes the currently opened window of the browser, whereas Quit() method terminates all the tabs of the browser i.e. it shut downs the browser.

3. Methods:

a) "SwitchTo().frame()" and "SwitchTo().alert()" Methods:

SwitchTo().frame() method is used to switch between the frames whereas SwitchTo().alert() is used to switch on to the pop windows and dialog boxes such as alert message box.

b) sendKeys() Method :

With sendKeys() command, text or value can be sent or entered in a specified text box or text field.

c) Click command:

Click() method is used to click on the web elements.

d) Wait

Sometimes browser speed does not match the speed of the test automation execution, which may leads to undesirable results.

In simple words, it may be stated that sometimes browser take considerable amount of time to load all the objects/elements of a web page and may load different elements at different intervals of time. This condition may lead to non-detection and skipping of the element(s) that were not present at a particular time period or will throw the ElementNotVisibleException exception.

Wait() method is used to manage and rectify these type of issues and situations. There are two different types of wait() method used in the Selenium Webdriver.

Implicit Wait: Implicit wait will enable you to set a time limit beyond which No Such Element exception can be thrown by the Selenium Webdriver in case the element is not found to be visible in the specified time period.
Explicit Wait: Explicit wait will look for the meeting of certain specified conditions after which Selenium Webdriver may proceed further.

Overall, it may be stated that explicit wait() provides better advantage in comparison to implicit wait().

SELENIUM GRID:

Selenium Grid is one of the productive components of the Selenium Suite, which is used to execute tests simultaneously and concurrently in parallel across multiple platforms including the different combinations of browsers, operating systems and devices. It supports distributed test execution environment.

selenium grid

The basic architecture of selenium grid is based on a hub and number of nodes, where hub is a centralized machine or may be seen a server responsible for running the tests and node symbolizes the machines where these tests will be executed. Each of these node or machine possesses & imparts different combinations of browsers and operating systems.

Why Selenium Grid?

Selenium Grid comes in the picture primarily because of following reasons:

Large scale and complex requirement of testing the application on each different combination of browsers and operating systems.
Test executions concurrently, in parallel to save precious time of testing phase/project.

Selenium Grid Working:

As stated above, selenium grid primarily comprises hub and nodes. Apart from hub and nodes, the tool involves the participation of some more elements, as described below:

Remote Web driver API

Remote Web driver API is one of the essential components in the working of selenium grid which is used to perform functions similar to that of Web driver with difference of configuring the remote web driver in order to run and execute tests. With Remote web driver, test scripts running on the server will be executed differently on each different remote systems or machines.

Remote web driver is used to connect server with the grids on which test scripts are intended to be executed.

Desired_Capabilities

This object selects and sets the browser(s) in the grid to ensure the test script execution on the intended node.

When the request for test script execution on the desired browser using Desired_Capababilities gets generated, the hub(server) looks for the specified browser in the grid and passes the request to that particular node via remote web driver to establish a bi-directional connection and starts the process of test script execution on that particular remote machine.

In nutshell, it may be stated that Selenium Suite comprising of Selenium IDE, Selenium 2.0(Selenium 3.14 is the latest version) and selenium grid provides the productive automation framework cum environment to ensure the speedy and effective execution of large and complex test scripts, thereby saving the considerable amount of time and efforts in testing/project.

Get New Content Update

Accessibility Testing
Dec 07, 2020
Non Functional Testing
Sep 10, 2019
Smoke vs Sanity Testing
Apr 19, 2020
Usability Testing
Feb 04, 2019
Test Suite Complete Guide
Feb 12, 2020
Gray Box Testing
Feb 16, 2019