Skip to main content

Quality Assurance

Exploring Selenium 4: Dive into the Actions API

Innovation. Hands Holding Light Bulb For Concept New Idea Concept With Innovation And Inspiration, Innovative Technology

Selenium 4, the latest iteration of the Selenium WebDriver, introduces several enhancements to streamline web automation. One notable improvement is the revamped Actions API. This low-level interface provides virtualized device input actions to the web browser, offering granular control over keyboard, mouse, pen, touch devices, and even scroll wheel interactions. In this blog, we will delve into the intricacies of the Actions API, exploring its key components and providing practical examples.

Understanding the Actions API

  1. Key Components: Action Builder

The Actions API introduces an intricate but powerful set of low-level building blocks. These blocks include commands for key inputs, pointer inputs, and wheel inputs. The Action Builder allows the construction of individual action commands assigned to specific inputs, which can be chained together. The associated perform method is then called to execute them collectively.


  1. Pause Command

While pointer movements and wheel scrolling allow users to set a duration for the action, the pause command becomes handy for inserting a delay between actions. This is crucial for ensuring correct execution in scenarios where a pause is needed.

WebElement clickable = driver.findElement("clickable"));

new Actions(driver)








  1. Release All Actions

An important consideration is that the driver retains the state of all input items throughout a session. The releaseAll method becomes valuable for resetting the state by releasing all currently depressed keys and pointer buttons.

((RemoteWebDriver) driver).resetInputState();


Keyboard Actions in Selenium 4

  1. Keys Representation

Selenium 4 provides a representation of any key input device for interacting with a web page. Aside from supporting ASCII characters, each keyboard key has a representation that can be pressed or released in designated sequences. Selenium assigns unicode values to keyboard keys for use in automation.


  1. Keyboard Actions Example

Here’s an example demonstrating keyboard actions using the Actions API:

new Actions(driver)







This sequence involves pressing the Shift key, typing ‘a’, releasing the Shift key, and then typing ‘b’.


  1. SendKeys Method

The sendKeys method is a convenience method in the Actions API that combines keyDown and keyUp commands in one action. It’s particularly useful when needing to type multiple characters in the middle of other actions.

new Actions(driver)




Mouse Actions in Selenium 4

  1. Mouse Actions Overview

Similar to keyboard actions, Selenium 4 provides a representation of any pointer device for interacting with a web page. The Actions API supports various mouse actions, including clicking and holding, clicking and releasing, right-clicking, double-clicking, and moving the mouse.


  1. Mouse Actions Example

Here’s an example demonstrating mouse actions using the Actions API:

WebElement clickable = driver.findElement("clickable"));

new Actions(driver)




This action involves clicking and holding the left mouse button on a clickable element.


  1. Real-Time Usage Scenarios

Mouse actions find applications in scenarios like canvas drawing applications (for handling various pointer events) and testing right-click functionalities (using the contextClick method).



Selenium 4’s enhanced Actions API empowers testers and developers with precise control over keyboard and mouse interactions. The Action Builder, along with commands like pause and releaseAll, adds flexibility and depth to test scenarios. In the next part of this series, we will explore advanced mouse and keyboard actions and their applications. Stay tuned for a comprehensive guide to mastering Selenium 4’s Actions API.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Jeet Palan

Jeet Palan is an Technical Consultant at Perficient. He has experience in Manual and Automation testing. In addition to this, he is willing to learn different types of testing and likes to know and learn about new trending technologies.

More from this Author

Follow Us