Exploring Selenium 4: Dive into the Actions API / Blogs / Perficient

Selenium 4, the latest iteration of the Selenium WebDriver, introduces several enhancements to streamline web automation. One notable improvement is the revamped Actions API. This low-level interface provides virtualized device input actions to the web browser, offering granular control over keyboard, mouse, pen, touch devices, and even scroll wheel interactions. In this blog, we will delve into the intricacies of the Actions API, exploring its key components and providing practical examples.

Understanding the Actions API

Key Components: Action Builder

The Actions API introduces an intricate but powerful set of low-level building blocks. These blocks include commands for key inputs, pointer inputs, and wheel inputs. The Action Builder allows the construction of individual action commands assigned to specific inputs, which can be chained together. The associated perform method is then called to execute them collectively.

Pause Command

While pointer movements and wheel scrolling allow users to set a duration for the action, the pause command becomes handy for inserting a delay between actions. This is crucial for ensuring correct execution in scenarios where a pause is needed.

WebElement clickable = driver.findElement(By.id("clickable"));

newActions(driver)

.moveToElement(clickable)

.pause(Duration.ofSeconds(1))

.clickAndHold()

.pause(Duration.ofSeconds(1))

.sendKeys("abc")

.perform();

WebElement clickable = driver.findElement(By.id("clickable")); new Actions(driver) .moveToElement(clickable) .pause(Duration.ofSeconds(1)) .clickAndHold() .pause(Duration.ofSeconds(1)) .sendKeys("abc") .perform();

WebElement clickable = driver.findElement(By.id("clickable"));

new Actions(driver)

    .moveToElement(clickable)

    .pause(Duration.ofSeconds(1))

    .clickAndHold()

    .pause(Duration.ofSeconds(1))

    .sendKeys("abc")

    .perform();

Release All Actions

An important consideration is that the driver retains the state of all input items throughout a session. The releaseAll method becomes valuable for resetting the state by releasing all currently depressed keys and pointer buttons.

((RemoteWebDriver) driver).resetInputState();

((RemoteWebDriver) driver).resetInputState();

Keyboard Actions in Selenium 4

Keys Representation

Selenium 4 provides a representation of any key input device for interacting with a web page. Aside from supporting ASCII characters, each keyboard key has a representation that can be pressed or released in designated sequences. Selenium assigns unicode values to keyboard keys for use in automation.

Keyboard Actions Example

Revolutionize Your Business With Generative AI

From product design and software development to virtual agents, content creation, and reporting, GenAI is transforming business. Our AI experts help you unlock GenAI’s full potential and drive growth.

Let’s Get Started

Here’s an example demonstrating keyboard actions using the Actions API:

newActions(driver)

.keyDown(Keys.SHIFT)

.sendKeys("a")

.keyUp(Keys.SHIFT)

.sendKeys("b")

.perform();

new Actions(driver) .keyDown(Keys.SHIFT) .sendKeys("a") .keyUp(Keys.SHIFT) .sendKeys("b") .perform();

new Actions(driver)

    .keyDown(Keys.SHIFT)

    .sendKeys("a")

    .keyUp(Keys.SHIFT)

    .sendKeys("b")

    .perform();

This sequence involves pressing the Shift key, typing ‘a’, releasing the Shift key, and then typing ‘b’.

SendKeys Method

The sendKeys method is a convenience method in the Actions API that combines keyDown and keyUp commands in one action. It’s particularly useful when needing to type multiple characters in the middle of other actions.

newActions(driver)

.sendKeys("abc")

.perform();

new Actions(driver) .sendKeys("abc") .perform();

new Actions(driver)

    .sendKeys("abc")

    .perform();

Mouse Actions in Selenium 4

Mouse Actions Overview

Similar to keyboard actions, Selenium 4 provides a representation of any pointer device for interacting with a web page. The Actions API supports various mouse actions, including clicking and holding, clicking and releasing, right-clicking, double-clicking, and moving the mouse.

Mouse Actions Example

Here’s an example demonstrating mouse actions using the Actions API:

WebElement clickable = driver.findElement(By.id("clickable"));

newActions(driver)

.clickAndHold(clickable)

.perform();

WebElement clickable = driver.findElement(By.id("clickable")); new Actions(driver) .clickAndHold(clickable) .perform();

WebElement clickable = driver.findElement(By.id("clickable"));

new Actions(driver)

    .clickAndHold(clickable)

    .perform();

This action involves clicking and holding the left mouse button on a clickable element.

Real-Time Usage Scenarios

Mouse actions find applications in scenarios like canvas drawing applications (for handling various pointer events) and testing right-click functionalities (using the contextClick method).

Conclusion

Selenium 4’s enhanced Actions API empowers testers and developers with precise control over keyboard and mouse interactions. The Action Builder, along with commands like pause and releaseAll, adds flexibility and depth to test scenarios. In the next part of this series, we will explore advanced mouse and keyboard actions and their applications. Stay tuned for a comprehensive guide to mastering Selenium 4’s Actions API.

Thoughts on “Exploring Selenium 4: Dive into the Actions API”

fasterman March 13, 2025 at 1:19 pm

Thank you for sharing good information.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Exploring Selenium 4: Dive into the Actions API

by Jeet Palan on December 5th, 2023 | ~ minute read

Understanding the Actions API

Keyboard Actions in Selenium 4

Revolutionize Your Business With Generative AI

Mouse Actions in Selenium 4

Conclusion

Tags

Thoughts on “Exploring Selenium 4: Dive into the Actions API”

Leave a Reply

Jeet Palan

Categories

Follow Us