Exploring Mobile UI Testing - Abstractions and Tests Implementation

Exploring Mobile UI Testing - Abstractions and Tests Implementation

This post is part of a series that tries to describe my journey in the Mobile UI Automated Testing world. Today we're going to explore how test suites can be organized to keep testing code clean and maintainable.

Quick links to already published articles:


Test code is still code. Although those lines will never reach the end users' devices, test code needs the same attention as production code.

During the early phases of development, writing tests doesn't seem that difficult. A common paradigm when doing that is the Given/When/Then, wherein Given defines the preconditions and initial state, When usually contains some kind of interaction either performed on single or multiple entities, and Then verifies the final state, for instance with the use of assertions.

With the codebase growing and the project becoming more complex and bigger, the testing structure and strategy must keep up and be versatile enough to allow developers to keep writing tests in an easy and efficient way. Unfortunately, often this is not the case: many projects I have been working on myself, both in my day-to-day activities and not, didn't have a proper testing architecture. At some point, developers struggled in writing tests efficiently so far that eventually they gave up or wrote the bare minimum, sometimes just the happy path. This is something that should not happen at all. Writing code and developing features is a lot of fun, and so is writing tests if done properly.

Structured Testing

Mobile UI Automated Testing is not an exception. Interacting with the underlying framework, being Android or iOS, sure is a great challenge as it requires specific knowledge and sometimes even creative solutions, especially if we're testing features developed by someone else.

Over the past few years, I've learned to appreciate the Robot Pattern. I first tried it out after watching Instrumentation Testing Robots - Kotlin Night, a great talk given by Jake Wharton that describes the entire idea behind it. Having this level of separation between screens (or views) allows a tidier test implementation and also makes the code more readable and maintainable.

In the testing strategy I introduced in Telepass for Android and iOS, I try to leverage this paradigm as much as possible to have tests "speak" by themselves, so even someone who has never written a single line of Kotlin or Swift can understand what's going on in a specific step, and maybe even make some changes to the code.

Let's get a bit more specific, though, as working through an example can explain this better. Note that I'll be pretty much Android-oriented in this post, but almost everything here can be applied to iOS as well.

F1 - Main Screen

Suppose the target application only has one screen (F1 - Main Screen), and we need to automate one of the tests from our test book that requires the title field to be The standard Lorem Ipsum passage.

This is roughly the test case definition someone has already prepared for us:

Test Case TC-001

Summary:
Verify that the title is displayed with a specific text when the App is launched.

Steps:
- Open the App

Expected Result:
- The title is displayed, with text: "The standard Lorem Ipsum passage"

From the XML file (let's leave Compose out of the picture for a moment) we know that the title has the id titleLabel. Using vanilla Espresso very wildly, we could write this:

internal class MyTestSuite {

	// TC-001
    @Test
    fun test_tc001() {
        // GIVEN
        val expectedTitle = "The standard Lorem Ipsum passage"
        
        // WHEN
        // [Code that launches the app, irrelevant for now]
        
        // THEN
        onView(withId(R.id.titleText)).check(matches(withText(expectedTitle))
    }
}
MyTestSuite.kt

One might say that the test is complete. We have defined our precondition, which is the value of  expectedTitle, made the app start, and then verified that the element matching the id R.id.titleText has expectedTitle as text.

This implementation, however, has many flaws:

  • It does not give a clear understanding of what is being tested. Sure, titleLabel is kinda self-explanatory, but what if the id was txtLabel or something more cryptic that could not be changed easily from the production code?
  • Locators and matchers are hard-coded in the test body. This surely works for one test, but what if we had to perform the same check from multiple test cases? The view-matching code above is quite simple, but if we had to traverse the layout hierarchy with multiple matchers to get our element and perform different checks on that same element in multiple test cases, things would get messy. We would end up with a lot of useless boilerplate, and maintainability would be terrible as just an id change would require many fixes.
    Also, note that the string value we're testing the content of the field against is declared at the beginning of the test case: having it there, hard-coded, might be a code smell in some situations.
  • This test is not solid at all. What if one of the developers pushed a change that swaps the current main page with another one, which features an element with the same titleTextid and content? We might never notice as the test would keep passing, but end users might end up landing on a screen that maybe wasn't even supposed to be released yet!

Adding more clarity

We need to make some changes at this point, so let's start by looking at the raw test case from our test book once again.

Test Case TC-001

Summary:
Verify that the title is displayed with a specific text when the App is launched.

Steps:
- Open the App

Expected Result:
- The title is displayed, with text: "The standard Lorem Ipsum passage"
  • Assuming the current test suite refers to a specific application, we can accept Open the App as step since there's no other application to be tested.
  • The expected result, however, is not clear at all. It doesn't specify what screen we should expect, and this creates uncertainty. A good solution would be, for instance, specifying a few more details in the expected result, maybe with a screenshot or even a mockup image made by the design team.

Another improvement would be linking the JIRA ticket of the feature as well as a reference in the summary, so one could find all the requirements for a given section of the app. This potentially saves a lot of time when things are uncertain!

After gathering any missing requirement, let's update the test case definition:

Test Case TC-001

Summary:
Verify that the title is displayed with a specific text in the Main Screen of the App.

Steps:
- Open the App

Expected Result:
- The Main Screen of the App is displayed.
- The title label is displayed in the header section, with text matching: "The standard Lorem Ipsum passage".

Attachments:
- Mockup reference
- JIRA Ticket reference

The requirements are now clearer, and we know what to expect when automating the test case.

After taking care of the definition, we need to move our attention towards the implementation code. The current is very vague and doesn't provide the level of safety and guarantee we expect from an automated test to be reliable.

Composing our tests

Looking at the screen we're testing, we can split it into multiple sections:

F2 - Mapped Elements
  • Toolbar Section: it contains all the navigation bar elements, being the navigation button, the title, and even an extra button on the right-hand side
  • Header Section: our section of interest, where we have the label we need to check and an additional button whose behavior is unknown at this time
  • Body Section: an intermediate part of the screen where several other UI elements are present
  • Footer Section: the bottom part of the screen, containing a simple button

Last but not least, in the picture F2 - Mapped Elements we also have the Root Element. This element is the container of the whole view and subviews, and it helps us make sure to some extent that the screen we're currently testing is the correct one. Most of the time it's defined by assigning an id to the root element of the XML file, a testTag Modifier to the main Scaffold in Jetpack Compose Screens or an accessibilityIdentifier to UIKit/SwiftUI views. This view mapping is what gives a good guideline when we need to organize our code.

Hands on code, we start by creating an entity that will hide the implementation details from the test suite.

internal class MainScreenRobot {
    // `MyUiObjectResolver` implementation is left out for the sake of simplicity.
    private val rootView: UiObject = MyUiObjectResolver.byId(R.id.rootElement)
    
    init {
        assertTrue("MainScreen root view not shown", rootView.waitForExists(5000L)) // 5000L is the timeout.
    }
}
MainScreenRobot.kt

You might notice that here I defined a rootView as UiObject. UiObject is one of the classes provided by UiAutomator, the second well-known library after Espresso used in UI Testing on Android. The main reason behind this choice relies on the fact that with UiObject , waiting for the view to appear or disappear is quite easy, as waitForExists provides a great mechanism for that (please do not randomly use Thread.sleep()as it will introduce flakiness in your tests).

With this base MainScreenRobot implementation, its constructor will check for the root view to be actually shown on the screen with a timeout of 5000 milliseconds, as it might take a while to appear. In case it doesn't, waitForExists will return false and the assertion will make our test fail.

Now we need to write a function to check the content of the title field.

internal class MainScreenRobot {
    private val rootView: UiObject = MyUiObjectResolver.byId(R.id.rootElement)
    private val titleText = KTextView { withId(R.id.titleText) }
    
    init {
        assertTrue("MainScreen root view not shown", rootView.waitForExists(5000L)) // 5000L is the timeout.
    }
    
    // Header Interactions
    
    fun verifyTitleText() {
        titleText.hasText("The standard Lorem Ipsum passage")
    }
}
Extracting the hardcoded string is still preferable.

We have defined titleText as KTextView. UiObjects are good for some things, but being the public API rather verbose, my advice would be to use it as last resort and only when strictly necessary. Kakao is a great and better alternative, with the same capabilities provided by Espresso but with a more intuitive API.

The expected string here is also hard-coded, and since it's a static string that is not supposedly changing anytime soon, a better solution would be defining an additional resource set for the Android Test bundle and reference it via its test R.string value.

Back to our test, we can rewrite it as follows:

internal class MyTestSuite {

	// TC-001
    @Test
    fun test_tc001() {
        // GIVEN
        // ...
        
        // WHEN
        // [Code that launches the app, irrelevant for now]
        
        // THEN
        MainScreenRobot().verifyTitleText()
    }
}
MyTestSuite.kt

We got rid of the implementation details since we introduced MainScreenRobot as an abstraction. This second step is great, but we can do better: what if we introduced a very simple DSL to make it look more natural?

Let's go back to MainScreenRobot:

// NEW! Top level function that takes a `MainScreenRobot` function as argument.
internal fun mainScreenRobot(func: MainScreenRobot.() -> Unit): MainScreenRobot = MainScreenRobot().apply(func)

internal class MainScreenRobot {
    private val rootView: UiObject = MyUiObjectResolver.byId(R.id.rootElement)
    private val titleText = KTextView { withId(R.id.titleText) }
    
    init {
        assertTrue("MainScreen root view not shown", rootView.waitForExists(5000L))
    }
    
    // Header Interactions
    
    fun verifyTitleText() {
        titleText.hasText("The standard Lorem Ipsum passage")
    }
}
MainScreenRobot.kt

And our test suite changes to:

internal class MyTestSuite {

	// TC-001
    @Test
    fun test_tc001() {
        // GIVEN
        // ...
        
        // WHEN
        // [Code that launches the app, irrelevant for now]
        
        // THEN
        mainScreenRobot {
            verifyTitleText()
        }
    }
}
MyTestSuite.kt

Being able to write test steps with the help of a custom DSL allows the reader to understand scopes better. We now know when MainScreenRobot starts and ends its checks, and by keeping the 1 Robot <-> 1 Screen rule, we can easily discern the other pages we might interact with within a single test case.

Is the verbosity gone though?

After writing dozens of robots by hand with their respective sections, I noticed that there still was a lot of boilerplate, and having all the sections in the same file made some robots reach the length of 300-400 lines of code. This situation is not that surprising: after all, we're binding several sections to a single robot, and it goes without saying that verbosity and length kick in.

To work around this issue, let's introduce SectionRobots. A SectionRobotis just a section that can be attached to a Robot to compose the entire screen. It's defined in a separate file, and one section shall not interact with elements that appear in another section from the same robot. This kinda recalls the Single Responsibility Principle from SOLID: one section should be in charge of its elements and nothing else.

Let's refactor MainScreenRobot by adding a HeaderSection to it. This new class will only interact with the title field and nothing else.

internal class MainScreenHeaderSection: SectionRobot {
	private val titleText = KTextView { withId(R.id.titleText) }
    
    fun verifyTitleText() {
        titleText.hasText("The standard Lorem Ipsum passage")
    }
}
MainScreenHeaderSection.kt

Now let's add a way to use it inside MainScreenRobot:

internal fun mainScreenRobot(func: MainScreenRobot.() -> Unit): MainScreenRobot = MainScreenRobot().apply(func)

internal class MainScreenRobot {
    private val rootView: UiObject = MyUiObjectResolver.byId(R.id.rootElement)
    
    init {
        assertTrue("MainScreen root view not shown", rootView.waitForExists(5000L)) // 5000L is the timeout.
    }
}

// Another extension to keep using the same DSL.
internal fun MainScreenRobot.headerSection(func: MainScreenHeaderSection.() -> Unit): MainScreenHeaderSection = MainScreenHeaderSection().apply(func)
MainScreenRobot.kt

That's it! Now we can go back to our test case to make some other changes:

internal class MyTestSuite {

	// TC-001
    @Test
    fun test_tc001() {
        // [...]
        
        // THEN
		mainScreenRobot {
			headerSection {
	            verifyTitleText()
            }
        }
    }
}
MyTestSuite.kt

As you can notice, the introduction of a few levels of indentation helps us understand what's going on better.

Code generation

We could keep this solution and be done, but I think of myself as a very lazy person when it comes to writing code. Why would I have to write those extension functions over and over all the time? What if I have some robots that might need to interact with a common section (for instance, one that manages Snackbars)? When the project scales and 50+ robots are present, it's not that great to recreate the same extension function over and over.

There's a simple solution to that: code generation.

By using a custom annotation processor, the creation of extension functions for both robots and the respective sections can happen under the hood at compile time, leaving robots and sections tidy and well-structured.

Going back to MainScreenRobot:

@GenerateDsl // New annotation
internal class MainScreenRobot : Robot(R.id.rootElement)

As you can see, MainScreenRobot is now just a one-liner: it now inherits from the abstract class Robot which extracts the initialization logic (it's the same for any robot) and generates the extension function we've seen above with just the @GenerateDsl annotation.

The SectionRobot changes as well, but not that much as we need one additional line:

// New annotation here as well!
@AddTo(targets = [MainScreenRobot::class], identifier = "headerSection")
internal class MainScreenHeaderSection: SectionRobot {
	private val titleText = KTextView { withId(R.id.titleText) }
    
    fun verifyTitleText() {
        titleText.hasText("The standard Lorem Ipsum passage")
    }
}
MainScreenHeaderSection.kt

The @AddTo annotation takes two arguments: the former, targets, specifies which Robots the compiler should generate the extension functions for. The latter, identifier, provides an alias that will be the name of the extension function defined. This way you can have a very long section class name (MainRobotHeaderSection, for instance) and reference it in another way from the robot (headerSection, in our case).

In case you're wondering, note that SectionRobot is just an empty interface to have a stricter check when validating the annotation.

By processing these two annotations (in this case, via KSP, I will leave the implementation details to some other post), we can go take a look at the generated code.

import kotlin.Unit

internal fun mainScreenRobot(func: MainScreenRobot.() -> Unit): MainScreenRobot = MainScreenRobot().apply(func)
MainScreenRobot$DslExtensions.kt (@GenerateDsl output)
import kotlin.LazyThreadSafetyMode
import kotlin.Unit

private val mainScreenHeaderSection: MainScreenHeaderSection by lazy(mode =
    LazyThreadSafetyMode.SYNCHRONIZED) {
    MainScreenHeaderSection()}


internal fun MainScreenRobot.headerSection(func: MainScreenHeaderSection.() -> Unit): MainScreenHeaderSection =
    mainScreenHeaderSection.apply(func)
MainScreenRobot$MainScreenHeaderSectionExtension.kt (@AddTo output)

And that's it! If we come across a common section in our test suites, we can add it to as many robots as we want! This applies to SnackbarSection, DialogSection, ToolbarSection and many other elements that might appear on several screens in our application.

@AddTo(
    targets = [
        RobotOne::class,
        RobotTwo::class,
        RobotThree::class,
        RobotFour::class
        // ...
    ],
    identifier = "snackbarSection"
)
internal class SnackbarSection : SectionRobot {
	// ...
}
SnackbarSection might be a common section that could be attached to any Robot

With our Robots becoming shorter and shorter, and our SectionRobots having well-defined scopes, reading code like this is not hard at all. Also, thanks to this abstraction, our test suites won't need any changes in case some elements are refactored since the implementation details are handled within sections!

If we add a few more requirements to our test case, TC-001, we can take a look at how a more complex test can be written in our test suite.

Test Case TC-001

Summary:
Verify the main elements shown in the Main Screen of the App.

Steps:
- Open the App

Expected Result:
Toolbar Section
 - Navigation button: close button
 - Navigation title: "NavBar"
 - Navigation extra button: Button with label "Y"
Header Section
 - Title label is shown with text: "The standard Lorem Ipsum passage"
 - A button is shown on the right of the title label, with text: "BTN"
Body Section
 - A text field is shown with text: "Lorem ipsum dolor sit amet\nconsectetur adipiscing elit, sed\n do eiusmod [...]"
 - Two radio buttons are displayed, side by side, the first with label "Radio 1" and the second with label "Radio 2"
 - An empty editable text field is shown
Footer Section
 - A button is displayed with text: "CTA Button", it is enabled.

Attachments:
- Mockup reference
- JIRA Ticket reference

And this would be the implementation:

internal class MyTestSuite {
    @Test // TC-001
    fun test_tc001() {
        val expectedRadioButtons = arrayOf(
            RadioButton(text = "Radio 1", isChecked = false),
            RadioButton(text = "Radio 2", isChecked = false)
        )

        navigator {
            startApp()
        }

        mainScreenRobot {
            toolbarSection {
                verifyCloseButton()
                verifyNavigationTitle("NavBar")
                verifyExtraNavigationButton(text = "Y")
            }

            headerSection {
                verifyTitleText()
                verifyExtraTitleButton()
            }

            bodySection {
                verifyMainText()
                verifyRadioButtons(*expectedRadioButtons)
                verifyEmptyTextField()
            }

            footerSection {
                verifyNextButtonEnabled()
            }
        }
    }
}

If we put the test case definition and its implementation side by side, we can follow the steps and understand what the test is actually doing. This is extremely valuable in PRs too, as it helps the reviewer understand the steps better and make sure that the implementation is actually correct - given that the underlying implementations are valid.


With the help of annotation processing, a custom DSL, and a few other shortcuts, writing tests can be easy and efficient. Not only the implementation is faster, but a third party can figure out what is going on without losing themselves in tricky implementation details. This approach is the one currently powering ~800 test cases per platform in my daily activities, and I see its value in having freshly-onboarded SDETs be able to catch up and start implementing test suites in a very short amount of time!

We have finally reached our last stop in our Mobile UI Testing journey. I hope I gave you some insights on Robot Pattern, network isolation, and abstraction strategies that you can use in your day-by-day activities!

Niccolò Forlini

Niccolò Forlini

Senior Mobile Engineer