Automated GUI testing with AutoHotKey

Posted by m.jackson on 2 February 2015 - 2:00pm

Robot in front of a screen

By Mike Jackson, Software Architect.

As part of a recent open call collaboration with the Distance project at the University of St. Andrews, I was asked about open source tools to automatically test GUI-based applications on Windows.

By coincidence, an EPCC colleague had recently asked me the same question. So, I hit Google and Wikipedia, tracked down some candidates and decided to try the free open source AutoHotKey toolkit. In this blog post, I describe my experiences with this "scriptable desktop automation" tool.

Freeware and open source GUI test tools

Wikipedia lists a number of open source GUI test tools and Google revealed a couple of others. However, only a few of these are free or open source:

Name	Licence	About
AutoIt	Closed source freeware	A scripting language with commands to invoke keystrokes, mouse moves and other window interactions, in addition to a rich set of programming language features including variables, conditionals, loops, and subroutines. AutoIt can be used for both repetitive task automation and GUI testing.
AutoHotKey	GNU Public Licence	A fork of AutoIt originating from 2003 - AutoIt closed their source later. The fork arose over a desire to support hot-keys, shortcuts consisting of keystroke combinations. Key differences are listed in a AutoIt / AutoHotkey comparison.
Robot Framework	Apache 2	A generic acceptance test automation framework written in Python. A library provides extensions that allow AutoIt to be used from within Robot Framework.
Sikuli Script	MIT	An automation tool that uses image recognition to identify and interact with display components. Sikuli uses Python as its scripting language. Sikuli can be used from within automated test frameworks, for example, CTest.
Qaliber	GPL	A GUI test automation framework that allows tests to be written via a drag-and-drop interface.

As to selecting a tool, Qaliber seems to have stagnated, its last release was 2011 and I could not find any user documentation. Sikuli Script is based on image recognition and I was concerned this would make it too tightly coupled to how a GUI is presented as opposed to how it is used (I'll return to this shortly). Robot Framework uses AutoIt for GUI testing so the choice reduced to AutoIt or AutoHotKey. I decided upon AutoHotKey as it met the original request for a free, open source solution.

AutoHotKey

Using AutoHotKey (1.1.16.0) I developed a three test scripts for Distance for Windows 6.2 which runs under Microsoft Windows.

My approach was as follows. I wrote the scripts in a text editor consulting the AutoHotKey command list when required. AutoHotKey's WinWaitActive command was used to pause each script until windows appeared or were active. A Send command was used to invoke shortcuts, select tabbed panels, move up or down lists and delete text in forms. A SendRaw command was used to enter free text, for example project names or file names, into fields.

If no shortcut existed within Distance for Windows to invoke a command, then the button or menu option would be invoked via a generic Send {Down} command to send a mouse click.

Selecting menus, buttons, tabbed panels and fields can be done using a Click command which takes coordinates relative to top-left corner of the active window. However, using coordinates closely couples a script to the position of widgets within the display. To improve the robustness of the script, I instead decided to rely on Distance for Window's tab order. The tab order is the order in which widgets are activated when the tab key is pressed. If one knows what widget currently has focus, one can use AutoHotKey's Send command to send tab keystrokes to move focus to the next widget to be invoked e.g. to move from a field in which text has been entered, to a button to invoke some action using that text.

In some cases, however, widgets were not part of the tab order and so, for these, relative coordinates had to be used. AutoHotKey ships with a tool, Window Spy, which displays information about the active window, including the relative position of a mouse pointer, which helps to determine the coordinates to use with the Click command.

The three scripts I wrote were as follows (the links take you to the scripts, which are text files):

Start and stop Distance for Windows.
Create a new project. This has an example of checking for project files after invoking the commands to create a new project.
Execute walkthrough Example 1 from the Distance for Windows 6.2 user's guide.

Pros and cons

AutoKey was very straightforward to get started with. A simple example that started up and shut down Distance for Windows was written within 15 minutes from first downloading AutoHotKey. The availability of commands both to invoke actions on a GUI via keystrokes or mouse clicks and to execute other operations (e.g. checking to see if a file exists, deleting temporary files etc.) is rich enough that a wide variety of automated scenarios or tests could be implemented. AutoHotKey's support for include files and subroutines allows for modular scripts to be developed.

On the downside, a lot of experimentation was needed to ensure that Distance for Windows was given enough time to respond before invoking the next action in the script. AutoHotKey's SetDelay command can impose a standard delay (in my scripts I set this to 1000ms) but for certain tasks (e.g. waiting for Distance for Windows to complete an analysis task) extra time was needed. This required some trial-and-error to identify suitable timings.

AutoHotKey was very sensitive to window sizes when using coordinates for mouse clicks. The scripts ran fine on a Windows XP virtual machine but not on my Windows 7 laptop as the widgets were a slightly different size. Position-based commands for mouse moves and clicks can be avoided if there are shortcuts or support for a tab order, but in Distance for Windows this was not always the case.

Likewise, AutoHotKey was very sensitive to text within windows. If windows don't have unique titles then determining which are active at any time must be done using text that AutoHotKey can scrape from their contents. Care had to be taken to ensure that any text used to identify windows was unique to that window (not just within the application being run but across all windows currently on the desktop). As a result, it is better for AutoHotKey to be run with a minimum of other applications open.

Sometimes keyboard shortcuts couldn't be used because more than one widget shared the same shortcut or there was no shortcut. Manually counting the number of tab presses required to move between widgets was time consuming.

Designing software with these limitations in mind e.g. ensuring that windows have unique titles, buttons and menu options have unique shortcuts and that commonly-occurring activities can be executed solely via the keyboard, would reduce the impact of some of these problems. As an aside, usability guru Jakob Neilsen recommends Keyboard-Only Navigation for Improved Accessibility.

Conclusion

Despite the problems encountered, I was impressed by how quickly I could write AutoHotKey scripts and the rich command set it supported. The key question that anyone considering its use needs to resolve is how to balance the trade-off between the time taken to write each script (implementing Example 1 took approximately 5 hours) and the stability of the interface in terms of widget positions, shortcut names and tab orders - the more frequently these change, the more work needs to be invested in keeping any scripts up to date.

If you have experience in developing automated tests for GUIs, or have views on AutoHotKey or other GUI automation tools then please comment below.