Blog
    Academy

    How to Automate a Windows Application

    Why most automation tools fail on enterprise Windows environments and how computer-use agents handle legacy apps, locked-down builds, and industrial HMI displays.

    How to Automate a Windows Application

    Automating a Windows application sounds straightforward. Install a tool, record some clicks, run the script. In practice, the environments where automation matters most are exactly where most tools break down: enterprise desktops, legacy ERP systems, industrial HMI panels, locked-down production builds. More tools claim to solve this now. The underlying problem hasn't changed

    Why Windows Automation Is Still Hard

    Most automation tools share the same assumption: the application exposes something they can hook into. A code hook, an accessibility tree, an object ID, a stable selector. That assumption holds for modern web apps. It breaks in three places that show up constantly in enterprise Windows environments.

    Legacy applications. Enterprise Windows desktops often run applications built on Win32, WPF, or WinForms, sometimes decades old. These applications may have no accessibility hooks, no stable selectors, and no APIs. Traditional tools simply cannot reach the elements they need to interact with.

    Locked-down production builds. Test builds often include instrumentation hooks that make automation possible. Production builds strip those hooks out. A script that works perfectly in a test environment stops working the moment it's pointed at a production build. That's usually the environment where validation actually matters.

    HMI and embedded displays running on Windows. Many HMI applications, including automotive digital cluster simulators, industrial control interfaces, and medical device UIs, can run on Windows machines. These applications don't expose accessibility hooks or structured selectors. The only interface is the screen itself. These are the environments where most automation tools stop working.

    What Changes With Agentic Automation

    The shift from script-based tools to agentic automation addresses the root cause of these failures. Script-based tools fail because they rely on code structure that isn't always there. Agentic automation adapts by reading the screen directly when that structure is not available.

    AskUI operates at the OS level. When structured signals are available, the agent uses them. When they are not, such as on locked-down builds, legacy applications, or HMI interfaces, it reads the screen directly. The agent perceives the screen the same way a human engineer would and acts on what it sees.

    What the Test Project Looks Like

    Everything the agent needs lives in plain text files. The folder structure determines what runs and in what order.

    ├── prompts/ │ ├── device_information.md # Windows version + display details │ ├── ui_information.md # app-specific concepts │ └── report_format.md ├── procedures/ │ └── launch_app.md ├── plans/ │ └── regression.md └── tests/ └── your_windows_app/ ├── setup.md ├── rules.md └── main_flow.md

    device_information.md tells the agent what it is running on:

    # Device Information Target: Windows 11, Intel Core i7 Display: 1920x1080 Input: keyboard + mouse Connection: AgentOS host

    A test file looks like this:

    # Test: Verify application status on startup ## Preconditions - Application is installed - No active session running ## Steps 1. Open the application from the Start Menu 2. Wait until the main dashboard loads 3. Verify the status display shows Ready ## Postconditions - Status indicator is green - No error messages are visible

    QA engineers, domain experts, and testers who know the application can write and maintain tests in plain text. No scripting or automation expertise required.

    Where This Matters Most: Enterprise and Industrial Windows

    For general Windows desktop apps, the difference between agentic and script-based tools is a matter of maintenance overhead. For enterprise and industrial environments, it's the difference between automation being possible or not.

    SIL environments on Windows. HMI simulation software running on Windows is one of the key environments where traditional tools fail completely. The display is rendered by a proprietary engine with no accessibility layer. AskUI operates at the OS level and interacts with what is rendered on screen, regardless of what's underneath.

    Teams running Windows HMI and industrial applications have used AskUI to automate across multiple machines simultaneously, achieving stable regression runs without modifying the target system.

    VDI and Citrix sessions. Remote desktop environments present the same problem. The application runs inside a virtualized session with no direct element access. AskUI's AgentOS runs locally on the target device and operates at the system input layer, making it compatible with VDI and Citrix without additional configuration.

    Teams running POS and enterprise Windows applications have run VM-based tests without an active RDP session, removing a key infrastructure dependency.

    Cross-variant testing. Enterprise Windows deployments often involve the same application running in multiple configurations: different languages, different feature sets, different hardware. Script-based tools require separate scripts for each variant. Because AskUI reads the screen rather than depending on code structure, the same test logic runs across variants without rebuilding.

    Teams with large existing Windows test suites have evaluated AskUI as a fit for VM-based environments, including WinForms applications with hundreds of existing test cases.

    Getting Started on Windows

    AgentOS installs on Windows machines in service mode, supporting RDP resilience and SYSTEM-level privileges for unattended runs. Full setup instructions are available in the docs.

    FAQ

    Does AskUI work with applications that have no DOM or accessibility hooks?

    Yes. AskUI operates at the OS level and reads the screen directly. It works on any application with a visible screen interface, including legacy software and locked-down builds with no standard automation support.

    What about locked-down production builds?

    AskUI does not require instrumentation hooks or code-level access to the application under test. It works on production builds the same way it works on test builds.

    Does it work in VDI or Citrix environments?

    Yes. AskUI's AgentOS runs locally on the target device and operates at the system input layer, making it compatible with virtualized environments.

    What Windows applications can AskUI automate?

    Any application with a visible screen interface: desktop apps, legacy enterprise software, HMI simulators, VDI sessions, and embedded displays running on Windows. For more on how this applies specifically to HMI and hardware validation environments, see AskUI: Eyes and Hands of AI Agent Explained.

    For builders

    Start for free.

    Download AskUI Desktop, clone the demo project, or start with the SDK. Add API keys when you are ready to run agents.

    Start for free
    For teams

    Ready for production?

    Commercial AgentOS, bring your own model, and custom infrastructure for distributed fleets. We'll map a plan to your stack.

    Talk to us

    We value your privacy

    We use cookies to enhance your experience, analyze traffic, and for marketing purposes.