Documentation Index
Fetch the complete documentation index at: https://docs.simular.ai/llms.txt
Use this file to discover all available pages before exploring further.
The Simulang Primer
A hands-on tour of writing scripts in Simulang. By the end you’ll know how to drive a real browser, fill in a form, click through a UI, take screenshots, ground elements with a vision model, and compose larger workflows out of small reusable steps — all in plain TypeScript or JavaScript. This primer is meant to be read in order; each chapter builds on the one before it. The complete API reference for the underlying library lives innode_modules/@simular-ai/simulib-js/index.d.ts once you’ve installed anything — keep that open in another tab when you’re ready to go deeper than what we cover here.
Table of contents
- What is Simulang?
- Your first script
- Opening an app
- Your first automation
- The accessibility tree
- Acting on elements
- Finding the element you want
- Waiting for the UI
- When things don’t work
- A complete example
- When accessibility isn’t enough
- Files, env vars, and shelling out
- Composing scripts
- Common pitfalls
- Where to go next
- Simulang in Claude Code
1. What is Simulang?
Simulang is a small command-line tool for running automation scripts against real desktop applications — your default browser, your editor, a chat app, a native dialog, whatever happens to be running. A Simulang script is an ordinary ES module (a.ts, .mts, or .mjs file) that imports from @simular-ai/simulib-js and drives the OS through its accessibility APIs.
If you’ve used Playwright or Puppeteer, the mental model will feel familiar: open a thing, find an element, interact with it, assert what happened. The differences:
- Simulang isn’t browser-only. The same APIs drive native apps — a messaging client to grab a 2FA code, a finder window, a desktop installer.
- There’s no headless mode. You’re automating the actual UI a person would see, on the actual screen.
- Element discovery uses the OS-level accessibility tree, not CSS selectors. That’s how the same script can find a button in Chrome and a menu item in a native macOS app.
simulang installed and a desktop you can run scripts against — see the simulang-cli README for install, authentication, and version-pinning details. From here on, we focus on writing and running workflows.
2. Your first script
Createhello.ts:
async function main() wrapper. A Simulang script is just an ES module, and the top of the file is the entry point. Top-level await works, dynamic imports work, and the full Node standard library is available.
.ts, .mts, or .mjs file. Run it with simulang run. Use process.env, process.exit, and anything else you’d use in a normal Node script.
3. Opening an app
The@simular-ai/simulib-js package gives you everything else. Let’s open example.com in the default browser:
FocusPolicyisStealorDoNotSteal. “Steal” asks the OS to bring the app to the front; “DoNotSteal” asks it not to. The word “ask” is doing real work in that sentence — see the pitfalls chapter.VisibilityisShoworHidden. Same caveat: it’s a request, not a guarantee, and Chromium-based apps in particular ignoreHidden.waitForLoadComplete(the trailing boolean) blocks for a short, fixed delay so the app or URL has a chance to become responsive beforeopen()returns. It’s a sleep, not a full network wait — you still need to give the page time to render before reaching for the DOM.
Instance — a handle to the running app:
App.exactName():
open(null, …) is the canonical way to raise an already-running app to the foreground without changing its state.
4. Your first automation
Time to make something happen. Save this aswikipedia-stats.ts:
Special:Statistics page.
Why n.description and not n.name? Web AX on macOS commonly exposes link text in the description field with name left empty. Other platforms differ — that’s exactly the kind of cross-platform variability chapter 7 addresses with a labelOf helper that checks every field a label might live in.
macOS first-run prompt. The first time a Simulang script touches another app, the OS will prompt you to grant accessibility permission to the terminal you ran it from (System Settings → Privacy & Security → Accessibility). Grant it once and you’re set.
That’s the whole Simulang loop in twenty-odd lines. The shape is the same for every script you’ll write — from this demo to a multi-step purchase flow:
- Open or attach to an app —
App.defaultBrowser().open(here), orAccessibilityTree.fromPidto drive something already running. - Snapshot the accessibility tree —
tree.snapshot(true)returns the visible UI as a tree of nodes. - Walk the tree to find the node you want — any standard array or object operation works, since the snapshot is plain data.
- Act on its
refId—tree.activate(refId)for clicks,tree.setValue(refId, text)for typing, and a handful of others.
.ts so we can show type annotations where they’re informative — .mjs works just as well.
5. The accessibility tree
You just used the accessibility tree without an introduction. Let’s back up and look at what it is. The accessibility tree is a snapshot of every visible widget in a window — the same data screen readers use to read a UI aloud. Every OS exposes one, and@simular-ai/simulib-js gives you a uniform Node API over it.
Instance.pid (chapter 4 used this form):
refId. It’s an opaque integer that identifies a node for this snapshot only. You pass it to interaction methods like tree.activate(refId) to click. Two crucial properties:
- A
refIdis valid only as long as the snapshot that produced it. Take a new snapshot, and the old refIds are dead. We’ll see how to cope with that in chapter 8. - A node only has a
refIdwhen the OS exposed one. Some structural wrappers (groups, panels) don’t have refIds; you can still see them in the tree, but you can’t click them. Always checkrefId != nullbefore acting.
AriaRole. The common ones:
AriaRole is a numeric enum. The TypeScript reverse-mapping trick — AriaRole[someNode.role] — does not work for numeric NAPI enums. Use the exported helper instead:
refId. The rest of this primer is elaborations on that loop.
6. Acting on elements
Once you have arefId, the AccessibilityTree object exposes a small family of action methods:
activate and setValue. The rest are special-purpose helpers for cases where a plain click doesn’t do the right thing.
Sync, not async. Every action method on AccessibilityTree is synchronous — they don’t return Promises, and you don’t await them. The await you saw in chapter 4 was for sleep, not for tree.activate. tree.snapshot() is also synchronous. The only things you await in a typical script are sleeps, polling helpers like withSnapshot (chapter 8), and Node stdlib promises (fetch, node:fs/promises).
You already saw activate in chapter 4. Filling a form looks the same — find the input, set its value, find the submit button, activate it:
- The predicate checks
n.refId != null. Without that, you might match a structural wrapper that has the right name but no way to be acted on. - Both
setValueandactivateuserefIds from the same snapshot. As long as no new snapshot has happened between them, the refs stay valid. After any navigation or UI mutation, you’ll need a fresh snapshot — chapter 8 shows the pattern.
7. Finding the element you want
Real UIs make element discovery harder than “find by name.” A few practical patterns we use again and again. A note on the helpers in this chapter.labelOf, stripPUA, flattenDFS, pageNodes below — and withSnapshot in the next chapter — are not exported from @simular-ai/simulib-js. They’re a handful of lines each, and copying them into each script is faster than reaching for a dependency. The upside of script-local helpers is that you can tweak them when a project needs something different (a labelOf that includes automationId, a pageNodes scoped to a specific app). If you find yourself maintaining a project-wide variant, factor it into a local lib/ directory and import it normally.
7.1 The label can live in any of four fields
Different platforms put a control’s accessible text in different places:- Native macOS apps usually put it in
name. - Text inputs put the current value in
value(notname). - Web AX on macOS often puts link text in
description. - Tooltips show up in
helpText.
labelOf(node) is the answer to “what would a screen reader say about this element?”, regardless of which field the OS chose.
7.2 Strip Private-Use-Area glyphs
Web pages frequently put icon-font glyphs (Font Awesome, Material Icons) into link labels. These come through as characters in Unicode’s Private Use Area (U+E000–U+F8FF), and they break anchored regexes:
7.3 Scope to the page, not the chrome
A browser snapshot includes the URL bar, tab strip, bookmarks bar, dev-tools panes — none of which you usually want when you’re looking for an “Add to cart” button. The convention is to flatten only the lastAriaRole.Document subtree, which is the active tab’s web content:
pageNodes(root) instead of flattenDFS(root) when you’re hunting for something inside the page.
7.4 Use the built-in search when it fits
AccessibilityTree also exposes a search method that mirrors the common case:
find for “fast path, one role, one name match.” Drop back to manual DFS when you need spatial reasoning — for example, “the second textbox that appears between the heading ‘Sign in’ and the button ‘Continue’.“
8. Waiting for the UI
Pages mutate. Modals appear. The element you want shows up half a second after a click. BecauserefIds invalidate every time you call snapshot(), the safe pattern is a polling loop that re-snapshots each tick and looks for what you want:
- The element might not be there yet — page is still loading, animation is mid-flight, modal hasn’t opened.
- The element might be there but its
refIdis from a now-stale snapshot. Polling re-issues the snapshot for you. - The named
labelgives you a useful error message when the wait genuinely times out: “Timed out waiting for the Allow button in the blocked-download panel” is much nicer to read at 3am than “Timed out waiting for node.”
9. When things don’t work
Sooner or later you’ll getTimed out waiting for "Continue" button, or your find will silently return undefined. Here’s the troubleshooting playbook.
9.1 Print the tree
The fastest way to debug a missing element is to dump what is there:- The element isn’t in the list. The page hasn’t rendered it yet. Sleep longer, or — better — wrap the lookup in
withSnapshotso it polls. - It is in the list, but your predicate isn’t matching. Almost always: the label is in
descriptionorvalueinstead ofname, or there’s a stray icon-font glyph you didn’t strip. See chapter 7.
9.2 Common root causes
A handful of issues account for most “where did my element go” problems:- The label is in
description,value, orhelpText, notname. UselabelOf(chapter 7) instead ofnode.namedirectly. - The element has no
refId. Structural wrappers (groups, panes) often share a name with their actionable children but don’t get a refId of their own. Always filter withn.refId != nullin your predicate. - You’re looking inside the wrong subtree. Browser chrome (URL bar, bookmarks bar) lives at the top of the snapshot; the active page is inside the last
AriaRole.Document. UsepageNodes(chapter 7) when you only want page content. - The page renders after your snapshot. Wrap the find in
withSnapshotrather than bumping a brittlesleep. - The element appears, then disappears. SPAs sometimes re-mount nodes mid-update.
withSnapshotretries on a fresh snapshot each tick; barefinddoesn’t. - You’re bound to the wrong window.
AccessibilityTree.fromForeground()binds to whichever app is frontmost at the moment of the call. If a notification stole focus during yoursleep, you’ll get its tree instead.fromPid(instance.pid)is the safer choice when you have a pid.
9.3 Iterate in the REPL
When you’re stuck on a predicate,simulang run -i is much faster than editing-and-rerunning the whole script. Bring the target app to the front first, then:
fromForeground() binds to whatever’s active at the moment of the call — that’s why the app needs to be focused before you run it.
10. A complete example
Let’s tie everything together. The script below opensautomationexercise.com (a public test site), confirms we’re logged out, signs in with credentials from environment variables, and signs out again. We verify each state transition through the accessibility tree.
- Imports at the top, secrets from
process.env. - A handful of small helpers (
sleep,labelOf,flattenDFS,withSnapshot) — the chapter-7 note about script-local helpers applies here. - One big
try/catchwithprocess.exit(1)on failure. - Numbered phases, each one a
withSnapshotthat names what it’s waiting for. The names double as breadcrumbs in the logs.
11. When accessibility isn’t enough
The accessibility tree is the right tool nine times out of ten, but not always. Two escape hatches:11.1 Hardware mouse and keyboard
When a control has norefId — a canvas, an embedded video player, an OS-level dialog the app doesn’t expose — drop to real coordinates and key events:
Coordinate.Abs is absolute pixels; Coordinate.Rel is movement relative to the current cursor position. For drag-and-drop, send Direction.Press, move, then Direction.Release.
11.2 Vision grounding
When you can see the element but neither the OS nor your regex can find it, take a screenshot and ask a vision model where it is:shot.ground(model, query) returns absolute screen coordinates for whatever the model thinks best matches your natural-language query.
A few practical notes:
- Pass
truetoscreenshotFullto hide the cursor — otherwise the model may describe the cursor as part of the UI. - Grounding is network-bound: it needs
OPENROUTER_API_KEYset. - Vision grounding is your last resort, not your first move. It’s slower, costs money, and is less reliable than the accessibility tree. Use it for the 1% of cases where nothing else works.
12. Files, env vars, and shelling out
There are no Simulang-specific I/O wrappers. Use Node’s standard library directly:import.meta.url + fileURLToPath is the ES-module equivalent of the CommonJS __dirname. Use it any time you need to resolve a path relative to the script itself.
The @simular-ai/simulib-js package also ships convenience classes — File, Directory, Clipboard, System — when you want cross- platform helpers (an OS clipboard read, a temp directory with an explicit lifecycle). They’re worth using for clipboard and audio work; for plain file reads and writes, staying in node:fs keeps the concept count low.
13. Composing scripts
As workflows grow, you’ll want to split them up. There are two natural ways.13.1 Import helpers from sibling modules
The simplest: extract reusable functions into their own files and import them.tree, the instance) as arguments rather than reaching for module-level singletons.
13.2 Run other scripts as subprocesses
When you want full isolation between phases — fresh process, fresh state, no shared imports — orchestrate from a parent script:place-order.ts fails, you can iterate on it directly without re-running sign-in every time.
14. Common pitfalls
A short list of things that bite first-time users (and second- and third-time users, honestly).- Temp directories don’t auto-clean. If you use
Directory.temp(), wrap the work intry { … } finally { dir.remove() }.
15. Where to go next
You now know enough to write real Simulang scripts. From here:- Read
index.d.tsin the resolved@simular-ai/simulib-jsinstall. It’s the typed surface of the library — roughly 1500 lines, organised by class. Source of truth for signatures and enums, and the entry point to everything we didn’t cover here (audio capture, speech-to-text, screen recording). - Read the library’s
CLAUDE.mdnext toindex.d.ts. It covers idioms, lifecycle preconditions, and platform quirks — things types can’t express. - Browse the example scripts in
examples/inside the install. They’re the best place to see larger patterns: a full purchase flow, file downloads with rename, screenshot-and-ground, module composition. - Build your reflexes at the REPL. When you’re not sure how an API behaves,
simulang run -iis faster than scripting it. Snapshots are cheap; experiment.
16. Simulang in Claude Code
If you use Claude Code in the editor, you can drive the same desktop APIs without hand-writing every script: install the skill and use/simulang so Claude generates and runs automation for you. See Simulang with Claude Code for setup (simulang init-claude), example prompts, permissions, and tips.
