> ## Documentation Index
> Fetch the complete documentation index at: https://docs.simular.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Simulang Primer

> Hands-on tour of writing Simulang scripts — browser automation, forms, screenshots, and workflows.

# The Simulang Primer

A hands-on tour of writing scripts in **Simulang**. By the end you'll know how to drive a real browser, fill in a form, click through a UI, take screenshots, ground elements with a vision model, and compose larger workflows out of small reusable steps — all in plain TypeScript or JavaScript.

This primer is meant to be read in order; each chapter builds on the one before it. The complete API reference for the underlying library is hosted at [simulang-js](https://docs.simular.ai/simulang-js/api/latest) — keep that open in another tab when you're ready to go deeper than what we cover here. Once a chapter clicks, the **[Simulang Recipes](https://github.com/simular-ai/simulang-recipes)** repo is the easiest way to see the same ideas applied end-to-end: every recipe is a small, runnable script you can clone, run, and adapt.

***

## Table of contents

1. [What is Simulang?](#1-what-is-simulang)
2. [Your first script](#2-your-first-script)
3. [Opening an app](#3-opening-an-app)
4. [Your first automation](#4-your-first-automation)
5. [The accessibility tree](#5-the-accessibility-tree)
6. [Acting on elements](#6-acting-on-elements)
7. [Finding the element you want](#7-finding-the-element-you-want)
8. [Waiting for the UI](#8-waiting-for-the-ui)
9. [When things don't work](#9-when-things-dont-work)
10. [A complete example](#10-a-complete-example)
11. [When accessibility isn't enough](#11-when-accessibility-isnt-enough)
12. [Files, env vars, and shelling out](#12-files-env-vars-and-shelling-out)
13. [Composing scripts](#13-composing-scripts)
14. [Common pitfalls](#14-common-pitfalls)
15. [Where to go next](#15-where-to-go-next)
16. [Simulang in Claude Code](#16-simulang-in-claude-code)

***

## 1. What is Simulang?

**Simulang is a small command-line tool for running automation scripts against real desktop applications** — your default browser, your editor, a chat app, a native dialog, whatever happens to be running. A Simulang script is an ordinary ES module (a `.ts`, `.mts`, or `.mjs` file) that imports from `@simular-ai/simulang-js` and drives the OS through its accessibility APIs.

If you've used Playwright or Puppeteer, the mental model will feel familiar: open a thing, find an element, interact with it, assert what happened. The differences:

* Simulang isn't browser-only. The same APIs drive **native apps** — a messaging client to grab a 2FA code, a finder window, a desktop installer.
* There's **no headless mode**. You're automating the actual UI a person would see, on the actual screen.
* Element discovery uses the OS-level **accessibility tree**, not CSS selectors. That's how the same script can find a button in Chrome and a menu item in a native macOS app.

This primer assumes you have `simulang` installed and a desktop you can run scripts against — see the [simulang-cli README](https://github.com/simular-ai/simulang-cli) for install, authentication, and version-pinning details. From here on, we focus on writing and running workflows.

***

## 2. Your first script

Create `hello.ts`:

```ts theme={null}
// hello.ts
const { platform, version } = process
console.log(`Hello from simulang on ${platform}, Node ${version}.`)
```

Run it:

```
$ simulang run hello.ts
Hello from simulang on darwin, Node v22.18.0.
```

That's it — no boilerplate, no `async function main()` wrapper. A Simulang script is just an ES module, and the top of the file *is* the entry point. Top-level `await` works, dynamic imports work, and the full Node standard library is available.

```ts theme={null}
// hello-async.ts
// Top-level await is fine.
await new Promise((r) => setTimeout(r, 50))
console.log('half a frame later…')

// And so is dynamic import — useful for branching by platform.
const fs = await import('node:fs/promises')
console.log('cwd =', await fs.realpath(process.cwd()))
```

If you've worked with newer Node tooling this will feel completely ordinary. That's the point.

**Recap.** A Simulang script is just an ES module. Save a `.ts`, `.mts`, or `.mjs` file. Run it with `simulang run`. Use `process.env`, `process.exit`, and anything else you'd use in a normal Node script.

***

## 3. Opening an app

The `@simular-ai/simulang-js` package gives you everything else. Let's open `example.com` in the default browser:

```ts theme={null}
// open-example.ts
import { App, FocusPolicy, Visibility } from '@simular-ai/simulang-js'

const instance = App.defaultBrowser().open(
  'https://example.com',
  FocusPolicy.Steal,        // ① bring the browser to the front
  Visibility.Show,          // ② don't try to hide it
  true,                     // ③ block briefly for the page to start
)

console.log('opened:', instance.pid)
```

Three things to call out:

1. **`FocusPolicy`** is `Steal` or `DoNotSteal`. *"Steal"* asks the OS to bring the app to the front; *"DoNotSteal"* asks it not to. The word "ask" is doing real work in that sentence — see the pitfalls chapter.
2. **`Visibility`** is `Show` or `Hidden`. Same caveat: it's a request, not a guarantee, and Chromium-based apps in particular ignore `Hidden`.
3. **`waitForLoadComplete`** (the trailing boolean) blocks for a short, fixed delay so the app or URL has a chance to become responsive before `open()` returns. It's a sleep, not a full network wait — you still need to give the page time to render before reaching for the DOM.

The call returns an **`Instance`** — a handle to the running app:

```ts theme={null}
instance.pid              // process ID
instance.isFocused()      // boolean
instance.focus()          // bring it forward
instance.hide()           // minimise
instance.isAccessible()   // does the OS expose this app's accessibility tree?
instance.enableAccessibility()
```

If you want to launch a *specific* app (not the system default), use `App.exactName()`:

```ts theme={null}
App.exactName('Google Chrome').open(
  null,                            // no URL — just raise the app
  FocusPolicy.Steal,
  Visibility.Show,
  false,
)
```

`open(null, …)` is the canonical way to **raise an already-running app** to the foreground without changing its state.

***

## 4. Your first automation

Time to make something happen. Save this as `wikipedia-stats.ts`:

```ts theme={null}
// wikipedia-stats.ts — run with: simulang run wikipedia-stats.ts
import {
  AccessibilityTree,
  App,
  AriaRole,
  FocusPolicy,
  Visibility,
  type AccessibilityNodeJs,
} from '@simular-ai/simulang-js'

const sleep = (ms: number) => new Promise((r) => setTimeout(r, ms))

function flattenDFS(
  node: AccessibilityNodeJs,
  out: AccessibilityNodeJs[] = [],
): AccessibilityNodeJs[] {
  out.push(node)
  for (const child of node.children) flattenDFS(child, out)
  return out
}

const instance = App.defaultBrowser().open(
  'https://en.wikipedia.org', FocusPolicy.Steal, Visibility.Show, true,
)
await sleep(2500)
if (!instance.isAccessible()) instance.enableAccessibility()

const tree = AccessibilityTree.fromPid(instance.pid)
const link = flattenDFS(tree.snapshot(true)).find(
  (n) =>
    n.role === AriaRole.Link &&
    /special:statistics/i.test(n.description) &&
    n.refId != null,
)
if (!link) throw new Error('Could not find the Statistics link.')

tree.activate(link.refId)
console.log('Opened the Wikipedia statistics page.')
```

Run it. Wikipedia opens, Simulang walks the page's accessibility tree, finds the Statistics link, clicks it, and your browser jumps to Wikipedia's `Special:Statistics` page.

**Why `n.description` and not `n.name`?** Web AX on macOS commonly exposes link text in the `description` field with `name` left empty. Other platforms differ — that's exactly the kind of cross-platform variability chapter 7 addresses with a `labelOf` helper that checks every field a label might live in.

**macOS first-run prompt.** The first time a Simulang script touches another app, the OS will prompt you to grant accessibility permission to the terminal you ran it from (System Settings → Privacy & Security → Accessibility). Grant it once and you're set.

That's the whole Simulang loop in twenty-odd lines. The shape is the same for every script you'll write — from this demo to a multi-step purchase flow:

1. **Open or attach to an app** — `App.defaultBrowser().open` (here), or `AccessibilityTree.fromPid` to drive something already running.
2. **Snapshot the accessibility tree** — `tree.snapshot(true)` returns the visible UI as a tree of nodes.
3. **Walk the tree** to find the node you want — any standard array or object operation works, since the snapshot is plain data.
4. **Act on its `refId`** — `tree.activate(refId)` for clicks, `tree.setValue(refId, text)` for typing, and a handful of others.

Don't worry about every line yet — the next few chapters break it down. The takeaway is that you've now seen the four moves that make up every Simulang script.

**A note on TypeScript.** This primer uses `.ts` so we can show type annotations where they're informative — `.mjs` works just as well.

***

## 5. The accessibility tree

You just used the accessibility tree without an introduction. Let's back up and look at what it is.

The **accessibility tree** is a snapshot of every visible widget in a window — the same data screen readers use to read a UI aloud. Every OS exposes one, and `@simular-ai/simulang-js` gives you a uniform Node API over it.

```ts theme={null}
import { AccessibilityTree } from '@simular-ai/simulang-js'

// Bind to the foreground window of whichever app is in front right now.
const tree = AccessibilityTree.fromForeground()
console.log('Bound to window:', tree.windowTitle)
```

You can also bind to a specific window by process ID — handy when you just opened the app yourself and have an `Instance.pid` (chapter 4 used this form):

```ts theme={null}
const tree = AccessibilityTree.fromPid(instance.pid)
```

Once you've got a tree, take a snapshot:

```ts theme={null}
const root = tree.snapshot(true)   // true = visible-only; almost always what you want
```

The shape of every node looks like this:

```ts theme={null}
interface AccessibilityNodeJs {
  role: AriaRole            // numeric enum: Button = 7, Link = 32, ...
  name: string              // accessible name
  value: string             // current value (for inputs)
  description: string       // platform-specific extra
  helpText: string          // tooltip-style hint
  refId?: number            // opaque handle for interactions (more below)
  children: AccessibilityNodeJs[]
  boundingBox: { x, y, width, height, ... }
}
```

You walk this tree like any other object graph:

```ts theme={null}
function flattenDFS(node, out = []) {
  out.push(node)
  for (const child of node.children) flattenDFS(child, out)
  return out
}

for (const node of flattenDFS(root).slice(0, 10)) {
  console.log(node.role, JSON.stringify(node.name).slice(0, 60))
}
```

**The most important property is `refId`.** It's an opaque integer that identifies a node *for this snapshot only*. You pass it to interaction methods like `tree.activate(refId)` to click. Two crucial properties:

* **A `refId` is valid only as long as the snapshot that produced it.** Take a new snapshot, and the old refIds are dead. We'll see how to cope with that in chapter 8.
* **A node only has a `refId` when the OS exposed one.** Some structural wrappers (groups, panels) don't have refIds; you can still see them in the tree, but you can't click them. Always check `refId != null` before acting.

The roles are an enum called `AriaRole`. The common ones:

```ts theme={null}
import { AriaRole } from '@simular-ai/simulang-js'

AriaRole.Button       // 7
AriaRole.Checkbox     // 10
AriaRole.Dialog       // 18
AriaRole.Document     // 20  — the root of a web page
AriaRole.Heading      // 29
AriaRole.Img          // 30
AriaRole.Link         // 32
AriaRole.MenuItem     // 43
AriaRole.Radio        // 54
AriaRole.Tab          // 72
AriaRole.Textbox      // 78
```

**Heads up.** `AriaRole` is a numeric enum. The TypeScript reverse-mapping trick — `AriaRole[someNode.role]` — does **not** work for numeric NAPI enums. Use the exported helper instead:

```ts theme={null}
import { ariaRoleToString } from '@simular-ai/simulang-js'
ariaRoleToString(node.role)  // "button"
```

**Recap.** Bind to a window → take a snapshot → walk the tree to find the node you want → act on its `refId`. The rest of this primer is elaborations on that loop.

***

## 6. Acting on elements

Once you have a `refId`, the `AccessibilityTree` object exposes a small family of action methods:

```ts theme={null}
tree.activate(refId)         // "click" — works for buttons, links, menuitems
tree.setValue(refId, text)   // type into a textbox / combobox
tree.toggle(refId)           // flip a checkbox or switch
tree.select(refId)           // choose a tab, radio button, or list item
tree.expandCollapse(refId)   // open/close a dropdown or tree item
tree.scrollIntoView(refId)   // bring the element on-screen
tree.focusElement(refId)     // focus (also raises the window)
tree.getBounds(refId)        // current screen coordinates
```

In practice you'll spend most of your time with **`activate`** and **`setValue`**. The rest are special-purpose helpers for cases where a plain click doesn't do the right thing.

**Sync, not async.** Every action method on `AccessibilityTree` is synchronous — they don't return Promises, and you don't `await` them. The `await` you saw in chapter 4 was for `sleep`, not for `tree.activate`. `tree.snapshot()` is also synchronous. The only things you `await` in a typical script are sleeps, polling helpers like `withSnapshot` (chapter 8), and Node stdlib promises (fetch, `node:fs/promises`).

You already saw `activate` in chapter 4. Filling a form looks the same — find the input, set its value, find the submit button, activate it:

```ts theme={null}
const nodes = flattenDFS(tree.snapshot(true))

const searchBox = nodes.find(
  (n) => n.role === AriaRole.Textbox && /search/i.test(n.name) && n.refId != null,
)
const searchBtn = nodes.find(
  (n) => n.role === AriaRole.Button && /^search$/i.test(n.name) && n.refId != null,
)
if (!searchBox || !searchBtn) throw new Error('search controls not found')

tree.setValue(searchBox.refId, 'simulang')   // ① type into the box
tree.activate(searchBtn.refId)               // ② click submit
```

Two details worth noticing — they apply to every action method:

1. The predicate checks **`n.refId != null`**. Without that, you might match a structural wrapper that has the right name but no way to be acted on.
2. Both `setValue` and `activate` use `refId`s from the *same* snapshot. As long as no new snapshot has happened between them, the refs stay valid. After any navigation or UI mutation, you'll need a fresh snapshot — chapter 8 shows the pattern.

***

## 7. Finding the element you want

Real UIs make element discovery harder than "find by name." A few practical patterns we use again and again.

**A note on the helpers in this chapter.** `labelOf`, `stripPUA`, `flattenDFS`, `pageNodes` below — and `withSnapshot` in the next chapter — are **not** exported from `@simular-ai/simulang-js`. They're a handful of lines each, and copying them into each script is faster than reaching for a dependency. The upside of script-local helpers is that you can tweak them when a project needs something different (a `labelOf` that includes `automationId`, a `pageNodes` scoped to a specific app). If you find yourself maintaining a project-wide variant, factor it into a local `lib/` directory and import it normally.

### 7.1 The label can live in any of four fields

Different platforms put a control's accessible text in different places:

* Native macOS apps usually put it in `name`.
* Text inputs put the *current value* in `value` (not `name`).
* Web AX on macOS often puts link text in `description`.
* Tooltips show up in `helpText`.

A single label-helper covers all four:

```ts theme={null}
const labelOf = (n) =>
  [n.name, n.value, n.description, n.helpText].filter(Boolean).join(' ').trim()
```

Now `labelOf(node)` is the answer to "what would a screen reader say about this element?", regardless of which field the OS chose.

### 7.2 Strip Private-Use-Area glyphs

Web pages frequently put icon-font glyphs (Font Awesome, Material Icons) into link labels. These come through as characters in Unicode's Private Use Area (`U+E000`–`U+F8FF`), and they break anchored regexes:

```ts theme={null}
// node.name might be " Logout"
/^logout$/i.test(node.name)         // false ☹
```

Strip them:

```ts theme={null}
const stripPUA = (s) => s.replace(/[-]/g, '').replace(/\s+/g, ' ').trim()
const labelOf = (n) => stripPUA([n.name, n.value, n.description, n.helpText].filter(Boolean).join(' '))
```

### 7.3 Scope to the page, not the chrome

A browser snapshot includes the URL bar, tab strip, bookmarks bar, dev-tools panes — none of which you usually want when you're looking for an "Add to cart" button. The convention is to flatten only the **last `AriaRole.Document` subtree**, which is the active tab's web content:

```ts theme={null}
function pageNodes(root) {
  const all = flattenDFS(root)
  const docs = all.filter((n) => n.role === AriaRole.Document)
  if (!docs.length) return all                       // not a browser — full tree
  return flattenDFS(docs[docs.length - 1])            // last = active tab
}
```

Then use `pageNodes(root)` instead of `flattenDFS(root)` when you're hunting for something inside the page.

### 7.4 Use the built-in search when it fits

`AccessibilityTree` also exposes a search method that mirrors the common case:

```ts theme={null}
import { TraversalOrder } from '@simular-ai/simulang-js'

const hits = tree.find(
  TraversalOrder.DepthFirst,
  AriaRole.Button,             // role filter (optional)
  'Submit',                    // name contains (optional)
  true,                        // visibleOnly
  1,                           // maxResults
)
if (hits[0]) tree.activate(hits[0].refId)
```

Use the built-in `find` for "fast path, one role, one name match." Drop back to manual DFS when you need spatial reasoning — for example, "the second textbox that appears between the heading 'Sign in' and the button 'Continue'."

***

## 8. Waiting for the UI

Pages mutate. Modals appear. The element you want shows up half a second after a click. Because `refId`s invalidate every time you call `snapshot()`, the safe pattern is a **polling loop that re-snapshots each tick and looks for what you want**:

```ts theme={null}
async function withSnapshot(
  tree,
  predicate,
  { timeoutMs = 8000, intervalMs = 250, label = 'node' } = {},
) {
  const deadline = Date.now() + timeoutMs
  while (Date.now() < deadline) {
    const root = tree.snapshot(true)               // ① fresh refIds every tick
    const hit = predicate(flattenDFS(root))        // ② caller decides "found"
    if (hit) return hit                            // ③ first match wins
    await sleep(intervalMs)
  }
  throw new Error(`Timed out waiting for ${label}`)
}
```

Used at the call site:

```ts theme={null}
const link = await withSnapshot(
  tree,
  (nodes) => nodes.find(
    (n) => n.role === AriaRole.Link && /^logout$/i.test(labelOf(n)) && n.refId != null,
  ),
  { label: '"Logout" link' },
)
tree.activate(link.refId)
```

Three reasons this pattern shows up in every nontrivial script:

1. The element might not be there yet — page is still loading, animation is mid-flight, modal hasn't opened.
2. The element might be there but its `refId` is from a now-stale snapshot. Polling re-issues the snapshot for you.
3. The named `label` gives you a useful error message when the wait genuinely times out: *"Timed out waiting for the Allow button in the blocked-download panel"* is much nicer to read at 3am than *"Timed out waiting for node."*

A predicate doesn't have to return a single node. When you need several pieces of state to all be visible at once, return a structured object:

```ts theme={null}
const { emailField, passField, loginBtn } = await withSnapshot(
  tree,
  (nodes) => {
    const headingIdx = nodes.findIndex((n) => /sign in/i.test(labelOf(n)))
    if (headingIdx === -1) return null
    const btnIdx = nodes.findIndex(
      (n, i) => i > headingIdx && n.role === AriaRole.Button && /sign in/i.test(labelOf(n)),
    )
    if (btnIdx === -1) return null
    const inputs = nodes
      .slice(headingIdx, btnIdx)
      .filter((n) => n.role === AriaRole.Textbox && n.refId != null)
    if (inputs.length < 2) return null
    return { emailField: inputs[0], passField: inputs[1], loginBtn: nodes[btnIdx] }
  },
  { label: 'sign-in form (email + password + button)' },
)
```

This is also how you handle login forms whose inputs have no accessible label at all: locate them **positionally**, between two well-labelled landmarks.

***

## 9. When things don't work

Sooner or later you'll get `Timed out waiting for "Continue" button`, or your `find` will silently return `undefined`. Here's the troubleshooting playbook.

### 9.1 Print the tree

The fastest way to debug a missing element is to dump what *is* there:

```ts theme={null}
import { ariaRoleToString } from '@simular-ai/simulang-js'

const lines = flattenDFS(tree.snapshot(true))
  .filter((n) => n.refId != null && labelOf(n))
  .map((n) => `${ariaRoleToString(n.role).padEnd(12)} ${JSON.stringify(labelOf(n)).slice(0, 60)}`)
console.log(lines.join('\n'))
```

Drop that wherever your script is stuck. It prints every actionable element on the page with its role and accessible label. Two outcomes:

* **The element isn't in the list.** The page hasn't rendered it yet. Sleep longer, or — better — wrap the lookup in `withSnapshot` so it polls.
* **It is in the list, but your predicate isn't matching.** Almost always: the label is in `description` or `value` instead of `name`, or there's a stray icon-font glyph you didn't strip. See chapter 7.

### 9.2 Common root causes

A handful of issues account for most "where did my element go" problems:

* **The label is in `description`, `value`, or `helpText`, not `name`.** Use `labelOf` (chapter 7) instead of `node.name` directly.
* **The element has no `refId`.** Structural wrappers (groups, panes) often share a name with their actionable children but don't get a refId of their own. Always filter with `n.refId != null` in your predicate.
* **You're looking inside the wrong subtree.** Browser chrome (URL bar, bookmarks bar) lives at the top of the snapshot; the active page is inside the last `AriaRole.Document`. Use `pageNodes` (chapter 7) when you only want page content.
* **The page renders after your snapshot.** Wrap the find in `withSnapshot` rather than bumping a brittle `sleep`.
* **The element appears, then disappears.** SPAs sometimes re-mount nodes mid-update. `withSnapshot` retries on a fresh snapshot each tick; bare `find` doesn't.
* **You're bound to the wrong window.** `AccessibilityTree.fromForeground()` binds to whichever app is frontmost *at the moment of the call*. If a notification stole focus during your `sleep`, you'll get its tree instead. `fromPid(instance.pid)` is the safer choice when you have a pid.

### 9.3 Iterate in the REPL

When you're stuck on a predicate, `simulang run -i` is much faster than editing-and-rerunning the whole script. Bring the target app to the front first, then:

```
$ simulang run -i
> const tree = AccessibilityTree.fromForeground()
> const root = tree.snapshot(true)
> flattenDFS(root).filter((n) => n.role === AriaRole.Button).map((n) => n.name)
[ 'Sign in', 'Cancel', '' ]
```

`fromForeground()` binds to whatever's active at the moment of the call — that's why the app needs to be focused before you run it.

***

## 10. A complete example

Let's tie everything together. The script below opens `automationexercise.com` (a public test site), confirms we're logged out, signs in with credentials from environment variables, and signs out again. We verify each state transition through the accessibility tree.

```ts theme={null}
// login-and-out.ts
import {
  AccessibilityTree,
  App,
  AriaRole,
  FocusPolicy,
  Visibility,
} from '@simular-ai/simulang-js'

const email = process.env.SITE_EMAIL
const password = process.env.SITE_PASSWORD
if (!email || !password) {
  console.error('Set SITE_EMAIL and SITE_PASSWORD before running.')
  process.exit(1)
}

const sleep = (ms) => new Promise((r) => setTimeout(r, ms))
const stripPUA = (s) => s.replace(/[-]/g, '').replace(/\s+/g, ' ').trim()
const labelOf = (n) => stripPUA([n.name, n.value, n.description, n.helpText].filter(Boolean).join(' '))

function flattenDFS(node, out = []) {
  out.push(node); for (const c of node.children) flattenDFS(c, out); return out
}

async function withSnapshot(tree, predicate, opts = {}) {
  const { timeoutMs = 8000, intervalMs = 250, label = 'node' } = opts
  const deadline = Date.now() + timeoutMs
  while (Date.now() < deadline) {
    const root = tree.snapshot(true)
    const hit = predicate(flattenDFS(root))
    if (hit) return hit
    await sleep(intervalMs)
  }
  throw new Error(`Timed out waiting for ${label}`)
}

const isLink = (n, re) => n.role === AriaRole.Link && re.test(labelOf(n)) && n.refId != null
const findLink = (nodes, re) => nodes.find((n) => isLink(n, re))
const SIGNUP_LOGIN = /^signup \/ login$/i
const LOGOUT = /^logout$/i

try {
  // 1. Open the site.
  const instance = App.defaultBrowser().open(
    'https://automationexercise.com', FocusPolicy.Steal, Visibility.Show, true,
  )
  await sleep(2500)
  if (!instance.isAccessible()) instance.enableAccessibility()
  const tree = AccessibilityTree.fromPid(instance.pid)

  // 2. Make sure we're logged out (self-heal from stale sessions).
  const navState = await withSnapshot(
    tree,
    (nodes) => {
      const logout = findLink(nodes, LOGOUT)
      if (logout) return { kind: 'logged-in', logout }
      if (findLink(nodes, SIGNUP_LOGIN)) return { kind: 'logged-out' }
      return null
    },
    { label: 'nav with Signup/Login or Logout' },
  )
  if (navState.kind === 'logged-in') {
    tree.activate(navState.logout.refId)
    await withSnapshot(
      tree,
      (nodes) => findLink(nodes, SIGNUP_LOGIN) && !nodes.some((n) => isLink(n, LOGOUT)),
      { label: 'logged-out state' },
    )
  }

  // 3. Click "Signup / Login".
  const loginLink = await withSnapshot(tree, (nodes) => findLink(nodes, SIGNUP_LOGIN), {
    label: '"Signup / Login" link',
  })
  tree.activate(loginLink.refId)

  // 4. Find the login form positionally and fill it.
  const { emailField, passField, loginBtn } = await withSnapshot(
    tree,
    (nodes) => {
      const hIdx = nodes.findIndex((n) => /^login to your account$/i.test(labelOf(n)))
      if (hIdx === -1) return null
      const btnIdx = nodes.findIndex(
        (n, i) => i > hIdx && n.role === AriaRole.Button && /^login$/i.test(labelOf(n)),
      )
      if (btnIdx === -1) return null
      const inputs = nodes.slice(hIdx, btnIdx).filter(
        (n) => n.role === AriaRole.Textbox && n.refId != null,
      )
      if (inputs.length < 2) return null
      return { emailField: inputs[0], passField: inputs[1], loginBtn: nodes[btnIdx] }
    },
    { label: 'login form' },
  )
  tree.setValue(emailField.refId, email)
  tree.setValue(passField.refId, password)
  tree.activate(loginBtn.refId)

  // 5. Confirm login by waiting for the Logout link, then click it.
  const logoutLink = await withSnapshot(tree, (nodes) => findLink(nodes, LOGOUT), {
    label: 'Logout link', timeoutMs: 10000,
  })
  tree.activate(logoutLink.refId)

  // 6. Confirm logout.
  await withSnapshot(
    tree,
    (nodes) => findLink(nodes, SIGNUP_LOGIN) && !nodes.some((n) => isLink(n, LOGOUT)),
    { label: 'logged-out confirmation' },
  )
  console.log('done.')
} catch (error) {
  console.error('Failed:', error instanceof Error ? error.message : error)
  process.exit(1)
}
```

Most real scripts are some variant of this shape:

* **Imports at the top**, secrets from `process.env`.
* **A handful of small helpers** (`sleep`, `labelOf`, `flattenDFS`, `withSnapshot`) — the chapter-7 note about script-local helpers applies here.
* **One big `try`/`catch`** with `process.exit(1)` on failure.
* **Numbered phases**, each one a `withSnapshot` that names what it's waiting for. The names double as breadcrumbs in the logs.

***

## 11. When accessibility isn't enough

The accessibility tree is the right tool nine times out of ten, but not always. Two escape hatches:

### 11.1 Hardware mouse and keyboard

When a control has no `refId` — a canvas, an embedded video player, an OS-level dialog the app doesn't expose — drop to real coordinates and key events:

```ts theme={null}
import {
  MouseController, KeyboardController,
  Button, Direction, Coordinate, Key,
} from '@simular-ai/simulang-js'

const mouse = new MouseController()
mouse.moveMouse(640, 360, Coordinate.Abs)     // absolute screen pixels
mouse.button(Button.Left, Direction.Click)    // press + release

const kb = new KeyboardController()
kb.text('hello world')                        // types via the OS input layer
kb.key(Key.Enter)                             // a single named key
```

`Coordinate.Abs` is absolute pixels; `Coordinate.Rel` is movement *relative to the current cursor position*. For drag-and-drop, send `Direction.Press`, move, then `Direction.Release`.

### 11.2 Vision grounding

When you can *see* the element but neither the OS nor your regex can find it, take a screenshot and ask a vision model where it is:

```ts theme={null}
import { Screen, screenshotFull, GroundingModel } from '@simular-ai/simulang-js'

const shot = screenshotFull(true, Screen.mainScreen())   // hide the cursor
const [x, y] = shot.ground(GroundingModel.default(), 'the blue Continue button')

mouse.moveMouse(x, y, Coordinate.Abs)
mouse.button(Button.Left, Direction.Click)
```

`shot.ground(model, query)` returns absolute screen coordinates for whatever the model thinks best matches your natural-language query.

A few practical notes:

* Pass `true` to `screenshotFull` to hide the cursor — otherwise the model may describe the cursor as part of the UI.
* Grounding is **network-bound**: it needs `OPENROUTER_API_KEY` set.
* Vision grounding is your last resort, not your first move. It's slower, costs money, and is less reliable than the accessibility tree. Use it for the 1% of cases where nothing else works.

***

## 12. Files, env vars, and shelling out

There are no Simulang-specific I/O wrappers. Use Node's standard library directly:

```ts theme={null}
import { readdirSync, readFileSync, writeFileSync, renameSync, statSync, existsSync } from 'node:fs'
import { spawn, spawnSync } from 'node:child_process'
import { join, dirname } from 'node:path'
import { fileURLToPath } from 'node:url'

// Resolve "the directory of this script."
const HERE = dirname(fileURLToPath(import.meta.url))

// Plain I/O.
writeFileSync(join(HERE, 'out.txt'), 'hello\n')
const text = readFileSync(join(HERE, 'out.txt'), 'utf8')

// Shell out — captured output.
const result = spawnSync('python3', ['transform.py', 'in.csv'], { encoding: 'utf8' })
if (result.status !== 0) throw new Error(result.stderr)

// Fire-and-forget — open a file with the OS default app.
spawn('open', [join(HERE, 'out.txt')], { detached: true, stdio: 'ignore' }).unref()
```

`import.meta.url` + `fileURLToPath` is the ES-module equivalent of the CommonJS `__dirname`. Use it any time you need to resolve a path relative to the script itself.

The `@simular-ai/simulang-js` package also ships convenience classes — `File`, `Directory`, `Clipboard`, `System` — when you want cross- platform helpers (an OS clipboard read, a temp directory with an explicit lifecycle). They're worth using for clipboard and audio work; for plain file reads and writes, staying in `node:fs` keeps the concept count low.

***

## 13. Composing scripts

As workflows grow, you'll want to split them up. There are two natural ways.

### 13.1 Import helpers from sibling modules

The simplest: extract reusable functions into their own files and import them.

```ts theme={null}
// flow/login.mjs
export async function login(tree, { email, password }) {
  // … positional find + setValue + activate, as in chapter 10 …
}
```

```ts theme={null}
// flow/checkout.mjs
export async function placeOrder(tree, item) {
  // …
}
```

```ts theme={null}
// run.mjs
import { App, FocusPolicy, Visibility, AccessibilityTree } from '@simular-ai/simulang-js'
import { login } from './flow/login.mjs'
import { placeOrder } from './flow/checkout.mjs'

App.defaultBrowser().open('https://example.com', FocusPolicy.Steal, Visibility.Show, true)
const tree = AccessibilityTree.fromForeground()

await login(tree, { email: process.env.SITE_EMAIL, password: process.env.SITE_PASSWORD })
await placeOrder(tree, 'blue-shirt')
```

This is just ES modules — there's nothing Simulang-specific to learn. Pass shared resources (the `tree`, the `instance`) as arguments rather than reaching for module-level singletons.

### 13.2 Run other scripts as subprocesses

When you want full isolation between phases — fresh process, fresh state, no shared imports — orchestrate from a parent script:

```ts theme={null}
// orchestrator.mjs
import { spawnSync } from 'node:child_process'
import { dirname, join } from 'node:path'
import { fileURLToPath } from 'node:url'

const HERE = dirname(fileURLToPath(import.meta.url))

function runStep(label, script) {
  console.log(`\n===== ${label} =====`)
  const result = spawnSync('simulang', ['run', join(HERE, script)], {
    stdio: 'inherit',   // stream the child's logs live
    env: process.env,   // pass env vars through (SITE_EMAIL, OPENROUTER_API_KEY, …)
  })
  if (result.status !== 0) {
    console.error(`"${label}" failed with exit ${result.status}.`)
    process.exit(result.status ?? 1)
  }
}

runStep('Sign in', 'sign-in.ts')
runStep('Place order', 'place-order.ts')
runStep('Download receipt', 'download-receipt.ts')
```

The big win: each step is **independently runnable**. If `place-order.ts` fails, you can iterate on it directly without re-running sign-in every time.

***

## 14. Common pitfalls

A short list of things that bite first-time users (and second- and third-time users, honestly).

```ts theme={null}
// ① refIds are tied to one snapshot. Don't cache them.
const btn = flattenDFS(tree.snapshot(true)).find(...)
await sleep(2000)         // ← the page may have re-rendered
tree.activate(btn.refId)  // ⚠ stale refId

// ✓ Re-snapshot inside withSnapshot and act on a fresh ref.
```

```ts theme={null}
// ② AriaRole is a numeric enum. AriaRole[role] does NOT give a string.
console.log(AriaRole[node.role])              // ⚠ undefined
console.log(ariaRoleToString(node.role))      // ✓ "button"
```

```ts theme={null}
// ③ FocusPolicy and Visibility are advisory.
App.exactName('Google Chrome').open(null, FocusPolicy.DoNotSteal, Visibility.Hidden, false)
// ⚠ Chromium / Electron apps and macOS Notes ignore these flags. The
//    app pops to the foreground and stays visible regardless.

// ✓ If you need certainty the app is focused, check and force it:
if (!instance.isFocused()) instance.focus()

// ✓ If you need it hidden, call .hide() after the launch settles:
await sleep(500)
instance.hide()
```

```ts theme={null}
// ④ The accessible label can live in any of four fields.
node.name === 'Sign in'           // ⚠ might be in description on web AX
labelOf(node) === 'Sign in'       // ✓ uses name|value|description|helpText
```

```ts theme={null}
// ⑤ Browser chrome contaminates snapshots.
nodes.find((n) => /save/i.test(labelOf(n)))   // ⚠ might match the URL bar
pageNodes(root).find(/* … */)                  // ✓ scope to the active tab
```

```ts theme={null}
// ⑥ There is no built-in sleep / wait. Roll your own.
const sleep = (ms) => new Promise((r) => setTimeout(r, ms))
```

```ts theme={null}
// ⑦ Top-level errors print ugly stack traces. Wrap and exit.
try {
  // … your script …
} catch (e) {
  console.error('Failed:', e instanceof Error ? e.message : e)
  process.exit(1)
}
```

One more worth mentioning that doesn't fit a one-liner:

* **Temp directories don't auto-clean.** If you use `Directory.temp()`, wrap the work in `try { … } finally { dir.remove() }`.

***

## 15. Where to go next

You now know enough to write real Simulang scripts. From here:

* **Clone the [Simulang Recipes](https://github.com/simular-ai/simulang-recipes) repo.** It's a growing collection of community-built automations — a Wordle solver, a 2048 bot, a Calendar-to-Docs digest, a daily news digest into Apple Notes, a TikTok-to-Slack forwarder, and more. Each recipe is a small, self-contained script with its own README; pick one close to what you want to build, run it, then read the source. Contributions welcome.
* **Read `index.d.ts`** in the resolved `@simular-ai/simulang-js` install. It's the typed surface of the library — roughly 1500 lines, organised by class. Source of truth for signatures and enums, and the entry point to everything we didn't cover here (audio capture, speech-to-text, screen recording).
* **Read the library's `CLAUDE.md`** next to `index.d.ts`. It covers idioms, lifecycle preconditions, and platform quirks — things types can't express.
* **Browse the example scripts** in `examples/` inside the install. They're the best place to see larger patterns: a full purchase flow, file downloads with rename, screenshot-and-ground, module composition.
* **Build your reflexes at the REPL.** When you're not sure how an API behaves, `simulang run -i` is faster than scripting it. Snapshots are cheap; experiment.

Welcome to Simulang. Have fun automating things that used to need a human in the chair.

***

## 16. Simulang in Claude Code

If you use **Claude Code** in the editor, you can drive the same desktop APIs without hand-writing every script: install the skill and use `/simulang` so Claude generates and runs automation for you. See **[Simulang with Claude Code](/simulang/simulang-claude-code)** for setup (`simulang init-claude`), example prompts, permissions, and tips.
