Simulang - Simular Developer Documentation

Simular Pro gives you the highest level of flexibility and precise control over the full desktop via Simulang, a specialized set of intuitive and easy-to-use action building blocks defined on top of Javascript. While basic prompting of Simular using the chat mode may produce a good initial plan (in Simulang), full success for complex workflows may require edits to some steps in the plan. The best way to learn Simulang is by examples. Find out how to:

General

Act

Completes a task specified in natural language, up to a maximum number of steps. Returns: None

function Act({
    task: String = "",
    maxSteps: Int = 100
})

Show Parameters

task

An arbitrary task.

maxSteps

Stop when this number of primitive actions have been executed.

Application

Open

Open or switch to an application. Returns: None

function Open({
    app: String = null,
    url: String = null,
    waitForLoadComplete: Bool = true,
    waitTime: Int = 0
})

Show Parameters

app

application name, e.g., Google Chrome, iMessage.

url

url address of a web page, e.g., google.com

waitForLoadComplete

Whether or not to wait for the app or URL to complete loading. Default is true.

waitTime

Duration in seconds to wait after opening the app or URL, in addition to the wait for load.

Keyboard control

Type

Type using the keyboard. Returns: None

function Type({
    text: String,
    withReturn: Bool = false,
    waitTime: Int = 0,
    waitForLoadComplete: Bool = false
})

Show Parameters

text

A piece of text to type.

withReturn

Whether or not to press the return (enter) key after typing.

waitTime

Duration in seconds to wait after typing.

waitForLoadComplete

Whether or not to wait for a page to complete loading after typing.

Shortcut

Perform keyboard shortcut in the current application. Returns: None

function Shortcut({
    key: String,
    cmd: Bool = false,
    ctrl: Bool = false,
    option: Bool = false,
    shift: Bool = false,
    waitTime: Int = 0
})

Show Parameters

key

A key to be pressed

cmd

Whether the command modifier should be pressed when tapping the key.

ctrl

Whether the control modifier should be pressed when tapping the key.

option

Whether the option modifier should be pressed when tapping the key.

shift

Whether the shift modifier should be pressed when tapping the key.

waitTime

Time in second to wait after executing the action.

Mouse Control

Click

Click on something, either specified by the at argument or the element argument. Disambiguate by specifing the spatial relation between the target element and an anchor concept. Available modes:

default is text-based grounding.
“textAndScreenshot”: grounding using both text and vision.
“vision”: vision-only grounding. When using modes with vision, all other arguments are ignored except withCommand, clickType and waits.

Returns: None

function Click({
    at: String = "",
    mode: String = "",
    clickType: String = "left",
    withCommand: Bool = false,
    element: UIElement = null,
    spatialRelation: String = "",
    anchorConcept: String = "",
    prior: String = "none",
    position: String = "center",
    includeInvisible: Bool = false,
    waitForLoadComplete: Bool = false,
    waitTime: Int = 0
})

Show Parameters

Description of a target object to click. For higher accuracy, describe the target object by its role and value, e.g. “sign in button”, “first name textfield”.

mode

options: “textAndScreenshot”, “vision”. Leave empty for default text-based grounding.

clickType

Type of click. Options are: “left” (default), “right”, and “doubleClick” (two quick left clicks)

withCommand

If true, agent presses the command key during click.

element

A UIElement. If given, ignores the at argument and directly clicks on this element.

spatialRelation

A comma-separated String of spatial relationship between the target object and element(s) that best match the anchorConcept. Options are: “closest”, “furthest”, “above”, “right”, “below”, “left”, “contains”, “containedIn”

anchorConcept

Description of an object used as an anchor with spatialRelation.

prior

A optional global spatial location prior used for disambiguation among elements that have similar descriptions. Options are: “left”, “right”, “top”, “bottom”. For example, “left” means to choose the left-most element among candidates.

position

Position within the frame of an element to click. Options are: topleft, topcenter, topright, middleleft, center, middleright, bottomleft, bottomcenter, bottomright, anywhere

includeInvisible

Whether or not to find the target from sections of the page that are not currently visible (i.e., needs scrolling).

waitForLoadComplete

If true, waits for page to finish loading after the click.

waitTime

Integer number of seconds to wait after click.

Move

Moves the cursor to an object. Disambiguate by specifing the spatial relation between the target element and an anchor concept. All parameters besides to have the same definition as those in the Click action. Returns: None

function Move({
    to: String = "",
    element: UIElement = null,
    spatialRelation: String = "",
    anchorConcept: String = "",
    prior: String = "none",
    includeInvisible: Bool = false,
    waitForLoadComplete: Bool = false,
    waitTime: Int = 0
})

Show Parameters

Description of a target object to click. For higher accuracy, describe the target object by its role and value, e.g. “sign in button”, “first name textfield”.

mode

See definition in Click.

element

See definition in Click.

spatialRelation

See definition in Click.

anchorConcept

See definition in Click

prior

See definition in Click

includeInvisible

See definition in Click

waitForLoadComplete

See definition in Click

waitTime

See definition in Click

Drag

Drag on the screen, starting from where the mouse is located Returns:

function Drag({
    to: String = "",
    element: UIElement = null,
    destinationApp: String = null
})

Show Parameters

the destination of the drag action in string

element

the destination of the drag action as UI Element

destinationApp

the target app, if destination is not in the same application as the origin

Scroll

Scroll on the screen in a specified direction. Returns: None

function Scroll({
    direction: String = "down",
    distance: Int = 200
})

Show Parameters

direction

The direction to scroll (up, down, left, right) Default is down.

distance

The scroll distance in pixels. Default is 200.

Perception

stateSatisfies

Checks whether the current screen satisfies the specified condition. This is useful for checking complex conditions that cannot be easily checked by ConceptsExist with simple element descriptions. However, it may be slower and use more tokens than ConceptsExist for large pages. Returns: true if the current screen satisfies the condition, false otherwise.

function stateSatisfies({
    condition: String
})

Show Parameters

condition

A condition to check on the current screen

ConceptsExist

Checks whether all the concepts can be found on the current visible screen. Each concept must be a substring of the text description of an element in the current focused application. This is faster and more token-efficient than stateSatisfies for checking conditions that can be represented by a small set of element descriptions.

Use the element picker tool

to check the text description of any element in the currently focused application.

Returns: If all concepts can be found, returns true, otherwise false.

function ConceptsExist({
    concepts: [String]
})

Show Parameters

concepts

An array of target concepts to find.

pageContent

Gets a JSON object containing the structural text content and base64 encoded image of the current screen. This object can be sent to a vision-language model for answering questions about the current screen. Returns: A JSON dictionary with the following fields:

text: A text description of the current web page;
imageFilePath: temporary location in memory of the screenshot (accessible by ask).

function pageContent({
})

Text Generation

ask

Runs a large vision-language model on the given input prompt string and a JSON dictionary context. Often used after pageContent(). Returns: String response from a large vision language model.

function ask({
    prompt: String, context: [[String: String]]
})

Show Parameters

prompt

String query to a large vision language model.

context

Array of JSON dictionaries, each containing the following fields:

text: an optional text description of the current web page.
imageFilePath: an optional path in memory of a screenshot taken by pageContent().

Also accepts a single JSON dictionary, such as the output of pageContent().

Wait

Put Agent into sleep state for a certain amount of time. Returns: None

function Wait({
    waitTime: Int, unit: String = "s"
})

Show Parameters

unit

Options are “s” for seconds (default) and “ms” for milliseconds.

waitTime

Duration of time to wait in the given units.

WaitForConcepts

Waits until all concepts can be found in the current frontmost window. If not all concepts can be found within 10 seconds, action returns failure Returns: None

function WaitForConcepts({
    concepts: [String]
})

Show Parameters

concepts

An array of target concepts to find.

User interaction

Respond

Respond to the user with a short message. Optionally require the user to give a yes/no answer or input text to proceed. Returns:

If requireConfirm is true, then this function returns true if the user chooses “yes”, returns false if the user chooses “no”.
If requireTextInput is true, then returns the string value that the user submitted.
Otherwise, returns true

function Respond({
    message: String, 
    requireConfirm: Bool = false, 
    requireTextInput: Bool = false
})

Show Parameters

message

A message to show to the user.

requireConfirm

Whether or not user confirmation is required to proceed with the remaining actions. Default is false. If requireConfirm is true, then the message must be phrased as a question, to which the only possible responses are “yes” and “no”.

requireTextInput

Whether or not to require the user to input text before proceeding. Default is false. If requireTextInput is true, then the user will be prompted to input text after the message is shown.

System IO

CopyToClipboard

Copies a String to clipboard. Returns: None

function CopyToClipboard({
    text: String
})

Show Parameters

text

Text to be copied to the clipboard.

GetFromClipboard

Get the content of the current clipboard. Returns: Content of the currrent clipboard

function GetFromClipboard({
})

SaveScreenshot

Takes a screenshot of an element on the screen or the whole screen, and saves the screenshot as a PNG to a file. Returns: None

function SaveScreenshot({
    element: UIElement = null,
    fileName: String = null,
    directory: String = null
})

Show Parameters

element

If provided, limit the screenshot to the frame of the element. Default is the whole screen.

fileName

Name for the image name, default is “simularSavedImage.png”.

ScreenshotToClipboard

Take a screenshot of an element or the current page and save it to the system clipboard Returns: None

function ScreenshotToClipboard({
    element: UIElement = null
})

Show Parameters

element

If provided, limit the screenshot to the frame of the element. Default is the whole screen.

ReadFile

Read the contents of a file whose location is specified by path. Returns: Contents of the file as a String

function ReadFile({
    path: String
})

Show Parameters

path

Either an absolute path to a file, or a name of a file (assumed to be in the default app cache directory).

WriteToFile

Writes the given text to a file. If the file already exists, then appends text to it, with an option to overwrite the existing content. Unless specified path, writes to /Library/Caches/com.simular.Simular-Pro/SimularActionResult/ Will throw an error if there is an existing non-folder file named SimularActionResult Returns: None

function WriteToFile({
    text: String,
    path: String? = "SimularActionResult.txt",
    overwrite: Bool = false
})

Show Parameters

text

Text to write to a file.

path

path of the file, default at /Library/Caches/com.simular.Simular-Pro/SimularActionResult/.txt. If path contains ”/”, treat it as full path

overwrite

Whether or not to overwrite the contents if filePath points to an existing file.

Google Sheet control

GetGoogleSheetCellValue

Gets the value of a cell in a Google Sheet. Returns: Value of the cell

function GetGoogleSheetCellValue({
    cell: String
})

Show Parameters

cell

Label of a cell. Column is indicated by a capital letter and row is indicated by a number. For example “B42” is the cell at column B row 42.

SetGoogleSheetCellValue

Sets the value of a Google Sheet cell. Returns: None

function SetGoogleSheetCellValue({
    cell: String, value: String
})

Show Parameters

cell

Label of a cell. Column is indicated by a capital letter and row is indicated by a number. For example “B42” is the cell at column B row 42.

value

value to write to the cell

GetGoogleSheetColumns

Gets the column ids of each header in a given array of column headers in a Google Sheet. For example, if the sheet has column headers “website”, “description”, “date” in cells A1, B1, C1, respectively, then GetGoogleSheetColumns(headers: ["website", "description", "date"]) returns [“A”, “B”, “C”] Note: This function currently assumes that the table headers are on row 1. Returns: Array of column id, each is a capital letter from A to Z

function GetGoogleSheetColumns({
    headers: [String]
})

Show Parameters

headers

Array of column header

Advanced GUI functions

GetElements

Get elements that satisfy some conditions inside the current frontmost application or inside a root element (if given). For disambiguation, one can constrain the search to elements that satisfy certain spatial relations to anchor elements. This function supports multiple return types according to returnType. Returns: Depending on returnType: [UIElement], String, [String], [String: UIElement]

function GetElements({
    elementRoles: [String] = [],
    elementOverallDescription: String = "",
    threshold: Double = 0.75,
    root: UIElement = null,
    spatialRelation: String = "",
    anchorRole: String = "",
    anchorOverallDescription: String = "",
    anchorElements: [UIElement] = [],
    horizontalRank: Int = null,
    verticalRank: Int = null,
    sortBy: String = "",
    useNeighborForMissingDescription: Bool = false,
    returnType: String = "elementArray"
})

Show Parameters

elementRoles

Constrains the search to elements whose role is included in this array of roles. Required if elementOverallDescription is not given.

elementOverallDescription

Description of elements to get from the page. Required if elementRoles are not provided.

threshold

If elementOverallDescription is given, then accept candidate elements whose normalized string similarity to elementOverallDescription is above this threshold value.

root

If given, then the search is limited to elements contained within this root element.

spatialRelation

A comma-separated String of spatial relationships between the target elements and the anchor. Available options are: “closest”, “furthest”, “above”, “right”, “below”, “left”, “contains”, “containedIn”, “sameRow”, “sameColumn”

anchorRole

Role of element(s) used as anchor for spatial relation.

anchorOverallDescription

Description of an object used as an anchor with spatialRelation.

anchorElements

Elements to use as anchor for spatial relation constraints. If anchorElements is provided, then anchorRole and anchorOverallDescription are ignored.

horizontalRank

If given, sorts the elements by x-coordinate of frame midpoint and returns the element with this rank. Left-most element has rank 1.

verticalRank

If given, sorts the elements by y-coordinate of frame midpoint and returns the element with this rank. Top-most element has rank 1.

sortBy

Returns the found elements in sorted order, by “x” (left to right) or “y” (top to bottom). Used only if returnType is “elementArray”.

useNeighborForMissingDescription

Whether or not to use the description of an element’s neighbor as a substitute if the element’s description is empty. This is only used if returnType involves returning element descriptions.

returnType

Options are: “elementArray” (default), “string”, “stringArray”, “strToElemDict”.

elementArray: returns an array of UIElements
string: returns a semicolon-separated string of descriptions of found elements
stringArray: returns an array of String descriptions of found elements
strToElemDict: returns a [String: Element] dictionary with element description as keys and corresponding element as value

GetAttributeOfElement

Searches for an element that matches the input criteria and gets the element’s value for a specified attribute. Returns: String value of an attribute of an element

function GetAttributeOfElement({
    elementRole: String = "",
    elementOverallDescription: String = "",
    attribute: String = "",
    threshold: Double = 0.75,
    root: UIElement = null,
    spatialRelation: String = "",
    anchorRole: String = "",
    anchorOverallDescription: String = "",
    anchorElements: [UIElement] = [],
    horizontalRank: Int = null,
    verticalRank: Int = null
})

Show Parameters

elementRole

Role of the target element. Required if elementOverallDescription is not given.

elementOverallDescription

Description of the element. Required if elementRoles are not provided.

attribute

Valid options are: “role”, “description”, “title”, “value”. For example, use “value” to get the value of text elements.

threshold

If elementOverallDescription is given, then accept candidate elements whose normalized string similarity with elementOverallDescription is above this threshold value.

root

If given, then the search is limited to elements contained within this root element.

spatialRelation

A comma-separated String of spatial relationships between the target elements and the anchor.

anchorRole

Role of element(s) used as anchor for spatial relation.

anchorOverallDescription

Description of an object used as an anchor with spatialRelation.

anchorElements

Elements to use as anchor for spatial relation constraints. If anchorElements is provided, then anchorRole and anchorOverallDescription are ignored.

horizontalRank

If given, sorts the elements by x-coordinate of frame midpoint and returns the element with this rank. Left-most element has rank 1.

verticalRank

If given, sorts the elements by y-coordinate of frame midpoint and returns the element with this rank. Top-most element has rank 1.

GetContent

Get text content from the current frontmost window or a region corresponding to the provided concept or element. Returns: If inElement argument is given or the frontmost window was used (because neither inConcept nor inElement was given), then returns a single String. Otherwise, returns a [String] array with one String per root element.

function GetContent({
    inConcept: String = "",
    inElement: UIElement = null,
    format: String = "flat"
})

Show Parameters

inConcept

Gets content in elements that match this concept.

inElement

Get content in this element.

format

Format of the returned content. Options are: “flat” (unformatted text), “json”, “xml”, “xmlSlim” (same as XML, with empty tags removed)

GetCells

Get all cells from a row or column element. Either row or column must be given. Returns: An array of cell elements contained in the given row or column. If input is a row, the output array is sorted by increasing x-coordinate (left to right). If input is a column, the output array is sorted by increasing y-coordinate (top to bottom).

function GetCells({
    row: UIElement = null, column: UIElement = null
})

Show Parameters

row

A row element that contains one or more cells.

column

A column element that contains one or more cells.

GetCellValue

Get the value of a given cell element. Returns: Value contained in the cell.

function GetCellValue({
    cell: UIElement
})

Show Parameters

cell

A cell element.

GetCellLabel

Get the label of the given cell element in Excel. Returns: cell’s label String. Example: “A1”

function GetCellLabel({
    cell: UIElement
})

Show Parameters

cell

A cell element

GetCellIndices

Given an array of table cell values, return a corresponding array of cell indices. For example, suppose the table has value1 in cell A10, then GetCellIndices(cellValues: ["value1"]) returns [“A10”] Returns: [String] array of cell indices

function GetCellIndices({
    cellValues: [String]
})

Show Parameters

cellValues

values of the cell

GetTableColumn

Given a header or a index String, return the column under it as [index: Element] dictionary If the table has a column with header “Website” in cell A1, and elements elem1 and elem2 under it, then this function returns [“A2”: elem1, “A3”: elem2]. Returns: Dictionary of [String: UIElement] pair for all information in the column under header

function GetTableColumn({
    header: String = null, index: String = null
})

Show Parameters

header

value of column header.

index

index of column header, e.g. “B42”.

GetStructuredDescription

Gets XML-formatted description of the contents in each element. Returns: An array of String [s_1, ..., s_n], where each s_i is an XML-formatted description of the contents rooted at u_i.

function GetStructuredDescription({
    fromElements: [UIElement]
})

Show Parameters

fromElements

An array of UIElements [u_1, ..., u_n].

Getting Started

Simular Pro

Simular Browser

Agent S

​General

​Act

​Application

​Open

​Keyboard control

​Type

​Shortcut

​Mouse Control

​Click

​Move

​Drag

​Scroll

​Perception

​stateSatisfies

​ConceptsExist

​pageContent

​Text Generation

​ask

​Wait

​Wait

​WaitForConcepts

​User interaction

​Respond

​System IO

​CopyToClipboard

​GetFromClipboard

​SaveScreenshot

​ScreenshotToClipboard

​ReadFile

​WriteToFile

​Google Sheet control

​GetGoogleSheetCellValue

​SetGoogleSheetCellValue

​GetGoogleSheetColumns

​Advanced GUI functions

​GetElements

​GetAttributeOfElement

​GetContent

​GetCells

​GetCellValue

​GetCellLabel

​GetCellIndices

​GetTableColumn

​GetStructuredDescription

General

Act

Application

Open

Keyboard control

Type

Shortcut

Mouse Control

Click

Move

Drag

Scroll

Perception

stateSatisfies

ConceptsExist

pageContent

Text Generation

ask

Wait

Wait

WaitForConcepts

User interaction

Respond

System IO

CopyToClipboard

GetFromClipboard

SaveScreenshot

ScreenshotToClipboard

ReadFile

WriteToFile

Google Sheet control

GetGoogleSheetCellValue

SetGoogleSheetCellValue

GetGoogleSheetColumns

Advanced GUI functions

GetElements

GetAttributeOfElement

GetContent

GetCells

GetCellValue

GetCellLabel

GetCellIndices

GetTableColumn

GetStructuredDescription