Official Website
Get your docs set up locally for easy development
Paper
Preview your changes before you push to make sure they’re perfect
Video
Preview your changes before you push to make sure they’re perfect
💡 Introduction
Welcome to Agent S, an open-source framework designed to enable autonomous interaction with computers through Agent-Computer Interface. Our mission is to build intelligent GUI agents that can learn from past experiences and perform complex tasks autonomously on your computer. Whether you’re interested in AI, automation, or contributing to cutting-edge agent-based systems, we’re excited to have you here!🛠️ Installation & Setup
❗Warning❗: If you are on a Linux machine, creating aClone the repository:conda
environment will interfere withpyatspi
. As of now, there’s no clean solution for this issue. Proceed through the installation without usingconda
or any virtual environment.
Setup Retrieval from Web using Perplexica
Agent S works best with web-knowledge retrieval. To enable this feature, you need to setup Perplexica:- Ensure Docker Desktop is installed and running on your system.
-
Navigate to the directory containing the project files.
-
Rename the
sample.config.toml
file toconfig.toml
. For Docker setups, you need only fill in the following fields:-
OPENAI
: Your OpenAI API key. You only need to fill this if you wish to use OpenAI’s models. -
OLLAMA
: Your Ollama API URL. You should enter it ashttp://host.docker.internal:PORT_NUMBER
. If you installed Ollama on port 11434, usehttp://host.docker.internal:11434
. For other ports, adjust accordingly. You need to fill this if you wish to use Ollama’s models instead of OpenAI’s. -
GROQ
: Your Groq API key. You only need to fill this if you wish to use Groq’s hosted models. -
ANTHROPIC
: Your Anthropic API key. You only need to fill this if you wish to use Anthropic models. Note: You can change these after starting Perplexica from the settings dialog. -
SIMILARITY_MEASURE
: The similarity measure to use (This is filled by default; you can leave it as is if you are unsure about it.)
-
-
Ensure you are in the directory containing the
docker-compose.yaml
file and execute: -
Our implementation of Agent S incorporates the Perplexica API to integrate a search engine capability, which allows for a more convenient and responsive user experience. If you want to tailor the API to your settings and specific requirements, you may modify the URL and the message of request parameters in
agent_s/query_perplexica.py
. For a comprehensive guide on configuring the Perplexica API, please refer to Perplexica Search API Documentation
Setup Paddle-OCR Server
Switch to a new terminal where you will run Agent S. Set the OCR_SERVER_ADDRESS environment variable as shown below. For a better experience, add the following line directly to your .bashrc (Linux), or .zshrc (MacOS) file.❗Warning❗: The agent will directly run python code to control your computer. Please use with care.
🚀 Usage
CLI
Run agent_s on your computer using:gui_agents
SDK
To deploy Agent S on MacOS or Windows:
cli_app.py
for more details on how the inference loop works.