Introduction
Hermes Agent is an open-source AI agent framework by Nous Research that runs entirely on your local machine. Unlike cloud-based assistants, Hermes gives you full privacy, zero API costs, and complete control over your data.
This guide walks you through the entire setup process—from installing Ollama to running your first query in the Hermes web dashboard. No prior experience with local LLMs required.
What You'll Need
Before starting, make sure your system meets these requirements:
- macOS, Linux, or Windows (with WSL2)
- At least 16GB RAM for smaller models (9B–12B parameters)
- 32GB+ RAM recommended for larger models (27B–31B parameters)
- Git installed on your system
- Node.js v20+ (for browser tools)
- A stable internet connection for downloading models
Note: Hermes uses Ollama as its local inference engine. Every model you run is downloaded to your machine, so ensure you have enough disk space (each model ranges from 5GB to 25GB).
Step 1: Install Ollama
Ollama is the engine that runs large language models locally. Installing it takes less than a minute.
Open your terminal and run:
curl -fsSL https://ollama.com/install.sh | sh
This command downloads and installs Ollama along with the CLI tools you need to pull and run models.

Verify the installation:
ollama --version
You should see the Ollama version printed. If not, restart your terminal and try again.
Step 2: Choose Your Model
Hermes Agent supports both cloud models (via API keys) and local models (via Ollama). For this guide, we focus on local models—because that's where the real power of Hermes shines.
Head to the Ollama model library and search for your preferred model. The two most popular families for Hermes are Qwen 3.5 and Gemma 4.

Model Recommendations by RAM
| RAM | Recommended Models | Size | Best For |
|---|---|---|---|
| 16GB | gemma4:e4b | ~4B effective | Fast responses, coding, everyday tasks |
| 16GB | qwen3.5:9b | ~9B parameters | Reasoning, multimodal, agentic workflows |
| 32GB | gemma4:31b | ~31B parameters | Best raw quality, deep reasoning, complex coding |
| 32GB | qwen3.5:27b | ~27B parameters | Advanced reasoning, long-context tasks, research |
Why these models?
- Gemma 4 (Google): Apache 2.0 licensed, natively multimodal, optimized for reasoning and coding. The E4B variant is the "sweet spot" for desktop users—fast and capable.
- Qwen 3.5 (Alibaba): Open-weight, 256K context window, strong at agentic tool use and visual understanding. The 9B and 27B variants balance speed and intelligence.
Tip: If you're unsure, start with
qwen3.5:9b. It's the most forgiving beginner model and handles a wide range of tasks well.
Step 3: Download Your Model
Once you've chosen a model, download it with a single command. Ollama handles everything—downloading weights, verifying integrity, and preparing the model for inference.
For 16GB RAM systems:
# Option A: Gemma 4 E4B (fast, lightweight)
ollama run gemma4:e4b
# Option B: Qwen 3.5 9B (balanced, capable)
ollama run qwen3.5:9b
For 32GB RAM systems:
# Option A: Gemma 4 31B (best quality, dense architecture)
ollama run gemma4:31b
# Option B: Qwen 3.5 27B (advanced reasoning, 256K context)
ollama run qwen3.5:27b

ollama run [model]—Ollama pulls the weights automatically.First-time downloads take 5–20 minutes depending on your internet speed and model size. The 9B model is ~6GB; the 27B model is ~16GB.
Verify the model is ready:
ollama list
You should see your downloaded model in the list with its size and modification date.
Step 4: Install Hermes Agent
With Ollama and your model ready, it's time to install Hermes Agent.
Run the official installer script:
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash

The installer will:
- Detect your operating system
- Check for required dependencies (uv, Python 3.11+, Git, Node.js, ripgrep, ffmpeg)
- Install any missing tools automatically
- Clone the Hermes Agent repository to
~/.hermes/hermes-agent - Set up the Python virtual environment
If you already have Hermes installed, the script updates your existing installation to the latest version.
Step 5: Select Your Model in Hermes
Launch Hermes to open the model picker:
ollama launch hermes
Hermes Agent supports multiple models simultaneously. When you first run this command, it presents a model picker where you choose which model to use for your session.

Navigating the picker:
- Use ↑/↓ arrow keys to navigate
- Press Enter to select
- Press ← to go back
Recommended setup for beginners:
- Choose
qwen3.5:9borgemma4:e4bfrom the local section - Cloud models (kimi-k2.5, glm-5.1, etc.) require API keys and are optional
Cloud vs Local: Cloud models offer frontier-level intelligence but cost money per request. Local models are free forever after download but require more RAM. For this guide, stick with local.
Step 6: Start Hermes
With your model selected, start Hermes Agent:
hermes

hermes in your terminal to launch the agent.Hermes will:
- Load the selected model into memory
- Initialize its tool ecosystem (web search, file operations, code execution)
- Present you with a chat interface
First launch takes 30–60 seconds as the model is loaded into RAM. Subsequent launches are faster.
Step 7: Test Your Setup
Time for your first query. Hermes supports natural language—just type what you want.
Try asking something that requires real-time information:
> can you give me current world news quick

Hermes will:
- Analyze your request
- Use the DuckDuckGo search tool to find current information
- Summarize the results into readable headlines
What just happened? Hermes didn't rely on its training data (which has a cutoff date). Instead, it used a tool— DuckDuckGo search—to fetch real-time information. This is the core power of agentic AI.
Other things to try:
"Write a Python script that fetches weather data""Explain quantum computing like I'm five""Analyze this CSV file and plot the trends"(drag a file into the terminal)
Step 8: Launch the Web Dashboard
While the CLI is powerful, the web dashboard offers a richer experience—especially for long conversations and visual workflows.
Launch it with:
hermes dashboard

hermes dashboard builds the web UI and opens it at http://127.0.0.1:9119.Hermes will:
- Build the web UI (takes ~10 seconds on first run)
- Start a local server
- Open your browser automatically
The dashboard runs entirely locally—no data leaves your machine.
Step 9: Explore the Dashboard
The Hermes dashboard is your mission control center. Here's what each tab does:

| Tab | Purpose |
|---|---|
| Status | System health, model load status, active tools |
| Sessions | Chat history, manage multiple conversations |
| Analytics | Token usage, response times, cost tracking |
| Logs | Debug output, tool execution traces |
| Cron | Schedule recurring tasks and automations |
| Skills | Browse and install community skill packs |
| Example | Pre-built prompts and workflow templates |
| Config | Model settings, tool preferences, API keys |
| Keys | Manage cloud provider API keys (optional) |
Key features to explore:
- Search your conversation history
- Switch models mid-session without restarting
- View tool calls in real-time (see exactly what Hermes is doing)
- Export conversations as Markdown or JSON
Troubleshooting
"Ollama not found"
Restart your terminal after installing Ollama. If it still fails, add Ollama to your PATH:
export PATH="$PATH:/usr/local/bin"
"Model download is stuck"
Large models can take 20+ minutes. If it freezes, cancel (Ctrl+C) and retry:
ollama pull qwen3.5:9b
"Hermes says 'model not found'"
Make sure you've downloaded the model via Ollama first:
ollama list
If the model isn't listed, re-run the download command.
"Out of memory errors"
Your model is too large for your RAM. Switch to a smaller variant:
- 16GB RAM → use 9B or E4B models
- 32GB RAM → use up to 27B models
You can also close other applications to free up RAM.
"Dashboard won't open"
Check if port 9119 is already in use:
lsof -i :9119
If something is using it, kill the process or wait for it to finish.
Next Steps
Now that Hermes is running, here's what to explore next:
- Add API keys for cloud models (Claude Opus 4.7, Kimi K2.6, GPT 5.4) in the Config tab
- Install skills from the community repository to extend capabilities
- Set up cron jobs for automated research and monitoring
- Read the AI Models guide to understand which model fits each task
- Compare coding plans to decide if you need cloud API access
Hermes Agent is designed to grow with you. Start local, add cloud models when you need frontier-level reasoning, and build custom skills for your specific workflows.
Welcome to local-first AI.