OpenClaw gives any local LLM real hands on your machine: mouse, keyboard, screenshot analysis, browser automation and shell access, all behind a safety layer that asks you before anything irreversible. It pairs vision-capable models (LLaVA, Qwen-VL, Pixtral, Llama-3.2-Vision) with a robust action runtime so agents can actually do work — book a flight, debug a repo, rearrange a spreadsheet — without ever leaving your desktop. The architecture is modular: swap the planner, the executor, or the model, with no cloud round-trip.
- Vision + action loop: see the screen, decide, click
- Sandboxed action runtime with user confirmation
- Pluggable models — Ollama, llama.cpp, vLLM
- Browser, shell, filesystem & app-level tools
- Record, replay & share agent trajectories
- Cross-platform (macOS, Linux, Windows)