prd.md - agent-smith

# PRD: Agent Smith - C++ LLM Agent Framework

## 1. Overview
Agent Smith is a high-performance, modular C++ framework for building LLM-powered agents. It aims to provide a clean abstraction for interacting with LLM providers, managing agent state, and executing tools. The framework strongly leans on `libmw` for everyday system tasks, networking, and data handling.

## 2. Core Components

### 2.1 LLM Provider Interface (`LlmClient`)
*   **Abstraction:** Unified interface focused on two primary backends:
    *   OpenAI-compatible endpoints (OpenAI, Local models running OpenAI-compatible servers like vLLM/llama.cpp server).
    *   Google Gemini API.
*   **Capabilities:**
    *   Support for standard request/response cycles via `libmw`'s HTTP client (`mw::url` wrapper around cURL).
    *   Asynchronous streaming of tokens.
    *   Configurable parameters (temperature, max tokens, etc.).
    *   System prompt management.

### 2.2 Agent Core (`Agent`)
*   **Logic:** Manages the conversational loop and decision-making process.
*   **Orchestration:** Coordinates between the `LlmClient`, `Memory`, and `Tool` systems.
*   **State Management:** Maintains the current status and internal reasoning of the agent.

### 2.3 Tool/Action System (`Tool` & MCP)
*   **Definition:** Interface for defining functions that agents can invoke. The framework will natively support standard **tool calls**.
*   **Model Context Protocol (MCP):** Full support for MCP to allow standardized communication, capability discovery, and seamless interaction with external servers, tools, and data sources.
*   **Discovery:** Mechanism to provide tool schemas (JSON Schema) to the LLM.
*   **Execution:** Automated parsing of LLM-generated arguments and execution of C++ functions.
*   **Feedback:** Structured way to return tool results back to the LLM context.

### 2.4 Agent Skills
*   **Modular Capabilities:** Support for defining, loading, and activating specialized "skills".
*   **Extension:** Skills will bundle specialized system prompts, contextual knowledge, and restricted toolsets to guide the agent through specific, complex workflows or domain-specific tasks.

### 2.5 Memory & Context Management (`Memory`)
*   **Storage:** Methods for storing and retrieving conversation history. Local persistence can be implemented using `mw::sqlite`.
*   **Strategies:**
    *   Rolling window (fixed token/message limit).
    *   Summary-based memory.

## 3. Functional Requirements

### 3.1 Asynchronous Operations
*   If asynchronous execution is necessary (e.g., for non-blocking LLM calls or tool executions), the framework will utilize **C++ standard coroutines**.

### 3.2 JSON Integration
*   Robust serialization and deserialization for API communication and tool argument parsing using **`nlohmann/json`** (leveraging `libmw`'s integration).

### 3.3 Streaming Support
*   Real-time processing of token streams for both UI feedback and intermediate agent reasoning.

### 3.4 Resilience, Error Handling & Logging
*   **Error Propagation:** Utilize `mw::E<T>` (`std::expected` wrapper) extensively for safe, exception-free error propagation across public interfaces, especially for network and JSON parsing tasks.
*   **Built-in retry logic:** for network failures and rate limiting.
*   **Validation:** Validation of LLM outputs to handle malformed or hallucinated tool calls, returning explicit error payloads back to the conversational loop.
*   **Logging:** Comprehensive logging handled via `libmw`'s integration with **`spdlog`**.

### 3.5 Command-Line Interface
*   **CLI Options:** The executable must support command-line arguments to specify:
    *   API Key (`--api-key` or `-k`).
    *   LLM API Endpoint (`--endpoint` or `-e`) to support OpenAI-compatible local servers or custom proxies. The endpoint should just be the base URL (e.g. `https://api.openai.com/v1`); paths are appended automatically via `mw::URL`.
    *   Help information (`--help` or `-h`).

### 3.6 Testing & Validation
*   **Unit Testing:** The project must contain comprehensive unit test coverage for all public interfaces (Tasks, Memory, Tools, Agent logic, and LLM Clients).
*   **Frameworks:** Rely on GoogleTest (`gtest`) and GoogleMock (`gmock`).
*   **Mocking:** Network operations in `LlmClient` tests should be bypassed using injected `mw::HTTPSessionMock` interfaces.

## 4. Technical Constraints
*   **Language Standard:** C++23.
*   **Build System:** CMake.
*   **Dependencies:** Heavy reliance on `libmw` for core utilities, networking (`mw::url`), database (`mw::sqlite`), and JSON processing (`nlohmann/json`).
*   **Portability:** Target Linux/Unix systems primarily.
* All URL manipulations should be done with the `libmw::URL` class.
* All functions that can fail should return a `mw:E<>`.

## 5. Future Works
*   **Vector Database Integration (RAG):** Implement Retrieval-Augmented Generation to allow agents to perform semantic search across large external knowledge bases.