design.md - agent-smith

# Technical Design Document: Agent Smith Framework

## 1. Introduction
This document outlines the technical design for **Agent Smith**, a modular C++23 framework for building LLM-powered agents. It is designed to be highly efficient and leverages `libmw` for everyday system tasks, networking, and data handling. 

This guide is structured to help a new graduate software engineer understand the architecture and implement the system systematically.

---

## 2. High-Level Architecture
The framework relies on a decoupled architecture centered around the `Agent` class.

*   **`Agent`**: The orchestrator. It manages the conversational loop, maintains state via `Memory`, interacts with the `LlmClient`, and dynamically executes `Tools` and `Skills`.
*   **`LlmClient`**: Handles network communication with LLM APIs (OpenAI and Gemini) using `mw::url`.
*   **`Memory`**: Stores the conversation history. Can be in-memory or backed by `mw::sqlite` for persistence.
*   **`Tool`**: An interface representing a callable action. Includes native tools and integrations via the Model Context Protocol (MCP).
*   **`Skill`**: A configuration that bundles a specific system prompt and a subset of tools to focus the agent on a specific workflow.
*   **CLI**: A command-line interface implemented using `cxxopts` to allow users to configure the agent's connection parameters (API key and endpoint) at runtime.

---

## 3. Data Models

We use `nlohmann/json` (provided via `libmw`) for all dynamic data structures and serialization.

```cpp
#include <string>
#include <optional>
#include <vector>
#include <variant>
#include <nlohmann/json.hpp>

// 1. System Message: Defines the agent's persona and overarching rules.
struct SystemMessage
{
    std::string content;
};

// 2. User Message: Input provided by the human user.
struct UserMessage
{
    std::string content;
};

// Represents a single tool call requested by the Assistant
struct ToolCall
{
    std::string id;
    std::string name;
    nlohmann::json arguments;
};

// 3. Assistant Message: The LLM's response. Can contain text or tool calls.
struct AssistantMessage
{
    std::optional<std::string> content;
    std::vector<ToolCall> tool_calls; 
};

// 4. Tool Result Message: The result of executing a tool, sent back to LLM.
struct ToolResultMessage
{
    std::string tool_call_id;
    std::string content;
};

// The unified Message type using std::variant for compile-time type safety.
using Message = std::variant<
    SystemMessage, 
    UserMessage, 
    AssistantMessage, 
    ToolResultMessage>;

// Standalone serialization functions for each message type
nlohmann::json toJson(const SystemMessage& msg);
nlohmann::json toJson(const UserMessage& msg);
nlohmann::json toJson(const ToolCall& msg);
nlohmann::json toJson(const AssistantMessage& msg);
nlohmann::json toJson(const ToolResultMessage& msg);
nlohmann::json toJson(const Message& msg);
```

---

## 4. Component Design

### 4.1 LlmClient Interface
The `LlmClient` abstracts the underlying LLM provider. This component uses `mw::E<>` for explicit error handling and C++23 coroutines for asynchronous task management.

```cpp
#include <mw/error.hpp>

// A placeholder for a C++23 coroutine Task type. 
template<typename T> class Task; 

class LlmClient
{
public:
    virtual ~LlmClient() = default;
    
    // Submits the history and tools, returning the Assistant's response or an error
    virtual Task<mw::E<Message>> generateResponse(
        const std::vector<Message>& history, 
        const nlohmann::json& available_tools_schema
    ) = 0;
};
```
*   **Implementation Steps:**
    *   Implement `OpenAiClient`. Allow dependency injection of `mw::HTTPSessionInterface` (falling back to `mw::HTTPSession`) to enable rigorous mocked network testing.
    *   Use `mw::URL` to safely parse custom endpoints and append standard paths (like `chat/completions`).
    *   Deserialize the response body into the `Message` struct, utilizing `std::unexpected(mw::runtimeError(...))` on JSON parsing failures.

### 4.2 Tool and MCP System
A `Tool` must expose its JSON Schema so the LLM understands its parameters, and an execution method.

```cpp
class Tool
{
public:
    virtual ~Tool() = default;
    virtual std::string name() const = 0;
    virtual std::string description() const = 0;
    
    // Returns the JSON Schema describing the arguments the tool accepts
    virtual nlohmann::json parametersSchema() const = 0;
    
    // Executes the tool with the JSON arguments provided by the LLM
    virtual Task<mw::E<std::string>> execute(const nlohmann::json& arguments) = 0;
};
```

**Tool Registry:**
To support multiple agents and centralized tool management, tools are owned by a global or application-level `ToolRegistry`. Methods returning `mw::E<void>` signal success or report duplication errors.

```cpp
#include <unordered_map>
#include <memory>
#include <stdexcept>
#include <format>

class ToolRegistry
{
public:
    // Registers a tool. Throws an exception if the tool name already exists
    // to prevent silent collisions.
    void registerTool(std::unique_ptr<Tool> tool)
    {
        if(tools_.contains(tool->name()))
        {
            throw std::runtime_error(
                std::format("Tool '{}' is already registered.", tool->name()));
        }
        tools_[tool->name()] = std::move(tool);
    }

    // Registers a tool with a namespace prefix to avoid collisions
    // (e.g., from different MCP servers).
    void registerToolWithNamespace(
        const std::string& ns, 
        std::unique_ptr<Tool> tool)
    {
        // Implementation detail: The Tool interface or a wrapper class
        // would need to return the new name when name() is called.
        std::string prefixed_name = std::format("{}_{}", ns, tool->name());
        if(tools_.contains(prefixed_name))
        {
            throw std::runtime_error(
                std::format("Tool '{}' is already registered.", prefixed_name));
        }
        // ... wrap tool and store with prefixed_name
    }

    Tool* getTool(const std::string& name) const
    {
        auto it = tools_.find(name);
        return it != tools_.end() ? it->second.get() : nullptr;
    }
    
    std::vector<Tool*> getAllTools() const
    {
        std::vector<Tool*> result;
        for(const auto& [name, tool] : tools_)
        {
            result.push_back(tool.get());
        }
        return result;
    }

private:
    std::unordered_map<std::string, std::unique_ptr<Tool>> tools_;
};
```

*   **Handling Collisions & Namespacing:** LLMs require unique tool names. If an MCP server provides a tool named `search` and another server also provides `search`, the `ToolRegistry` will reject the second registration. To resolve this, external tools (like those from MCP) should be registered using `registerToolWithNamespace` (e.g., `github_search` vs `local_search`).
*   **MCP Integration:** 
    *   Implement an `McpClient` that can parse MCP server manifests.
    *   The `McpClient` will act as a factory, dynamically generating `Tool` objects that map to remote MCP functions and registering them with the `ToolRegistry` using the server's name as a namespace.

### 4.3 Agent Skills
A `Skill` overrides the agent's default behavior, giving it a new persona and restricted capabilities.

```cpp
struct Skill
{
    std::string name;
    std::string system_prompt;
    // Names of the tools allowed for this skill
    std::vector<std::string> allowed_tools; 
};
```

### 4.4 Memory Management
Handles the context window.

```cpp
class Memory
{
public:
    virtual ~Memory() = default;
    virtual void addMessage(const Message& msg) = 0;
    virtual std::vector<Message> getHistory() const = 0;
    virtual void clear() = 0;
};
```
*   **Implementation Steps:**
    *   Create `InMemoryMemory` (a simple `std::vector` wrapper).
    *   Create `SqliteMemory` using `mw::sqlite` to persist `Message` structs to a database file.

### 4.5 The Agent Core
The `Agent` ties the components together in its main conversational loop.

```cpp
#include <memory>
#include <vector>

class Agent
{
public:
    Agent(
        std::unique_ptr<LlmClient> client, 
        std::unique_ptr<Memory> memory, 
        ToolRegistry& tool_registry);
    
    // Grants the agent permission to use a specific tool from the registry
    mw::E<void> allowTool(const std::string& tool_name);
    void activateSkill(const Skill& skill);
    
    // The main entry point for user interaction
    Task<mw::E<std::string>> run(const std::string& user_input);
    
private:
    std::unique_ptr<LlmClient> client_;
    std::unique_ptr<Memory> memory_;
    ToolRegistry& tool_registry_;
    
    // List of tool names this agent is currently allowed to use
    std::vector<std::string> allowed_tools_;
    std::optional<Skill> current_skill_;
};
```

**The `run` Loop Logic (Pseudo-code):**
1. Append `user_input` as a `UserMessage` to `memory_`.
2. `loop`:
    1. Fetch `history` from `memory_`.
    2. Resolve `allowed_tools_` against the `tool_registry_` and convert them into a single JSON Schema array.
    3. `response_msg = co_await client_->generateResponse(history, tools_schema)`.
    4. Append `response_msg` (which is an `AssistantMessage`) to `memory_`.
    5. `if response_msg contains tool_calls`:
        1. For each `call` in `tool_calls`:
            1. Find matching `Tool* tool` in `tool_registry_`.
            2. `result = co_await tool->execute(call.arguments)`.
            3. Append a `ToolResultMessage` (containing `result` and `call.id`) to `memory_`.
    6. `else`:
        1. Break loop.
3. Return `response_msg.content` (or extract content from the variant).

---

## 5. Build System & Dependencies
*   **CMake:** The project will be built using CMake (`CMakeLists.txt`).
*   **C++23:** Ensure compiler flags support standard C++23 (`-std=c++23`).
*   **libmw:** Include `libmw` via `FetchContent` or as a Git submodule. 
    *   Link against `mw::core`, `mw::url`, `mw::sqlite`.
*   **Logging:** Use `spdlog` (bundled with `libmw`) for detailed debug and execution logging.

---

## 6. Implementation Plan (For the Developer)

To build this systematically, follow these phases:

1.  **Phase 1: Foundations & Build Setup** 
    *   Create the `CMakeLists.txt`. Integrate `libmw`. 
    *   Define the core structs (`Message`, `Role`, `Skill`).
2.  **Phase 2: Networking & LLM Client** 
    *   Implement the `LlmClient` interface. 
    *   Build `OpenAiClient` using `mw::url`. Test it with a hardcoded prompt to ensure you can parse the JSON responses.
3.  **Phase 3: The Core Loop** 
    *   Implement the `Agent` class and a simple `InMemoryMemory` class. 
    *   Get a basic back-and-forth chat working in the terminal *without* tools.
4.  **Phase 4: Tool Calling** 
    *   Implement the `Tool` interface. 
    *   Create a simple mock tool (e.g., `CalculatorTool`). 
    *   Implement the `Agent::run` loop logic to parse tool calls, execute them, and feed the results back to the LLM.
5.  **Phase 5: Persistence & Skills** 
    *   Implement `SqliteMemory` utilizing `mw::sqlite`.
    *   Add the ability to load and switch between `Skill` profiles.
6.  **Phase 6: Async Polish & MCP** 
    *   Finalize C++23 coroutine integration for all network and execution paths.
    *   Research and implement an `McpClient` to communicate with external MCP servers over stdio.