# AnyFS Architecture

## System Overview

AnyFS is a three-tier system:

```
Agent Harnesses (Claude Code, Codex, Gemini, Cursor, OpenClaw)
       |
       | Adapters extract agent state
       v
AnyFS CLI / SDK (local, runs on user's machine)
       |
       | Encrypted payloads only
       v
AnyFS Cloud (API server + storage gateway + object storage)
```

All encryption/decryption happens locally. The cloud never sees plaintext.

## Production Deployment

Current deployment on GKE (`us-central1`):

| Component | URL | Backend |
|-----------|-----|---------|
| API Server | `https://api.anyfs.ai` | FastAPI + Cloud SQL PostgreSQL |
| Blob Storage | `https://storage.anyfs.ai` | Node.js storage gateway + GCS |

## Key Hierarchy (E2E Encryption)

Three-level key hierarchy using AES-256-GCM:

```
Level 1: Wrap Key
  = SHA256(user_id + ":" + device_id)
  Not stored anywhere; re-derived from identity
       |
       | encrypts
       v
Level 2: Root Key
  = 256-bit random key (os.urandom(32))
  Stored encrypted in ~/.anyfs/keys/rootkey.enc
       |
       | encrypts (one DEK per namespace per snapshot)
       v
Level 3: DEK (Data Encryption Key)
  = 256-bit random key per payload
  Stored wrapped inside each envelope
       |
       | encrypts
       v
Plaintext payload (agent memory, skills, transcripts, etc.)
```

### Encryption Details

- **Algorithm**: AES-256-GCM (Authenticated Encryption with Associated Data)
- **Nonce**: 12 bytes, randomly generated per encryption operation
- **AAD (Additional Authenticated Data)**: namespace name for DEK wrapping, user_id for root key wrapping
- **Library**: Python `cryptography` (AESGCM)

### Envelope Structure

Each encrypted namespace payload is stored as an envelope:

```json
{
  "envelope_id": "env_abc123def456",
  "alg": "AES256-GCM",
  "namespace": "memory",
  "wrapped_dek": {
    "nonce": "<base64>",
    "ciphertext": "<base64>"
  },
  "payload": {
    "nonce": "<base64>",
    "ciphertext": "<base64>"
  }
}
```

### Password-Based Sharing

To share without exposing the root key:

1. Unwrap DEK using root key
2. Derive share key: `SHA256(password + ":" + namespace)`
3. Re-wrap DEK with share key
4. Add `"share_mode": "password"` to envelope

Recipient decrypts with only the password:
1. Derive share key: `SHA256(password + ":" + namespace)`
2. Unwrap DEK with share key
3. Decrypt payload with DEK

## Agent Identity Model

```
User (usr_xxx)
  |
  +-- Device (dev_xxx)
  |     |
  |     +-- Agent: Claude Code (agt_aaa)
  |     +-- Agent: Codex CLI (agt_bbb)
  |     +-- Agent: Cursor (agt_ccc)
  |
  +-- Device 2 (dev_yyy)
        |
        +-- Agent: Claude Code (agt_ddd)  <- different agent, different identity
```

- Each `anyfs init` creates a `user_id` + `device_id` pair
- Each `anyfs agent register` creates an `agent_id` bound to the workspace
- Agents authenticate to the API with agent tokens (`agtok_xxx`)
- Agents can be standalone (self-service) or bound to a user account (via claim code)

## Adapter Architecture

Each agent harness has an adapter implementing the `AgentAdapter` protocol:

```python
class AgentAdapter(Protocol):
    def discover(workspace: str) -> list[Path]       # Find agent state files
    def extract(workspace: str) -> list[dict]         # Parse into raw records
    def normalize(raw: list[dict]) -> dict            # Group by namespace
    def bundle(normalized: dict) -> CaptureResult     # Create capture payload
```

### Namespace Classification

Files are classified into namespaces:

| Namespace | Content | Encrypted |
|-----------|---------|-----------|
| `artifacts` | Code, documents | No (public) |
| `memory` | Agent memory, context | Yes |
| `soul` | Identity, beliefs | Yes |
| `skills` | Instructions, prompts, CLAUDE.md | Yes |
| `process` | Session transcripts, logs | Yes |
| `user` | User-created data | Yes |

## Universal Agent Session Format (UASF)

The canonical transcript format for cross-harness session transfer:

```python
CanonicalTranscript:
  session: TranscriptSessionMeta  # id, agent, model, cwd, git_branch
  events: list[TranscriptEvent]   # Typed event list
```

### Event Types

| Event Type | Description |
|------------|-------------|
| `user_message` | User input |
| `assistant_text` | Assistant response text |
| `thinking` | Model thinking/reasoning |
| `file_read` | File read operation |
| `file_edit` | File edit operation |
| `file_write` | File write/create |
| `shell_command` | Shell command execution |
| `code_search` | Code search (grep, glob, web) |
| `error` | Error events |
| `file_snapshot` | Full file state capture |

### Adapter Coverage

| Agent | Parse | Export (Native) | Inject (Context) |
|-------|-------|-----------------|-------------------|
| Claude Code | Full | Full | Full |
| Codex CLI | Full | Full | Full |
| Gemini CLI | Full | - | Full |
| Cursor | - | - | Full (.mdc rules) |
| OpenClaw | Full | - | - |

## Data Flow: Quick Share

```
1. anyfs quick-share --workspace . --password Pass123
   |
   +-> capture: ClaudeCodeAdapter.extract() -> normalize() -> CaptureResult
   |
   +-> snapshot: encrypt each namespace with unique DEK, wrap DEK with root key
   |
   +-> share: for each namespace:
   |     re-wrap DEK with SHA256(password:namespace)
   |     write .share.json
   |
   +-> upload: POST /blobs/upload to storage.anyfs.ai (storage gateway)
   |     storage gateway -> object storage backend
   |     returns CID (content-addressed hash)
   |
   +-> return: {cid, gateway_url, password}
```

## Data Flow: Receiving a Share

```
1. anyfs share open https://storage.anyfs.ai/raw/<cid> -p Pass123
   |
   +-> download: fetch share envelope from storage gateway
   |
   +-> derive key: SHA256("Pass123:memory")
   |
   +-> unwrap DEK: AES-256-GCM decrypt with derived key
   |
   +-> decrypt payload: AES-256-GCM decrypt with DEK
   |
   +-> write: output decrypted JSON to filesystem
```

## Storage Backends (Storage Gateway)

The storage gateway supports multiple storage backends:

| Backend | Config | Use Case |
|---------|--------|----------|
| `local` | `STORAGE_LOCAL_ROOT` | Development |
| `gcs` | `GCS_BUCKET` + `GCP_PROJECT_ID` | GCP deployment |
| `s3` | `S3_BUCKET` + AWS credentials | AWS deployment |

Current production uses GCS behind the public storage gateway. Files remain client-side
encrypted before upload, and public retrieval flows through `https://storage.anyfs.ai/raw/<cid>`.

## AFPKG Archive Format

Packaged agent state for publishing and distribution:

```
package.afpkg (ZIP)
  manifest.public.json     # Readable without keys
  artifacts/public/        # Unencrypted files
  payloads/private/        # Encrypted envelopes per namespace
  lineage.json             # Fork/derivation graph
  policy.json              # Visibility + fork policy
```

## Lineage System

Tracks parent-child relationships between packages:

```python
LineageRecord:
  parent_package_id: str
  child_package_id: str
  created_at: datetime
```

Created automatically on `anyfs fork`. Append-only (immutable history).
Queryable via `anyfs lineage show <package_id>`.
