Architecture & Design Decisions
This document captures key architectural decisions, design patterns, and learnings from building sideBar.Table of Contents
- Design Patterns
- Key Architectural Decisions
- Data Flow
- URL Routing Policy
- Security Model
- Performance Considerations
- Learnings & Trade-offs
Design Patterns
UI Simplicity vs AI Context Richness
Key Insight: What you display in the UI can be minimal and user-friendly, while the AI context can be far more detailed and structured. This separation of concerns allows for:- Clean, uncluttered user interfaces
- Rich, contextual AI interactions
- Flexibility to enhance AI capabilities without UI changes
Example: Weather Data
UI Display:- Temperature (e.g., “18°C”)
- Simple weather icon
- Basic description (e.g., “Partly cloudy”)
- Make weather-aware suggestions (“It might rain tomorrow, perhaps indoor activities?”)
- Understand context (“The wind is strong from the west, which explains…”)
- Provide detailed recommendations without cluttering the UI
Example: Location Data
UI Display:- “London, UK”
Example: User Profile
UI Display:- Name and avatar
- Basic settings toggles
Progressive Enhancement Pattern
Features are built with progressive enhancement in mind:- Core Functionality: Works with minimal data
- Enhanced with Context: Becomes smarter with additional information
- Graceful Degradation: Falls back cleanly when data unavailable
Key Architectural Decisions
1. JSONB Message Storage
Decision: Store conversation messages as JSONB array in PostgreSQL rather than separate messages table. Rationale:- Single query retrieves entire conversation (no joins)
- Flexible schema for evolving message structure
- GIN indexing enables fast search across message content
- Tool calls naturally nested within messages
- ✅ Simpler queries, better read performance
- ✅ Atomic conversation updates
- ❌ Harder to query individual messages across conversations
- ❌ Large conversations could hit size limits (mitigated by pagination)
backend/api/models/conversation.py:15
2. Server-Sent Events (SSE) for Streaming
Decision: Use SSE instead of WebSockets for chat streaming. Rationale:- Unidirectional communication (server → client) sufficient for streaming tokens
- Simpler protocol than WebSockets
- Better compatibility with proxies and load balancers
- Built-in reconnection in EventSource API
- HTTP-based, easier to secure and monitor
- ✅ Simpler implementation
- ✅ Better compatibility
- ✅ Automatic reconnection
- ❌ Unidirectional only (acceptable for this use case)
backend/api/routers/chat.py:64 (stream endpoint)
3. Skill-Based Tool System
Decision: Implement tools as discrete Python scripts following Agent Skills specification rather than inline Python functions. Rationale:- Sandboxing and security isolation
- Reusability across different AI agents
- Clear separation of concerns
- Easier to validate and test
- Community sharing and standardization
- ✅ Strong security boundaries
- ✅ Reusable, standardized format
- ✅ Easy to validate and share
- ❌ Additional subprocess overhead
- ❌ More complex debugging
backend/api/executors/skill_executor.py:20
4. Supabase + R2 for Persistence
Decision: Use Supabase Postgres for relational data and Cloudflare R2 for object storage. Rationale:- Managed Postgres with built-in pooling and RLS
- Object storage for workspace files and assets
- Clear separation between structured data and blobs
- Scales independently of the API layer
- ✅ Managed infrastructure and backups
- ✅ RLS for user scoping
- ❌ External dependency on Supabase/R2 availability
- ❌ Requires careful credential management
backend/api/config.py (Supabase settings), backend/api/services/storage/
5. Doppler for Secrets Management
Decision: Use Doppler for centralized secrets management rather than .env files. Rationale:- Centralized secret storage
- Environment-specific configurations
- Audit logging for secret access
- Team collaboration without sharing secrets
- Automatic rotation support
- ✅ Better security and audit trail
- ✅ Team-friendly secret sharing
- ✅ Environment separation
- ❌ External dependency
- ❌ Requires Doppler token for local dev
docker-compose.yml:40 (DOPPLER_TOKEN env var)
5. FastAPI + SvelteKit Split
Decision: Separate backend (FastAPI) and frontend (SvelteKit) rather than monolithic framework. Rationale:- Independent scaling of frontend and backend
- Technology flexibility (best tool for each layer)
- Clear API boundaries
- Easier to add mobile clients later
- Better developer experience (hot reload for both)
- ✅ Flexibility and scalability
- ✅ Clear separation of concerns
- ✅ Better DX with independent hot reload
- ❌ Additional deployment complexity
- ❌ CORS and authentication coordination
6. Storage Abstraction + Tmpfs
Decision: Route file operations through a storage abstraction (R2) and keep skill scratch space on tmpfs. Rationale:- Avoid persistent local workspace volumes
- Centralized storage for files across containers
- Ephemeral local data reduces risk and cleanup burden
- ✅ Consistent storage across environments
- ✅ Reduced local state
- ❌ Network dependency for file access
- ❌ Slightly higher latency for large file operations
backend/api/services/storage/, docker-compose.yml tmpfs config
Skill Output Metadata
- File-producing skills return
file_idplus derivative metadata instead of raw storage paths. - AI context is standardized in
{user_id}/files/{file_id}/ai/ai.mdwith backward-compatible frontmatter.
Data Flow
Chat Message Lifecycle
Tool Execution Flow
Security Model
Defense in Depth
sideBar implements multiple security layers:1. Container Security
- Non-root user (UID 1000)
- Dropped Linux capabilities (ALL)
- Read-only root filesystem
- No new privileges
- Tmpfs with noexec
docker-compose.yml:49-58
2. Storage Scoping
- File operations routed through R2-backed storage service
- Workspace access scoped by user ID and category
- Path normalization to prevent traversal
3. Resource Limits
- Execution timeout: 30 seconds per skill
- Output size: 10MB maximum
- Concurrency: 5 simultaneous skill executions
- Memory: Container limits via Docker
backend/api/config.py:31-33
4. Skill Sandboxing
- Subprocess isolation
- Minimal environment variables
- No inherited file descriptors
- Read-only skill directory mount
- Tmpfs working directory for ephemeral writes
backend/api/executors/skill_executor.py:45-70
5. Authentication
- Bearer token for all API endpoints
- Future: JWT with expiry and refresh tokens
backend/api/auth.py
6. Row-Level Security (RLS)
- Supabase policies enforce per-user access
- Session user ID propagated via
SET app.user_id
backend/api/db/session.py, Supabase policies
Security Trade-offs
Strict but Usable:- Path jailing limits flexibility but prevents major security issues
- Write allowlists require explicit configuration but prevent accidents
- Timeouts may interrupt long-running tasks but prevent hangs
- JWT with user identity and expiry
- Rate limiting per user
- Skill-level permission system
- Output sanitization for XSS prevention
Performance Considerations
1. Denormalization for Reads
Pattern: Store computed values alongside source data. Examples:message_counton conversations (avoid counting JSONB array)first_messagepreview (avoid parsing JSONB)- User settings cached in memory
2. Caching Strategy
Weather Cache
- TTL: 30 minutes (1800s)
- Key: Rounded lat/lon (2 decimal places)
- In-memory dictionary
- Reduces API calls for nearby locations
backend/api/routers/weather.py:18-40
Database Connection Pooling
- SQLAlchemy connection pool
- Reuses connections across requests
- Configurable pool size
3. Streaming vs Buffering
Decision: Stream tokens immediately rather than buffering complete responses. Benefits:- Lower perceived latency
- Better UX (see response forming)
- Handles long responses gracefully
yield in async generator
4. GIN Indexing for JSONB
- Full-text search across conversation messages
- Fast lookups within nested structures
- Enables complex queries without scanning
backend/api/models/conversation.py:25
5. Lazy Loading
- Skills loaded on-demand, not at startup
- User settings fetched per-request (with caching)
- Frontend components code-split via SvelteKit
Performance Monitoring
Frontend Metrics
- Web Vitals: Captured in
frontend/src/lib/utils/performance.tsand sent to/api/v1/metrics/web-vitals. - Chat Metrics: Captured in
frontend/src/lib/utils/chatMetrics.tsand sent to/api/v1/metrics/chat. - Transport: Uses
navigator.sendBeaconwith fetch fallback to avoid blocking UX. - Env Controls: Metrics endpoints and sampling are driven by
PUBLIC_*env vars (see.env.example).
Backend Metrics
- Prometheus:
/metricsexposes HTTP, chat, tool, storage, and DB pool metrics (backend/api/metrics.py). - Ingestion:
backend/api/routers/metrics.pyaccepts web-vitals and chat metrics from the frontend.
Error Monitoring
- Sentry (Frontend): Initialized in
frontend/src/lib/config/sentry.tsandfrontend/src/hooks.client.ts. - Sentry (Backend): Configured in
backend/api/main.pyviaSENTRY_*settings.
Learnings & Trade-offs
What Worked Well
1. JSONB for Message Storage
Initially skeptical, but the simplicity of single-query conversation retrieval outweighed the downsides. GIN indexing makes search fast enough.2. Agent Skills Specification
Standardized format made it easy to add new capabilities. Community patterns emerging.3. SSE for Streaming
Much simpler than WebSockets for unidirectional streaming. Reconnection handled automatically by browser.4. Doppler Integration
Eliminated .env file sharing in team. Environment separation became trivial.5. Storage Scoping
Sleep well knowing skills cannot escape their scoped storage prefixes. Zero security incidents.What We’d Do Differently
1. Earlier Investment in Testing
Added tests later rather than from the start. TDD would have caught integration issues earlier.2. Schema Migrations from Day One
Used Alembic from the start. Hand-crafted SQL migrations became painful.3. Structured Logging Earlier
Added structured logging later. Would have helped debugging production issues.4. API Versioning
No versioning initially. Breaking changes harder to manage. We later added/api/v1/ paths with deprecation middleware for legacy /api/* routes (sunset: 2026-06-01).
Interesting Challenges
1. Corporate SSL Interception
Company MITM proxy broke httpx SSL verification. Created custom client with configurable SSL verification. Solution:JINA_SSL_VERIFY=false option in config
2. Multi-turn Tool Use Loops
Claude sometimes requests multiple tools sequentially. Had to implement loop with max iterations to prevent infinite loops. Solution: Max 5 tool use rounds per message3. Message ID Deduplication
Duplicate messages on reconnect. Frontend generates IDs to prevent duplicates on backend. Solution:user_message_id in request payload
4. Skill Output Size
Some skills (PDF extraction) returned massive outputs, breaking responses. Solution: 10MB output limit with truncationFuture Architectural Considerations
1. Multi-User Support
Currently single-user with extensibleuser_id field. Full multi-user requires:
- JWT authentication with user identity
- User-scoped data isolation
- Per-user rate limiting
- Billing and quotas
2. Skill Marketplace
Community skill sharing requires:- Skill signing and verification
- Dependency management
- Version compatibility
- Security reviews
3. Horizontal Scaling
Current architecture allows horizontal scaling of:- ✅ Frontend (stateless SvelteKit)
- ✅ Backend API (FastAPI stateless with external DB)
- ⚠️ Skill execution (needs distributed locking for concurrency limits)
- Redis for distributed caching and locking
- Shared storage for files (R2 already supports this)
- Database connection pooling
4. Real-time Collaboration
Multiple users editing same note requires:- WebSocket bidirectional communication
- Operational transforms or CRDTs
- Conflict resolution
- Presence indicators
References
- Agent Skills Spec: https://agentskills.io/specification
- FastAPI Docs: https://fastapi.tiangolo.com
- SvelteKit Docs: https://kit.svelte.dev
- PostgreSQL JSONB: https://www.postgresql.org/docs/current/datatype-json.html
- Server-Sent Events: https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events
Last Updated: 2026-01-04 Document Owner: Architecture Team Review Cycle: Quarterly or on major architectural changes