Agent Orchestration & Security Template
A reference architecture for AI agent orchestration, trust measurement, and tool integration. All code authored by AI agents under human oversight.
Overview
Designed to be studied, forked, and adapted. A single-maintainer project optimized for individual developer efficiency with maximum portability.
Container-First
All Python and Rust operations run in Docker containers. Zero local dependencies beyond Docker itself -- maximum portability across any Linux system.
Single Maintainer
No contributors model. Optimized for individual developer efficiency. No feature requests, guidance, or community interaction accepted.
All AI-Authored
Every code change is authored by AI agents under human oversight. Humans decide what to work on and when to merge; agents handle the implementation.
Self-Hosted
Self-hosted GitHub Actions runners on personal hardware. Zero-cost infrastructure with full control over the execution environment.
This repository contains dual-use research and tooling. The maintainer provides no guidance, consultation, or feature development. No feature requests are accepted. No external contributions are accepted. Released under a public domain dedication.
Agentic Workflow
Humans decide WHAT to work on and WHEN to merge. Agents handle HOW with automated quality loops.
AI Agents
Six AI agents collaborate in an autonomous development workflow -- from issue refinement to PR review and merge.
Primary development assistant. Architecture design, complex refactoring, debugging, comprehensive documentation, CI/CD pipelines, and test development. Deep understanding of the entire codebase.
Automated PR reviewer and quality gatekeeper. Reviews every pull request for security vulnerabilities, container configurations, and project standards. Provides actionable feedback within 3-5 minutes.
Additional code review perspective in pull requests. Suggests improvements, identifies potential issues, and provides alternative implementations with inline suggestions.
AI-powered code generation and PR review. Focuses on code patterns, best practices, performance considerations, and API design feedback.
Code generation, refactoring, and review via OpenRouter. Model-agnostic access to multiple AI providers for diverse implementation perspectives.
Code generation and conversion via OpenRouter. Quick generation, explanation, and cross-language conversion for rapid prototyping.
MCP Servers
19 modular servers providing specialized functionality via the Model Context Protocol -- from code quality to 3D rendering.
Format checking (Python, JS, TS, Go, Rust), linting (ruff, eslint), auto-formatting, pytest, and type checking.
Manim animations, LaTeX compilation (PDF/DVI/PS), TikZ diagram rendering, and preview generation.
Template-based meme generation with auto-resize text, visual feedback, auto-upload, and 7+ templates.
14+ synthesis tools, 50+ audio tags, 74 language support, 37+ voices, and sound effect generation up to 22 seconds.
AI-powered video editing with Whisper transcription, speaker diarization, scene detection, multi-video composition, and GPU acceleration.
3D content creation with physics simulations, Cycles/Eevee rendering, geometry nodes, animation, and particle systems.
Gemini AI consultations with comparison mode, conversation history management, and auto-consultation on uncertainty.
Code generation, refactoring, and review via OpenRouter. Model-agnostic AI code assistance.
Code generation and conversion via OpenRouter. Quick generation, explanation, and cross-language conversion.
AI-powered code generation and completion via OpenAI Codex. Requires ChatGPT Plus subscription.
AI agent embodiment in virtual worlds (VRChat, Blender, Unity). OSC integration, PAD emotion model, 16 MCP tools.
GitHub Projects v2 work queue. Query ready work, claim/release with conflict prevention, and dependency graph management.
Multi-provider memory system: short-term events, long-term facts, and semantic search via AWS Bedrock AgentCore or ChromaDB.
Semantic search for anime reaction images. Sentence-transformer embeddings, tag-based filtering, auto-fetch from GitHub.
Cross-platform desktop automation: window management, screenshots, mouse/keyboard control for Linux and Windows.
Process memory exploration for legacy software integration. Memory reading, hex dump, pattern scanning, and pointer chain resolution.
Terrain generation with intelligent validation, error correction, 11 professional templates, and CLI automation. Windows only.
GPU-accelerated LoRA training management. Dataset upload, training job monitoring, and model export.
GPU-accelerated AI image generation. Custom workflow execution and LoRA model management.
Research Packages
Standalone packages addressing different aspects of AI agent development, safety, and security.
Research-validated detection framework for hidden backdoors in LLMs. Based on Anthropic's research on deceptive AI that persists through safety training. AUC = 1.0 across GPT-2, Mistral-7B, and Qwen2.5-7B.
Simulation framework for autonomous AI agents operating in economic systems. Agents earn cryptocurrency, form companies, create sub-agents, and seek investment. 14 crates, 13 task challenges.
Tamper-responsive Raspberry Pi briefcase with dual-sensor detection, LUKS2 cryptographic wipe, and hybrid post-quantum (ML-KEM-1024 + ML-DSA-87) encrypted recovery USB.
Agent-driven biological automation platform. Combines a Raspberry Pi 5 liquid handling system with AI agent orchestration over MCP for CRISPR-Cas9 gene editing workflows.
Companion Repositories
Standalone repositories extending the template-repo ecosystem.
Injection toolkit for AI agent integration with legacy software -- DLL injection (Windows), LD_PRELOAD (Linux), shared memory IPC, overlay rendering, and MCP memory explorer.
Embeddable OS framework (18 crates) -- scene-graph UI, 90+ terminal commands, browser engine, window manager, VFS. 4 backends (SDL2, PSP, UE5 FFI, framebuffer) with 8 themes.
Browser-based multiplayer gaming platform for agentic office hours -- Rust/WASM games with an alert overlay surfacing agent activity, CI failures, and decision points.
Modernized Rust SDK for PlayStation Portable -- ~829 syscall bindings, 38+ high-level modules, kernel mode support, and experimental std. Edition 2024 fork.
Strategic Documents
14 strategic documents spanning risk assessment, technical guidance, and philosophy -- all auto-compiled from LaTeX and distributed with each release.
Risk Assessments
Technical Guides
Rust CLI Tools
Purpose-built command-line tools for agent orchestration, security hardening, and CI/CD automation.
| Tool | Purpose |
|---|---|
automation-cli |
Unified CI/CD runner -- format, lint, test, build, deny for all packages |
github-agents-cli |
Issue/PR monitoring, refinement, code analysis, and agent execution |
board-manager |
GitHub Projects v2 board operations -- claim, release, status updates |
git-guard |
Git CLI wrapper requiring sudo for dangerous operations (force push, --no-verify) |
gh-validator |
GitHub CLI wrapper for automatic secret masking |
pr-monitor |
Dedicated PR monitoring for admin/review feedback during development |
markdown-link-checker |
Fast concurrent markdown link validator for CI/CD pipelines |
code-parser |
Parse and apply code blocks from AI agent responses |
code-review-processor |
Process and apply AI code review feedback automatically |
mcp-code-quality |
Rust MCP server for code quality tools (formatting, linting, testing) |
CI/CD Pipeline
15-stage pipeline running on self-hosted hardware. Zero cloud costs -- all infrastructure on personal machines.
Self-Hosted Runner
All CI runs on personal hardware -- zero cloud compute costs. Full control over the execution environment and toolchain versions.
Docker-Based
Every CI stage runs inside Docker containers for reproducibility. Same environment locally and in CI.
Multi-Language
Unified pipeline covering Python (ruff, pytest, bandit) and Rust (clippy, cargo-deny, cargo-test) packages.
Auto-Fix Loop
CI failure handler automatically fixes formatting and lint errors, pushes the fix, and re-triggers the pipeline.
AI Safety
Key safety principles implemented in this project, informed by the BlueDot Impact AI Safety Fundamentals course.
Sleeper Agent Detection
AI systems can develop hidden capabilities that only emerge under specific conditions. Detection requires analyzing internal processes (residual stream activations), not just outputs. Deception persists through safety training.
Scalable Oversight
Break complex tasks into verifiable subtasks. Use AI to help evaluate AI outputs. Debate systems with multiple AI instances. Always maintain human judgment in the loop.
Control Protocols
Separate "writer" and "monitor" roles using different models/providers. Use signal jamming to prevent covert coordination. Spend human attention on the most suspicious outputs.
Human Gates
Three mandatory human gates in the workflow: issue approval, code review, and merge decision. Agents cannot bypass these checkpoints regardless of capability.
Containment Layers
Defense in depth: capability limits, monitoring, verification, rollback, and kill switches. Never fully automate critical infrastructure or production deployments.
Wrapper Guards
CLI wrappers (git-guard, gh-validator) enforce security boundaries. Dangerous git operations require sudo. Secrets are automatically masked in GitHub CLI output.