TL;DR
RepoReaper is an autonomous code-audit agent that parses repository ASTs, prefetches key contexts into a vector cache, and performs Just-In-Time reads when more context is needed. It uses async I/O, a hybrid BM25/vector retrieval stack, bilingual prompts, and can be run locally or in Docker; a public demo is available but subject to shared API limits.
What happened
A developer published RepoReaper, an agentic system that automates architectural analysis and semantic code search by combining AST-aware parsing with a dynamic retrieval cache. On cold start it scans a repository’s Abstract Syntax Trees to build a lightweight symbol map, then prefetches a small set of architecturally important files to warm a persistent vector store. During interactive Q&A the agent runs a ReAct-style loop: queries are rewritten into precise English search terms, hybrid BM25+vector retrieval is used, and if returned context is insufficient the agent issues tool commands to fetch files from GitHub, index them, and re-attempt the answer in the same inference cycle. The project emphasizes non-blocking ingestion via asyncio/httpx, runs behind Gunicorn/Uvicorn workers with a disk-backed ChromaDB vector store, and includes UI features for bilingual (English/Chinese) prompts and streaming responses. A public demo exists but uses shared API quotas that can hit 403/429 limits; local deployment is recommended for full performance.
Why it matters
- AST-aware chunking preserves code structure so an LLM sees coherent function and class boundaries rather than arbitrary text slices.
- Treating the vector store as a dynamic cache reduces upfront indexing while enabling targeted Just-In-Time reads for missing context.
- Hybrid retrieval (dense vectors + BM25 + RRF) aims to balance semantic matches with exact code-level signals like identifiers and signatures.
- Asynchronous ingestion and stateless worker design allow higher throughput and safer multi-worker deployments with persistent disk-backed vectors.
- Built-in bilingual prompt handling lowers friction for teams working in English and Chinese without separate frontends.
Key facts
- Parses repositories using Python's ast module to build a symbol map (classes/functions) at cold start.
- Prefetches circa 10–20 files deemed architecturally relevant to warm the vector cache before interactive use.
- Uses a ReAct loop: query rewrite, hybrid retrieval, and tool-triggered JIT file fetching when retrieval results are insufficient.
- Hybrid search stack combines BAAI/bge-m3 embeddings for dense retrieval and BM25 (Rank-BM25/BM25Okapi) for sparse matching; results use Reciprocal Rank Fusion (RRF).
- Implements structure-aware chunking: splits by class/method and injects parent class signature/docstrings into child chunks.
- Built on Python 3.10+ (core stack FastAPI, AsyncIO), uses httpx for async I/O, ChromaDB for persistent vectors, and runs with Gunicorn + Uvicorn workers.
- Provides bilingual support with dynamic prompt switching and a frontend language toggle; UI streams responses via Server-Sent Events and can render architecture diagrams with Mermaid.js.
- Public demo exists but is limited by shared API quotas; repository includes instructions for local and Docker deployment and requires GitHub and LLM API keys.
- Repository is published under an MIT license and shows community interest (38 stars at time of capture).
What to watch next
- Public demo may encounter API rate limits (HTTP 403/429); deploy locally for unrestricted use (confirmed in the source).
- How well the JIT-fetching ReAct loop performs on large, unfamiliar codebases in practice — empirical evaluation results are not provided in the source.
- Community adoption, third-party integrations, and real-world audit accuracy over time — not confirmed in the source.
Quick glossary
- Abstract Syntax Tree (AST): A tree representation of source code structure where nodes denote constructs like classes, functions, and expressions.
- Retrieval-Augmented Generation (RAG): An approach where an LLM is augmented with external retrieved documents or embeddings to ground generation in context.
- BM25: A sparse retrieval algorithm that ranks documents based on exact term matching and term frequency signals.
- Vector embeddings: Numeric representations of text or code used for semantic similarity search with dense retrieval models.
- ReAct loop: A reasoning-and-acting pattern where a model alternates between generating reasoning steps and issuing actions (tools) to gather more information.
Reader FAQ
What does RepoReaper do?
It autonomously analyzes repository architecture using AST parsing, seeds a vector cache with key files, and performs JIT file reads during interactive code Q&A.
Which LLMs and embeddings are supported?
The project lists compatibility with the OpenAI SDK-style integrations and recommends DeepSeek/SiliconFlow for LLMs and BAAI/bge-m3 for embeddings.
How can I run it?
You can run it locally with Python (recommended) or via Docker; the repository includes a quick-start and environment variable examples for GitHub and LLM API keys.
Is there a hosted demo?
Yes; a public demo is available but uses shared API quotas and may return 403/429 rate-limit responses—local deployment is suggested for production use.
Is RepoReaper production-ready?
not confirmed in the source
RepoReaper 💀 Harvest Logic. Dissect Architecture. Chat with Code. 基于 AST 深度解析 · 双语适配的自治型代码审计 Agent English • 简体中文 👇 Live Demo Access / 在线体验 👇 ⚠️ Public Demo Limitations:…
Sources
- Show HN: RepoReaper – AST-aware, JIT-loading code audit agent (Python/AsyncIO)
- AST-aware, JIT-loading code audit agent (Python/AsyncIO)
- PurCL/RepoAudit: An autonomous LLM-agent …
- RepoAudit
Related posts
- Developers might’ve stuck with Stack Overflow if community were kinder
- LaTeX ‘Coffee Stains’ package documentation (PDF) — comments and notes
- KeelTest: AI-powered VS Code extension that generates pytest suites