Intelligent Document Processing
Production AI Pipeline — Deep Technical Reference
A production IDP platform processing legal contracts with a 4-agent pipeline, 5-layer hallucination detection, 3-tier eval architecture, and full observability. 94% extraction accuracy. <1% hallucination rate. 75% auto-approval. $18K/month (down from $120K).
Unified Pipeline — Both Lanes
Both extraction and query lanes route through the same pipeline: Input → Guard → Cache → Agent Pipeline (Supervisor → Research → Analysis → Critic) → Hallucination Detection → Route → Output.
User Request
React + FastAPI<50ms · $0Document upload (PDF/DOCX) or natural language query from compliance analysts, legal counsel, or operations leadership.
PII Redaction
Regex + DynamoDB + KMS<1ms · $0Detect SSN, email, phone, CC → tokenize. Reversible for authorized reviewers via KMS-encrypted DynamoDB mapping.
Injection Detection
Regex + Bedrock Guardrails<15ms · ~$0.001Pattern match + ML scoring. >0.7 REJECT + security alert. 0.3-0.7 SANITIZE (XML wrap). <0.3 PASS.
Semantic Cache
ElastiCache Redis 7<5ms · $0 on hitSHA-256 hash lookup. 35% hit rate. TTL 1hr. Saves ~$700/month in LLM costs. ROI: 14x.
Supervisor
LangGraph AgentState3ms · $0Deterministic state router. Pure if/else. Routes: Research → Analysis → Critic → retry or done.
Research
Aurora PostgreSQL15ms · $0Hybrid search: pgvector (cosine) + BM25 (keyword) + Reciprocal Rank Fusion. P@5 = 0.85.
Analysis
AWS Bedrock (tiered)~2.8s · $0.01-$0.50ONLY agent that costs $. Tiered model routing: 60% Haiku ($0.01) / 30% Sonnet ($0.05) / 10% Opus ($0.50).
Critic
Python + fuzz8ms · $04-6 deterministic checks. LLM-free. Source verification, field validation, date consistency, denied topics.
Hallucination Detection
Custom scoring~5ms · $05-layer weighted composite. Source verification (0.25), cross-validation (0.25), historical (0.20), schema (0.15), LLM confidence (0.15).
Confidence Routing
Logic<1ms · $0≥0.85 auto-approve (75%). 0.60-0.85 needs review (20%). <0.60 reject (5%).
Structured Result
FastAPI + DynamoDB<1ms · $0JSON + per-field confidence + full audit trail. 7-year immutable retention.
Need this level of rigor in your AI system?
We build production AI systems with the same eval architecture, hallucination detection, and observability you see here.