LEXIS AI - Legal Intelligence Platform
Built an AI-powered contract intelligence platform that reduced 4-hour legal reviews to 90 seconds with 96.3% accuracy, saving $2.4M annually.
The Challenge
The client's legal department was drowning in contract volume. Senior attorneys billing at $600 per hour were spending an average of four hours per contract on initial review — reading through indemnification clauses, liability caps, change-of-control provisions, and cross-references to governing law. With 1,200+ contracts flowing through the pipeline each quarter, the team was burning through $2.8 million annually on first-pass reviews alone, and the backlog was growing by 15% quarter over quarter.
The problem was compounded by inconsistency. Different attorneys flagged different risks depending on their specialization and experience level. Junior associates missed subtle carve-outs in limitation-of-liability clauses roughly 23% of the time, and cross-referencing against the firm's 50,000+ precedent library was practically impossible within the time constraints of a standard review cycle. The firm had tried rules-based contract analysis tools, but they broke down on non-standard clause structures and couldn't handle the nuance of context-dependent risk assessment.
Beyond pure cost, the bottleneck was creating strategic risk. Deals were delayed by days waiting for legal review. The general counsel needed a system that could perform clause-level risk analysis at machine speed while maintaining the accuracy standard required for Fortune 500 transactions — and critically, every citation and risk flag had to be traceable back to a specific precedent or regulatory source. Zero tolerance for hallucinated legal references.
The Solution
We designed and built a retrieval-augmented generation pipeline purpose-built for legal contract analysis. The core insight was that legal reasoning is clause-atomic — meaning in a contract lives at the individual clause level, not at the paragraph or section level. We built a custom clause segmentation engine using a fine-tuned legal NER model that identifies clause boundaries, cross-references, and defined-term dependencies with 98.7% boundary accuracy.
The retrieval layer operates on a vector store of 50,000+ legal precedents indexed at the clause level. Each clause is embedded using a domain-adapted embedding model fine-tuned on 200,000 legal clause pairs, stored in pgvector with metadata including jurisdiction, contract type, risk category, and historical outcome. At query time, the system performs hybrid retrieval — combining dense vector similarity with BM25 keyword matching — followed by a cross-encoder re-ranker trained specifically on legal relevance judgments. This multi-stage retrieval achieves 94.2% recall at k=10, compared to 71% for vanilla cosine similarity.
The generation layer uses a multi-step verification architecture we call "cite-then-generate." Rather than asking the LLM to generate analysis and hope it cites correctly, we first retrieve relevant precedents, then ask the model to identify which specific precedents support each risk flag, and finally generate the analysis constrained to only reference verified sources. Every claim in the output links back to a specific clause in a specific precedent document. We implemented a post-generation verification pass that cross-checks all citations against the source documents and flags any hallucinated references before they reach the attorney.
The system processes contracts through a pipeline of clause extraction, risk classification (across 47 risk categories), precedent matching, comparative analysis, and final report generation. The entire pipeline runs in under 90 seconds for a standard 40-page commercial agreement. Attorneys receive a structured risk report with clause-by-clause annotations, risk scores, relevant precedents, and suggested negotiation positions — all with full citation trails.
Technical Architecture
The platform runs on a three-tier architecture designed for low-latency inference and auditability. The ingestion tier handles document processing: PDFs and Word documents are parsed using a custom document processor that preserves structural hierarchy (articles, sections, clauses, sub-clauses). A fine-tuned legal NER model segments documents into atomic clauses and extracts defined terms, cross-references, and party identifiers. Processed clauses are embedded using our domain-adapted embedding model and stored in PostgreSQL with pgvector, alongside full metadata in a relational schema.
The retrieval and reasoning tier is the core intelligence layer. It consists of a FastAPI service that orchestrates the multi-stage retrieval pipeline: query expansion, hybrid search (dense + sparse), cross-encoder re-ranking, and context assembly. The reasoning engine uses Claude and GPT-4 in a dual-model architecture — one model generates initial analysis, the other performs adversarial verification. Both models operate within strict prompt constraints that enforce citation requirements and prevent confabulation. The entire reasoning chain is logged for auditability.
The presentation tier serves a React frontend with real-time streaming of analysis results via WebSocket. Attorneys see a split-pane interface: the original contract on the left with highlighted risk clauses, and the AI analysis on the right with expandable precedent references. The system includes an annotation layer where attorneys can accept, modify, or reject AI recommendations, creating a feedback loop that continuously improves model performance. Infrastructure runs on AWS with ECS for container orchestration, ElastiCache for caching frequently-accessed precedents, and CloudWatch for monitoring pipeline latency and accuracy metrics.
Results
Contract review time reduced from 4 hours to 90 seconds per document
Risk detection accuracy on blind tests against senior associates
Direct cost savings from reduced attorney hours on first-pass review
Zero hallucinated legal citations in production across 4,800+ reviews
Tech Stack
This platform has fundamentally changed how our legal team operates. What used to take senior attorneys half a day now takes 90 seconds — and the accuracy is remarkable. The fact that every risk flag traces back to a specific precedent gives our partners the confidence to rely on it for high-stakes transactions.
Want results like these?
We help enterprises design and deploy production-grade AI systems. Let's discuss your project.
More Case Studies
HALO - Infrastructure Monitoring
Built an ML-powered observability platform that reduced MTTR by 87% and eliminated 94% of alert noise, saving $340K annually.
EMS - Workforce Intelligence
Built an ML-powered workforce analytics platform that predicts burnout 3 weeks ahead with 89% accuracy and improved delivery timelines by 34%.