Case studiesLegal services
Cutting case-research time from 6 hours to 22 minutes with a RAG system over 80,000 court filings
RAG legal-research assistant for a litigation boutique.
“We did not buy software. We built a research associate that reads everything we have ever written and never forgets a case.”
The challenge
A 12-attorney litigation boutique. Every new case opened with two to six hours of associate time spent reading prior filings to understand opposing counsel's pattern, the judge's tendencies, and the precedents being argued. That work was billable but it was also bottlenecking the firm's capacity to take new matters.
Existing legal-tech AI tools were too generic — they searched public case law but did not have access to the firm's accumulated 80,000-document corpus of filings, briefs, and internal memos. The partners refused to upload anything client-confidential to a third-party vendor.
How we approached it
Days 1–4 — Ingest pipeline. Built a Lambda-driven ingest that watched the firm's S3 bucket of court filings (PDF + DOCX). OCR via Textract for scanned documents. Chunked at the section level with overlap, embedded with Cohere v3, indexed in Pinecone. Tagged every chunk with jurisdiction, court, judge, opposing counsel, date, and our internal matter ID.
Days 5–8 — Retrieval + reranker. Hybrid retrieval: vector search for semantic match plus BM25 for cite-string matching (Westlaw-style citations). Cohere reranker on the top 50 to surface the actually-relevant 8. Claude Sonnet 4 as the answer model with a strict "answer only from the provided documents — cite every claim with the matter ID + page number" system prompt.
Days 9–10 — UI and evals. Lightweight Next.js app inside the firm's Cloudflare-protected intranet. 100 historical research questions used as the eval set, scored against attorney-validated answers. Hit 99.2% citation accuracy on first ship.
The whole system runs in the firm's AWS account. No client data ever leaves their tenant.
The outcome
Median case-research time dropped from 6 hours to 22 minutes. Associates now use the assistant as a first-pass and spend their billable hours on the harder analysis — strategy, drafting, argumentation. The firm took on three additional matters in the quarter following launch without adding headcount.
The Managing Partner estimated $340k/year in recaptured billable hours redirected to higher-value work. The system pays for itself in the first month it is used.
Stack
- Anthropic Claude Sonnet 4
- Cohere reranker
- Pinecone
- AWS Lambda
- Linear-driven workflow
Team
One senior engineer + a legal-domain reviewer
Handoff
Full code in their AWS org with IaC templates. Eval harness re-runnable on every model change. Quarterly model-upgrade reviews. The firm now adds new jurisdictions to the index in-house.
Your engagement, next on this list