Intelligent Document Processing with LLMs: From PDFs to Structured Data [Case Study]

Executive Summary / Key Results

NorthBridge Logistics, a mid-market freight forwarder processing 30,000+ pages of shipment documents each month, transformed its back-office operations with an LLM-powered intelligent document processing pipeline. Moving from brittle OCR templates to document AI with structured LLM extraction, they achieved measurable, sustained impact across speed, accuracy, and cost.

93.7% field-level accuracy across 28 key fields; 98.2% on critical totals and dates
85% reduction in manual data-entry hours (920 to 138 hours per month)
3.4x faster cycle time from receipt to system of record (36 hours to 10.5 hours)
$512,000 in annualized savings and a 6-week payback; 212% first-year ROI
40% fewer invoice disputes caused by data errors and missing documentation
99.9% successful redaction of sensitive PII with governed access and audit logs

This case study explains how we designed, deployed, and scaled LLM extraction to consistently convert PDFs, scans, and emails into clean, structured data that flows into NorthBridge’s ERP and transportation management system (TMS).

Background / Challenge

NorthBridge Logistics moves thousands of shipments each week. Every move generates paperwork: bills of lading, commercial invoices, packing lists, proofs of delivery, customs forms, certificates of origin, and carrier rate confirmations. Much of it arrives as PDFs and image scans attached to emails or dropped into shared folders.

Before the project, a team of 12 document specialists typed key information into the ERP and TMS. The legacy process relied on template-based OCR set up for a handful of standard layouts. Any time a trading partner changed a logo, font, or column order, fields broke. Handwritten notes, skewed scans, and multi-language documents led to frequent exceptions. The backlog kept growing and SLAs slipped.

The impact showed up on the P&L and in the customer experience:

Average processing time from receipt to system of record stretched to 36 hours.
Template maintenance consumed 10–15 hours per week and still missed edge cases.
Error-induced disputes rose, tying up cash and straining relationships.
The team spent 920 hours each month on manual entry and rework, limiting growth.

NorthBridge’s leaders wanted a solution that could handle real-world variability, scale with demand, and integrate cleanly with existing systems. They needed accuracy, not a demo; reliability, not another dashboard.

Solution / Approach

We designed an intelligent document processing solution centered on LLM extraction, built to withstand messy, real-life documents and evolving partner formats. Rather than forcing every vendor into a fixed template, the pipeline treats each document as a structured information problem that can be understood with context, governed with confidence thresholds, and verified with business rules.

The core approach combined four pillars:

Layout-aware OCR and classification We first converted all inputs into high-fidelity text with coordinates, preserving tables and reading order. A lightweight classifier routed each file to the right extractor for bills of lading, invoices, packing lists, proofs of delivery, or customs forms. This ensured downstream prompts had the right schema expectations.
LLM extraction to a strict schema We prompted an LLM to read the document holistically and return a structured JSON payload matching a predefined schema for each document type. The schema specified field names, types, units, normalization rules, and required/optional indicators. Rather than asking the model to be creative, we asked it to be consistent.
Business rule validation and grounding Extracted values were validated against rules from NorthBridge’s domain. For example, item quantities multiplied by unit weight needed to match total weight within a tolerance. Vendor names were grounded using a canonical list from the ERP. Purchase order numbers were cross-checked against open orders. Where mismatches occurred, the system flagged low confidence and routed the case for review.
Human-in-the-loop where it matters We introduced a review interface that showed the original document side by side with extracted fields, model highlights, and validation messages. Reviewers corrected only the few low-confidence fields, not entire pages. This kept people in control while letting the system handle the boring majority.

For readers planning their own rollout, we cover how these building blocks connect across data sources, APIs, and workflows in our Integrations & Intelligent Automation: A Complete Guide Integrations & Intelligent Automation: A Complete Guide.

Implementation

The program ran in four phases over eight weeks, with measurable checkpoints and a controlled pilot before the full rollout.

Phase 1: Discovery and baseline (Weeks 1–2) We audited two years of documents, sampled 5,000 files across eight document types, and established baseline metrics. Legacy OCR achieved 74% field-level accuracy across the sampled set. Average processing time: 36 hours. Manual effort: 920 hours per month. Cost per document, inclusive of labor and software: 2.20 dollars. We also identified where errors hurt most: invoice totals, dates, and mismatched purchase orders.

Phase 2: Prototype and schema design (Weeks 3–4) We defined extraction schemas for each document type: 28 fields for invoices, 19 for bills of lading, and 14 for proofs of delivery, with normalized units and controlled vocabularies. We created few-shot prompts with examples covering common layouts and edge cases, and added output-format instructions so the model returned consistent JSON. Early tests reached 91% F1 accuracy on invoices and 88% on bills of lading.

Phase 3: Pilot and integration (Weeks 5–7) We processed 10,000 live documents through the new pipeline while keeping the legacy system as a fallback. We wired the outputs to NorthBridge’s ERP and TMS using event-driven APIs, and moved low-confidence exceptions to a reviewer queue. We also built validation layers to compare line totals, taxes, and dates against ERP data. This is where integration pays off; the practical patterns we used echo those in our playbook on AI integration with core systems AI Integration with CRM, ERP, and Help Desk: A Practical Playbook.

Phase 4: Rollout and observability (Week 8) We cut over to production. The team received dashboards for throughput, accuracy by field, model confidence, and exception rates. We added guardrails: performance alerts, PII redaction checks, and audit logs. Retraining and schema updates now follow a change-management process aligned with IT and compliance.

Under the hood, a few design choices made the difference:

Structured output with strict typing reduced post-processing effort. For example, currency fields were always decimals with three-letter ISO codes, and dates normalized to ISO 8601.
Context windows included both text and lightweight metadata, such as vendor lists and open purchase orders, to ground the model.
Confidence scores combined model logit signals, validation rule outcomes, and discrepancy checks against ERP data to drive routing decisions.
The reviewer UI showed why the model believed a field value was correct, with the exact text span and coordinates highlighted. This boosted reviewer speed and trust.
Sensitive PII was redacted at the pre-processing stage and governed with role-based access, with a full audit trail for compliance.

As mailboxes and shared folders continued to receive documents, RPA bots handled intake logistics like downloading attachments and filing them by shipment number. These bots triggered the LLM pipeline and updated status in case management. If you are orchestrating bots and agents across the workflow, our guide on RPA plus AI is a helpful reference RPA + AI: Orchestrating Autonomous Agents and Bots for End-to-End Automation.

Mini-case: The vendor onboarding packet

During the pilot, NorthBridge onboarded a new carrier whose packets included W-9 forms, insurance certificates, and bank letters in a single combined PDF. Historically, this was a manual outlier. We added a light classification step to split the packet and route each subdocument to the correct schema. The LLM extracted tax IDs and coverage dates, and the validation layer compared them to requested values. Human reviewers only checked two fields flagged as low confidence because of faint scan quality. What used to take 25 minutes per packet dropped to under 4 minutes end to end.

Results with specific metrics

Accuracy Across 10,000 pilot documents, the pipeline delivered 93.7% field-level accuracy, with 98.2% on critical fields including invoice total, invoice date, and PO number. Bills of lading reached 92.1% overall accuracy despite varied carrier formats and frequent handwritten notations. For proofs of delivery, signature detection and date capture achieved 96.4% accuracy.

Speed Average processing time declined from 36 hours to 10.5 hours, a 71% improvement. Same-day processing rose from 18% of volume to 64%. Seasonal peaks, which previously created week-long backlogs, were absorbed without overtime.

Cost Manual effort fell 85%, from 920 to 138 hours per month, saving 782 hours monthly. At a conservative fully loaded rate, that equates to approximately 4.9 full-time equivalents redeployed to revenue-generating tasks. All-in cost per document dropped from 2.20 dollars to 0.68 dollars, including compute, storage, and a small human-in-the-loop budget.

Cash cycle and disputes Data-entry errors were a major source of invoice disputes and delayed payments. With cleaner data and automated attachment of proof documents, NorthBridge saw a 40% reduction in disputes tied to data mismatches. Days sales outstanding improved by 3.2 days for the affected customer segments.

Compliance and governance The platform redacted PII automatically for 99.9% of instances before reviewers saw documents, and maintained full audit logs of who viewed or changed data fields. Quarterly internal audits reported zero high-severity issues related to document handling.

ROI The program paid for itself in six weeks. First-year ROI reached 212%, driven by labor savings, reduced dispute write-offs, and avoided costs of template maintenance.

People experience Reviewers moved from monotonous typing to quality assurance focused on exceptions. Average time to resolve a flagged field fell to 41 seconds because the UI highlighted the exact evidence. Employee satisfaction scores in the back office rose 18 points during the first quarter after rollout.

Key Takeaways

LLM extraction is not magic; it is an engineering approach that, when combined with the right guardrails, handles the messy middle where legacy OCR fails and humans burn out. NorthBridge’s results are repeatable because they rest on fundamentals:

Start with your documents, not a generic model. Define the schema, examples, and validation that reflect your real work.
Ground the model in your business data. Use vendor lists, open POs, and policy rules to cross-check and boost confidence.
Keep people in the loop at the right moments. Route low-confidence fields for quick review rather than dumping entire documents.
Integrate cleanly with your systems of record. Event-driven APIs, idempotent upserts, and clear error handling reduce downstream friction.
Measure outcomes, not demos. Track accuracy by field, time to post, exception rates, and cost per document to prove ROI and guide improvements.

If you are assessing the broader automation landscape, see how document AI fits into an ecosystem of connectors, orchestration, and human oversight in our practical guide to integrations and automation Integrations & Intelligent Automation: A Complete Guide. And when it is time to connect document AI outputs to your CRM, ERP, or help desk, follow the steps in our hands-on playbook for enterprise systems AI Integration with CRM, ERP, and Help Desk: A Practical Playbook. For end-to-end workflows that mix bots, agents, and people, learn how to design resilient orchestration with our overview of RPA plus AI RPA + AI: Orchestrating Autonomous Agents and Bots for End-to-End Automation.

About NorthBridge Logistics

NorthBridge Logistics is a North American freight forwarder specializing in multi-modal shipments for mid-market manufacturers and retailers. With operations spanning customs brokerage, warehousing, and last-mile delivery, NorthBridge manages thousands of shipments each week and serves customers across consumer goods, industrial equipment, and healthcare.

As the business scaled, document variability and manual data entry became a bottleneck. By partnering with our team to implement LLM-powered intelligent document processing, NorthBridge unlocked a faster, more reliable back office that can keep pace with growth while improving the customer experience.

Ready to turn PDFs and scans into clean, structured data your systems trust? Schedule a consultation and we will map your documents, design your extraction schema, and stand up a pilot that proves value in weeks, not months.

Malecu | Custom AI Solutions for Business Growth

Intelligent Document Processing with LLMs: From PDFs to Structured Data [Case Study]

Intelligent Document Processing with LLMs: From PDFs to Structured Data [Case Study]

Executive Summary / Key Results

Background / Challenge

Solution / Approach

Implementation

Mini-case: The vendor onboarding packet

Results with specific metrics

Key Takeaways

About NorthBridge Logistics

Related Posts

How We Built a Continuous Evaluation Pipeline for Agentic Systems: A Case Study in Reliable AI

Integrating AI with Legacy Systems: A Success Story of Modernization

From Bots to Reps: How a SaaS Company Cut Escalations by 40% with Smarter Human Handoff Strategies

From Broken Prompts to Bulletproof Agents: How Agent Red Teaming Cut Incident Rates by 94%