NewNow supporting Gemini 2.5 Flash for lightning-fast document extraction.

Build and deploy OCR pipelines.

ocr-use is a developer platform that makes turning PDFs into structured data in lightning speed.

Processing Pipeline

S3 to database. Automatically.

Watch your documents flow through our intelligent pipeline—from raw PDFs to structured, validated data in your database.

S3 Bucket
invoice_2024.pdf
OCR Processing
{}
Typed JSON
Zod validated
Insert
Database
PostgreSQL
Simple Integration

From S3 to database in one call.

CLI or API. Your choice.

# Process a PDF from S3
ocr-use process \
--source s3://bucket/invoice.pdf \
--schema ./invoice.schema.ts \
--output postgres://db
✓ Processed in 2.3s
✓ Extracted 47 fields
✓ Validated with Zod
✓ Inserted to database
Intelligent Routing

Best model for every document.

We automatically route each PDF through the optimal OCR engine based on document type and complexity.

invoice_2024.pdfGemini 2.5 Flash
Structured tables detected → Fast extraction
handwritten_form.pdfPaddleOCR
Handwriting detected → Specialized model
research_paper.pdfDocling
Complex layout → Document understanding

Built for AI startups

Processing 2-10k PDFs daily? We handle the complexity so you can focus on your product.

Zero Configuration

Scales instantly from 1 to 10,000 PDFs per day. No infrastructure to manage, no models to fine-tune.

Type-Safe Schemas

Define your data structure with Zod or OpenAPI. We validate every extraction and catch errors before they hit your database.

Token Pricing

Pay per token, not per page. Process a 100-page document or a 1-page receipt—you only pay for what you extract.