Extraction Overview

Learn about how Ottimate extracts structured data from invoice documents

Document extraction is the foundation of Ottimate’s invoice automation platform. Ottimate uses a proprietary AI model trained specifically on invoices and purchase documents to extract structured data from uploaded files. This enables downstream automation including coding, approvals, and ERP synchronization.

How extraction works

The extraction pipeline begins when a document is uploaded to Ottimate. Documents can be uploaded via:

  • Email forwarding - Send invoices to your dedicated Ottimate inbox
  • Desktop Upload - Upload invoices to an Ottimate Location via the Dashboard UI
  • API upload - Upload files programmatically using the POST /invoices/upload endpoint

Supported file formats

Ottimate accepts the following file types:

  • PDF
  • JPG / JPEG
  • PNG
  • TIFF

Maximum file size is 25MB per file.

Uploads vs. invoices

Understanding the distinction between uploads and invoices is critical when working with the Ottimate API.

When you upload a document to Ottimate, it creates an upload record, not an invoice. The upload represents the raw file awaiting extraction. An invoice is only created after extraction completes and structured data has been captured from the document.

This means:

  • An upload exists immediately after you submit a file
  • An invoice only exists after successful extraction
  • You should poll for upload status to track extraction progress
  • Once extraction completes, you can retrieve the invoice using the invoice ID

Extraction processing

Extraction is asynchronous. When you upload a document, Ottimate queues it for processing and returns immediately with an upload ID. The actual extraction happens in the background.

Depending on your account configuration during onboarding, uploaded files are processed through one of two pipelines:

  • AI extraction - Fully automated extraction using Ottimate’s proprietary model. This is the faster option and is suitable for most standard invoice formats.
  • Human-in-the-loop extraction - A quality-focused pipeline where trained operators verify and correct AI-extracted data. This option provides higher accuracy for complex or non-standard documents but has longer processing times.

Your account’s extraction pipeline is configured during onboarding based on the account tier.

Checking extraction status

To monitor the progress of your uploads, call the GET /invoices/uploads endpoint. This returns a list of uploads that have been submitted but not yet fully processed.

Check upload status
$curl -X GET "https://api.ottimate.com/invoices/uploads" \
> -H "X-Api-Key: your-api-key-here" \
> -H "X-API-Version: 1.0.0"

The response includes the status of each upload:

Response
1{
2 "version": "1.0.0",
3 "count": 1,
4 "results": [
5 {
6 "id": "b3e3e3e3-3e3e-4e3e-8e3e-3e3e3e3e3e3e",
7 "filename": "invoice-abc.pdf",
8 "status": "pending_processing",
9 "date": "2025-10-10T12:00:00Z",
10 "no_of_pages": 1,
11 "ottimate_location_name": "Downtown Location",
12 "preview_url": "https://ottimate.com/#b3e3e3e3-3e3e-4e3e-8e3e-3e3e3e3e3e3e"
13 }
14 ]
15}

Upload status values

StatusDescription
pending_processingThe upload is queued and awaiting extraction
splitThe uploaded file contains multiple pages that require manual review to confirm whether they belong to the same invoice or should be separated into distinct invoices
groupedPage grouping has been confirmed and the upload is ready for invoice creation

Using the preview URL

The preview_url field provides a direct link to the Ottimate extraction UI where users can:

  • Review extracted data and make corrections
  • Confirm page groupings for multi-page uploads
  • Manually export the invoice once review is complete

Once an upload no longer appears in this list, extraction has completed and the invoice is available through the invoices API.

What you can do with extraction

  • Upload invoice files - Submit PDF, image, or scanned documents for extraction
  • Track upload progress - Monitor the status of documents in the extraction queue
  • Automate intake workflows - Build integrations that automatically submit invoices from email, vendor portals, or other sources