MessyDocs

How we test OCR accuracy

Every OCR tool claims high accuracy and almost none of them say what they tested it on. This page is the method behind our India OCR accuracy benchmark: the documents we ran, what we did and did not measure, and how we score each column. Read it before you quote a number from us.

The test documents

We deliberately use hard, real inputs rather than a clean printed English PDF, because anything reads a clean printed PDF. Three documents do the separating:

Measured versus positioning-verified

Our benchmark mixes two grades of evidence, and we label every cell as one or the other because pretending they are the same is how benchmarks lie.

Measured means we ran the three documents above through the engine we use, on its standard path with no tuned pipeline, and checked the output by hand against the source bill. Only our own product carries measured numbers, and you can hold us to them.

Positioning-verified means we read the tool's own site and pricing page: what it charges, what languages it claims, what shape its output is, and whether an Indian accountant would find any India-facing content at all. We have not fed our three documents to any competitor engine, so we publish no competitor accuracy score anywhere. A pricing figure we can check; a head-to-head accuracy decimal we did not run, so we do not print one. If you want a competitor's accuracy on your worst bill, the honest answer is the same one we give about ourselves: upload it and look.

How we score the four columns

A single accuracy headline hides where a tool fails. A model can read 98% of the characters on a page and still be useless if the 2% it drops are the GSTIN and the tax split. So we score four separate things:

  1. Raw-text accuracy. Did it read the characters on the page, including regional script and handwriting? Checked character by character against the source.
  2. Structured GST-field accuracy. Did it correctly pull the fields you book: GSTIN, taxable value, CGST, SGST, total? This is the column that decides whether the tool saves time. We check each field by hand against the bill.
  3. Output shape. Can you get a row per bill out, in a shape you import into Tally or paste into Excel, or does it stop at a wall of text?
  4. India pricing. Is there a plan a solo accountant can afford in rupees, or is it priced per page in dollars for a US enterprise? Taken from the tool's own pricing page.

What we will not fake

We publish our own product in the table because hiding it would be more dishonest than disclosing it. We score competitors only on positioning, never on an accuracy number we did not earn. When we re-test or add a tool, we update the date at the top of the benchmark.