India OCR accuracy benchmark: how the major tools handle real GST bills

By MessyDocs team · Updated 23 May 2026

Every OCR tool claims high accuracy. None of them publish what "accuracy" means on the documents Indian CAs actually deal with: a four-column GST bill, a handwritten chit, a regional-language receipt. So we ran our own test, on our own bills, and scored four things that matter in this work rather than one number that doesn't.

This page is meant to be cited. If a row here is useful, quote it.

What we measured, and why one accuracy number is a lie

A single "98% accurate" headline hides where a tool fails. A model can read 98% of the characters on a page and still be useless to you if the 2% it drops are the GSTIN and the tax split, or if it returns everything as one scrambled line you have to re-sort. So we scored four separate things:

Raw-text accuracy. Did it read the characters on the page, including regional script and handwriting?
Structured GST-field accuracy. Did it correctly pull the fields you actually book: GSTIN, taxable value, CGST, SGST, total? This is the one that decides whether the tool saves you time.
Tally-ready output. Can you get a row per bill out, in a shape you import into Tally or paste into Excel, or does it stop at a wall of text?
India pricing. Is there a plan a solo CA can afford in rupees, or is it USD-per-page priced for a US enterprise?

The first two can diverge sharply. A tool can ace raw text and fail structure because it can't hold table columns together (the linearization problem).

The test documents

We deliberately used hard, real inputs rather than a clean printed sample:

A Hindi-language bill (our "Hindi.png" test asset). Devanagari header and line items, mixed with Latin-character figures.
A handwritten invoice (a Ghana commercial invoice, dense handwriting). Stands in for the handwritten kirana chit: hard, real handwriting with names and numbers.

These are the documents that separate tools. Anything reads a clean printed English PDF.

Two kinds of cell in the table below, kept apart on purpose

This page mixes two grades of evidence, and we label which is which because pretending they're the same is how benchmarks lie.

Measured (only our own model). Numbers we got by running the three test documents through the engine we use and checking the output by hand against the source bill. These are the rows with percentages.
Positioning-verified (every competitor). What the tool charges, what languages it claims, whether it has any India-facing presence, what its output shape is. This comes from the tool's own site and pricing page, not from us feeding it our bills. We have not run our three documents through a competitor's engine, so we publish no competitor accuracy score. None. A pricing figure we can check; a head-to-head accuracy decimal we did not run, so we don't print one.

If you want a competitor's accuracy on your worst bill, the honest answer is the same one we give about ourselves: upload it and look. That's the whole reason this category needs a benchmark that admits what it didn't test.

Results

Tool	What it is	Raw-text on messy Indic/handwritten	Structured GST fields	Output shape	India pricing
MessyDocs (our model) measured	Purpose-built OCR for Indian bills	Hindi bill read near-complete (~100% on our Hindi.png asset); Ghana handwritten invoice high-90s. Both checked by hand.	Pulls GSTIN, taxable value, CGST/SGST as fields	Row-per-bill export aimed at Excel/Tally import	Rupee pricing, built for solo CAs
HandwritingOCR.com positioning-verified	Early-entrant OCR with broad document coverage	Lists several Indic languages; we have not run our bills through it	General field extraction, no India GST/HSN schema in its pitch	Structured text/export	USD and GBP pricing, no rupee plan; no India-facing or CA content we could find
IndicLedger.in positioning-verified	Live Indian competitor on this exact wedge	Pitches handwritten and thermal, Hindi/Marathi/Hinglish. We have not run our bills through it.	Pitches vendor + GSTIN + taxable value + CGST/SGST	"→ Excel" is its stated output	India-focused; pricing not the blocker, distribution is
Invoicing-SaaS tools with built-in capture (e.g. TrulyInvoice) positioning-verified	Invoicing/voucher product, OCR is a feature	Built for printed/PDF invoices, not the headline on handwritten or regional script	Strong GST-field structure for printed input	Tally/voucher output is the core	Rupee pricing, usually trial-then-paid (check the live plan)
Plain-text OCR utilities (free Indic OCR pages) positioning-verified	Free, single-purpose text extractors	Can read regional script and hand back the characters	None: plain text only, no field structure	Text blob, you structure it yourself	Free, but no structure and no India workflow
Western document-AI APIs (e.g. Nanonets, US extractors) positioning-verified	Developer document-extraction APIs	Strong general OCR architecture; not tuned or marketed for Indic scripts	Generic field extraction, no GST/HSN schema out of the box	Structured JSON, not Tally-shaped	USD per-page or high monthly minimums, steep for a solo CA; no rupee plan
Enterprise GST suites (e.g. ClearTax capture) positioning-verified	Compliance suite, capture is one module	Built for printed/e-invoice input	Strong GST fields, reconciliation focus	Feeds its own compliance flow	Enterprise/India pricing, often demo-gated for the capture piece

A blunt note on the competitor rows: every one is positioning-verified, not accuracy-measured. We read their sites, checked their pricing and language claims, and noted whether they show up for an Indian CA at all. We did not run our Hindi and handwritten documents through their engines, so we have not earned the right to print an accuracy number for them, and we won't. Treat their cells as "here's what the tool says it is and what it costs," and treat only our percentages as a measured result you can hold us to.

What the numbers actually tell you

Three takeaways stand out from running this.

The gap is structure, not reading. Most tools can read most of the characters. Where they split is whether they hold a four-column GST bill together as rows and hand you bookable fields. That's the line between saving time and creating a re-sorting chore.

Handwriting is solved enough to use, not enough to trust. Our handwritten test came back in the high-90s. That's good, and it's also why you still check the GSTIN and the total by eye: the misses are confident and plausible, not blank.

Pricing is a real filter for Indian CAs. A tool can be technically excellent and still be wrong for a solo practitioner because it's priced per page in dollars. The rupee question isn't a footnote; for this audience it decides adoption.

One more thing the table doesn't capture, because it isn't an accuracy or price question: where your data goes. The consumer versions of ChatGPT and Gemini, the obvious free route, use what you paste to train their models, so a client's bill stops being private the moment you drop it in. For a CA choosing where to send confidential documents, that belongs on the checklist next to accuracy and rupee pricing.

How we tested, and what we did not test

This is the part that makes the page defensible, so read it before you quote a row.

What we tested (measured). Two documents, run through the engine we use on its standard path, no tuned pipeline: our Hindi.png asset and a Ghana commercial invoice with dense handwriting. Every GST field in our own row was checked by hand against the source bill. Those percentages are ours and you can hold us to them.

What we did not test (and won't fake). We did not feed those three documents to any competitor's engine. So there is no competitor accuracy score anywhere on this page, by design. The competitor cells are positioning-verified only: pricing taken from the tool's own pages, language and feature claims taken from how the tool markets itself, India presence judged by whether an Indian CA would find any India-facing or CA-specific content. That's a real, checkable layer of comparison, and it's the layer that actually decides adoption for a solo practitioner (a tool priced per page in dollars is wrong for this audience no matter how good its OCR is). It is not a head-to-head accuracy result, and we don't dress it up as one.

Why publish our own product in the table at all. Hiding it would be more dishonest than disclosing it. Judge our percentages the way you'd judge anyone's, and re-run your own worst bill before you decide. If you get a result that contradicts ours, tell us, we'll re-test, and we'll update the date at the top.

For the workflow that sits on top of these results, see the messy-bills guide and the Indian-language extraction guide. For the two documents that separate the tools most, we went deeper: Hindi invoice to Excel and thermal receipt to Excel. Bank statements put the same engine through a different test: what QuickBooks and Zoho miss on Indian bank statements walks the import gaps, and why run-to-run reproducibility is non-negotiable for a CA explains why a figure you cannot reproduce is a figure you cannot book.