India OCR accuracy benchmark: how the major tools handle real GST bills
Every OCR tool claims high accuracy. None of them publish what "accuracy" means on the documents Indian CAs actually deal with: a four-column GST bill, a handwritten chit, a regional-language receipt. So we ran our own test, on our own bills, and scored four things that matter in this work rather than one number that doesn't.
This page is meant to be cited. If a row here is useful, quote it.
What we measured, and why one accuracy number is a lie
A single "98% accurate" headline hides where a tool fails. A model can read 98% of the characters on a page and still be useless to you if the 2% it drops are the GSTIN and the tax split, or if it returns everything as one scrambled line you have to re-sort. So we scored four separate things:
- Raw-text accuracy. Did it read the characters on the page, including regional script and handwriting?
- Structured GST-field accuracy. Did it correctly pull the fields you actually book: GSTIN, taxable value, CGST, SGST, total? This is the one that decides whether the tool saves you time.
- Tally-ready output. Can you get a row per bill out, in a shape you import into Tally or paste into Excel, or does it stop at a wall of text?
- India pricing. Is there a plan a solo CA can afford in rupees, or is it USD-per-page priced for a US enterprise?
The first two can diverge sharply. A tool can ace raw text and fail structure because it can't hold table columns together (the linearization problem).
The test documents
We deliberately used hard, real inputs rather than a clean printed sample:
- A Hindi-language bill (our "Hindi.png" test asset). Devanagari header and line items, mixed with Latin-character figures.
- A handwritten invoice (a Ghana commercial invoice, dense handwriting). Stands in for the handwritten kirana chit: hard, real handwriting with names and numbers.
These are the documents that separate tools. Anything reads a clean printed English PDF.
Two kinds of cell in the table below, kept apart on purpose
This page mixes two grades of evidence, and we label which is which because pretending they're the same is how benchmarks lie.
- Measured (only our own model). Numbers we got by running the three test documents through the engine we use and checking the output by hand against the source bill. These are the rows with percentages.
- Positioning-verified (every competitor). What the tool charges, what languages it claims, whether it has any India-facing presence, what its output shape is. This comes from the tool's own site and pricing page, not from us feeding it our bills. We have not run our three documents through a competitor's engine, so we publish no competitor accuracy score. None. A pricing figure we can check; a head-to-head accuracy decimal we did not run, so we don't print one.
If you want a competitor's accuracy on your worst bill, the honest answer is the same one we give about ourselves: upload it and look. That's the whole reason this category needs a benchmark that admits what it didn't test.
Results
| Tool | What it is | Raw-text on messy Indic/handwritten | Structured GST fields | Output shape | India pricing |
|---|---|---|---|---|---|
| MessyDocs (our model) measured |
Purpose-built OCR for Indian bills | Hindi bill read near-complete (~100% on our Hindi.png asset); Ghana handwritten invoice high-90s. Both checked by hand. | Pulls GSTIN, taxable value, CGST/SGST as fields | Row-per-bill export aimed at Excel/Tally import | Rupee pricing, built for solo CAs |
| HandwritingOCR.com positioning-verified |
Early-entrant OCR with broad document coverage | Lists several Indic languages; we have not run our bills through it | General field extraction, no India GST/HSN schema in its pitch | Structured text/export | USD and GBP pricing, no rupee plan; no India-facing or CA content we could find |
| IndicLedger.in positioning-verified |
Live Indian competitor on this exact wedge | Pitches handwritten and thermal, Hindi/Marathi/Hinglish. We have not run our bills through it. | Pitches vendor + GSTIN + taxable value + CGST/SGST | "→ Excel" is its stated output | India-focused; pricing not the blocker, distribution is |
| Invoicing-SaaS tools with built-in capture (e.g. TrulyInvoice) positioning-verified |
Invoicing/voucher product, OCR is a feature | Built for printed/PDF invoices, not the headline on handwritten or regional script | Strong GST-field structure for printed input | Tally/voucher output is the core | Rupee pricing, usually trial-then-paid (check the live plan) |
| Plain-text OCR utilities (free Indic OCR pages) positioning-verified |
Free, single-purpose text extractors | Can read regional script and hand back the characters | None: plain text only, no field structure | Text blob, you structure it yourself | Free, but no structure and no India workflow |
| Western document-AI APIs (e.g. Nanonets, US extractors) positioning-verified |
Developer document-extraction APIs | Strong general OCR architecture; not tuned or marketed for Indic scripts | Generic field extraction, no GST/HSN schema out of the box | Structured JSON, not Tally-shaped | USD per-page or high monthly minimums, steep for a solo CA; no rupee plan |
| Enterprise GST suites (e.g. ClearTax capture) positioning-verified |
Compliance suite, capture is one module | Built for printed/e-invoice input | Strong GST fields, reconciliation focus | Feeds its own compliance flow | Enterprise/India pricing, often demo-gated for the capture piece |
A blunt note on the competitor rows: every one is positioning-verified, not accuracy-measured. We read their sites, checked their pricing and language claims, and noted whether they show up for an Indian CA at all. We did not run our Hindi and handwritten documents through their engines, so we have not earned the right to print an accuracy number for them, and we won't. Treat their cells as "here's what the tool says it is and what it costs," and treat only our percentages as a measured result you can hold us to.
What the numbers actually tell you
Three takeaways stand out from running this.
The gap is structure, not reading. Most tools can read most of the characters. Where they split is whether they hold a four-column GST bill together as rows and hand you bookable fields. That's the line between saving time and creating a re-sorting chore.
Handwriting is solved enough to use, not enough to trust. Our handwritten test came back in the high-90s. That's good, and it's also why you still check the GSTIN and the total by eye: the misses are confident and plausible, not blank.
Pricing is a real filter for Indian CAs. A tool can be technically excellent and still be wrong for a solo practitioner because it's priced per page in dollars. The rupee question isn't a footnote; for this audience it decides adoption.
One more thing the table doesn't capture, because it isn't an accuracy or price question: where your data goes. The consumer versions of ChatGPT and Gemini, the obvious free route, use what you paste to train their models, so a client's bill stops being private the moment you drop it in. For a CA choosing where to send confidential documents, that belongs on the checklist next to accuracy and rupee pricing.
How we tested, and what we did not test
This is the part that makes the page defensible, so read it before you quote a row.
What we tested (measured). Two documents, run through the engine we use on its standard path, no tuned pipeline: our Hindi.png asset and a Ghana commercial invoice with dense handwriting. Every GST field in our own row was checked by hand against the source bill. Those percentages are ours and you can hold us to them.
What we did not test (and won't fake). We did not feed those three documents to any competitor's engine. So there is no competitor accuracy score anywhere on this page, by design. The competitor cells are positioning-verified only: pricing taken from the tool's own pages, language and feature claims taken from how the tool markets itself, India presence judged by whether an Indian CA would find any India-facing or CA-specific content. That's a real, checkable layer of comparison, and it's the layer that actually decides adoption for a solo practitioner (a tool priced per page in dollars is wrong for this audience no matter how good its OCR is). It is not a head-to-head accuracy result, and we don't dress it up as one.
Why publish our own product in the table at all. Hiding it would be more dishonest than disclosing it. Judge our percentages the way you'd judge anyone's, and re-run your own worst bill before you decide. If you get a result that contradicts ours, tell us, we'll re-test, and we'll update the date at the top.
For the workflow that sits on top of these results, see the messy-bills guide and the Indian-language extraction guide. For the two documents that separate the tools most, we went deeper: Hindi invoice to Excel and thermal receipt to Excel.