Frequently asked questions

15 questions about Image Table to Excel OCR.

Q1. Are images really not uploaded?
Yes. The site is static S3 hosting — there is no upload endpoint to send anything to. Open DevTools → Network, run OCR, and the upload tab stays at zero bytes. After the page first loads and the language model is cached, you can even go offline.
Q2. Which image formats are supported?
PNG · JPG · WEBP are the officially supported formats. Most browser-readable formats (GIF, BMP) also work, but we only test the three. PDF is not supported yet — export the PDF page to PNG and upload that.
Q3. How accurate is Korean recognition?
tesseract.js Korean (kor) reaches 95%+ on clean printed tables, and roughly 80–90% on typical web/app screenshots. Handwritten or low-quality scans can drop below 60%. Tables heavy in numbers/Latin letters tend to score even higher because the English model also runs.
Q4. Why not use CLOVA OCR or Google Vision?
Those APIs score higher but (1) require paid plans and (2) ship your image to their servers. This tool prioritises browser-only, free usage, so tesseract.js is the default. Need CLOVA-grade accuracy on-prem? Get in touch.
Q5. How big an image can I upload?
Practically keep images under ~5MB and ~3000px on the long edge. Beyond that, the browser tab may get memory-pressed. Cropping to just the table region is faster and more accurate anyway.
Q6. Why is the first run slow?
We download the Korean (~12MB) and English (~8MB) tesseract language models once from CDN and cache them in the browser (IndexedDB). Later runs in the same browser start immediately.
Q7. Rows / columns split incorrectly — how do I fix it?
Double-click any cell in the preview to edit inline. To drop a whole row, tick its checkbox and "Delete selected rows". Missing rows can be added with "Add row". For heavy column restructuring, export to XLSX first and edit in Excel.
Q8. Korean "원" or "%" symbols keep garbling.
We post-correct patterns like 12,345 원 into 12,345원. But if tesseract already mis-read the glyph, clean-up can't recover it — raise the resolution or fix the cell manually.
Q9. How are merged cells handled?
We only see pixels, not merge metadata. Text spanning a merged range typically lands in the leftmost column of that range. Split or move cells in the preview as needed, or re-merge in Excel after exporting.
Q10. Can I paste the result into Google Sheets?
Most reliable: download XLSX and File → Import in Google Sheets. CSV export has a UTF-8 BOM so Korean shows up cleanly in both Sheets and Excel. A "copy to clipboard" button is on the roadmap.
Q11. Can you batch-convert multi-page PDFs?
The MVP handles single images only. Multi-page PDF batch OCR is on the roadmap. For now, export each PDF page to PNG (e.g. Preview.app on macOS, or any PDF viewer) and process them one by one.
Q12. Does it work on mobile?
Runs on iOS Safari 16+ and Android Chrome. Mobile memory/CPU limits make it best suited for table images up to ~2MP. Use a desktop for larger images.
Q13. Do you log or store anything?
We use GA4 and Naver Analytics for page-view counts only. No image contents or recognised text ever leave your browser — nothing is transmitted to begin with. localStorage only stores your language preference; IndexedDB only caches the tesseract language model.
Q14. My company blocks file uploads. Is this tool safe to use?
The site is pure static hosting (S3 + CloudFront) — there is no backend to receive anything. Parsing and OCR happen inside your browser in the tesseract.js WebAssembly runtime. Some enterprises restrict even third-party JS, so check with your security team. On-prem builds are available on request.
Q15. Does handwritten text work?
The tesseract.js models we ship are print-optimised; handwriting accuracy is low (30–50%). Cropping one row at a time helps a little, but true handwriting OCR is a specialised service (e.g. CLOVA OCR) beyond this tool's scope.