Frequently asked questions
15 questions about Image Table to Excel OCR.
- Q1. Are images really not uploaded?
- Yes. The site is static S3 hosting — there is no upload endpoint to send anything to. Open DevTools → Network, run OCR, and the upload tab stays at zero bytes. After the page first loads and the language model is cached, you can even go offline.
- Q2. Which image formats are supported?
- PNG · JPG · WEBP are the officially supported formats. Most browser-readable formats (GIF, BMP) also work, but we only test the three. PDF is not supported yet — export the PDF page to PNG and upload that.
- Q3. How accurate is Korean recognition?
- tesseract.js Korean (kor) reaches 95%+ on clean printed tables, and roughly 80–90% on typical web/app screenshots. Handwritten or low-quality scans can drop below 60%. Tables heavy in numbers/Latin letters tend to score even higher because the English model also runs.
- Q4. Why not use CLOVA OCR or Google Vision?
- Those APIs score higher but (1) require paid plans and (2) ship your image to their servers. This tool prioritises browser-only, free usage, so tesseract.js is the default. Need CLOVA-grade accuracy on-prem? Get in touch.
- Q5. How big an image can I upload?
- Practically keep images under ~5MB and ~3000px on the long edge. Beyond that, the browser tab may get memory-pressed. Cropping to just the table region is faster and more accurate anyway.
- Q6. Why is the first run slow?
- We download the Korean (~12MB) and English (~8MB) tesseract language models once from CDN and cache them in the browser (IndexedDB). Later runs in the same browser start immediately.
- Q7. Rows / columns split incorrectly — how do I fix it?
- Double-click any cell in the preview to edit inline. To drop a whole row, tick its checkbox and "Delete selected rows". Missing rows can be added with "Add row". For heavy column restructuring, export to XLSX first and edit in Excel.
- Q8. Korean "원" or "%" symbols keep garbling.
- We post-correct patterns like
12,345 원into12,345원. But if tesseract already mis-read the glyph, clean-up can't recover it — raise the resolution or fix the cell manually. - Q9. How are merged cells handled?
- We only see pixels, not merge metadata. Text spanning a merged range typically lands in the leftmost column of that range. Split or move cells in the preview as needed, or re-merge in Excel after exporting.
- Q10. Can I paste the result into Google Sheets?
- Most reliable: download XLSX and File → Import in Google Sheets. CSV export has a UTF-8 BOM so Korean shows up cleanly in both Sheets and Excel. A "copy to clipboard" button is on the roadmap.
- Q11. Can you batch-convert multi-page PDFs?
- The MVP handles single images only. Multi-page PDF batch OCR is on the roadmap. For now, export each PDF page to PNG (e.g. Preview.app on macOS, or any PDF viewer) and process them one by one.
- Q12. Does it work on mobile?
- Runs on iOS Safari 16+ and Android Chrome. Mobile memory/CPU limits make it best suited for table images up to ~2MP. Use a desktop for larger images.
- Q13. Do you log or store anything?
- We use GA4 and Naver Analytics for page-view counts only. No image contents or recognised text ever leave your browser — nothing is transmitted to begin with. localStorage only stores your language preference; IndexedDB only caches the tesseract language model.
- Q14. My company blocks file uploads. Is this tool safe to use?
- The site is pure static hosting (S3 + CloudFront) — there is no backend to receive anything. Parsing and OCR happen inside your browser in the tesseract.js WebAssembly runtime. Some enterprises restrict even third-party JS, so check with your security team. On-prem builds are available on request.
- Q15. Does handwritten text work?
- The tesseract.js models we ship are print-optimised; handwriting accuracy is low (30–50%). Cropping one row at a time helps a little, but true handwriting OCR is a specialised service (e.g. CLOVA OCR) beyond this tool's scope.