Are images really not uploaded?

Yes. The site is static S3 hosting — there is no upload endpoint to send anything to. Open DevTools → Network, run OCR, and the upload tab stays at zero bytes. After the page first loads and the language model is cached, you can even go offline.

Which image formats are supported?

PNG · JPG · WEBP are the officially supported formats. Most browser-readable formats (GIF, BMP) also work, but we only test the three. PDF is not supported yet — export the PDF page to PNG and upload that.

How accurate is Korean recognition?

tesseract.js Korean (kor) reaches 95%+ on clean printed tables, and roughly 80–90% on typical web/app screenshots. Handwritten or low-quality scans can drop below 60%. Tables heavy in numbers/Latin letters tend to score even higher because the English model also runs.

Why not use CLOVA OCR or Google Vision?

Those APIs score higher but (1) require paid plans and (2) ship your image to their servers. This tool prioritises browser-only, free usage, so tesseract.js is the default. Need CLOVA-grade accuracy on-prem? Get in touch.

How big an image can I upload?

Practically keep images under ~5MB and ~3000px on the long edge. Beyond that, the browser tab may get memory-pressed. Cropping to just the table region is faster and more accurate anyway.

Why is the first run slow?

We download the Korean (~12MB) and English (~8MB) tesseract language models once from CDN and cache them in the browser (IndexedDB). Later runs in the same browser start immediately.

Rows / columns split incorrectly — how do I fix it?

Double-click any cell in the preview to edit inline. To drop a whole row, tick its checkbox and "Delete selected rows". Missing rows can be added with "Add row". For heavy column restructuring, export to XLSX first and edit in Excel.

Korean "원" or "%" symbols keep garbling.

We post-correct patterns like 12,345 원 into 12,345원. But if tesseract already mis-read the glyph, clean-up can't recover it — raise the resolution or fix the cell manually.

How are merged cells handled?

We only see pixels, not merge metadata. Text spanning a merged range typically lands in the leftmost column of that range. Split or move cells in the preview as needed, or re-merge in Excel after exporting.

Can I paste the result into Google Sheets?

Most reliable: download XLSX and File → Import in Google Sheets. CSV export has a UTF-8 BOM so Korean shows up cleanly in both Sheets and Excel. A "copy to clipboard" button is on the roadmap.

Can you batch-convert multi-page PDFs?

The MVP handles single images only. Multi-page PDF batch OCR is on the roadmap. For now, export each PDF page to PNG (e.g. Preview.app on macOS, or any PDF viewer) and process them one by one.

Does it work on mobile?

Runs on iOS Safari 16+ and Android Chrome. Mobile memory/CPU limits make it best suited for table images up to ~2MP. Use a desktop for larger images.

Do you log or store anything?

We use GA4 and Naver Analytics for page-view counts only. No image contents or recognised text ever leave your browser — nothing is transmitted to begin with. localStorage only stores your language preference; IndexedDB only caches the tesseract language model.

My company blocks file uploads. Is this tool safe to use?

The site is pure static hosting (S3 + CloudFront) — there is no backend to receive anything. Parsing and OCR happen inside your browser in the tesseract.js WebAssembly runtime. Some enterprises restrict even third-party JS, so check with your security team. On-prem builds are available on request.

Does handwritten text work?

The tesseract.js models we ship are print-optimised; handwriting accuracy is low (30–50%). Cropping one row at a time helps a little, but true handwriting OCR is a specialised service (e.g. CLOVA OCR) beyond this tool's scope.

자주 묻는 질문

이미지표 엑셀 OCR 에 대한 15개 질문.

작성 김지광 (운영자)마지막 업데이트 2026년 5월 16일bal.pe.kr 마이크로 SaaS

Q1. 정말 서버로 업로드되지 않나요?: 네. 이 도구는 AWS S3 정적 호스팅이며 이미지·텍스트를 받을 백엔드가 존재하지 않습니다. 개발자도구 Network 탭을 열고 OCR 을 실행하면 업로드 트래픽이 0바이트인 것을 직접 확인할 수 있습니다. 페이지 최초 로드 이후 네트워크를 끊어도 동작합니다(언어 모델이 캐시된 이후).
Q2. 어떤 이미지 형식이 지원되나요?: PNG · JPG · WEBP 를 지원합니다. 브라우저가 읽을 수 있는 다른 이미지(GIF, BMP 등) 도 대부분 동작하지만 공식 테스트는 앞 세 가지입니다. PDF 는 현재 미지원이며 PDF 페이지를 PNG 로 내보내어 업로드해 주세요.
Q3. 한국어 정확도는 얼마나 되나요?: tesseract.js 의 한국어 모델(kor) 은 인쇄된 깔끔한 표에서 95%+ 정확도, 일반 웹·앱 스크린샷에서 80~90% 수준입니다. 손글씨·스캔 품질이 낮은 문서는 60% 아래로 떨어질 수 있습니다. 숫자·영문·특수기호가 많은 표는 한국어보다 오히려 더 정확합니다(영어 모델이 함께 동작).
Q4. CLOVA OCR 이나 Google Vision 을 쓰지 않는 이유는?: 그쪽 API 는 정확도는 높지만 (1) 유료 전환이 필요하고 (2) 이미지를 외부 서버로 보내야 합니다. 본 도구는 브라우저-only 무료를 우선한 제품이라 tesseract.js 를 기본으로 씁니다. 사내용으로 CLOVA OCR 정확도가 필요하면 온프레미스 변형을 문의해 주세요.
Q5. 얼마나 큰 이미지까지 올릴 수 있나요?: 실무적으로 5MB · 긴 변 3000px 이하를 권장합니다. 그 이상이면 브라우저 메모리가 압박을 받아 탭이 먹통이 될 수 있습니다. 더 큰 이미지는 표 부분만 잘라서 올리면 훨씬 빠르고 정확합니다.
Q6. 왜 첫 실행이 오래 걸리나요?: 한국어 모델(~12MB) 과 영어 모델(~8MB) 을 CDN 에서 1회 다운로드해 브라우저(IndexedDB) 에 캐시하기 때문입니다. 이후 같은 브라우저에서는 즉시 OCR 이 시작됩니다.
Q7. 행·열이 잘못 분리됐어요. 어떻게 고치나요?: 결과 표의 아무 셀이나 더블클릭 하면 수정할 수 있습니다. 행 전체가 잘못됐다면 왼쪽 체크박스로 선택 후 "선택 행 삭제". 행이 빠졌으면 "행 추가" 로 맨 아래에 빈 행을 만들고 수동 입력. 열 구조를 크게 바꿔야 한다면 XLSX 로 먼저 내보낸 뒤 엑셀에서 편집하는 편이 빠릅니다.
Q8. 한글 "원" / "%" 같은 기호가 자꾸 깨집니다.: 본 도구는 셀별 후처리로 12,345 원 같은 공백을 12,345원 으로 자동 결합합니다. 그러나 OCR 단계에서 이미 글자 자체가 손상된 경우는 복원되지 않습니다. 이 경우 이미지 해상도를 올리거나, 셀 더블클릭으로 직접 수정해 주세요.
Q9. 병합 셀은 어떻게 처리되나요?: 이 도구는 이미지 픽셀만 볼 뿐 셀 병합 정보를 알 수 없으므로, 병합된 구간의 텍스트는 해당 구간에 포함된 모든 열의 "가장 왼쪽 열" 셀로 들어가는 경향이 있습니다. 필요 시 결과 표에서 셀을 다른 칸으로 잘라 옮기거나 XLSX 로 내보낸 뒤 엑셀에서 병합을 복원하세요.
Q10. 결과를 구글 스프레드시트로 바로 붙여넣을 수 있나요?: XLSX 로 다운로드 후 구글 스프레드시트의 "파일 → 가져오기" 로 여는 방식이 가장 안정적입니다. CSV 다운로드는 UTF-8 BOM 포함이므로 구글 스프레드시트 / 엑셀 모두 한글이 안 깨집니다. 향후 "클립보드 복사" 버튼을 추가할 예정입니다.
Q11. PDF 여러 페이지 일괄 변환도 지원하나요?: 현재 MVP 에서는 이미지 1장씩만 지원합니다. PDF 일괄 처리는 로드맵에 있습니다. 지금 단계에서는 엑셀의 "PDF → 스크린샷" 또는 macOS 미리보기로 페이지를 PNG 로 내보낸 뒤 한 장씩 처리해 주세요.
Q12. 모바일에서도 되나요?: iOS Safari 16 이상, Android Chrome 에서 동작합니다. 모바일은 메모리·CPU 제약이 있어 2MP 이하 표 이미지 정도가 적합합니다. 큰 이미지는 데스크톱을 권장합니다.
Q13. 데이터를 저장하거나 로그를 남기나요?: GA4 와 네이버 애널리틱스를 이용한 방문 수 집계만 수행합니다. 업로드한 이미지나 인식된 텍스트는 어떠한 서버에도 저장되지 않습니다(애초에 전송 자체가 없습니다). 브라우저 localStorage 에는 언어 토글 설정만, IndexedDB 에는 tesseract 언어 모델 캐시만 저장됩니다.
Q14. 회사 보안 정책으로 파일 업로드가 금지되어 있어요. 써도 되나요?: 이 도구는 정적 호스팅(S3 + CloudFront) 이며 이미지·텍스트를 받을 백엔드 자체가 존재하지 않습니다. 파싱과 OCR은 브라우저 안의 tesseract.js WebAssembly 런타임에서만 해석됩니다. 일부 기업은 외부 JS 실행도 제약하므로 보안팀 정책을 먼저 확인하세요. 필요시 온프레미스 배포도 가능합니다(문의).
Q15. 손글씨 표도 인식되나요?: 현재 사용하는 tesseract.js 모델은 인쇄체 특화로, 손글씨 정확도는 낮습니다(30~50%). 손글씨 표는 한 줄씩 이미지로 쪼개어 여러 번 돌리면 조금 나아지지만, 전용 손글씨 OCR(예: CLOVA OCR) 이 필요한 영역입니다.