Dots.OCRIntelligent OCR Data Extraction

Extract text, tables, and formulas from any document or image instantly. Supports 100+ languages with AI precision. Free and open source.

Start→

dots.ocr - Multilingual Document OCR and Layout Parsing Tool Interface

Drop your file here or click to browse

Supports PDF, JPG, PNG, WEBP files up to 10MB

Advanced Document Intelligence

Everything you need for intelligent document processing

dots.ocr combines cutting-edge AI with practical document processing needs. Extract, analyze, and structure data from any document or image with unprecedented accuracy and speed.

Multilingual Document Processing: Process documents in 100+ languages with state-of-the-art accuracy. From English to Chinese, Arabic to Tamil - dots.ocr handles complex multilingual content with unified layout detection.
Advanced Table & Formula Extraction: Extract complex tables, mathematical formulas, and structured data from PDFs and images. Perfect for research papers, financial reports, and technical documents.
Lightning-Fast AI Processing: Built on a compact 1.7B vision-language model for optimal speed and accuracy. Process documents 10x faster than traditional OCR while maintaining superior quality.

Trusted by developers worldwide

Empowering thousands of developers to build intelligent document processing solutions.

Languages Supported: 100+
Document Processing Accuracy: 95%+
API Response Time: <2s
Documents Processed: 1M+

Frequently Asked Questions

About dots.ocr

What is dots.ocr?: dots.ocr is a state-of-the-art multilingual document parser that unifies layout detection and content recognition in a single vision-language model. Built on a compact 1.7B-parameter LLM, it delivers exceptional performance for text, tables, formulas, and reading order across 100+ languages.
Who should use dots.ocr?: dots.ocr is perfect for researchers, data scientists, businesses, and developers who need to extract structured data from documents and images. It's ideal for processing research papers, financial reports, invoices, forms, and any complex documents.

Features & Capabilities

What types of content can dots.ocr extract?: dots.ocr can extract text, tables, mathematical formulas, and maintain reading order from PDFs, scanned documents, and images. It handles complex layouts, multilingual content, and preserves document structure with high accuracy.
How accurate is dots.ocr compared to other OCR tools?: dots.ocr achieves state-of-the-art performance on OmniDocBench and other benchmarks, often outperforming much larger commercial models while being significantly faster and more efficient.

Getting Started

Is dots.ocr free and open source?: Yes! dots.ocr is completely open source and free to use. You can find the source code, documentation, and installation instructions on our GitHub repository.
How do I get started with dots.ocr?: You can try our live demo online, or install dots.ocr locally using our quick start guide. We support multiple deployment options including vLLM for production use and Hugging Face for development.