Deployment

Easy Deployment with Docker: Get Started with Dots.OCR in Minutes

Learn how to deploy Dots.OCR using our official Docker images with vLLM integration. Our containerized solution makes it simple to set up a production-ready OCR service that can process documents across 100+ languages with enterprise-grade performance and scalability.

DevOps Team

Infrastructure Engineer

Containerized Excellence

Our official Docker images provide the fastest and most reliable way to deploy Dots.OCR in any environment. Built on the proven vLLM foundation with CUDA 12.8 support, our containers include all necessary dependencies and optimizations for maximum performance. The images are thoroughly tested and ready for production deployment.

Quick Start Guide

Getting started with Dots.OCR is incredibly simple: ```bash # Pull the official image docker pull rednotehilab/dots.ocr:vllm-openai-v0.9.1 # Run with GPU support docker run --gpus all -p 8000:8000 \ rednotehilab/dots.ocr:vllm-openai-v0.9.1 ``` That's it! Your OCR service will be running on port 8000 with full GPU acceleration and support for 100+ languages.

Production-Ready Configuration

Our Docker images are optimized for production workloads with: • Pre-configured vLLM server for optimal inference performance • CUDA 12.8 runtime for GPU acceleration • OpenAI-compatible API endpoints for easy integration • Automatic model weight management • Memory optimization for efficient resource usage • Health check endpoints for monitoring The container includes all necessary NVIDIA libraries and dependencies, eliminating compatibility issues.

Scalable Architecture

Deploy Dots.OCR at scale using container orchestration: ```yaml # docker-compose.yml version: '3.8' services: dots-ocr: image: rednotehilab/dots.ocr:vllm-openai-v0.9.1 deploy: resources: reservations: devices: - driver: nvidia count: 1 capabilities: [gpu] ports: - "8000:8000" environment: - CUDA_VISIBLE_DEVICES=0 - VLLM_GPU_MEMORY_UTILIZATION=0.95 ``` Scale horizontally by adding more container instances or configure multi-GPU setups for maximum throughput.

API Integration

Our Docker deployment provides OpenAI-compatible endpoints for seamless integration: ```python import requests # Process document via API response = requests.post( "http://localhost:8000/v1/chat/completions", json={ "model": "dots.ocr", "messages": [{ "role": "user", "content": [ {"type": "text", "text": "Extract all text"}, {"type": "image_url", "image_url": {"url": "data:image/jpeg;base64,..."}} ] }] } ) ``` This familiar API format makes it easy to integrate with existing applications and workflows.

Monitoring and Observability

Monitor your Dots.OCR deployment with built-in endpoints: • `/health` - Service health status • `/metrics` - Prometheus-compatible metrics • `/v1/models` - Available model information • GPU utilization and memory usage tracking • Request latency and throughput metrics Integrate with your existing monitoring stack using these standardized endpoints.

Advanced Configuration

Customize your deployment with environment variables: ```bash # Advanced configuration docker run --gpus all \ -e VLLM_GPU_MEMORY_UTILIZATION=0.9 \ -e VLLM_TENSOR_PARALLEL_SIZE=2 \ -e VLLM_MAX_MODEL_LEN=4096 \ -p 8000:8000 \ rednotehilab/dots.ocr:vllm-openai-v0.9.1 ``` Optimize for your specific hardware configuration and workload requirements. Multi-GPU setups, memory tuning, and batch processing parameters can all be adjusted via environment variables.

Enterprise Support

For enterprise deployments, consider: • Load balancing across multiple container instances • Kubernetes deployment with auto-scaling • Persistent volume mounts for model caching • Custom security configurations • Private registry deployment Our Docker images provide the foundation for robust, scalable OCR services that can handle enterprise-grade workloads with confidence.

Want to learn more about Dots.OCR?