Docling Graph Documentation¶
What is Docling Graph?¶
Docling-Graph turns documents into validated Pydantic objects, then builds a directed knowledge graph with explicit semantic relationships.
This transformation enables high-precision use cases in chemistry, finance, and legal domains, where AI must capture exact entity connections (compounds and reactions, instruments and dependencies, properties and measurements) rather than rely on approximate text embeddings.
This toolkit supports two extraction paths: local VLM extraction via Docling, and LLM-based extraction using either local runtimes (vLLM, Ollama) or API providers (Mistral, OpenAI, Gemini, IBM WatsonX), all orchestrated through a flexible, config-driven pipeline.
Key Features¶
- ✍🏻 Multi-Format Input: Ingest PDFs, images, URLs, raw text, Markdown and more.
- 🧠 Flexible Extraction: VLM or LLM-based (vLLM, Ollama, Mistral, Gemini, WatsonX, etc.)
- 🔨 Smart Graphs: Convert Pydantic models to NetworkX graphs with stable node IDs
- 📦 Multiple Export: CSV (Neo4j-compatible), Cypher scripts, JSON, Markdown
- 📊 Rich Visualizations: Interactive HTML and detailed Markdown reports
- ⚙️ Type-Safe Configuration: Pydantic-based validation
Quick Navigation¶
Getting Started¶
-
Set up your environment with uv package manager
-
Run your first extraction in 5 minutes
-
Understand the pipeline stages and components
-
Learn how documents flow through the system
Core Documentation¶
-
Overview, architecture, and core concepts
-
Installation, schema definition, pipeline configuration, extraction, and more
-
CLI reference, Python API, examples, and advanced topics
-
Detailed API documentation
-
Contributing and development guide
Resources¶
Documentation¶
- GitHub Repository - Source code and issues
- PyPI Package - Install via pip/uv
- Contributing Guidelines - How to contribute
Community¶
- GitHub Issues - Report bugs and request features
- GitHub Discussions - Ask questions and share ideas
Related Projects¶
Next Steps¶
Need Help?¶
- Installation Issues: See Installation Guide
- Template Questions: See Schema Definition
- Configuration Help: See Pipeline Configuration
- Error Messages: See Error Handling