Docling Graph Documentation¶

What is Docling Graph?¶

Docling-Graph turns documents into validated Pydantic objects, then builds a directed knowledge graph with explicit semantic relationships.

This transformation enables high-precision use cases in chemistry, finance, and legal domains, where AI must capture exact entity connections (compounds and reactions, instruments and dependencies, properties and measurements) rather than rely on approximate text embeddings.

This toolkit supports two extraction paths: local VLM extraction via Docling, and LLM-based extraction using either local runtimes (vLLM, Ollama) or API providers (Mistral, OpenAI, Gemini, IBM WatsonX), all orchestrated through a flexible, config-driven pipeline.

Key Features¶

✍🏻 Multi-Format Input: Ingest PDFs, images, URLs, raw text, Markdown and more.
🧠 Flexible Extraction: VLM or LLM-based (vLLM, Ollama, Mistral, Gemini, WatsonX, etc.)
🔨 Smart Graphs: Convert Pydantic models to NetworkX graphs with stable node IDs
📦 Multiple Export: CSV (Neo4j-compatible), Cypher scripts, JSON, Markdown
📊 Rich Visualizations: Interactive HTML and detailed Markdown reports
⚙️ Type-Safe Configuration: Pydantic-based validation

Getting Started¶

Installation →

Set up your environment with uv package manager
Quick Start →

Run your first extraction in 5 minutes
Architecture →

Understand the pipeline stages and components
Key Concepts →

Learn how documents flow through the system

Core Documentation¶

Introduction

Overview, architecture, and core concepts
Fundamentals

Installation, schema definition, pipeline configuration, extraction, and more
Usage

CLI reference, Python API, examples, and advanced topics
Reference

Detailed API documentation
Community

Contributing and development guide

Resources¶

Documentation¶

GitHub Repository - Source code and issues
PyPI Package - Install via pip/uv
Contributing Guidelines - How to contribute

Community¶

GitHub Issues - Report bugs and request features
GitHub Discussions - Ask questions and share ideas

Docling - Document processing engine
Pydantic - Data validation library
NetworkX - Graph library

Next Steps¶

Need Help?¶

Installation Issues: See Installation Guide
Template Questions: See Schema Definition
Configuration Help: See Pipeline Configuration
Error Messages: See Error Handling