Architecture¶
System Architecture¶
Docling Graph follows a modular, pipeline-based architecture with clear separation of concerns:

Core Components¶
Document Processor¶
Converts documents to structured format using Docling with OCR or Vision pipelines.
Location: docling_graph/core/extractors/document_processor.py
Extraction Backends¶
VLM Backend: Direct extraction from images using vision-language models (local only) LLM Backend: Text-based extraction supporting local (vLLM, Ollama) and remote APIs
Location: docling_graph/core/extractors/backends/
Processing Strategies¶
One-to-One: Each page produces a separate model (invoice batches, ID cards) Many-to-One: Multiple pages merged into single model (rheology researchs, reports)
Location: docling_graph/core/extractors/strategies/
Document Chunker¶
Splits large documents while preserving semantic coherence and respecting structure.
Location: docling_graph/core/extractors/document_chunker.py
Graph Converter¶
Transforms Pydantic models to NetworkX graphs with stable node IDs and automatic deduplication.
Location: docling_graph/core/converters/graph_converter.py
Exporters & Visualizers¶
Export graphs in CSV, Cypher, JSON formats and generate interactive HTML visualizations.
Location: docling_graph/core/exporters/, docling_graph/core/visualizers/
Complete Pipeline Flow¶
%%{init: {'theme': 'redux-dark', 'look': 'default', 'layout': 'elk'}}%%
flowchart TB
%% 1. Define Classes
classDef input fill:#E3F2FD,stroke:#90CAF9,color:#0D47A1
classDef config fill:#FFF8E1,stroke:#FFECB3,color:#5D4037
classDef output fill:#E8F5E9,stroke:#A5D6A7,color:#1B5E20
classDef decision fill:#FFE0B2,stroke:#FFB74D,color:#E65100
classDef data fill:#EDE7F6,stroke:#B39DDB,color:#4527A0
classDef operator fill:#F3E5F5,stroke:#CE93D8,color:#6A1B9A
classDef process fill:#ECEFF1,stroke:#B0BEC5,color:#263238
%% 2. Define Nodes
A@{ shape: terminal, label: "Input Source" }
A1@{ shape: procs, label: "1. Input Normalization<br/>Type Detection & Validation" }
A2{"Input Type"}
%% Ingestion Paths
B@{ shape: procs, label: "2a. Docling Conversion<br/>Generates Images & Markdown" }
B2@{ shape: lin-proc, label: "2b. Text Processing<br/>Direct to Markdown" }
B3@{ shape: lin-proc, label: "2c. Load DoclingDocument<br/>Pre-parsed Content" }
%% Strategy Decision
C{"3. Backend"}
%% Extraction Paths
D@{ shape: lin-proc, label: "4a. VLM Extraction<br/>Page-by-Page (Images)" }
E@{ shape: lin-proc, label: "4b. Markdown Prep<br/>Merge Text Content" }
%% Chunking Logic (LLM Path)
F{"5. Chunking"}
G@{ shape: tag-proc, label: "6a. Hybrid Chunking<br/>Semantic + Token-Aware" }
H@{ shape: tag-proc, label: "6b. Full Document<br/>Context Window Permitting" }
I@{ shape: procs, label: "7. Batch Extraction<br/>LLM Inference" }
%% Convergence & Validation
J@{ shape: tag-proc, label: "8. Pydantic Validation<br/>Per-Chunk/Page Check" }
K{"9. Consolidation"}
L@{ shape: lin-proc, label: "10a. Smart Merge<br/>Programmatic/Reduce" }
M@{ shape: lin-proc, label: "10b. LLM Consolidation<br/>Refinement Loop" }
%% Graph & Export
N@{ shape: procs, label: "11. Graph Conversion<br/>Pydantic → NetworkX" }
O@{ shape: tag-proc, label: "12. Node ID Generation<br/>Stable Hashing" }
P@{ shape: tag-proc, label: "13. Export<br/>CSV/Cypher/JSON" }
Q@{ shape: tag-proc, label: "14. Visualization<br/>HTML + Reports" }
%% 3. Define Connections
A --> A1
A1 --> A2
%% Routing Inputs
A2 -- "PDF/Image" --> B
A2 -- "Text/MD" --> B2
A2 -- "DoclingDoc" --> B3
%% Routing to Backend Strategy
B --> C
B2 & B3 --> E
%% Backend Decisions
C -- VLM --> D
C -- LLM --> E
%% LLM Path: Markdown -> Chunking -> Extraction
E --> F
F -- Yes --> G
F -- No --> H
G --> I
H --> I
%% VLM Path: Direct to Validation (Skips Chunking)
D --> J
%% LLM Path: Join Validation
I --> J
%% Consolidation
J --> K
K -- "Rule-Based" --> L
K -- "AI-Based" --> M
%% Final Stages
L --> N
M --> N
N --> O
O --> P
P --> Q
%% 4. Apply Classes
class A input
class A1,B,I,N process
class B2,B3,D,E,L,M process
class A2,C,F,K decision
class G,H,J,O operator
class P,Q output
Stage-by-Stage Breakdown¶
Stage 1: Template Loading¶
# Load Pydantic template
template = import_template("module.Template")
# Validate structure
validate_template(template)
Stage 2: Document Conversion¶
# Convert using Docling
doc = processor.convert_to_docling_doc(source)
# Extract markdown
markdown = processor.extract_full_markdown(doc)
Stage 3: Extraction¶
# Choose backend
if backend == "vlm":
models = vlm_backend.extract_from_document(source, template)
else:
models = llm_backend.extract_from_markdown(markdown, template)
Stage 4: Consolidation (if needed)¶
if len(models) > 1:
if llm_consolidation:
final_model = llm_backend.consolidate(models, template)
else:
final_model = programmatic_merge(models)
Stage 5: Graph Conversion¶
Stage 6: Export¶
# Export in multiple formats
csv_exporter.export(graph, output_dir)
cypher_exporter.export(graph, output_dir)
json_exporter.export(graph, output_dir)
Protocol-Based Design¶
Docling Graph uses Python Protocols for type-safe, flexible interfaces:
class ExtractionBackendProtocol(Protocol):
"""Protocol for extraction backends"""
def extract_from_document(self, source: str, template: Type[BaseModel]) -> List[BaseModel]: ...
Benefits: Type safety, easy mocking, clear contracts, flexible implementations
Location: docling_graph/config.py
Purpose: Type-safe configuration using Pydantic
class PipelineConfig(BaseModel):
"""Single source of truth for all defaults"""
source: str
template: Union[str, Type[BaseModel]]
backend: Literal["llm", "vlm"] = "llm"
inference: Literal["local", "remote"] = "local"
processing_mode: Literal["one-to-one", "many-to-one"] = "many-to-one"
use_chunking: bool = True
llm_consolidation: bool = False
export_format: Literal["csv", "cypher"] = "csv"
output_dir: str = "outputs"
# ... additional settings
Error Handling¶
Location: docling_graph/exceptions.py
Hierarchy:
DoclingGraphError (base)
├── ConfigurationError
├── ClientError
├── ExtractionError
├── ValidationError
├── GraphError
└── PipelineError
Structured Errors:
try:
run_pipeline(config)
except ClientError as e:
print(f"Error: {e.message}")
print(f"Details: {e.details}")
print(f"Cause: {e.cause}")
Extensibility¶
Docling Graph is designed for extension:
- LLM Providers: Implement
LLMClientProtocol - Pipeline Stages: Implement
PipelineStage - Export Formats: Extend
BaseExporter
See Custom Backends for details.
Now that you understand the architecture:
- Installation - Set up your environment
- Schema Definition - Create Pydantic templates
- Pipeline Configuration - Configure the pipeline