Skip to content

CLI Recipes

This guide provides ready-to-use CLI commands for all example scripts. Run these from your project root directory.


Quick Reference

Recipe Script Backend Use Case
01: VLM from Image 01_quickstart_vlm_image.py VLM Forms, invoices, ID cards
02: LLM from PDF 02_quickstart_llm_pdf.py LLM (Remote) Rheology researchs, reports
03: URL Processing 03_url_processing.py LLM (Remote) Remote documents
04: Input Formats 04_input_formats.py LLM Text, Markdown, JSON
05: Processing Modes 05_processing_modes.py LLM (Local) Mode comparison
06: Export Formats 06_export_formats.py VLM CSV, Cypher, JSON
07: Local Inference 07_local_inference.py LLM (Local) Offline processing
08: Chunking 08_chunking_consolidation.py LLM (Remote) Large documents
09: Batch Processing 09_batch_processing.py VLM Multiple documents
10: Multi-Provider 10_provider_configs.py LLM (Remote) Provider comparison

📍 VLM from Image

Python Script: 01_quickstart_vlm_image.py

Use Case: Extract structured data from invoice images

CLI Command:

uv run docling-graph convert "docs/examples/data/invoice/sample_invoice.jpg" \
    --template "docs.examples.templates.billing_document.BillingDocument" \
    --output-dir "outputs/cli_01" \
    --backend "vlm" \
    --processing-mode "one-to-one" \
    --docling-pipeline "vision"

When to Use:

  • ✅ Single-page forms or invoices
  • ✅ ID cards, badges, receipts
  • ✅ Image files (JPG, PNG)
  • ✅ Structured layouts

📍 LLM from PDF

Python Script: 02_quickstart_llm_pdf.py

Use Case: Extract from multi-page rheology researchs

Prerequisites:

export MISTRAL_API_KEY="your-api-key"
uv sync

CLI Command:

uv run docling-graph convert "docs/examples/data/research_paper/rheology.pdf" \
    --template "docs.examples.templates.rheology_research.ScholarlyRheologyPaper" \
    --output-dir "outputs/cli_02" \
    --backend "llm" \
    --inference "remote" \
    --provider "mistral" \
    --model "mistral-large-latest" \
    --processing-mode "many-to-one" \
    --use-chunking \
    --no-llm-consolidation

When to Use:

  • ✅ Multi-page documents
  • ✅ Text-heavy content
  • ✅ Rheology researchs, reports
  • ✅ Complex narratives

📍 URL Processing

Python Script: 03_url_processing.py

Use Case: Download and process documents from URLs

Prerequisites:

export MISTRAL_API_KEY="your-api-key"
uv sync

CLI Command:

uv run docling-graph convert "https://arxiv.org/pdf/2207.02720" \
    --template "docs.examples.templates.rheology_research.ScholarlyRheologyPaper" \
    --output-dir "outputs/cli_03" \
    --backend "llm" \
    --inference "remote" \
    --provider "mistral" \
    --model "mistral-large-latest" \
    --processing-mode "many-to-one" \
    --use-chunking

When to Use:

  • ✅ arXiv papers
  • ✅ Web-hosted PDFs
  • ✅ Automated ingestion
  • ✅ Remote document processing

📍 Input Formats

Python Script: 04_input_formats.py

Use Case: Process text, Markdown, and DoclingDocument formats

Text File:

# Create sample text file
echo "Title: Sample Document
Summary: This is a test document.
Key Points:
- Point 1
- Point 2" > sample.txt

# Process text file
uv run docling-graph convert "sample.txt" \
    --template "docs.examples.templates.simple.SimpleDocument" \
    --output-dir "outputs/cli_04_text" \
    --backend "llm" \
    --inference "remote" \
    --provider "mistral"

Markdown File:

# Process markdown file
uv run docling-graph convert "README.md" \
    --template "docs.examples.templates.simple.SimpleDocument" \
    --output-dir "outputs/cli_04_markdown" \
    --backend "llm"

When to Use:

  • ✅ Documentation files
  • ✅ Plain text content
  • ✅ Reprocessing (DoclingDocument)
  • ✅ Skip OCR for speed

📍 Processing Modes

Python Script: 05_processing_modes.py

Use Case: Compare one-to-one vs many-to-one modes

Prerequisites:

ollama serve
ollama pull llama3:8b
uv sync

One-to-One Mode:

uv run docling-graph convert "docs/examples/data/id_card/multi_french_id_cards.pdf" \
    --template "docs.examples.templates.id_card.IDCard" \
    --output-dir "outputs/cli_05_one_to_one" \
    --backend "llm" \
    --inference "local" \
    --provider "ollama" \
    --model "llama3:8b" \
    --processing-mode "one-to-one" \
    --no-use-chunking

Many-to-One Mode:

uv run docling-graph convert "docs/examples/data/id_card/multi_french_id_cards.pdf" \
    --template "docs.examples.templates.id_card.IDCard" \
    --output-dir "outputs/cli_05_many_to_one" \
    --backend "llm" \
    --inference "local" \
    --provider "ollama" \
    --model "llama3:8b" \
    --processing-mode "many-to-one" \
    --use-chunking


📍 Export Formats

Python Script: 06_export_formats.py

Use Case: Generate different export formats for Neo4j

CSV Export (Bulk Import):

uv run docling-graph convert "docs/examples/data/invoice/sample_invoice.jpg" \
    --template "docs.examples.templates.billing_document.BillingDocument" \
    --output-dir "outputs/cli_06_csv" \
    --backend "vlm" \
    --export-format "csv"

Cypher Export (Script):

uv run docling-graph convert "docs/examples/data/invoice/sample_invoice.jpg" \
    --template "docs.examples.templates.billing_document.BillingDocument" \
    --output-dir "outputs/cli_06_cypher" \
    --backend "vlm" \
    --export-format "cypher"

Neo4j Import:

# CSV bulk import
neo4j-admin database import full \
    --nodes=outputs/cli_06_csv/docling_graph/nodes.csv \
    --relationships=outputs/cli_06_csv/docling_graph/edges.csv

# Cypher script
cat outputs/cli_06_cypher/docling_graph/graph.cypher | \
    cypher-shell -u neo4j -p password


📍 Local Inference

Python Script: 07_local_inference.py

Use Case: Privacy-focused offline processing

Prerequisites:

ollama serve
ollama pull llama3:8b
uv sync

CLI Command:

uv run docling-graph convert "docs/examples/data/research_paper/rheology.pdf" \
    --template "docs.examples.templates.rheology_research.ScholarlyRheologyPaper" \
    --output-dir "outputs/cli_07" \
    --backend "llm" \
    --inference "local" \
    --provider "ollama" \
    --model "llama3:8b" \
    --processing-mode "many-to-one" \
    --use-chunking

When to Use:

  • ✅ Privacy-sensitive documents
  • ✅ Offline processing
  • ✅ No API costs
  • ✅ Development and testing

📍 Chunking & Consolidation

Python Script: 08_chunking_consolidation.py

Use Case: Compare consolidation strategies

Prerequisites:

export MISTRAL_API_KEY="your-api-key"
uv sync

Programmatic Merge (Fast):

uv run docling-graph convert "docs/examples/data/research_paper/rheology.pdf" \
    --template "docs.examples.templates.rheology_research.ScholarlyRheologyPaper" \
    --output-dir "outputs/cli_08_programmatic" \
    --backend "llm" \
    --inference "remote" \
    --provider "mistral" \
    --processing-mode "many-to-one" \
    --use-chunking \
    --no-llm-consolidation

LLM Consolidation (Intelligent):

uv run docling-graph convert "docs/examples/data/research_paper/rheology.pdf" \
    --template "docs.examples.templates.rheology_research.ScholarlyRheologyPaper" \
    --output-dir "outputs/cli_08_llm" \
    --backend "llm" \
    --inference "remote" \
    --provider "mistral" \
    --processing-mode "many-to-one" \
    --use-chunking \
    --llm-consolidation


📍 Batch Processing

Python Script: 09_batch_processing.py

Use Case: Process multiple documents efficiently

Bash Script:

#!/bin/bash
# Process all invoices in a directory

for file in docs/examples/data/invoice/*.jpg; do
    filename=$(basename "$file" .jpg)
    echo "Processing $filename..."

    uv run docling-graph convert "$file" \
        --template "docs.examples.templates.billing_document.BillingDocument" \
        --output-dir "outputs/cli_09/$filename" \
        --backend "vlm" \
        --processing-mode "one-to-one"
done

echo "Batch processing complete!"


📍 Multi-Provider

Python Script: 10_provider_configs.py

Use Case: Compare different LLM providers

Prerequisites:

# Set API keys for providers you want to test
export OPENAI_API_KEY="sk-..."
export MISTRAL_API_KEY="..."
export GEMINI_API_KEY="..."
export WATSONX_API_KEY="..."
export WATSONX_PROJECT_ID="..."

uv sync

OpenAI:

uv run docling-graph convert "docs/examples/data/research_paper/rheology.pdf" \
    --template "docs.examples.templates.rheology_research.ScholarlyRheologyPaper" \
    --output-dir "outputs/cli_10_openai" \
    --backend "llm" \
    --inference "remote" \
    --provider "openai" \
    --model "gpt-4-turbo-preview"

Mistral:

uv run docling-graph convert "docs/examples/data/research_paper/rheology.pdf" \
    --template "docs.examples.templates.rheology_research.ScholarlyRheologyPaper" \
    --output-dir "outputs/cli_10_mistral" \
    --backend "llm" \
    --inference "remote" \
    --provider "mistral" \
    --model "mistral-large-latest"

Gemini:

uv run docling-graph convert "docs/examples/data/research_paper/rheology.pdf" \
    --template "docs.examples.templates.rheology_research.ScholarlyRheologyPaper" \
    --output-dir "outputs/cli_10_gemini" \
    --backend "llm" \
    --inference "remote" \
    --provider "gemini" \
    --model "gemini-1.5-pro"

WatsonX:

uv run docling-graph convert "docs/examples/data/research_paper/rheology.pdf" \
    --template "docs.examples.templates.rheology_research.ScholarlyRheologyPaper" \
    --output-dir "outputs/cli_10_watsonx" \
    --backend "llm" \
    --inference "remote" \
    --provider "watsonx" \
    --model "ibm/granite-4-h-small"


Common Options

Visualization

# View interactive graph
uv run docling-graph inspect outputs/cli_01

# Open specific HTML file
open outputs/cli_01/docling_graph/graph.html

Debugging

# Verbose output
uv run docling-graph --verbose convert ...

# Check version
uv run docling-graph --version

# Get help
uv run docling-graph convert --help

Configuration

# Initialize config file
uv run docling-graph init

# Use custom config
uv run docling-graph convert --config custom_config.yaml ...

Troubleshooting

🐛 API Key Issues

# Check if key is set
echo $MISTRAL_API_KEY

# Set key for current session
export MISTRAL_API_KEY="your-key"

# Set permanently (add to ~/.bashrc or ~/.zshrc)
echo 'export MISTRAL_API_KEY="your-key"' >> ~/.bashrc

🐛 Ollama Issues

# Check if Ollama is running
curl http://localhost:11434

# Start Ollama
ollama serve

# List available models
ollama list

# Pull a model
ollama pull llama3:8b

🐛 Installation Issues

# Reinstall dependencies
uv sync --force

# Check Python version
python --version  # Should be 3.10+

# Verify installation
uv run python -c "import docling_graph; print(docling_graph.__version__)"

Next Steps

  1. Python API → - Programmatic usage
  2. Examples → - Real-world examples
  3. Advanced Topics → - Custom backends