Installation
Before you start, please make sure you have Python 3.10+ available.
Building DGT from source lets you make changes to the code base. To install from source, clone the repository and install with the following commands:
Now let's set up your virtual environment.
To install packages, we recommend starting off with the following
If you plan on contributing, install the pre-commit hooks to keep code formatting clean
Large Language Models (LLMs) Dependencies
DGT uses Large Language Models (LLMs) to generate synthetic data. Following LLM inference engines are supported:
| Engine | Additional Installation | Environment Variables | Supported APIs |
|---|---|---|---|
| Ollama | - | - | completion, chat_completion |
| WatsonX | - | WATSONX_API_KEY="", WATSONX_PROJECT_ID="" |
completion, chat_completion |
| OpenAI | - | OPENAI_API_KEY="" |
completion, chat_completion |
| Azure OpenAI | - | AZURE_OPENAI_API_KEY="" |
completion, chat_completion |
| Anthropic Claude | - | ANTHROPIC_API_KEY="" |
chat_completion |
| vLLM | pip install -e ".[vllm]" |
- | completion, chat_completion |
Most of the aforementioned LLM inference engines use environment variables to specify configuration settings. You can either export those environment variables prior to every run or save them in .env file at base of fms-dgt repository directory.
Warning
vLLM dependencies requires Linux OS and CUDA.