Usage

This module allows you to connect to any API endpoint that implements the OpenAI API specification. Many LLM providers such as Llama.cpp, LMStudio, LocalAI, vLLM, or self-hosted solutions offer OpenAI-compatible APIs.

Setup

Before using the OpenAI-compatible functionality, you need to configure the connection details:

-- Set server information for current job
CALL dbsdk_v1.openai_compatible_setserverforjob('127.0.0.1');
CALL dbsdk_v1.openai_compatible_setportforjob(8080);
CALL dbsdk_v1.openai_compatible_setprotocolforjob('http');

-- Optional: Set the model to use
CALL dbsdk_v1.openai_compatible_setmodelforjob('gpt-3.5-turbo');

-- Optional: Set API key if your server requires authentication
-- Many local deployments don't require authentication
CALL dbsdk_v1.openai_compatible_setapikeyforjob('your-api-key');

-- Optional: Set a custom base path if your provider uses a non-standard path
-- By default, the standard OpenAI path '/v1' is used
CALL dbsdk_v1.openai_compatible_setbasepathforjob('/v1');

To make these settings persistent across jobs for your user profile:

CALL dbsdk_v1.openai_compatible_setserverforme('127.0.0.1');
CALL dbsdk_v1.openai_compatible_setportforme(8080);
CALL dbsdk_v1.openai_compatible_setprotocolforme('http');
CALL dbsdk_v1.openai_compatible_setmodelforme('gpt-3.5-turbo');  -- Optional
CALL dbsdk_v1.openai_compatible_setapikeyforme('your-api-key');  -- Optional
CALL dbsdk_v1.openai_compatible_setbasepathforme('/v1');         -- Optional

Basic Usage

Text Generation

To generate text using the OpenAI-compatible API:

SELECT dbsdk_v1.openai_compatible_generate('What is Db2 for i?')
FROM sysibm.sysdummy1;

Set `max_tokens`:

it is recommended to specify a value for max_tokens depending on your task:

SELECT dbsdk_v1.openai_compatible_generate(
  'Why are armadillos so cute?',
  '{"max_tokens": 128}'
) FROM sysibm.sysdummy1;

Specifying a Model

You can specify which model to use (overriding the default):

SELECT dbsdk_v1.openai_compatible_generate(
  'Explain SQL in 100 words',
  '{"model_id": "gpt-3.5-turbo"}'
) FROM sysibm.sysdummy1;

Optional Parameters

The openai_compatible_generate function supports a wide range of optional parameters through a JSON options object:

SELECT dbsdk_v1.openai_compatible_generate(
  'Tell me a story about a dragon',
  '{
    "model_id": "llama3",
    "max_tokens": 150,
    "temperature": 0.7,
    "top_p": 0.9,
    "frequency_penalty": 0.1,
    "presence_penalty": 0.1,
    "seed": 42
  }'
) FROM sysibm.sysdummy1;

Supported Parameters

see official OpenAI API Reference for full breakdown of optional parameters.

Parameter	Type	Default	Description
`model_id`	string	(configured model)	The model to use for generation
`max_tokens`	integer	16	Maximum number of tokens to generate
`temperature`	decimal	1.0	Controls randomness (0-2). Lower is more deterministic
`top_p`	decimal	1.0	Nucleus sampling probability cutoff
`n`	integer	1	Number of completions to generate
`stream`	boolean	false	Whether to stream back partial progress
`logprobs`	integer	null	Include log probabilities on most likely tokens
`echo`	boolean	false	Echo back the prompt in the completion
`stop`	string	null	Sequences where generation should stop
`presence_penalty`	decimal	0.0	Penalty between -2.0 and 2.0 for tokens based on presence
`frequency_penalty`	decimal	0.0	Penalty between -2.0 and 2.0 for tokens based on frequency
`logit_bias`	object	null	JSON object mapping tokens to bias values
`user`	string	null	Unique identifier for the end user
`seed`	integer	null	Seed for deterministic sampling
`suffix`	string	null	Suffix after completion insertion (for compatible models only)

Here are some key parameters explained:

temperature: Controls randomness in token selection. Lower values (e.g., 0.2) make the output more focused and deterministic, while higher values (e.g., 0.8) make it more creative.
max_tokens: Sets the maximum length of the generated text. Increase for longer responses, decrease for shorter ones.
top_p: An alternative to temperature for controlling randomness. Sets a probability threshold for token selection.
frequency_penalty: Reduces repetition by penalizing tokens that have already appeared in the text. Higher values (0.8) strongly discourage repetition.
presence_penalty: Similar to frequency_penalty, but penalizes tokens based on their presence regardless of how many times they’ve appeared.
seed: When provided, attempts to make outputs deterministic. Good for reproducible results.

JSON Generation

For structured data, you can use the JSON-specific function:

SELECT dbsdk_v1.openai_compatible_generate_json(
  'Generate a JSON with 3 names and ages of fictional people',
  '{
    "model_id": "llama3",
    "temperature": 0.5,
    "max_tokens": 100
  }'
) FROM sysibm.sysdummy1;

The generate_json function returns the complete API response as a JSON object, allowing you to access not only the generated content but also metadata such as token usage statistics.

API Response Format

The OpenAI-compatible API response follows this structure:

{
  "choices": [
    {
      "text": "** GENERATED TEXT HERE **",
      "index": 0,
      "logprobs": null,
      "finish_reason": "length"
    }
  ],
  "created": 1746039098,
  "model": "llama3",
  "system_fingerprint": "b5043-c262bedd",
  "object": "text_completion",
  "usage": {
    "completion_tokens": 100,
    "prompt_tokens": 13,
    "total_tokens": 113
  },
  "id": "chatcmpl-OW6vxJUGFMwUyg6HPGHYO1Uaza3VeXRj",
  "timings": {
    "prompt_n": 13,
    "prompt_ms": 418.129,
    "prompt_per_token_ms": 32.16376923076923,
    "prompt_per_second": 31.090883435494785,
    "predicted_n": 100,
    "predicted_ms": 14705.362,
    "predicted_per_token_ms": 147.05362,
    "predicted_per_second": 6.8002406197140886
  }
}

The regular generate function extracts just the text content from choices[0].text, while the generate_json function returns the entire JSON response.

Authentication

The OpenAI-compatible endpoints can work with both authenticated and unauthenticated services:

Commercial Services (like OpenAI): Typically require an API key
Self-hosted Models: Often don’t require authentication, especially local deployments

If your service doesn’t require authentication, you can skip setting the API key, and the system will automatically make unauthenticated requests.

Configuration Functions

Server Configuration

openai_compatible_getserver() - Gets the current server hostname/IP
openai_compatible_setserverforjob() - Sets the server for the current job
openai_compatible_setserverforme() - Sets the server persistently for your user profile

Port Configuration

openai_compatible_getport() - Gets the current port
openai_compatible_setportforjob() - Sets the port for the current job
openai_compatible_setportforme() - Sets the port persistently for your user profile

Protocol Configuration

openai_compatible_getprotocol() - Gets the current protocol (http/https)
openai_compatible_setprotocolforjob() - Sets the protocol for the current job
openai_compatible_setprotocolforme() - Sets the protocol persistently for your user profile

Model Configuration

openai_compatible_getmodel() - Gets the current model
openai_compatible_setmodelforjob() - Sets the model for the current job
openai_compatible_setmodelforme() - Sets the model persistently for your user profile

API Key Configuration (Optional)

openai_compatible_getapikey() - Gets the current API key
openai_compatible_setapikeyforjob() - Sets the API key for the current job
openai_compatible_setapikeyforme() - Sets the API key persistently for your user profile

Base Path Configuration (Optional)

openai_compatible_getbasepath() - Gets the current base path
openai_compatible_setbasepathforjob() - Sets the base path for the current job
openai_compatible_setbasepathforme() - Sets the base path persistently for your user profile

OpenAI-compatible endpoints main functionality

Function: `openai_compatible_generate`

Description: Uses OpenAI-compatible API to generate a reply to the given prompt

Input parameters:

PROMPT (required): The input prompt for the LLM.
options (optional): JSON object containing optional parameters:
- model_id: The model identifier to use for generation.
- max_tokens: Maximum number of tokens to generate. Default 16.
- temperature: Sampling temperature between 0 and 2. Default 1.
- top_p: Nucleus sampling probability mass. Default 1.
- n: Number of completions to generate. Default 1.
- stream: Whether to stream back partial progress. Default false.
- logprobs: Include log probabilities on most likely tokens, max 5.
- echo: Echo back the prompt in the completion. Default false.
- stop: Up to 4 sequences where generation should stop.
- presence_penalty: Penalty between -2.0 and 2.0 for tokens based on presence. Default 0.
- frequency_penalty: Penalty between -2.0 and 2.0 for tokens based on frequency. Default 0.
- best_of: Generate best_of completions server-side. Default 1.
- logit_bias: JSON object mapping tokens to bias values.
- user: Unique identifier for the end user.
- seed: Seed for deterministic sampling.
- suffix: Suffix after completion insertion (for compatible models only).
apikey (optional): API key for authentication. If not provided, uses configured key.
base_url (optional): Base URL for the API endpoint. If not provided, uses configured endpoint settings.

Return type:

clob(2G) ccsid 1208

Return value:

The generated reply.

Unsupported options:

set best_of = coalesce(json_value(options, ‘$.best_of’), 1);

Function: `openai_compatible_generate_json`

Description: Uses OpenAI-compatible API to process a prompt and return structured JSON

Input parameters:

PROMPT (required): The input prompt for the LLM.
options (optional): JSON object containing optional parameters:
- model_id: The model identifier to use for generation.
- max_tokens: Maximum number of tokens to generate. Default 16.
- temperature: Sampling temperature between 0 and 2. Default 1.
- top_p: Nucleus sampling probability mass. Default 1.
- n: Number of completions to generate. Default 1.
- stream: Whether to stream back partial progress. Default false.
- logprobs: Include log probabilities on most likely tokens, max 5.
- echo: Echo back the prompt in the completion. Default false.
- stop: Up to 4 sequences where generation should stop.
- presence_penalty: Penalty between -2.0 and 2.0 for tokens based on presence. Default 0.
- frequency_penalty: Penalty between -2.0 and 2.0 for tokens based on frequency. Default 0.
- best_of: Generate best_of completions server-side. Default 1.
- logit_bias: JSON object mapping tokens to bias values.
- user: Unique identifier for the end user.
- seed: Seed for deterministic sampling.
- suffix: Suffix after completion insertion (for compatible models only).
api_key_ (optional): API key for authentication. If not provided, uses configured key.
base_url (optional): Base URL for the API endpoint. If not provided, uses configured endpoint settings.

Return type:

clob(2G) ccsid 1208

Return value:

The complete JSON response from the API.

Unsupported options:

set best_of = coalesce(json_value(options, ‘$.best_of’), 1);

OpenAI-compatible endpoints utility functions

procedure: `openai_compatible_setserverforjob`

Description: sets the OpenAI-compatible server to be used for this job Input parameters:

HOSTNAME (required): The IP address or hostname of the OpenAI-compatible server.

procedure: `openai_compatible_setserverforme`

Description: sets the OpenAI-compatible server to be used for this user profile (persists across jobs) Input parameters:

HOSTNAME (required): The IP address or hostname of the OpenAI-compatible server.

procedure: `openai_compatible_setportforjob`

Description: sets the OpenAI-compatible server port to be used for this job Input parameters:

PORT (required): The OpenAI-compatible server port.

procedure: `openai_compatible_setportforme`

Description: sets the OpenAI-compatible server port to be used for this user profile (persists across jobs). Input parameters:

PORT (required): The OpenAI-compatible server port.

procedure: `openai_compatible_setmodelforjob`

Description: sets the large language model (LLM) to be used for this job Input parameters:

MODEL (required): The model identifier to use.

procedure: `openai_compatible_setmodelforme`

Description: sets the large language model (LLM) to be used for this user profile (persists across jobs) Input parameters:

MODEL (required): The model identifier to use.

procedure: `openai_compatible_setprotocolforjob`

Description: sets the protocol to be used for this job Input parameters:

PROTOCOL (required): http/https

procedure: `openai_compatible_setprotocolforme`

Description: sets the protocol to be used for this user profile (persists across jobs)

Input parameters:

PROTOCOL (required): http/https Function for API key management

procedure: `openai_compatible_setapikeyforjob`

Description: sets the API key to be used for this job Input parameters:

API_KEY (required): The API key for authentication.

procedure: `openai_compatible_setapikeyforme`

Description: sets the API key to be used for this user profile (persists across jobs)

Input parameters:

API_KEY (required): The API key for authentication. Function for base path configuration (for servers with non-standard API paths)

procedure: `openai_compatible_setbasepathforjob`

Description: sets the API base path to be used for this job Input parameters:

BASE_PATH (required): The base path for API endpoints.

procedure: `openai_compatible_setbasepathforme`

Description: sets the API base path to be used for this user profile (persists across jobs)

Input parameters:

BASE_PATH (required): The base path for API endpoints.

JSON Object Update Function

Description: Merges two JSON objects into one

Input parameters:

base_object (required): The base JSON object
update_object (required): The object to merge into the base object

Return type:

clob(64K) ccsid 1208

Return value:

A merged JSON object

Usage

Setup

Basic Usage

Text Generation

Set max_tokens:

Specifying a Model

Optional Parameters

Supported Parameters

JSON Generation

API Response Format

Authentication

Configuration Functions

Server Configuration

Port Configuration

Protocol Configuration

Model Configuration

API Key Configuration (Optional)

Base Path Configuration (Optional)

OpenAI-compatible endpoints main functionality

Function: openai_compatible_generate

Function: openai_compatible_generate_json

OpenAI-compatible endpoints utility functions

procedure: openai_compatible_setserverforjob

procedure: openai_compatible_setserverforme

procedure: openai_compatible_setportforjob

procedure: openai_compatible_setportforme

procedure: openai_compatible_setmodelforjob

procedure: openai_compatible_setmodelforme

procedure: openai_compatible_setprotocolforjob

procedure: openai_compatible_setprotocolforme

procedure: openai_compatible_setapikeyforjob

procedure: openai_compatible_setapikeyforme

procedure: openai_compatible_setbasepathforjob

procedure: openai_compatible_setbasepathforme

JSON Object Update Function

Set `max_tokens`:

Function: `openai_compatible_generate`

Function: `openai_compatible_generate_json`

procedure: `openai_compatible_setserverforjob`

procedure: `openai_compatible_setserverforme`

procedure: `openai_compatible_setportforjob`

procedure: `openai_compatible_setportforme`

procedure: `openai_compatible_setmodelforjob`

procedure: `openai_compatible_setmodelforme`

procedure: `openai_compatible_setprotocolforjob`

procedure: `openai_compatible_setprotocolforme`

procedure: `openai_compatible_setapikeyforjob`

procedure: `openai_compatible_setapikeyforme`

procedure: `openai_compatible_setbasepathforjob`

procedure: `openai_compatible_setbasepathforme`