Batch Service
The BatchService provides functionality to submit and manage batch jobs using the IBM watsonx.ai Batches API. A batch job processes multiple requests from a JSONL input file.
Relationship with FileService
BatchService and FileService work in tandem. Input files must be uploaded to IBM Cloud Object Storage before a batch job can reference them, and output files must be retrieved from COS once the job completes. BatchService handles both steps automatically when a FileService instance is provided:
- File upload — when submitting via
Path,File, orInputStream,BatchServicecallsFileService.upload()internally and assigns the resultingfile_idto the request. - Output retrieval — when using
submitAndFetch(),BatchServicecallsFileService.retrieve()internally once the job completes and deserializes each output line into the requested type.
If you already have a file_id from a previous upload, you can submit directly without these automatic steps using BatchCreateRequest.inputFileId().
Quick Start
FileService fileService = FileService.builder()
.apiKey(WATSONX_API_KEY)
.projectId(WATSONX_PROJECT_ID)
.baseUrl(CloudRegion.DALLAS)
.build();
BatchService batchService = BatchService.builder()
.apiKey(WATSONX_API_KEY)
.projectId(WATSONX_PROJECT_ID)
.baseUrl(CloudRegion.DALLAS)
.endpoint("/v1/chat/completions")
.fileService(fileService)
.build();
List<BatchResult<ChatResponse>> results = batchService.submitAndFetch(
Path.of("requests.jsonl"),
ChatResponse.class
);
results.forEach(r ->
System.out.println(r.customId() + ": " + r.response().body().toAssistantMessage().content())
);
Overview
The BatchService enables you to:
- Submit batch jobs from a
Path,File,InputStream, or a pre-uploadedfile_id. - Wait for job completion and retrieve deserialized results with
submitAndFetch(). - Automatically clean up input and output files after
submitAndFetch()completes. - List, retrieve, and cancel batch jobs.
Service Configuration
Basic Setup
BatchService batchService = BatchService.builder()
.apiKey(WATSONX_API_KEY)
.projectId(WATSONX_PROJECT_ID)
.baseUrl("https://us-south.ml.cloud.ibm.com") // or use CloudRegion
.endpoint("/v1/chat/completions")
.fileService(fileService)
.build();
Builder Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
apiKey | String | Conditional | API key for IBM Cloud authentication |
authenticator | Authenticator | Conditional | Custom authentication (alternative to apiKey) |
projectId | String | Conditional | Project ID where batch jobs will run |
spaceId | String | Conditional | Space ID (alternative to projectId) |
baseUrl | String/CloudRegion | Yes | watsonx.ai service base URL |
endpoint | String | Yes | Default API endpoint for batch inference (e.g., /v1/chat/completions) |
fileService | FileService | Conditional | Required when submitting via file upload or using submitAndFetch() |
timeout | Duration | No | Request and polling timeout (default: 60 seconds) |
logRequests | Boolean | No | Enable request logging (default: false) |
logResponses | Boolean | No | Enable response logging (default: false) |
httpClient | HttpClient | No | Custom HTTP client |
verifySsl | Boolean | No | SSL certificate verification (default: true) |
version | String | No | API version override |
Either
apiKeyorauthenticatormust be provided. EitherprojectIdorspaceIdmust be specified.
Input File Format
Input files must be in JSONL format — one JSON object per line. Each line represents a single inference request and must include a custom_id to correlate results with inputs.
{"custom_id": "a", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "ibm/granite-4-h-small", "messages": [{"role": "user", "content": [{"type": "text", "text": "Capital of Italy"}]}], "max_completion_tokens": 0, "time_limit": 30000, "temperature": 0}}
{"custom_id": "b", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "ibm/granite-4-h-small", "messages": [{"role": "user", "content": [{"type": "text", "text": "Capital of France"}]}], "max_completion_tokens": 0, "time_limit": 30000, "temperature": 0}}
{"custom_id": "c", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "ibm/granite-4-h-small", "messages": [{"role": "user", "content": [{"type": "text", "text": "Capital of Germany"}]}], "max_completion_tokens": 0, "time_limit": 30000, "temperature": 0}}
Examples
Submit and Fetch Results
submitAndFetch() uploads the file, submits the job, polls until completion, and returns the deserialized results. It requires a FileService to be configured on the builder.
From a Path:
var results = batchService.submitAndFetch(Path.of("requests.jsonl"), ChatResponse.class);
results.forEach(r ->
System.out.println(r.customId() + ": " + r.response().body().toAssistantMessage().content())
);
// → a: Rome
// → b: Paris
// → c: Berlin
From an InputStream:
InputStream is = new FileInputStream("requests.jsonl");
List<BatchResult<ChatResponse>> results = batchService.submitAndFetch(is, "requests.jsonl", ChatResponse.class);
From a pre-uploaded file_id — when you have already uploaded the file via FileService:
FileData fileData = fileService.upload(Path.of("requests.jsonl"));
List<BatchResult<ChatResponse>> results = batchService.submitAndFetch(
BatchCreateRequest.builder()
.inputFileId(fileData.id())
.build(),
ChatResponse.class
);
Batch Chat Requests
submitChatRequestsAndFetch() is a higher-level convenience method that accepts a list of ChatRequest objects directly, builds the JSONL input internally, and returns results in the same order as the input list.
var parameters = ChatParameters.builder()
.modelId("ibm/granite-4-h-small")
.temperature(0.0)
.maxCompletionTokens(0)
.build();
List<ChatRequest> requests = Stream.of("Italy", "France", "Germany")
.map(country -> ChatRequest.builder()
.parameters(parameters)
.messages(List.of(
SystemMessage.of("You are a helpful assistant."),
UserMessage.text("What is the capital of " + country + "? Answer with only the city name.")
))
.build())
.toList();
List<BatchResult<ChatResponse>> results = batchService.submitChatRequestsAndFetch(requests);
System.out.println(results.get(0).response().body().toAssistantMessage().content()); // → Rome
System.out.println(results.get(1).response().body().toAssistantMessage().content()); // → Paris
System.out.println(results.get(2).response().body().toAssistantMessage().content()); // → Berlin
Unlike
submitAndFetch(), the results returned bysubmitChatRequestsAndFetch()are guaranteed to be in the same order as the input requests.
Submit Without Waiting
Use submit() to start the job and return immediately with the job metadata, without blocking for completion:
BatchData batchData = batchService.submit(Path.of("requests.jsonl"));
System.out.println("Job ID: " + batchData.id());
System.out.println("Status: " + batchData.status());
// → Job ID: batch-AQIDkP4L...
// → Status: validating
Then poll manually and retrieve the output once complete:
while (true) {
batchData = batchService.retrieve(batchData.id());
if (batchData.status().equals(Status.COMPLETED.value())) {
String output = fileService.retrieve(batchData.outputFileId());
System.out.println(output);
break;
} else if (batchData.status().equals(Status.FAILED.value())) {
System.err.println("Batch failed: " + batchData.errors());
break;
}
Thread.sleep(2000);
}
Customizing a Request
Use BatchCreateRequest to override the endpoint, set a custom completion window, or attach metadata:
BatchData batchData = batchService.submit(
BatchCreateRequest.builder()
.inputFileId(fileData.id())
.endpoint("/v1/chat/completions") // overrides the service-level default
.completionWindow("1h") // defaults to "24h" if not set
.metadata(Map.of("job", "nightly-run"))
.projectId("override-project-id")
.transactionId("my-transaction-id")
.build()
);
Listing Batch Jobs
// All jobs (default limit: 20)
BatchListResponse response = batchService.list();
// With a custom limit
BatchListResponse response = batchService.list(
BatchListRequest.builder()
.limit(10)
.build()
);
response.data().forEach(b -> System.out.println(b.id() + " – " + b.status()));
System.out.println("Has more: " + response.hasMore());
Cancelling a Batch Job
Cancelling transitions the job to cancelling and eventually to cancelled. Partial results, if available, are preserved in the output file.
BatchData batchData = batchService.cancel("batch-abc123");
System.out.println(batchData.status()); // → cancelling
BatchData
Returned by submit(), retrieve(), cancel(), and contained in BatchListResponse.data().
| Field | Type | Description |
|---|---|---|
id() | String | Unique batch job identifier |
object() | String | Always "batch" |
endpoint() | String | API endpoint used for inference (e.g., /v1/chat/completions) |
inputFileId() | String | Identifier of the uploaded input file |
completionWindow() | String | Time window for completion (e.g., "24h") |
status() | String | Current job status — see Status |
outputFileId() | String | Identifier of the output file; available once completed |
errorFileId() | String | Identifier of the error file, if any requests failed |
errors() | FileErrors | Validation or processing errors, if any |
requestCounts() | RequestCounts | Summary of total, completed, and failed request counts |
metadata() | Map<String, String> | User-defined key-value metadata |
createdAt() | Long | Unix timestamp when the job was created |
inProgressAt() | Long | Unix timestamp when processing started |
finalizingAt() | Long | Unix timestamp when the job entered the finalizing phase |
completedAt() | Long | Unix timestamp when the job completed successfully |
failedAt() | Long | Unix timestamp when the job failed |
expiresAt() | Long | Unix timestamp when the job will expire |
expiredAt() | Long | Unix timestamp when the job expired |
cancellingAt() | Long | Unix timestamp when cancellation was requested |
cancelledAt() | Long | Unix timestamp when the job was fully cancelled |
BatchResult
Returned per-item by submitAndFetch(). Each entry corresponds to one line in the input JSONL file.
| Field | Type | Description |
|---|---|---|
id() | String | Unique identifier of this result entry |
customId() | String | The custom_id from the original input line — use this to correlate results with inputs |
response() | Response<T> | HTTP response wrapper containing status code, request ID, and deserialized body |
processedAt() | Long | Unix timestamp when this request was processed |
response().statusCode() contains the HTTP status for this individual request. response().body() is deserialized into the class passed to submitAndFetch() (e.g., ChatResponse.class).
Status
The Status enum covers the terminal states used internally for polling. The full set of status values returned by the API:
| Value | Description |
|---|---|
validating | Input file is being validated |
in_progress | Job is actively processing requests |
finalizing | Processing complete, output is being assembled |
completed | Job finished successfully — output file is available |
failed | Job failed — check errors() and errorFileId() |