Deployment Service

The DeploymentService allows you to interact with models deployed in IBM watsonx.ai deployment spaces. Instead of referencing a modelId, every request targets a deploymentId — the identifier of an already-deployed asset. The service supports the same operations as ChatService and TimeSeriesService (chat, streaming chat, text generation, streaming text generation, time series forecasting), plus the ability to inspect a deployment’s metadata via findById.

What is a Deployment Space?

A deployment space is an IBM watsonx.ai workspace that contains deployable assets, their deployments, and associated environments. Assets (foundation models, prompt-tuned models, prompt templates) are promoted from projects into a deployment space before they can be deployed. A single asset can be deployed to multiple spaces — for example, a test space and a production space.

Once deployed, each deployment is identified by a unique deploymentId. You use this ID in every DeploymentService request instead of a modelId.

Quick Start

DeploymentService deploymentService = DeploymentService.builder()
    .baseUrl(CloudRegion.DALLAS)
    .apiKey(WATSONX_API_KEY)
    .build();

var chatRequest = ChatRequest.builder()
    .deploymentId(DEPLOYMENT_ID)
    .messages(UserMessage.text("Hello!"))
    .build();

var response = deploymentService.chat(chatRequest);
System.out.println(response.toAssistantMessage().content());

Overview

The DeploymentService enables you to:

Send synchronous and streaming chat requests to a deployed model.
Run time series forecasting against a deployed TTM model, with optional futureData for exogenous features.
Retrieve deployment metadata (findById) including status, inference endpoints, asset type, and hardware configuration.

Service Configuration

Basic Setup

DeploymentService deploymentService = DeploymentService.builder()
    .baseUrl(CloudRegion.DALLAS)   // or use a URL string
    .apiKey(WATSONX_API_KEY)
    .build();

No projectId, spaceId, or modelId is required — all routing is done through the deploymentId in each request.

Builder Parameters

Parameter	Type	Required	Description
`apiKey`	String	Conditional	API key for IBM Cloud authentication
`authenticator`	Authenticator	Conditional	Custom authentication (alternative to `apiKey`)
`baseUrl`	String/CloudRegion	Yes	watsonx.ai ML endpoint
`timeout`	Duration	No	Default request timeout (default: 60 seconds)
`logRequests`	Boolean	No	Enable request logging (default: false)
`logResponses`	Boolean	No	Enable response logging (default: false)
`httpClient`	HttpClient	No	Custom HTTP client
`verifySsl`	Boolean	No	SSL certificate verification (default: true)
`defaultParameters`	ChatParameters	No	Fallback chat parameters applied to every chat request
`messageInterceptor`	MessageInterceptor	No	Post-processing hook for the assistant’s text content
`toolInterceptor`	ToolInterceptor	No	Post-processing hook for function call arguments

Either apiKey or authenticator must be provided. projectId, spaceId, and modelId are ignored — if set on a request’s parameters object, a warning is logged.

Chat

Synchronous Chat

var chatRequest = ChatRequest.builder()
    .deploymentId(DEPLOYMENT_ID)
    .messages(
        SystemMessage.of("You are a helpful assistant."),
        UserMessage.text("Hello, how are you?"),
    ).build();

ChatResponse response = deploymentService.chat(chatRequest);

Streaming Chat

var chatRequest = ChatRequest.builder()
    .deploymentId(DEPLOYMENT_ID)
    .messages(UserMessage.text("Tell me a joke."))
    .build();

CompletableFuture<ChatResponse> future = deploymentService.chatStreaming(chatRequest,
    new ChatHandler() {
        @Override
        public void onPartialResponse(String partial, PartialChatResponse partialResponse) {
            System.out.print(partial);
        }

        @Override
        public void onCompleteResponse(ChatResponse response) {
            System.out.println("\n[Done]");
        }

        @Override
        public void onError(Throwable error) {
            error.printStackTrace();
        }
    }
);

future.join(); // wait for completion

Time Series Forecasting

The DeploymentService supports time series forecasting via forecast(), with one key addition over TimeSeriesService: futureData — exogenous features known in advance for the forecast horizon (e.g. holidays, scheduled events).

InputSchema schema = InputSchema.builder()
    .timestampColumn("date")
    .addIdColumn("ID1")
    .build();

ForecastData historicalData = ForecastData.create()
    .addAll("date", "2020-01-01T00:00:00", "2020-01-01T01:00:00", "2020-01-05T01:00:00")
    .addAll("ID1", "D1", "D1", "D1")
    .addAll("TARGET1", 1.46, 2.34, 4.55);

ForecastData futureData = ForecastData.create()
    .add("date", "2021-01-01T00:00:00")
    .add("ID1", "D1")
    .add("TARGET1", 5);

TimeSeriesParameters parameters = TimeSeriesParameters.builder()
    .futureData(futureData)
    .build();

TimeSeriesRequest request = TimeSeriesRequest.builder()
    .deploymentId(DEPLOYMENT_ID)
    .inputSchema(schema)
    .data(historicalData)
    .parameters(parameters)
    .build();

ForecastResponse result = deploymentService.forecast(request);
System.out.println("Output data points: " + result.outputDataPoints());

futureData is only supported by DeploymentService. When using TimeSeriesService directly, it is not available.

Finding a Deployment

Use findById to inspect a deployment’s metadata, status, and inference endpoints:

var request = FindByIdRequest.builder()
    .deploymentId(DEPLOYMENT_ID)
    .spaceId(SPACE_ID)             
    .build();

DeploymentResource resource = deploymentService.findById(request);

FindByIdRequest Parameters

Parameter	Type	Required	Description
`deploymentId`	String	Yes	The unique deployment identifier
`projectId`	String	Conditional	Project ID where the deployment resides
`spaceId`	String	Conditional	Space ID (alternative to `projectId`)
`transactionId`	String	No	Request tracking ID

Either projectId or spaceId must be provided.

DeploymentResource

Field	Type	Description
`metadata().id()`	String	Deployment unique identifier
`metadata().name()`	String	Human-readable name
`metadata().description()`	String	Deployment description
`metadata().createdAt()`	String	Creation timestamp
`metadata().modifiedAt()`	String	Last modification timestamp
`metadata().spaceId()`	String	Space where the deployment lives
`metadata().projectId()`	String	Project where the deployment lives
`metadata().tags()`	List<String>	Tags
`entity().deployedAssetType()`	String	Type of deployed asset (`prompt_tune`, `foundation_model`, `custom_foundation_model`)
`entity().baseModelId()`	String	The underlying foundation model
`entity().status().state()`	String	Deployment state (e.g., `ready`, `failed`)
`entity().status().inference()`	List<Inference>	List of inference endpoints
`entity().status().message()`	Message	Status message with `level()` and `text()`
`entity().status().failure()`	ApiErrorResponse	Error details if state is `failed`
`entity().asset()`	ModelRel	Model asset reference with `id()` and `rev()`
`entity().promptTemplate()`	SimpleRel	Prompt template reference (if applicable)
`entity().hardwareSpec()`	HardwareSpec	Hardware specification (`id`, `name`, `numNodes`)
`entity().online().parameters()`	Map<String, Object>	Online deployment parameters (e.g., `serving_name`)
`entity().custom()`	Map<String, Object>	User-defined metadata