The version date for the API of the form YYYY-MM-DD
.
Static
wxCreate a new watsonx.ai deployment.
Create a new deployment, currently the only supported type is online
.
If this is a deployment for a prompt tune then the asset
object must exist and the id
must be the id
of the
model
that was created after the prompt training.
If this is a deployment for a prompt template then the prompt_template
object should exist and the id
must be
the id
of the prompt template to be deployed.
The parameters to send to the service.
Delete the deployment.
Delete the deployment with the specified identifier.
The parameters to send to the service.
Infer text.
Infer the next tokens for a given deployed model with a set of parameters. If a serving_name
is used then it must
match the serving_name
that is returned in the inference
section when the deployment was created.
Note that there is currently a limitation in this operation when using return_options
, for input only
input_text
will be returned if requested, for output the input_tokens
and generated_tokens
will not be
returned.
The parameters to send to the service.
Optional
callbacks: WatsonXAI.RequestCallbacks<WatsonXAI.Response<WatsonXAI.TextChatResponse>>The parameters to send to the service.
Infer text event stream.
Infer the next tokens for a given deployed model with a set of parameters. This operation will return the output
tokens as a stream of events. If a serving_name
is used then it must match the serving_name
that is returned in
the inference
when the deployment was created.
Note that there is currently a limitation in this operation when using return_options
, for input only
input_text
will be returned if requested, for output the input_tokens
and generated_tokens
will not be
returned, also the
rank
and top_tokens
will not be returned.
The parameters to send to the service.
Optional
callbacks: WatsonXAI.RequestCallbacks<WatsonXAI.Response<WatsonXAI.TextChatResponse>>The parameters to send to the service.
return - Promise resolving to Stream object. Stream object is AsyncIterable based class. Stream object contains an additional property holding an AbortController, read more below.
Optional
callbacks: WatsonXAI.RequestCallbacks<WatsonXAI.Response<WatsonXAI.TextChatResponse>>Infer text chat.
Infer the next chat message for a given deployment. The deployment must reference a prompt template which has
input_mode
set to chat
. The model to the chat request will be from the deployment base_model_id
. Parameters
to the chat request will be from the prompt template model_parameters
. Related guides:
Deployment, Prompt
template, Text
chat.
If a serving_name
is used then it must match the serving_name
that is returned in the inference
section when
the deployment was created.
The parameters to send to the service.
Optional
callbacks: WatsonXAI.RequestCallbacks<WatsonXAI.Response<WatsonXAI.TextChatResponse>>Infer text chat event stream.
Infer the next chat message for a given deployment. This operation will return the output tokens as a stream of
events. The deployment must reference a prompt template which has input_mode
set to chat
. The model to the chat
request will be from the deployment base_model_id
. Parameters to the chat request will be from the prompt
template model_parameters
. Related guides:
Deployment, Prompt
template, Text
chat.
If a serving_name
is used then it must match the serving_name
that is returned in the inference
section when
the deployment was created.
The parameters to send to the service.
Optional
callbacks: WatsonXAI.RequestCallbacks<WatsonXAI.Response<WatsonXAI.TextChatResponse>>return - Promise resolving to Stream object. Stream object is AsyncIterable based class. Stream object contains an additional property holding an AbortController, read more below.
Optional
callbacks: WatsonXAI.RequestCallbacks<WatsonXAI.Response<WatsonXAI.TextChatResponse>>Retrieve the deployment details.
Retrieve the deployment details with the specified identifier.
The parameters to send to the service.
Retrieve the deployments.
Retrieve the list of deployments for the specified space or project.
Optional
params: WatsonXAI.ListDeploymentsParamsThe parameters to send to the service.
Update the deployment metadata.
Update the deployment metadata. The following parameters of deployment metadata are supported for the patch operation.
/name
/description
/tags
/custom
/online/parameters
/asset
- replace
only/prompt_template
- replace
only/hardware_spec
/hardware_request
/base_model_id
- replace
only (applicable only to prompt template deployments referring to IBM base
foundation models)The PATCH operation with path specified as /online/parameters
can be used to update the serving_name
.
The parameters to send to the service.
Generate embeddings.
Generate embeddings from text input.
See the documentation for a description of text embeddings.
The parameters to send to the service.
Optional
callbacks: WatsonXAI.RequestCallbacks<WatsonXAI.Response<WatsonXAI.TextChatResponse>>The parameters to send to the service.
List the available foundation models.
Retrieve the list of deployed foundation models.
Optional
params: WatsonXAI.ListFoundationModelSpecsParamsThe parameters to send to the service.
List the supported tasks.
Retrieve the list of tasks that are supported by the foundation models.
Optional
params: WatsonXAI.ListFoundationModelTasksParamsThe parameters to send to the service.
Create a new prompt session.
This creates a new prompt session.
The parameters to send to the service.
Add a new prompt to a prompt session.
This creates a new prompt associated with the given session.
The parameters to send to the service.
Add a new chat item to a prompt session entry.
This adds new chat items to the given entry.
The parameters to send to the service.
Delete a prompt session.
This deletes a prompt session with the given id.
The parameters to send to the service.
Delete a prompt session entry.
This deletes a prompt session entry with the given id.
The parameters to send to the service.
Get a prompt session.
This retrieves a prompt session with the given id.
The parameters to send to the service.
Get a prompt session entry.
This retrieves a prompt session entry with the given id.
The parameters to send to the service.
Get current prompt session lock status.
Retrieves the current locked state of a prompt session.
The parameters to send to the service.
Get entries for a prompt session.
List entries from a given session.
The parameters to send to the service.
Update a prompt session.
This updates a prompt session with the given id.
The parameters to send to the service.
Prompt session lock modifications.
Modifies the current locked state of a prompt session.
The parameters to send to the service.
Create a new prompt / prompt template.
This creates a new prompt with the provided parameters.
The parameters to send to the service.
Add a new chat item to a prompt.
This adds new chat items to the given prompt.
The parameters to send to the service.
Delete a prompt.
This delets a prompt / prompt template with the given id.
The parameters to send to the service.
Get a prompt.
This retrieves a prompt / prompt template with the given id.
The parameters to send to the service.
Get the inference input string for a given prompt.
Computes the inference input string based on state of a prompt. Optionally replaces template params.
The parameters to send to the service.
Get current prompt lock status.
Retrieves the current locked state of a prompt.
The parameters to send to the service.
Update a prompt.
This updates a prompt / prompt template with the given id.
The parameters to send to the service.
Prompt lock modifications.
Modifies the current locked state of a prompt.
The parameters to send to the service.
Infer text.
Infer the next tokens for a given deployed model with a set of parameters.
The parameters to send to the service.
Optional
callbacks: WatsonXAI.RequestCallbacks<WatsonXAI.Response<WatsonXAI.TextChatResponse>>The parameters to send to the service.
Infer text event stream.
Infer the next tokens for a given deployed model with a set of parameters. This operation will return the output tokens as a stream of events
Stream<string | WatsonxAiMlVml_v1.ObjectStreamed<WatsonxAiMlVml_v1.TextGenResponse>> represents a source of streaming data. If request performed successfully Stream<string | WatsonxAiMlVml_v1.ObjectStreamed<WatsonxAiMlVml_v1.TextGenResponse>> returns either stream line by line. Output will stream as follow:
or stream of objects. Output will stream as follow: { id: 2, event: 'message', data: {data} } Here is one of the possibilities to read streaming output:
const stream = await watsonxAiMlService.generateTextStream(parameters); for await (const line of stream) { console.log(line); }.
The parameters to send to the service.
Optional
callbacks: WatsonXAI.RequestCallbacks<WatsonXAI.Response<WatsonXAI.TextChatResponse>>The parameters to send to the service.
return - Promise resolving to Stream object. Stream object is AsyncIterable based class. Stream object contains an additional property holding an AbortController, read more below.
Optional
callbacks: WatsonXAI.RequestCallbacks<WatsonXAI.Response<WatsonXAI.TextChatResponse>>Infer text.
Infer the next tokens for a given deployed model with a set of parameters.
The parameters to send to the service.
Optional
callbacks: WatsonXAI.RequestCallbacks<WatsonXAI.Response<WatsonXAI.TextChatResponse>>The parameters to send to the service.
Infer text event stream.
Infer the next tokens for a given deployed model with a set of parameters. This operation will return the output tokens as a stream of events.
Stream<string | WatsonxAiMlVml_v1.ObjectStreamed<WatsonxAiMlVml_v1.TextGenResponse>> represents a source of streaming data. If request performed successfully Stream<string | WatsonxAiMlVml_v1.ObjectStreamed<WatsonxAiMlVml_v1.TextGenResponse>> returns either stream line by line. Output will stream as follow:
or stream of objects. Output will stream as follow: { id: , event: 'message', data: {data} }
Here is one of the possibilities to read streaming output:
const stream = await watsonxAiMlService.generateTextStream(parameters); for await (const line of stream) { console.log(line); }
The parameters to send to the service.
Optional
callbacks: WatsonXAI.RequestCallbacks<WatsonXAI.Response<WatsonXAI.TextChatResponse>>The parameters to send to the service.
return - Promise resolving to Stream object. Stream object is AsyncIterable based class. Stream object contains an additional property holding an AbortController, read more below.
Optional
callbacks: WatsonXAI.RequestCallbacks<WatsonXAI.Response<WatsonXAI.TextChatResponse>>Generate rerank.
Rerank texts based on some queries.
The parameters to send to the service.
Optional
callbacks: WatsonXAI.RequestCallbacks<WatsonXAI.Response<WatsonXAI.TextChatResponse>>The parameters to send to the service.
Text tokenization.
The text tokenize operation allows you to check the conversion of provided input to tokens for a given model. It splits text into words or sub-words, which then are converted to ids through a look-up table (vocabulary). Tokenization allows the model to have a reasonable vocabulary size.
The parameters to send to the service.
Create a new watsonx.ai training.
Create a new watsonx.ai training in a project or a space.
The details of the base model and parameters for the training must be provided in the prompt_tuning
object.
In order to deploy the tuned model you need to follow the following steps:
Create a WML model asset, in a space or a project,
by providing the request.json
as shown below:
curl -X POST "https://{cpd_cluster}/ml/v4/models?version=2024-01-29" \
-H "Authorization: Bearer <replace with your token>" \
-H "content-type: application/json" \
--data '{
"name": "replace_with_a_meaningful_name",
"space_id": "replace_with_your_space_id",
"type": "prompt_tune_1.0",
"software_spec": {
"name": "watsonx-textgen-fm-1.0"
},
"metrics": [ from the training job ],
"training": {
"id": "05859469-b25b-420e-aefe-4a5cb6b595eb",
"base_model": {
"model_id": "google/flan-t5-xl"
},
"task_id": "generation",
"verbalizer": "Input: {{input}} Output:"
},
"training_data_references": [
{
"connection": {
"id": "20933468-7e8a-4706-bc90-f0a09332b263"
},
"id": "file_to_tune1.json",
"location": {
"bucket": "wxproject-donotdelete-pr-xeyivy0rx3vrbl",
"path": "file_to_tune1.json"
},
"type": "connection_asset"
}
]
}'
Notes:
auto_update_model: true
then you can skip this step as the model will have been saved at
the end of the training job.request.json
that was stored in the results_reference
field, look for the path in the field
entity.results_reference.location.model_request_path
.type
must be prompt_tune_1.0
.watsonx-textgen-fm-1.0
.Create a tuned model deployment as described in the create deployment documentation.
The parameters to send to the service.
Cancel or delete the training.
Cancel the specified training and remove it.
The parameters to send to the service.
Retrieve the training.
Retrieve the training with the specified identifier.
The parameters to send to the service.
Retrieve the list of trainings.
Retrieve the list of trainings for the specified space or project.
Optional
params: WatsonXAI.TrainingsListParamsThe parameters to send to the service.
Static
newCancel the document extraction.
Cancel the specified document extraction and remove it.
The parameters to send to the service.
Cancel the synthetic data generation.
Cancel the synthetic data generation and remove it.
The parameters to send to the service.
Create a document extraction.
Create a document extraction.
The parameters to send to the service.
Create a fine tuning job.
Create a fine tuning job that will fine tune an LLM.
The parameters to send to the service.
Create a synthetic data generation job.
Create a synthetic data generation job.
The parameters to send to the service.
Create a taxonomy job.
Create a taxonomy job.
The parameters to send to the service.
Cancel or delete a fine tuning job.
Delete a fine tuning job if it exists, once deleted all trace of the job is gone.
The parameters to send to the service.
Cancel or delete the taxonomy job.
Cancel or delete the taxonomy job.
The parameters to send to the service.
Get document extraction.
Get document extraction.
The parameters to send to the service.
Get a fine tuning job.
Get the results of a fine tuning job, or details if the job failed.
The parameters to send to the service.
Get synthetic data generation job.
The parameters to send to the service.
Get taxonomy job.
The parameters to send to the service.
Get document extractions.
Get document extractions.
Optional
params: ListDocumentExtractionsParamsThe parameters to send to the service.
Retrieve the list of fine tuning jobs.
Retrieve the list of fine tuning jobs for the specified space or project.
Optional
params: FineTuningListParamsThe parameters to send to the service.
Get synthetic data generation jobs.
Optional
params: ListSyntheticDataGenerationsParamsThe parameters to send to the service.
Get taxonomy jobs.
Optional
params: ListTaxonomiesParamsThe parameters to send to the service.
Time series forecast.
Generate forecasts, or predictions for future time points, given historical time series data.
The parameters to send to the service.
Static
constructConstructs a service URL by formatting the parameterized service URL.
The parameterized service URL is: 'https://{region}.ml.cloud.ibm.com'
The default variable values are:
The formatted URL with all variable placeholders replaced by values.
Construct a WatsonxAiMlVml_v1 object.