Tuned Model Inference#
This section shows how to deploy model and use ModelInference
class with created deployment.
There are two ways to query generate_text
using the deployments module or using ModelInference module .
Working with deployments#
This section describes methods that enable user to work with deployments. But first it will be needed to create client and set project_id
or space_id
.
from ibm_watson_machine_learning import APIClient
client = APIClient(credentials)
client.set.default_project("7ac03029-8bdd-4d5f-a561-2c4fd1e40705")
To create deployment with specific parameters call following lines.
from datetime import datetime
model_id = prompt_tuner.get_model_id()
meta_props = {
client.deployments.ConfigurationMetaNames.NAME: "PT DEPLOYMENT SDK - project",
client.deployments.ConfigurationMetaNames.ONLINE: {},
client.deployments.ConfigurationMetaNames.SERVING_NAME : f"pt_sdk_deployment_{datetime.utcnow().strftime('%Y_%m_%d_%H%M%S')}"
}
deployment_details = client.deployments.create(model_id, meta_props)
To get a deployment_id from details, use id
from metadata
.
deployment_id = deployment_details['metadata']['id']
print(deployment_id)
'7091629c-f88a-4e90-b7f0-4f414aec9c3a'
You can directly query generate_text
using the deployments module.
client.deployments.generate_text(
prompt="Example prompt",
deployment_id=deployment_id)
Creating ModelInference
instance#
At the beginning, it is recommended to define parameters (later used by module).
from ibm_watson_machine_learning.metanames import GenTextParamsMetaNames as GenParams
generate_params = {
GenParams.MAX_NEW_TOKENS: 25,
GenParams.STOP_SEQUENCES: ["\n"]
}
Create the ModelInference itself, using credentials and project_id
/ space_id
or the previously initialized APIClient (see APIClient initialization).
from ibm_watson_machine_learning.foundation_models import ModelInference
tuned_model = ModelInference(
deployment_id=deployment_id,
params=generate_params,
credentials=credentials,
project_id=project_id
)
# OR
tuned_model = ModelInference(
deployment_id=deployment_id,
params=generate_params,
api_client=client
)
You can directly query generate_text
using the ModelInference
object.
tuned_model.generate_text(prompt="Example prompt")
Importing data#
To use ModelInference, an example data may be need.
import pandas as pd
filename = 'car_rental_prompt_tuning_testing_data.json'
url = "https://raw.github.com/IBM/watson-machine-learning-samples/master/cloud/data/prompt_tuning/car_rental_prompt_tuning_testing_data.json"
if not os.path.isfile(filename):
wget.download(url)
data = pd.read_json(filename)
Analyzing satisfaction#
Note
The satisfaction analysis was performed for a specific example - car rental, it may not work in the case of other data sets.
To analyze satisfaction prepare batch with prompts, calculate the accuracy of tuned model and compare it with base model.
prompts = list(data.input)
satisfaction = list(data.output)
prompts_batch = ["\n".join([prompt]) for prompt in prompts]
Calculate accuracy of based model:
from sklearn.metrics import accuracy_score, f1_score
base_model = ModelInference(
model_id='google/flan-t5-xl',
params=generate_params,
api_client=client
)
base_model_results = base_model.generate_text(prompt=prompts_batch)
print(f'base model accuracy_score: {accuracy_score(satisfaction, [int(x) for x in base_model_results])}, base model f1_score: {f1_score(satisfaction, [int(x) for x in base_model_results])}')
'base model accuracy_score: 0.965034965034965, base model f1_score: 0.9765258215962441'
Calculate accuracy of tuned model:
tuned_model_results = tuned_model.generate_text(prompt=prompts_batch)
print(f'accuracy_score: {accuracy_score(satisfaction, [int(x) for x in tuned_model_results])}, f1_score: {f1_score(satisfaction, [int(x) for x in tuned_model_results])}')
'accuracy_score: 0.972027972027972, f1_score: 0.9811320754716981'
Generate methods#
The detailed explanation of available generate methods with exact parameters can be found in the ModelInferece class.
With previously created tuned_model
object, it is possible to generate a text stream (generator) using defined inference and generate_text_stream()
method.
for token in tuned_model.generate_text_stream(prompt=input_prompt):
print(token, end="")
'$10 Powerchill Leggings'
And also receive more detailed result with generate()
.
details = tuned_model.generate(prompt=input_prompt, params=gen_params)
print(details)
{
'model_id': 'google/flan-t5-xl',
'created_at': '2023-11-17T15:32:57.401Z',
'results': [
{
'generated_text': '$10 Powerchill Leggings',
'generated_token_count': 8,
'input_token_count': 73,
'stop_reason': 'eos_token'
}
],
'system': {'warnings': []}
}