``ModelInference`` for Deployments ================================== This section shows how to use the ModelInference module with a created deployment. You can infer text in one of two ways: * the :ref:`deployments` module * the :ref:`ModelInference` module Infer text with deployments --------------------------- You can directly query ``generate_text`` using the deployments module. .. code-block:: python client.deployments.generate_text( prompt="Example prompt", deployment_id=deployment_id) Creating ``ModelInference`` instance ------------------------------------ Start by defining the parameters. They will later be used by the module. .. code-block:: python from ibm_watsonx_ai.metanames import GenTextParamsMetaNames as GenParams generate_params = { GenParams.MAX_NEW_TOKENS: 25, GenParams.STOP_SEQUENCES: ["\n"] } Create the ModelInference by using credentials and ``project_id`` / ``space_id``, or the previously initialized APIClient (see :ref:`APIClient initialization`). .. code-block:: python from ibm_watsonx_ai.foundation_models import ModelInference deployed_model = ModelInference( deployment_id=deployment_id, params=generate_params, credentials=credentials, project_id=project_id ) # OR deployed_model = ModelInference( deployment_id=deployment_id, params=generate_params, api_client=client ) .. _generate_text_ModelInference: You can directly query ``generate_text`` using the ``ModelInference`` object. .. code-block:: python deployed_model.generate_text(prompt="Example prompt") Generate methods -------------------- A detailed explanation of available generate methods with exact parameters can be found in the :ref:`ModelInferece class`. With the previously created ``deployed_model`` object, it is possible to generate a text stream (generator) using a defined inference and the ``generate_text_stream()`` method. .. code-block:: python for token in deployed_model.generate_text_stream(prompt=input_prompt): print(token, end="") '$10 Powerchill Leggings' And also receive more detailed result with ``generate()``. .. code-block:: python details = deployed_model.generate(prompt=input_prompt, params=gen_params) print(details) { 'model_id': 'google/flan-t5-xl', 'created_at': '2023-11-17T15:32:57.401Z', 'results': [ { 'generated_text': '$10 Powerchill Leggings', 'generated_token_count': 8, 'input_token_count': 73, 'stop_reason': 'eos_token' } ], 'system': {'warnings': []} }