``ModelInference`` for Deployments
==================================


This section shows how to use the ModelInference module with a created deployment.

You can infer text in one of two ways:
    * the :ref:`deployments<generate_text_deployments>` module
    * the :ref:`ModelInference<generate_text_ModelInference>` module

Infer text with deployments
---------------------------

You can directly query ``generate_text`` using the deployments module.

.. code-block:: python

    client.deployments.generate_text(
        prompt="Example prompt",
        deployment_id=deployment_id)


Creating ``ModelInference`` instance
------------------------------------

Start by defining the parameters. They will later be used by the module.

.. code-block:: python

    from ibm_watsonx_ai.metanames import GenTextParamsMetaNames as GenParams

    generate_params = {
        GenParams.MAX_NEW_TOKENS: 25,
        GenParams.STOP_SEQUENCES: ["\n"]
    }

Create the ModelInference by using credentials and ``project_id`` / ``space_id``, or the previously initialized APIClient (see :ref:`APIClient initialization<api_client_init>`).

.. code-block:: python

    from ibm_watsonx_ai.foundation_models import ModelInference

    deployed_model = ModelInference(
        deployment_id=deployment_id,
        params=generate_params,
        credentials=credentials,
        project_id=project_id
    )

    # OR

    deployed_model = ModelInference(
        deployment_id=deployment_id,
        params=generate_params,
        api_client=client
    )

.. _generate_text_ModelInference:

You can directly query ``generate_text`` using the ``ModelInference`` object.

.. code-block:: python

        deployed_model.generate_text(prompt="Example prompt")


Generate methods
--------------------

A detailed explanation of available generate methods with exact parameters can be found in the :ref:`ModelInferece class<model-inference-class>`.

With the previously created ``deployed_model`` object, it is possible to generate a text stream (generator) using a defined inference and the ``generate_text_stream()`` method.

.. code-block:: python

    for token in deployed_model.generate_text_stream(prompt=input_prompt):
        print(token, end="")
    '$10 Powerchill Leggings'

And also receive more detailed result with ``generate()``.

.. code-block:: python

    details = deployed_model.generate(prompt=input_prompt, params=gen_params)
    print(details)
    {
        'model_id': 'google/flan-t5-xl',
        'created_at': '2023-11-17T15:32:57.401Z',
        'results': [
            {
            'generated_text': '$10 Powerchill Leggings',
            'generated_token_count': 8,
            'input_token_count': 73,
            'stop_reason': 'eos_token'
            }
        ],
        'system': {'warnings': []}
    }