genai.extensions.localserver package#

Extension for running local simplified inference API compatible with current version of SDK

class genai.extensions.localserver.LocalLLMServer[source]#

Bases: object

__init__(models, port=8080, interface='0.0.0.0', api_key=None, insecure_api=False)[source]#
Parameters:
  • models (list[type[LocalModel]]) –

  • port (int) –

  • interface (str) –

  • api_key (str | None) –

  • insecure_api (bool) –

get_credentials()[source]#
run_locally()[source]#
start_server()[source]#
stop_server()[source]#
class genai.extensions.localserver.LocalModel[source]#

Bases: ABC

abstract generate(input_text, parameters)[source]#

Generate a response from your llm using the provided input text and parameters

Parameters:
  • input_text (str) – The input prompt chat

  • parameters (TextGenerationParameters) – The parameters that the user code wishes to be used

Raises:

NotImplementedError – If you do not implement this function.

Returns:

The result to be sent back to the client

Return type:

TextGenerationResult

property model_id#

Model ID This is the ID that you would use when defining the model you want to use

example: “google/flan-t5-base”

Raises:

NotImplementedError – If you do not implement this property.

abstract tokenize(input_text, parameters)[source]#

Tokenize the input text with your model and return the output

Parameters:
  • input_text (str) – The input prompt chat

  • parameters (TextTokenizationParameters) – The parameter that the user code wishes to be used

Raises:

NotImplementedError – If you do not implement this function.

Returns:

The result to be sent back to the client

Return type:

TextTokenizationCreateResults

Submodules#