genai.extensions.localserver package

Extension for running local simplified inference API compatible with current version of SDK

class genai.extensions.localserver.LocalLLMServer[source]

Bases: object

__init__(models, port=8080, interface='0.0.0.0', api_key=None, insecure_api=False)[source]
Parameters:
  • models (list[type[LocalModel]])

  • port (int)

  • interface (str)

  • api_key (str | None)

  • insecure_api (bool)

get_credentials()[source]
run_locally()[source]
start_server()[source]
stop_server()[source]
class genai.extensions.localserver.LocalModel[source]

Bases: ABC

abstract generate(input_text, parameters)[source]

Generate a response from your llm using the provided input text and parameters

Parameters:
  • input_text (str) – The input prompt chat

  • parameters (TextGenerationParameters) – The parameters that the user code wishes to be used

Raises:

NotImplementedError – If you do not implement this function.

Returns:

The result to be sent back to the client

Return type:

TextGenerationResult

property model_id

Model ID This is the ID that you would use when defining the model you want to use

example: “google/flan-t5-base”

Raises:

NotImplementedError – If you do not implement this property.

abstract tokenize(input_text, parameters)[source]

Tokenize the input text with your model and return the output

Parameters:
  • input_text (str) – The input prompt chat

  • parameters (TextTokenizationParameters) – The parameter that the user code wishes to be used

Raises:

NotImplementedError – If you do not implement this function.

Returns:

The result to be sent back to the client

Return type:

TextTokenizationCreateResults

Submodules