genai.extensions.localserver package#
Extension for running local simplified inference API compatible with current version of SDK
- class genai.extensions.localserver.LocalLLMServer[source]#
Bases:
object
- __init__(models, port=8080, interface='0.0.0.0', api_key=None, insecure_api=False)[source]#
- Parameters:
models (list[type[LocalModel]]) –
port (int) –
interface (str) –
api_key (str | None) –
insecure_api (bool) –
- class genai.extensions.localserver.LocalModel[source]#
Bases:
ABC
- abstract generate(input_text, parameters)[source]#
Generate a response from your llm using the provided input text and parameters
- Parameters:
input_text (str) – The input prompt chat
parameters (TextGenerationParameters) – The parameters that the user code wishes to be used
- Raises:
NotImplementedError – If you do not implement this function.
- Returns:
The result to be sent back to the client
- Return type:
TextGenerationResult
- property model_id#
Model ID This is the ID that you would use when defining the model you want to use
example: “google/flan-t5-base”
- Raises:
NotImplementedError – If you do not implement this property.
- abstract tokenize(input_text, parameters)[source]#
Tokenize the input text with your model and return the output
- Parameters:
input_text (str) – The input prompt chat
parameters (TextTokenizationParameters) – The parameter that the user code wishes to be used
- Raises:
NotImplementedError – If you do not implement this function.
- Returns:
The result to be sent back to the client
- Return type:
TextTokenizationCreateResults