Text Classification¶

class ibm_watsonx_ai.foundation_models.classifications.TextClassification(credentials=None, project_id=None, space_id=None, api_client=None)[source]¶

Bases: WMLResource

Instantiate the text classification service.

Parameters:

credentials (Credentials, optional) – credentials to the watsonx.ai instance
project_id (str, optional) – ID of the project, defaults to None
space_id (str, optional) – ID of the space, defaults to None
api_client (APIClient, optional) – initialized APIClient object with a set project ID or space ID. If passed, credentials and project_id/space_id are not required, defaults to None

Raises:

InvalidMultipleArguments – raised when neither api_client nor credentials alongside space_id or project_id are provided

from ibm_watsonx_ai import Credentials
from ibm_watsonx_ai.foundation_models.classifications import TextClassification

text_classification = TextClassification(
    credentials=Credentials(
        api_key = IAM_API_KEY,
        url = "https://us-south.ml.cloud.ibm.com"),
    project_id="*****"
    )

cancel_job(classification_job_id)[source]¶

Cancel a text classification job.

Parameters:: classification_job_id (str) – ID of text classification job
Returns:: “SUCCESS” if the cancel succeeds
Return type:: str

Example:

text_classification.cancel_job(classification_job_id="<CLASSIFICATION_JOB_ID>")

delete_job(classification_job_id)[source]¶

Delete a text classification job.

Parameters:: classification_job_id (str) – ID of text classification job
Returns:: “SUCCESS” if the deletion succeeds
Return type:: str

Example:

text_classification.delete_job(classification_job_id="<CLASSIFICATION_JOB_ID>")

get_job_details(classification_job_id=None, limit=None)[source]¶

Return text classification job details. If classification_job_id is None, return the details of all text classification jobs.

Parameters:

classification_job_id (str, optional) – ID of the text classification job, defaults to None
limit (int, optional) – limit number of fetched records, defaults to None

Returns:

details of the text classification job

Return type:

dict

Example:

text_classification.get_job_details(classification_job_id="<CLASSIFICATION_JOB_ID>")

classmethod get_job_id(classification_details)[source]¶

Get the unique ID of a stored classification request.

Parameters:: classification_details (dict) – metadata of the stored classification
Returns:: unique ID of the stored clasification request
Return type:: str

Example:

classification_details = text_classification.run_job(...)
classification_job_id = text_classification.get_job_id(classification_details)

get_results(classification_job_id)[source]¶

Get the text classification results.

Parameters:: classification_job_id (str) – ID of text classification job
Returns:: text classification job results
Return type:: dict

Example:

results = text_classification.get_results(classification_job_id="<CLASSIFICATION_JOB_ID>")

get_status(classification_job_id)[source]¶

Get the text classification status.

Parameters:: classification_job_id (str) – ID of text classification job
Returns:: text classification job status, possible values: [submitted, uploading, running, downloading, downloaded, completed, failed]
Return type:: str

Example:

status = text_classification.get_status(classification_job_id="<CLASSIFICATION_JOB_ID>")

list_jobs(limit=None)[source]¶

List text classification jobs. If limit is None, all jobs will be listed.

Parameters:: limit (int, optional) – limit number of fetched records, defaults to None
Returns:: text classification jobs information as a pandas DataFrame
Return type:: pandas.DataFrame

Example:

text_classification.list_jobs()

run_job(document_reference, parameters, custom=None)[source]¶

Start a request to classify text in the document.

Parameters:

document_reference (DataConnection) – reference to the document in the bucket from which text will be classified
parameters (TextClassificationParameters or dict) – the parameters for the text classification
custom (dict, optional) – user defined properties for the text classification, defaults to None

Returns:

text classification response

Return type:

dict

Example:

from ibm_watsonx_ai.helpers import DataConnection, S3Location
from ibm_watsonx_ai.foundation_models.schema import (
    TextClassificationParameters,
    ClassificationMode,
    OCRMode,
)

document_reference = DataConnection(
    connection_asset_id="<connection_id>",
    location=S3Location(bucket="<bucket_name>", path="path/to/file"),
)

response = text_classification.run_job(
    document_reference=document_reference,
    parameters=TextClassificationParameters(
        ocr_mode=OCRMode.ENABLED,
        classification_mode=ClassificationMode.EXACT,
        auto_rotation_correction=True,
        languages=["en"],
        semantic_config=TextClassificationSemanticConfig(
            schemas_merge_strategy=SchemasMergeStrategy.MERGE,
            schemas=[...],
        ),
    ),
    custom={},
)

Enums¶

class ibm_watsonx_ai.foundation_models.schema.SchemasMergeStrategy(value)[source]¶

Bases: StrEnum

Strategy for schemas merge.

MERGE = 'merge'¶

REPLACE = 'replace'¶

class ibm_watsonx_ai.foundation_models.schema.OCRMode(value)[source]¶

Bases: StrEnum

DISABLED = 'disabled'¶

ENABLED = 'enabled'¶

FORCED = 'forced'¶

class ibm_watsonx_ai.foundation_models.schema.ClassificationMode(value)[source]¶

Bases: StrEnum

BINARY = 'binary'¶

EXACT = 'exact'¶

class ibm_watsonx_ai.foundation_models.schema.TextClassificationSemanticConfig(schemas_merge_strategy=None, schemas=None)[source]¶

Bases: BaseSchema

Semantic configuration for text classification.

Parameters:

schemas_merge_strategy (SchemasMergeStrategy, optional) – strategy for schemas merge
schemas (list[dict], optional) – schemas

schemas = None¶

schemas_merge_strategy = None¶

class ibm_watsonx_ai.foundation_models.schema.TextClassificationParameters(ocr_mode=None, classification_mode=None, auto_rotation_correction=None, languages=None, semantic_config=None)[source]¶

Bases: BaseSchema

Parameters used for text classification.

Parameters:

ocr_mode (OCRMode, optional) – whether OCR should be used when processing a document, an empty value allows the service to select the best option for your processing mode
classification_mode (ClassificationMode, optional) – classification mode, the value exact gives the exact schema name the document is classified to, the option binary only gives whether the document is classified to a known schema or not
auto_rotation_correction (bool, optional) – whether should the service attempt to fix a rotated page or image
languages (list[str], optional) – set of languages to be expected in the document, the language codes follow ISO 639 where possible, see the REST API documentation for the currently supported languages
semantic_config (TextClassificationSemanticConfig, optional) – additional configuration settings for the Semantic KVP model

auto_rotation_correction = None¶

classification_mode = None¶

languages = None¶

ocr_mode = None¶

semantic_config = None¶