Text Classification¶
- class ibm_watsonx_ai.foundation_models.classifications.TextClassification(credentials=None, project_id=None, space_id=None, api_client=None)[source]¶
Bases:
WMLResource
Instantiate the text classification service.
- Parameters:
credentials (Credentials, optional) – credentials to the watsonx.ai instance
project_id (str, optional) – ID of the project, defaults to None
space_id (str, optional) – ID of the space, defaults to None
api_client (APIClient, optional) – initialized APIClient object with a set project ID or space ID. If passed,
credentials
andproject_id
/space_id
are not required, defaults to None
- Raises:
InvalidMultipleArguments – raised when neither api_client nor credentials alongside space_id or project_id are provided
from ibm_watsonx_ai import Credentials from ibm_watsonx_ai.foundation_models.classifications import TextClassification text_classification = TextClassification( credentials=Credentials( api_key = IAM_API_KEY, url = "https://us-south.ml.cloud.ibm.com"), project_id="*****" )
- cancel_job(classification_job_id)[source]¶
Cancel a text classification job.
- Parameters:
classification_job_id (str) – ID of text classification job
- Returns:
“SUCCESS” if the cancel succeeds
- Return type:
str
Example:
text_classification.cancel_job(classification_job_id="<CLASSIFICATION_JOB_ID>")
- delete_job(classification_job_id)[source]¶
Delete a text classification job.
- Parameters:
classification_job_id (str) – ID of text classification job
- Returns:
“SUCCESS” if the deletion succeeds
- Return type:
str
Example:
text_classification.delete_job(classification_job_id="<CLASSIFICATION_JOB_ID>")
- get_job_details(classification_job_id=None, limit=None)[source]¶
Return text classification job details. If classification_job_id is None, return the details of all text classification jobs.
- Parameters:
classification_job_id (str, optional) – ID of the text classification job, defaults to None
limit (int, optional) – limit number of fetched records, defaults to None
- Returns:
details of the text classification job
- Return type:
dict
Example:
text_classification.get_job_details(classification_job_id="<CLASSIFICATION_JOB_ID>")
- classmethod get_job_id(classification_details)[source]¶
Get the unique ID of a stored classification request.
- Parameters:
classification_details (dict) – metadata of the stored classification
- Returns:
unique ID of the stored clasification request
- Return type:
str
Example:
classification_details = text_classification.run_job(...) classification_job_id = text_classification.get_job_id(classification_details)
- get_results(classification_job_id)[source]¶
Get the text classification results.
- Parameters:
classification_job_id (str) – ID of text classification job
- Returns:
text classification job results
- Return type:
dict
Example:
results = text_classification.get_results(classification_job_id="<CLASSIFICATION_JOB_ID>")
- get_status(classification_job_id)[source]¶
Get the text classification status.
- Parameters:
classification_job_id (str) – ID of text classification job
- Returns:
text classification job status, possible values: [submitted, uploading, running, downloading, downloaded, completed, failed]
- Return type:
str
Example:
status = text_classification.get_status(classification_job_id="<CLASSIFICATION_JOB_ID>")
- list_jobs(limit=None)[source]¶
List text classification jobs. If limit is None, all jobs will be listed.
- Parameters:
limit (int, optional) – limit number of fetched records, defaults to None
- Returns:
text classification jobs information as a pandas DataFrame
- Return type:
pandas.DataFrame
Example:
text_classification.list_jobs()
- run_job(document_reference, parameters, custom=None)[source]¶
Start a request to classify text in the document.
- Parameters:
document_reference (DataConnection) – reference to the document in the bucket from which text will be classified
parameters (TextClassificationParameters or dict) – the parameters for the text classification
custom (dict, optional) – user defined properties for the text classification, defaults to None
- Returns:
text classification response
- Return type:
dict
Example:
from ibm_watsonx_ai.helpers import DataConnection, S3Location from ibm_watsonx_ai.foundation_models.schema import ( TextClassificationParameters, ClassificationMode, OCRMode, ) document_reference = DataConnection( connection_asset_id="<connection_id>", location=S3Location(bucket="<bucket_name>", path="path/to/file"), ) response = text_classification.run_job( document_reference=document_reference, parameters=TextClassificationParameters( ocr_mode=OCRMode.ENABLED, classification_mode=ClassificationMode.EXACT, auto_rotation_correction=True, languages=["en"], semantic_config=TextClassificationSemanticConfig( schemas_merge_strategy=SchemasMergeStrategy.MERGE, schemas=[...], ), ), custom={}, )
Enums¶
- class ibm_watsonx_ai.foundation_models.schema.SchemasMergeStrategy(value)[source]¶
Bases:
StrEnum
Strategy for schemas merge.
- MERGE = 'merge'¶
- REPLACE = 'replace'¶
- class ibm_watsonx_ai.foundation_models.schema.OCRMode(value)[source]¶
Bases:
StrEnum
- DISABLED = 'disabled'¶
- ENABLED = 'enabled'¶
- FORCED = 'forced'¶
- class ibm_watsonx_ai.foundation_models.schema.ClassificationMode(value)[source]¶
Bases:
StrEnum
- BINARY = 'binary'¶
- EXACT = 'exact'¶
- class ibm_watsonx_ai.foundation_models.schema.TextClassificationSemanticConfig(schemas_merge_strategy=None, schemas=None)[source]¶
Bases:
BaseSchema
Semantic configuration for text classification.
- Parameters:
schemas_merge_strategy (SchemasMergeStrategy, optional) – strategy for schemas merge
schemas (list[dict], optional) – schemas
- schemas = None¶
- schemas_merge_strategy = None¶
- class ibm_watsonx_ai.foundation_models.schema.TextClassificationParameters(ocr_mode=None, classification_mode=None, auto_rotation_correction=None, languages=None, semantic_config=None)[source]¶
Bases:
BaseSchema
Parameters used for text classification.
- Parameters:
ocr_mode (OCRMode, optional) – whether OCR should be used when processing a document, an empty value allows the service to select the best option for your processing mode
classification_mode (ClassificationMode, optional) – classification mode, the value exact gives the exact schema name the document is classified to, the option binary only gives whether the document is classified to a known schema or not
auto_rotation_correction (bool, optional) – whether should the service attempt to fix a rotated page or image
languages (list[str], optional) – set of languages to be expected in the document, the language codes follow ISO 639 where possible, see the REST API documentation for the currently supported languages
semantic_config (TextClassificationSemanticConfig, optional) – additional configuration settings for the Semantic KVP model
- auto_rotation_correction = None¶
- classification_mode = None¶
- languages = None¶
- ocr_mode = None¶
- semantic_config = None¶