The parameters for the text extraction.

interface TextClassificationParameters {
    auto_rotation_correction?: boolean;
    classification_mode?: string;
    languages?: string[];
    ocr_mode?: string;
    semantic_config?: TextClassificationSemanticConfig;
}

Properties

auto_rotation_correction?: boolean

Should the service attempt to fix a rotated page or image.

classification_mode?: string

The classification mode. The value exact gives the exact schema name the the document is classified to. The option `binary`` only gives whether the document is classified to a known schema or not.

languages?: string[]

Set of languages to be expected in the document. The language codes follow ISO 639 where possible. See the documentation for the currently supported languages.

ocr_mode?: string

If OCR should be used when processing a document. An empty value allows the service to select the best option for your processing mode.

  • enabled: OCR is run on embedded images, OCR is only run if no programmatic text could be extracted from the area.
  • disabled: OCR is not run, no information is extracted from images or scanned documents.
  • forced: WDU will take a picture of the page and run OCR across it, this applies to all documents even purely programmatic ones.

Additional configuration settings for the Semantic KVP model.