AWS Textract V3
Extracts data from forms and tables using OCR and machine learning
Slug: aws-textract-model-v3
Version: 1.0.0
Infer: Yes
Overview
AWS Textract V3 Model
The AWS Textract V3 model further enhances the capabilities of previous versions by adding support for external data integration. This allows the model to utilize preprocessing information to rotate bounding boxes and better align them with document orientation, resulting in even more accurate text positioning and extraction.
Improvements in V3
- External Data Integration: Ability to use preprocessing data to improve extraction
- Enhanced Rotation Support: Better handling of rotated text in documents
- Pre-processing Awareness: Takes advantage of earlier document analysis results
- Intelligent Coordinate Transformations: Applies rotations based on document orientation
- All V2 Enhancements: Includes all improvements from V2 model
How It Works
- The model uploads your document to Amazon Web Services Textract
- Textract analyzes the document using specialized machine learning algorithms
- The model integrates external data (if provided) to inform coordinate transformations
- V3 applies improved bounding box calculations with rotation awareness
- The model processes the results, including:
- Detected text with orientation-aware positioning
- Form fields with key-value pairs
- Table structures with row and column data
- Results are converted into a rich Kodexa document structure with highly accurate spatial information
Options Configuration
Option | Description |
---|---|
ignore_dash_lines | When enabled, removes dash-only lines from the extracted document structure |
apply_skew | When enabled, corrects for document skew in the text positioning calculations |
external_data_key | Key to access external preprocessing data for improved coordinate transformation |
Process Flow
Extraction Capabilities
AWS Textract V3 excels at extracting:
- Rotated Text: Better handling of text at various orientations
- Text Content: Words, lines, and paragraphs with orientation-aware positioning
- Form Fields: Automatically identifies key-value pairs in forms
- Tables: Detects tabular structures with row and column relationships
- Handwriting: Identifies and extracts handwritten text
- Document Layout: Preserves the visual structure of the document with the highest fidelity
Use Cases
This model is particularly useful for:
- Complex Document Layouts: Documents with mixed orientations and rotated sections
- Pre-processed Documents: Cases where document analysis has already identified orientation
- High-Precision Layout Analysis: Applications requiring accurate text positioning
- Forms Processing: Extracting data from invoices, applications, and forms
- Table Extraction: Converting tabular information into structured data
- Document Digitization: Converting paper or image-based documents to digital formats
When to Use V3 vs. Previous Versions
Choose Textract V3 when:
- You have preprocessing information about document orientation
- You’re working with documents containing rotated text or sections
- You need the highest possible spatial accuracy
- You’re processing complex documents with mixed orientations
- Your document processing pipeline includes orientation detection steps
External Data Integration
The external_data_key option allows you to specify where to find preprocessing information about the document’s orientation. This data is used to apply appropriate rotations to the bounding boxes extracted from Textract, resulting in more accurate text positioning regardless of the original document orientation.
Inference Options
The following options can be configured when using this model for inference:
Name | Label | Type | Description | Default | Required |
---|---|---|---|---|---|
ignore_dash_lines | Ignore Dash Line | boolean | Ignore the dash line in the document | False | No |
apply_skew | Apply Skew | boolean | Apply skew correction to the document | True | No |
external_data_key | External Data Key | string | N/A | - | No |
Model Details
- Provider: Amazon Web Services