Leverages the Kodexa Data Definition and builds prompts for labeling with Large Language Models
llm-taxonomy-model
Version: 1.0.0
Infer: Yes
Event Aware: No
Option | Description |
---|---|
taxonomy | The data definition to use for extraction, defining the structure of data to extract |
label_document | When enabled, adds labels to the document based on extracted data |
set_external_data | Sets external data properties based on extracted structure |
apply_guidance | Uses pre-defined guidance to improve extraction accuracy |
external_data_key | Key to use when setting external data |
Name | Label | Type | Description | Default | Required |
---|---|---|---|---|---|
taxonomy | Data Definition | taxonomy | The data definition to use for the model | - | No |
label_document | Label Document | boolean | Label the document | True | No |
set_external_data | Set External Data | boolean | Set the external data to the structure from the data classes | False | No |
apply_guidance | Apply Guidance | boolean | Apply guidance, if found | False | No |
external_data_key | External Data Key | string | N/A | - | No |
Name | Label | Type | Description | Default |
---|---|---|---|---|
enable_line_fallback | Enable Line Fallback | boolean | Fallback to line level labeling if multiple lines and unable to find content | False |
raise_exception_on_fallback | Raise Exception on Fallback | boolean | Raise an exception if we fallback to line level labeling | False |
Name | Label | Type | Description | Default |
---|---|---|---|---|
- | N/A | article | N/A | - |
embedded | Embedded | boolean | Treat as embedded | False |
cardinality | Cardinality | string | The cardinality of the data element in the chunk | single |
classificationStrategy | Classification Strategy | string | Should we chunk using the data element for classification | dataElement |
classificationContent | Classification Content | string | What content should we use for classification | text |
maxHits | Max Embedding Hits | number | The maximum number of hits to return from embeddings | 5 |
includeExplanation | Include Explanation | boolean | Try to include an explanation of the classification in the response, useful for debugging | True |
ignoreNonWords | Ignore Non-Words | boolean | Ignore non-word tokens when classifying | True |
restrictClassification | Restrict Classification | boolean | Restrict to only this classification (no mixed classes allowed) | True |
rerank | Rerank Classification Matches | boolean | Should we rerank the classification results | False |
maxPagesFromRerank | Max Pages From Rerank | number | The maximum number of pages to return from the rerank | 5 |
chunkingStrategy | Chunking Strategy | string | How should we chunk the document for the LLM | classifiedContent |
nPages | Number of Pages | number | The number of pages to use for the chunking strategy | 5 |
tagPage | Label Page | boolean | Label the page, if classified | True |
labelDocument | Tag Document | boolean | Add a tag to the document if classified | True |
promptStrategy | Prompt Strategy | string | Which prompt strategy should we use | layout |
image_width | Image Width | number | The width of the image | 350 |
skipExtraction | Skip Extraction | boolean | Should we skip extraction for this data element | False |
includeImages | Include Images | boolean | Should we include images in the prompt even if the strategy doesn’t normally include them | False |
enableThinkingMode | Enable Thinking Mode | boolean | Should the LLM use thinking mode (if it is available for the selected extraction model)? | False |
overrideExtractionModel | Override Extraction Model | boolean | Should we override the extraction model | False |
extractionModel | Extraction Model | cloudModel | Choose a model if you wish to override the extraction model | anthropic.claude-3-haiku-20240307-v1:0 |
enableStructureReview | Enable Structure Review | boolean | Should we enable the structure review | True |
structureReview | Structure Review Model | cloudModel | Choose a model if you wish to review the structure | anthropic.claude-3-haiku-20240307-v1:0 |
merge | Merge | boolean | Merge the objects identified in the chunks | True |
mergeWithAI | Merge with AI | boolean | Use AI to review chunks that are grouped and merge them into a single representation | False |
mergeInstructions | Merge Instructions | string | Additional instructions for merging the results to be included in the merge prompt | - |