Models

Models

Models

Models are a key component in Kodexa. They are a way to bring intelligent processing to documents, supporting both pluggable parsing, transformation and labeling of documents while also being trainable.

While many products have a concept of a model, we think of models in Kodexa as being a bit different. We think of models as not just the code and processing, but how the user will experience training the model. This means that when we are looking at developing a model we are looking at the user experience, the training data and the model code.

Model Training Workflow

While Kodexa can support models running in several ways, we have a standard workflow that we use to train models.

image

This workflow is designed to allow you to build a model that can be trained and deployed in several different ways.

Anatomy of a Model

image

A model is made up of a few key concepts:

  • Model Code — The code that is used to parse, transform and label documents
  • Training Options — The options that are used to train the model
  • Inference Options — The options that are used to run the model
  • Model Taxonomy — The taxonomy that is used to define the structure of labels that the model uses to “guide” the user/training process
  • Additional Taxonomy Options — Additional options that the model can add to Taxonomies that we will be using for extraction

These different parts of the model allow you to build and deploy models that not only allow for flexible training and inference, but also provide rich ways in which you can capture knowledge about the documents from the user.

Trainable vs Non-Trainable Models

Models in Kodexa can either be trainable, meaning the user can label and train the model themselves using the UI or they can be “pre-trained” meaning that the model is already trained and the user can use it to label documents. An example of a “pre-trained” model would be something like the Azure Invoice Form Recognizer model. This model is already trained, and the user can use it to label invoices, but the user cannot train the model themselves.

There are a few key things to remember about how the user will interact with a model, depending on whether it is trainable or not. If a user wants to use a trainable model, then the model needs to be added to the project. Whereas, if the user would like to use a non-trainable model, then the model needs to be used in an assistant but does not need to be added to the project.

← Previous

Data Stores