- Document Stores: These are specialized in storing files and their corresponding document representations.
- Data Stores: These focus on holding extracted data objects and attributes identified within documents stored in a Document Store.
Concept and Design
Stores in Kodexa are designed to manage both native files and their associated “Document” representations, which contain unstructured data. The process involves defining a Data Structure to label documents, enabling the platform to convert these labeled documents into a structured format.Detailed Explanation of Store Types
Document Stores
Document Stores play a pivotal role in managing files that are subject to parsing, labeling, and conversion into structured data. The term “document” here implies that upon uploading a file (like a PDF), Kodexa creates a “container.” This container holds:- The original file (referred to as the native file).
- One or more Kodexa Documents representing the semi-structured version of the native file.
Data Stores
Data Stores are engineered to manage structured data extracted from labeled documents stored in a Document Store. They are interconnected with a Data Structure (internally termed as a Taxonomy), which:- Formalizes data structure into groups and individual data attributes.
- Stores actual data points and their related groups in the Data Store, with a lineage tracing back to the document representation in the Document Store.