Introduction

Introduction

Introduction to Kodexa

Kodexa is a versatile and comprehensive platform designed for data processing, management, and machine learning. It excels in handling a broad spectrum of unstructured data and intelligent document processing challenges with its highly extensible capabilities.
image

Data Storage Framework

The platform organizes data storage across various locations within its deployment infrastructure:

  1. Object Storage: This serves as the repository for binary data and native files.
  2. Database: Used for storing metadata, configuration details, and structured data.
  3. Search Indexing: An optional feature for additional metadata, configuration, and structured data indexing.
  4. Analytics Datamart: Another optional element that holds structured data, audit logs, and extraction metadata.

Efficient Data Movement

Data enters the platform via a secure REST API. Within the core framework, data is seamlessly transferred between the platform instance, Object Storage, and the Database. Depending on the deployment configuration, data may also be replicated to search indexes and/or an analytics datamart for enhanced analytic reporting.

Extension Packs, integral for additional functionalities, can be deployed within the Kubernetes cluster or via cloud-native serverless architectures, maintaining communication exclusively with the REST API.

Network Configuration and Access

Typically, the platform is deployed within a Kubernetes environment, enabling ingress. All network access is predominantly confined to a private network, with options to integrate into an enterprise network using standard cloud capabilities.

Secure Authentication Protocols

Deployments commonly utilize either an OAuth provider or an internal security system for authentication. User sessions are managed via JWT tokens, generated upon successful authentication and then integrated into the user experience.

Robust Encryption Standards

The platform is equipped to support encryption for both in-transit and at-rest data, ensuring a high level of security and data protection throughout its deployment.