Search

Working with Projects

Working with Projects

Understanding the Project Class

The Project class is a fundamental component of the Kodexa SDK, representing a container for your data extraction and processing tasks. Here's the definition of the Project class:

from pydantic import BaseModel, Field, ConfigDict
from typing import Optional
from kodexa import Organization, User, ProjectMetadata, ProjectStatus, ProjectOptions

class Project(BaseModel):
    model_config = ConfigDict(
        populate_by_name=True,
        use_enum_values=True,
        arbitrary_types_allowed=True,
        protected_namespaces=("model_config",),
    )
    id: Optional[str] = Field(None)
    organization: Optional[Organization] = None
    name: Optional[str] = None
    description: Optional[str] = None
    metadata: Optional[ProjectMetadata] = None
    has_image: Optional[bool] = Field(None, alias="hasImage")
    status: Optional[ProjectStatus] = None
    owner: Optional[User] = None
    options: Optional[ProjectOptions] = Field(None, alias="options")

Let's break down the key attributes of the Project class:

  • id: A unique identifier for the project.
  • organization: The organization to which the project belongs.
  • name: The name of the project.
  • description: A brief description of the project.
  • metadata: Additional metadata associated with the project.
  • has_image: Indicates whether the project has an associated image.
  • status: The current status of the project.
  • owner: The user who owns the project.
  • options: Additional options for the project.

The Project class uses Pydantic's BaseModel, which provides data validation and settings management. The model_config settings enable features like populating by field name, using enum values, and allowing arbitrary types.

Now that we understand the structure of a Project, let's proceed with creating a project and loading a taxonomy.

Step 1: Initialize the Kodexa Client and Organization

First, you need to have a Kodexa client and an organization endpoint:

from kodexa import KodexaClient, OrganizationEndpoint

client = KodexaClient(...)  # Initialize your Kodexa client
organization = ...  # Get your organization endpoint

Step 2: Create the Project

Next, create a new Project object with a name and description:

from kodexa import Project

project_name = "Test Project"
project_description = "Test Project Description"
project = Project(name=project_name, description=project_description, organization=organization.detach())

Here, we're initializing a Project object with the name and description we've specified. The organization.detach() method is used to ensure we're working with a detached copy of the organization object.

Step 3: Create the Project Using a Template

Use the Kodexa client to create the project based on a template:

template_ref = "data-extraction-starter"
project = client.projects.create(project, template_ref=template_ref)

This step uses the Kodexa client to create the project on the Kodexa platform, using the specified template as a starting point.

Step 4: Load a Local Taxonomy (Optional)

If you have a local taxonomy file, you can load it and update the project's taxonomy:

from yaml import load, FullLoader
from kodexa import TaxonomyEndpoint

taxonomy = None
taxonomy_yaml = "path/to/your/taxonomy.yaml"  # Set to None if not using a local taxonomy

if taxonomy_yaml is not None:
    with open(taxonomy_yaml) as taxonomy_file:
        test_taxonomy = TaxonomyEndpoint.parse_obj(load(taxonomy_file, Loader=FullLoader)).set_client(client)

    taxonomy = project.taxonomies.list()[0]
    taxonomy.taxons = test_taxonomy.taxons
    taxonomy.update()

This code block loads a local taxonomy file (if specified), creates a TaxonomyEndpoint object, and updates the project's existing taxonomy with the loaded taxons.

Putting It All Together

Here's the complete code that combines all these steps:

from kodexa import KodexaClient, OrganizationEndpoint, Project, TaxonomyEndpoint
from yaml import load, FullLoader

# Initialize client and organization
client = KodexaClient(...)  # Initialize your Kodexa client
organization = ...  # Get your organization endpoint

# Create the project
project_name = "Test Project"
project_description = "Test Project Description"
project = Project(name=project_name, description=project_description, organization=organization.detach())

# Create the project using the template
template_ref = "data-extraction-starter"
project = client.projects.create(project, template_ref=template_ref)

# Load and update taxonomy (if applicable)
taxonomy = None
taxonomy_yaml = "path/to/your/taxonomy.yaml"  # Set to None if not using a local taxonomy

if taxonomy_yaml is not None:
    with open(taxonomy_yaml) as taxonomy_file:
        test_taxonomy = TaxonomyEndpoint.parse_obj(load(taxonomy_file, Loader=FullLoader)).set_client(client)

    taxonomy = project.taxonomies.list()[0]
    taxonomy.taxons = test_taxonomy.taxons
    taxonomy.update()

# The project and taxonomy (if loaded) are now ready to use
print(f"Project '{project.name}' created successfully.")
if taxonomy:
    print("Local taxonomy loaded and updated.")

By following these steps, you can create a new project based on a template and optionally load a custom taxonomy from a local YAML file. This approach allows you to quickly set up projects with custom configurations and taxonomies, streamlining your workflow with the Kodexa SDK.

Remember to replace the placeholder values (like the client initialization and organization endpoint) with your actual Kodexa account details. Also, ensure you have the necessary permissions to create projects and modify taxonomies in your Kodexa environment.