Skip to content
Lucd Python Client | Custom Feature Transformation | 1.0.1

Custom Feature Transformation

This documentation describes how to use the Lucd Python libraries for creating custom feature engineering operations for processing data for model training. Custom feature engineering operations enable a Lucd user to apply feature engineering operations that are not available in the Lucd GUI. A simple example includes adjusting a given attribute as a function of the value of several other attributes. Overall, this should enable near open-ended feature transformation needs.

The Custom Operation Module

The eda.int.custom_operation module is used for sending custom operations to the Lucd backend so that they are selectable in the Lucd EDA section of the GUI. The simple Python script below illustrates how one may define a custom feature engineering operation and send it to the Lucd backend using the Lucd Python libraries.

from eda.int import custom_operation
import lucd

def transform(row):
    row['flower.petal_length'] = row['flower.petal_length'] + 1
    return row

client = lucd.LucdClient(domain="https://saas.lucd.ai",
                         username="username",
                         password="password",
                         login_domain="Lucd Platform URL"
                         )

data = {
        "operation_name": "simple iris attribute adder",
        "author_name": "J. Branch",
        "author_email": "joel.branch@lucd.ai",
        "operation_description": "Testing custom feature op using simple transform",
        "operation_purpose": "Numeric feature scaling",
        "operation_features": ["flower.petal_length"],
        "operation_function": transform
}

response_json, rv = custom_operation.create(data)

client.close()

The create function from the eda.int.custom_operation module is used to submit the actual function and metadata to Lucd. The required attributes in the dict data are used for display in the Lucd GUI (and may at some point be used for search purposes). Table 1 describes the attributes in detail.

Table 1. Custom Operation Attribute Descriptions

Attribute Description
name String name/label of the custom feature engineering operation
author_name Name of the developer who wrote the operation
author_email Author’s email
purpose Short description of what the operation achieves
description Longer description of how the operation might achieve its purpose, as well as other notes
feature List of strings identifying features/facets the operation affects
transform_function The Python function that implements the logic of the custom operation

Regarding the custom transformation function, it is essential that the function be defined in the same context as the dict and in which create is called. This ensures that the de-serialization of the function works properly on the Lucd backend.

Notes on Creating Custom Operations

Since Lucd uses the Dask framework for data transformation, the custom functions are expected to be compatible with conventional Dask and Pandas function application mechanisms. Custom functions can be applied to data (dataframes) via the following Dask functions:

As shown in the example code above, the user does not identify which Dask mechanism to use for applying a function to data in the create function. This will be selectable in the Lucd GUI when applying custom operations in the eda section.

Comments