Data Encoding (DP_3)

This functionality converts data to a recognizable format through encoding. Supported techniques including, but are not limited to, one-hot, label and categorical encoding.

This process involves converting data into a format that can be understood by a computer. This can include converting text into numerical values, or categorizing data into discrete groups. The goal of encoding is to make it possible for a machine learning algorithm to interpret and learn from the data.

There are many different types of encoding techniques, such as one-hot encoding, which converts categorical data into a binary format, and label encoding, which assigns a unique numerical value to each category in a categorical variable. The appropriate encoding technique depends on the type of data and the machine learning algorithm being used.

Sample Request

This request encodes categorical values in the X variable with one-hot method, ignoring values in the Y variable.

{
    "project_id": 1,
    "parent_id": 2,
    "block_id": 3,
    "function_code": "DP_3",
    "args": {
        "xvalue": {
            "encode": true,
            "encoding_type": "onehot",
            "remainder": "passthrough",
            "index": 0
        },
        "yvalue": {
            "encode": false,
            "encoding_type": "categorical",
            "remainder": "drop",
            "index": 2
        }
        "save_name": "testweights",
        "load_name": "testweights"
    }
}

Encoding Data

Encode categorical values

POST https://autogon.ai/api/v1/engine/start

Encodes categorical data on specific columns with specified boundaries

Request Body

Name

Type

Description

project_id*

int

current project ID

parent_id*

int

parent block ID

block_id*

int

current block ID

function_code*

String

block's function code

xvalue/yvalue*

object

arguments for X or Y variables

encode*

bool

specify if variable is encoded

args

object

block arguments

encoding_type*

String

One-Hot Encoding: Converts categories into binary columns. Label Encoding: Assigns numbers to categories.

Binary Encoding: Represents categories as binary codes.

Target Encoding: Replaces categories with target stats.

String to Hash Encoding: Hashes strings to numbers.

Extract Numbers Encoding: Converts text numbers to digits.

remainder*

String

applied method to none specified columns; drop drops the unspecified columns for encoding, passthrough ignores unspecified columns

index*

int

column index to apply encoding technique

save_name

String

name to save processing models with.

load_name

String

name to load processing models with. Used to switch to loading mode

{
    "status": "true",
    "message": {
        "id": 3,
        "project": 1,
        "block_id": 3,
        "parent_id": 2,
        "dataset_url": "",
        "x_value_url": "",
        "y_value_url": ""
    }
}

// Some code

// Some code

PreviousMissing Data (DP_2)NextData Split (DP_4)

Last updated 1 year ago

Was this helpful?