Automated Data Processing (DP_ADP)
This function automatically cleans and encodes supported data.
Automated data cleaning and pre-processing streamline the preparation of data for machine learning training. These techniques involve identifying and addressing missing values, outliers, and inconsistencies in the dataset, as well as standardizing and transforming features. By automating these tasks, data scientists can save time, ensure data quality, and enhance the performance and reliability of machine learning models.
Sample Request
Automated Data Preprocessing
POST
https://api.autogon.ai/api/v1/engine/start
Request Body
Name | Type | Description |
---|---|---|
x_slice* | String | array | boundaries for the x dataset |
y_slice* | String | array | boundaries for the y dataset |
strategy_value | String | method of handling missing values. Check the Missing Data block |
le_thresh | int | uniques threshold for label encoding |
ohe_thresh | int | uniques threshold for one hot encoding |
project_id* | int | current project ID |
block_id* | int | current block ID |
function_code* | String | block's function code |
args* | object | block arguments |
parent_id | int | previous block ID |
excluded_columns* | array | Columns to ignore entirely |
excluded_fillmissing_columns | array | Columns to ignore for filling in missing data only |
excluded_encoding_columns | array | Columns to ignore for encoding only |
excluded_scaling_columns | array | Columns to ignore for scaling only |
save_name* | String | name to save processing models with |
load_name* | String | name to load processing models with. Used to switch to loading mode |
dataset_type | String | type of dataset being processed with loaded weights.
|
clean | bool | set's wether or not to drop duplicates during loading mode
|
Good to know: Unlike other block requests, the Data Input block isn't permitted to have parent blocks, hence its null
value.
Last updated