# Data Split (DP\_4)

This process of separating data into two sets is a crucial step in the process of developing and evaluating machine learning models. It ensures that the model is able to generalize well to new, unseen data, and it also allows for a more accurate assessment of the model's performance.

This functionality of splitting data into training and test sets is widely used in the field of machine learning and data science.

## Sample Request

This request uses the mean strategy to fill in missing values in the second column to the end with the mean values of the X variable.

```javascript
{
    "project_id": 1,
    "parent_id": 3,
    "block_id": 4,
    "function_code": "DP_4",
    "args": {
        "test_size": 0.3,
        "random_state": 0
    }
}
```

## Splitting Data

## Splits data

<mark style="color:green;">`POST`</mark> `https://autogon.ai/api/v1/engine/start`

Splitting data into training and test data.

#### Request Body

| Name                                             | Type      | Description                                                                                                                                                                                                                                  |
| ------------------------------------------------ | --------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| project\_id<mark style="color:red;">\*</mark>    | int       | current project ID                                                                                                                                                                                                                           |
| parent\_id<mark style="color:red;">\*</mark>     | int       | parent block ID                                                                                                                                                                                                                              |
| block\_id<mark style="color:red;">\*</mark>      | int       | current block ID                                                                                                                                                                                                                             |
| function\_code<mark style="color:red;">\*</mark> | String    | block's function code                                                                                                                                                                                                                        |
| args                                             | object    | block arguments                                                                                                                                                                                                                              |
| test\_size                                       | float/int | If `float`, should be between 0.0 and 1.0 and represent the proportion of the dataset to include in the test split. If `int`, represents the absolute number of test samples. If None, the value is set to the complement of the train size. |
| random\_state                                    | int       | Controls the shuffling applied to the data before applying the split.                                                                                                                                                                        |

{% tabs %}
{% tab title="200: OK Feature Scaling Successful" %}

```javascript
{
    "status": "true",
    "message": {
        "id": 4,
        "project": 1,
        "block_id": 4,
        "parent_id": 3,
        "dataset_url": "",
        "x_value_url": "",
        "y_value_url": "",
        "x_train_url": "",
        "y_train_url": "",
        "x_test_url": "",
        "y_test_url": ""
    }
}
```

{% endtab %}
{% endtabs %}

{% tabs %}
{% tab title="Python" %}

```
// Some code
```

{% endtab %}

{% tab title="Node" %}

```
// Some code
```

{% endtab %}
{% endtabs %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.autogon.ai/autogon-engine-studio/data-processing/data-split-dp_4.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
