# Text Vectorizer (DP\_VEC)

Text Vectorizers are tools used to convert textual data into numerical representations suitable for machine learning models. They process text inputs and transform them into feature vectors, enabling the API to perform natural language processing and text-based tasks efficiently.

Supported Vectorizers:

1. TF-IDF Vectorizer: Assigns weights to words based on their importance in a document and rarity across the dataset, capturing their significance for modeling.
2. Count Vectorizer: Counts the occurrences of each word in a document, representing it as a sparse matrix with word frequencies.
3. Hashing Vectorizer: Converts words into numerical indices using a hashing trick, providing memory-efficient representations.

These vectorizers are crucial for handling text data in the API, facilitating tasks like text classification, sentiment analysis, and other natural language processing tasks.

## Sample Request

The request performs text vectorization using the TF-IDF vectorizer with specified boundaries to scale the data on the specified variables.

```javascript
{
    "project_id": 1,
    "parent_id": 5,
    "block_id": 6,
    "function_code": "DP_VEC",
    "args": {
        "vectorizer": "tfidf",
        "boundariestoscale": ":, :",
        "dataset": false,
        "xtrain": true,
        "xtest": true,
        "x": true,
        "ytrain": false,
        "ytest": false,
        "y": false
    }
}

```

## Parameter Details

## Principal Component Analysis

<mark style="color:green;">`POST`</mark> `https://autogon.ai/api/v1/engine/start`

#### Request Body

| Name                                             | Type   | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| ------------------------------------------------ | ------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| project\_id<mark style="color:red;">\*</mark>    | int    | current project ID                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| parent\_id<mark style="color:red;">\*</mark>     | int    | parent block ID                                                                                                                                                                                                                                                                                                                                                                                                                                            |
| block\_id<mark style="color:red;">\*</mark>      | int    | current block ID                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| function\_code<mark style="color:red;">\*</mark> | String | block's function code                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| args<mark style="color:red;">\*</mark>           | object | block arguments                                                                                                                                                                                                                                                                                                                                                                                                                                            |
| boundariestoscale                                | String | boundaries to vectorize                                                                                                                                                                                                                                                                                                                                                                                                                                    |
| dataset/x/y/xtrain/ytrain/xtest/ytest            | bool   | variables to apply vectorizer                                                                                                                                                                                                                                                                                                                                                                                                                              |
| vectorizer                                       | String | <p>Type of vectorizer to apply:</p><p></p><p><code>tfidf</code> : Converts text data into numerical features based on term frequency-inverse document frequency, capturing word importance in documents and across the corpus.</p><p></p><p><code>count</code>: Transforms text data into numerical features by counting the occurrences of words</p><p></p><p><code>hashing</code>: Uses a hashing trick to map words into fixed-size feature vectors</p> |

{% tabs %}
{% tab title="200: OK Text Vectorization Successful" %}

```javascript
{
    "status": "true",
    "message": {
        "id": 3,
        "project": 1,
        "block_id": 7,
        "parent_id": 6,
        "dataset_url": "",
        "x_value_url": "",
        "y_value_url": ""
    }
}
```

{% endtab %}
{% endtabs %}

{% tabs %}
{% tab title="Python" %}

```
// Some code
```

{% endtab %}

{% tab title="Node" %}

```
projectId = 1
parentId = 6
blockId = 7

client.array_reshaping(projectId, parentId, blockId, {

```

{% endtab %}
{% endtabs %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.autogon.ai/autogon-engine-studio/data-processing/text-vectorizer-dp_vec.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
