Neuralk-AI Categorization Expert Module
This example shows how to use the Neuralk-AI categorization module to categorize products.
Unlike examples that rely the Classifier, which is a generic building
block rather than a full pipeline, here we use an end-to-end workflow available
through the Neuralk API which integrates all the data processing steps.
We show how to interact with the API to create a project and dataset and run the categorization.
We use a subset of the Best Buy dataset.
WARNING
For this example to run, the environment variables NEURALK_USERNAME and
NEURALK_PASSWORD must be defined. They will be used to connect to the
Neuralk API.
The first step is to create a Neuralk client that we will use to
interact with the platform. Note that we chose to make our username and
password available through environment variables, but you can use other
approaches to load them.
import os
from neuralk import Neuralk
user = os.environ['NEURALK_USERNAME']
password = os.environ['NEURALK_PASSWORD']
client = Neuralk(user, password)
Next, we gather our local dataset, a subset of the Best Buy catalog.
to make the example run fast we use a small subset, pass subsample=False to run on full dataset
import polars as pl
import skrub
from neuralk.datasets import best_buy
local_dataset = best_buy(subsample=True)
skrub.TableReport(pl.read_parquet(local_dataset["train_path"]))