Using NICLClassifier with On-Premise Server¶

The NICLClassifier provides a scikit-learn compatible interface for using Neuralk’s In-Context Learning model with an on-premise or self-hosted NICL server. Use the host parameter to specify your server URL.

Note

For this example to run, the environment variable HOST must be set with your NICL server URL.

Simple example on toy data¶

We start by using the NICLClassifier with a host parameter on simple data that needs no preprocessing.

Generate simple data:

import os

import numpy as np
from sklearn.datasets import make_classification
from sklearn.decomposition import PCA
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler

from neuralk import NICLClassifier

X, y = make_classification(n_samples=200, n_features=10, n_classes=2, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

# Ensure data is in the correct format (float32 for features, int64 for labels)
X_train = X_train.astype(np.float32)
X_test = X_test.astype(np.float32)
y_train = y_train.astype(np.int64)

print(f"{X_train.shape=} {y_train.shape=} {X_test.shape=} {y_test.shape=}")

X_train.shape=(160, 10) y_train.shape=(160,) X_test.shape=(40, 10) y_test.shape=(40,)

Now we apply Neuralk’s NICLClassifier with a host parameter.

Note

The host URL is read from the HOST environment variable. If your server requires authentication, you can pass it via the default_headers parameter.

# Initialize the classifier with your NICL server URL from HOST env var
classifier = NICLClassifier(
    host=os.environ["HOST"],
    model="nicl-small",
    timeout_s=300,
)

# Note: nothing actually happens during fit() -- in-context learning models are
# pretrained but require no fitting on our specific dataset. The fit method
# only stores the training data.
classifier = classifier.fit(X_train, y_train)

# Make predictions
predictions = classifier.predict(X_test)
probabilities = classifier.predict_proba(X_test)

accuracy = accuracy_score(y_test, predictions)
print(f"Accuracy: {accuracy}")
print(f"Predictions shape: {predictions.shape}")
print(f"Probabilities shape: {probabilities.shape}")

Traceback (most recent call last):
  File "/home/runner/work/neuralk/neuralk/examples/0040_on_premise_classifier.py", line 76, in <module>
    predictions = classifier.predict(X_test)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/runner/work/neuralk/neuralk/src/neuralk/_base_classifier.py", line 149, in predict
    result = self._remote_predict(X)
             ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/runner/work/neuralk/neuralk/src/neuralk/_classifier.py", line 182, in _remote_predict
    response = self._client.classifications.create(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/runner/work/neuralk/neuralk/src/neuralk/_api.py", line 228, in create
    return self._client._make_request(tar_bytes, headers)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/runner/work/neuralk/neuralk/src/neuralk/_api.py", line 608, in _make_request
    raise NeuralkException(
neuralk.exceptions.NeuralkException: ('Timeout after 300s (after 8 attempts)', <HTTPStatus.REQUEST_TIMEOUT: 408>, '[Errno 110] Connection timed out')

Working with authentication¶

If your NICL server requires authentication, you can pass authentication headers via the default_headers parameter:

Example with authentication headers

classifier_with_auth = NICLClassifier(
    host=os.environ["HOST"],
    model="nicl-small",
    default_headers={"Authorization": "Bearer your-token"},
)

Advanced configuration¶

The NICLClassifier supports various configuration options for fine-tuning the connection and request behavior:

Example with advanced configuration

classifier_advanced = NICLClassifier(
    host=os.environ["HOST"],
    dataset_name="my-dataset",
    model="nicl-large",  # Use a different model
    timeout_s=600,  # Longer timeout
    metadata={"source": "example", "version": "1.0"},
    user="user123",
    api_version="v1",
)

Integration with scikit-learn pipelines¶

NICLClassifier can be integrated into scikit-learn pipelines:

# Create a pipeline with preprocessing and NICLClassifier
pipeline = make_pipeline(
    StandardScaler(),
    PCA(n_components=5),
    NICLClassifier(
        host=os.environ["HOST"],
        model="nicl-small",
    ),
)

# Fit and predict
pipeline.fit(X_train, y_train)
pipeline_predictions = pipeline.predict(X_test)
pipeline_accuracy = accuracy_score(y_test, pipeline_predictions)
print(f"Pipeline accuracy: {pipeline_accuracy}")

Total running time of the script: (18 minutes 8.101 seconds)

Gallery generated by Sphinx-Gallery