Using the OnPremiseClassifier
The OnPremiseClassifier provides a scikit-learn compatible interface
for using Neuralk’s In-Context Learning model with an on-premise or self-hosted
NICL server. It’s ideal for users who have deployed NICL on their own infrastructure
or need to work with on-premise deployments.
NOTE
For this example to run, you need access to a running NICL server.
The server URL should be specified via the host parameter when
initializing the classifier.
Simple example on toy data
We start by using the OnPremiseClassifier on simple data that needs no preprocessing.
Generate simple data:
import numpy as np
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
X, y = make_classification(n_samples=200, n_features=10, n_classes=2, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Ensure data is in the correct format (float32 for features, int64 for labels)
X_train = X_train.astype(np.float32)
X_test = X_test.astype(np.float32)
y_train = y_train.astype(np.int64)
print(f"{X_train.shape=} {y_train.shape=} {X_test.shape=} {y_test.shape=}")
X_train.shape=(160, 10) y_train.shape=(160,) X_test.shape=(40, 10) y_test.shape=(40,)
Now we apply Neuralk’s OnPremiseClassifier.
NOTE
Replace “http://localhost:8000” with the actual URL of your NICL server.
If your server requires authentication, you can pass it via the
default_headers parameter.
from sklearn.metrics import accuracy_score
from neuralk import OnPremiseClassifier
# Initialize the classifier with your NICL server URL
# Replace with your actual server URL
classifier = OnPremiseClassifier(
host="http://localhost:8000", # Replace with your NICL server URL
model="nicl-small",
timeout_s=300,
)
# Note: nothing actually happens during fit() -- in-context learning models are
# pretrained but require no fitting on our specific dataset. The fit method
# only stores the training data.
classifier = classifier.fit(X_train, y_train)
# Make predictions
predictions = classifier.predict(X_test)
probabilities = classifier.predict_proba(X_test)
accuracy = accuracy_score(y_test, predictions)
print(f"Accuracy: {accuracy}")
print(f"Predictions shape: {predictions.shape}")
print(f"Probabilities shape: {probabilities.shape}")
Accuracy: 0.875
Predictions shape: (40,)
Probabilities shape: (40, 2)
Working with authentication
If your NICL server requires authentication, you can pass authentication
headers via the default_headers parameter:
Example with authentication headers
classifier_with_auth = OnPremiseClassifier(
host="http://localhost:8000",
model="nicl-small",
default_headers={"Authorization": "Bearer your-token"},
)
Advanced configuration
The OnPremiseClassifier supports various configuration options for fine-tuning the connection and request behavior:
Example with advanced configuration
classifier_advanced = OnPremiseClassifier(
host="http://localhost:8000",
dataset_name="my-dataset",
model="nicl-large", # Use a different model
timeout_s=600, # Longer timeout
metadata={"source": "example", "version": "1.0"},
user="user123",
api_version="v1",
)
Integration with scikit-learn pipelines
Like the cloud-based Classifier, OnPremiseClassifier can be integrated into scikit-learn pipelines:
from sklearn.decomposition import PCA
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler
# Create a pipeline with preprocessing and OnPremiseClassifier
pipeline = make_pipeline(
StandardScaler(),
PCA(n_components=5),
OnPremiseClassifier(
host="http://localhost:8000",
model="nicl-small",
),
)
# Fit and predict
pipeline.fit(X_train, y_train)
pipeline_predictions = pipeline.predict(X_test)
pipeline_accuracy = accuracy_score(y_test, pipeline_predictions)
print(f"Pipeline accuracy: {pipeline_accuracy}")
Pipeline accuracy: 0.85
Total running time of the script: (0 minutes 0.610 seconds)