Skip to main content

Training & Fine-tuning

Achiral AI is a privacy-first AI platform for businesses with enterprise-grade, self-hosted, secure infrastructure. Train and fine-tune LLM models on your dedicated Chiro instance with a complete training infrastructure, human-in-the-loop capabilities, custom fine-tuning workflows, and enterprise-grade model versioning.

Overview

Your Chiro instance supports:

  • Fine-tuning: Adapt pre-trained models to your specific use case
  • Human-in-the-Loop Training: Guide model behavior with human feedback
  • Custom Training: Train models from scratch or continue pre-training
  • Model Versioning: Track and manage model iterations
  • Distributed Training: Multi-GPU training on Growth plan

Quick Start

Basic Fine-tuning

from openai import OpenAI

client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://abc123xyz.nano.achiral.ai/v1"
)

# Create fine-tuning job
job = client.fine_tuning.jobs.create(
training_file="file-abc123",
model="mistral-7b",
hyperparameters={
"n_epochs": 3,
"batch_size": 4,
"learning_rate_multiplier": 1.0
}
)

print(f"Job ID: {job.id}")

Monitor Training

# Get job status
status = client.fine_tuning.jobs.retrieve(job.id)
print(f"Status: {status.status}")
print(f"Progress: {status.trained_tokens}/{status.training_file.tokens}")

# List all jobs
jobs = client.fine_tuning.jobs.list()
for job in jobs.data:
print(f"{job.id}: {job.status}")

Training Data Format

Chat Format (JSONL)

For instruction-following and chat models:

{"messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What is AI?"}, {"role": "assistant", "content": "AI stands for Artificial Intelligence..."}]}
{"messages": [{"role": "user", "content": "Explain machine learning"}, {"role": "assistant", "content": "Machine learning is a subset of AI..."}]}

Completion Format (JSONL)

For text completion models:

{"prompt": "Translate to French: Hello", "completion": " Bonjour"}
{"prompt": "Translate to French: Goodbye", "completion": " Au revoir"}

Preparing Training Data

import json

# Create training examples
examples = [
{
"messages": [
{"role": "user", "content": "What is Python?"},
{"role": "assistant", "content": "Python is a high-level programming language..."}
]
},
{
"messages": [
{"role": "user", "content": "What is JavaScript?"},
{"role": "assistant", "content": "JavaScript is a scripting language..."}
]
}
]

# Save to JSONL
with open('training_data.jsonl', 'w') as f:
for example in examples:
f.write(json.dumps(example) + '\n')

Uploading Training Files

# Upload file to Chiro instance
file = client.files.create(
file=open("training_data.jsonl", "rb"),
purpose="fine-tune"
)

print(f"File ID: {file.id}")

Data Quality Guidelines

  • Minimum examples: 50-100 for basic fine-tuning
  • Recommended: 500-1000 examples for production models
  • Maximum size: Depends on GPU tier (1GB/10GB/100GB for D32/D128/D256)
  • Format validation: Use the built-in validation tool
curl -X POST https://abc123xyz.nano.achiral.ai/v1/files/validate \
-H "Authorization: Bearer YOUR_API_KEY" \
-F "file=@training_data.jsonl"

Human-in-the-Loop Training

Achiral AI's unique HITL training allows you to guide model behavior through human feedback during and after training.

Feedback Collection

# Submit feedback on model output
client.feedback.create(
model="my-fine-tuned-model",
prompt="What is machine learning?",
completion="Machine learning uses algorithms...",
rating=4, # 1-5 scale
corrections="Add mention of neural networks",
metadata={
"user_id": "user123",
"session_id": "sess456"
}
)

Continuous Learning

Enable continuous improvement based on feedback:

# Create HITL training job
job = client.fine_tuning.jobs.create(
model="my-fine-tuned-model",
training_mode="hitl",
feedback_source="last_30_days",
auto_retrain=True,
retrain_threshold=100 # Retrain after 100 feedback items
)

Feedback Dashboard

Monitor feedback and model improvements:

  1. Navigate to TrainingFeedback
  2. View feedback analytics and trends
  3. Filter by rating, user, or time period
  4. Approve or reject corrections
  5. Trigger manual retraining

HITL Best Practices

  • Collect diverse feedback: Ensure feedback covers different use cases
  • Set quality thresholds: Only use high-quality feedback (rating ≥ 3)
  • Regular retraining: Retrain weekly or after significant feedback volume
  • A/B testing: Compare HITL-trained vs baseline models

Custom Fine-tuning

Advanced Hyperparameters

job = client.fine_tuning.jobs.create(
model="mistral-7b",
training_file="file-abc123",
validation_file="file-def456",
hyperparameters={
"n_epochs": 5,
"batch_size": 8,
"learning_rate_multiplier": 0.5,
"prompt_loss_weight": 0.1,
"gradient_accumulation_steps": 4,
"warmup_steps": 100,
"weight_decay": 0.01,
"max_grad_norm": 1.0
},
suffix="my-custom-model"
)

Hyperparameter Reference

ParameterDescriptionDefaultRange
n_epochsTraining epochs31-20
batch_sizeSamples per batch41-128
learning_rate_multiplierLR scaling factor1.00.01-10.0
prompt_loss_weightWeight for prompt loss0.00.0-1.0
gradient_accumulation_stepsAccumulation steps11-64
warmup_stepsLR warmup steps00-1000
weight_decayL2 regularization0.010.0-0.1
max_grad_normGradient clipping1.00.1-10.0

Training Techniques

LoRA (Low-Rank Adaptation)

Efficient fine-tuning for large models:

job = client.fine_tuning.jobs.create(
model="llama-70b",
training_file="file-abc123",
method="lora",
lora_config={
"r": 8, # Rank
"lora_alpha": 16,
"lora_dropout": 0.05,
"target_modules": ["q_proj", "v_proj"]
}
)

QLoRA (Quantized LoRA)

Memory-efficient training with quantization:

job = client.fine_tuning.jobs.create(
model="llama-70b",
training_file="file-abc123",
method="qlora",
quantization="4bit",
lora_config={
"r": 16,
"lora_alpha": 32
}
)

Full Fine-tuning

Update all model parameters:

job = client.fine_tuning.jobs.create(
model="mistral-7b",
training_file="file-abc123",
method="full",
hyperparameters={
"n_epochs": 3,
"batch_size": 2, # Smaller batch for memory
"learning_rate_multiplier": 0.1
}
)

Model Versioning

Automatic Versioning

Every training job creates a new model version:

# List model versions
versions = client.models.versions.list("my-model")

for version in versions.data:
print(f"Version {version.version}: {version.created_at}")
print(f" Training job: {version.training_job_id}")
print(f" Metrics: {version.metrics}")

Version Metadata

# Get version details
version = client.models.versions.retrieve("my-model", version=3)

print(f"Created: {version.created_at}")
print(f"Base model: {version.base_model}")
print(f"Training samples: {version.training_samples}")
print(f"Validation loss: {version.validation_loss}")

Version Management

# Set active version
client.models.versions.set_active("my-model", version=3)

# Compare versions
comparison = client.models.versions.compare(
"my-model",
versions=[2, 3],
test_file="file-test123"
)

print(f"Version 2 accuracy: {comparison.results[0].accuracy}")
print(f"Version 3 accuracy: {comparison.results[1].accuracy}")

# Rollback to previous version
client.models.versions.set_active("my-model", version=2)

Version Tags

# Tag versions
client.models.versions.update(
"my-model",
version=3,
tags=["production", "v1.2.0", "stable"]
)

# List by tag
production_version = client.models.versions.list(
"my-model",
tags=["production"]
)

Training Monitoring

Real-time Metrics

# Stream training metrics
for event in client.fine_tuning.jobs.stream_events(job.id):
if event.type == "metrics":
print(f"Step {event.step}: loss={event.loss:.4f}")
elif event.type == "checkpoint":
print(f"Checkpoint saved at step {event.step}")
elif event.type == "completed":
print("Training completed!")

Training Dashboard

Access via web interface:

  1. Navigate to TrainingJobs
  2. Click on job ID
  3. View real-time metrics:
    • Training loss
    • Validation loss
    • Learning rate
    • GPU utilization
    • Estimated time remaining

Metrics Export

# Export training metrics
metrics = client.fine_tuning.jobs.export_metrics(job.id)

import pandas as pd
df = pd.DataFrame(metrics)
df.to_csv('training_metrics.csv')

Checkpoints

Automatic Checkpointing

Checkpoints are saved automatically:

  • Every epoch
  • Every 1000 steps
  • On best validation loss
  • On training completion

Manual Checkpoints

# Create checkpoint
checkpoint = client.fine_tuning.jobs.create_checkpoint(
job.id,
description="Before hyperparameter change"
)

# List checkpoints
checkpoints = client.fine_tuning.jobs.list_checkpoints(job.id)

for cp in checkpoints:
print(f"Step {cp.step}: {cp.description}")

Resume from Checkpoint

# Resume training from checkpoint
job = client.fine_tuning.jobs.create(
model="mistral-7b",
training_file="file-abc123",
resume_from_checkpoint="checkpoint-12345"
)

Training Best Practices

Data Preparation

  1. Clean data: Remove duplicates, errors, formatting issues
  2. Balanced dataset: Include diverse examples
  3. Validation split: Use 10-20% for validation
  4. Test separately: Keep separate test set for final evaluation

Hyperparameter Tuning

  1. Start conservative: Use default hyperparameters first
  2. Learning rate: Most important - start with 1e-5 to 5e-5
  3. Batch size: Increase until GPU memory is ~80% used
  4. Epochs: Start with 3-5, increase if underfitting

Monitoring

  1. Watch validation loss: Stop if it increases (overfitting)
  2. GPU utilization: Should be >80% during training
  3. Training time: Estimate: (samples × epochs) / (batch_size × throughput)
  4. Cost tracking: Monitor GPU hours and storage

Evaluation

# Evaluate fine-tuned model
evaluation = client.fine_tuning.jobs.evaluate(
model="my-fine-tuned-model",
test_file="file-test123",
metrics=["perplexity", "accuracy", "f1"]
)

print(f"Perplexity: {evaluation.perplexity}")
print(f"Accuracy: {evaluation.accuracy}")

Multi-GPU Training

Available on Growth plan (D256 GPU tier):

job = client.fine_tuning.jobs.create(
model="llama-70b",
training_file="file-abc123",
distributed_training={
"strategy": "ddp", # Distributed Data Parallel
"num_gpus": 4
},
hyperparameters={
"batch_size": 32, # Total batch size across GPUs
"gradient_accumulation_steps": 8
}
)

Next Steps

Learn more