Training & Fine-tuning
Achiral AI is a privacy-first AI platform for businesses with enterprise-grade, self-hosted, secure infrastructure. Train and fine-tune LLM models on your dedicated Chiro instance with a complete training infrastructure, human-in-the-loop capabilities, custom fine-tuning workflows, and enterprise-grade model versioning.
Overview
Your Chiro instance supports:
- Fine-tuning: Adapt pre-trained models to your specific use case
- Human-in-the-Loop Training: Guide model behavior with human feedback
- Custom Training: Train models from scratch or continue pre-training
- Model Versioning: Track and manage model iterations
- Distributed Training: Multi-GPU training on Growth plan
Quick Start
Basic Fine-tuning
from openai import OpenAI
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://abc123xyz.nano.achiral.ai/v1"
)
# Create fine-tuning job
job = client.fine_tuning.jobs.create(
training_file="file-abc123",
model="mistral-7b",
hyperparameters={
"n_epochs": 3,
"batch_size": 4,
"learning_rate_multiplier": 1.0
}
)
print(f"Job ID: {job.id}")
Monitor Training
# Get job status
status = client.fine_tuning.jobs.retrieve(job.id)
print(f"Status: {status.status}")
print(f"Progress: {status.trained_tokens}/{status.training_file.tokens}")
# List all jobs
jobs = client.fine_tuning.jobs.list()
for job in jobs.data:
print(f"{job.id}: {job.status}")
Training Data Format
Chat Format (JSONL)
For instruction-following and chat models:
{"messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What is AI?"}, {"role": "assistant", "content": "AI stands for Artificial Intelligence..."}]}
{"messages": [{"role": "user", "content": "Explain machine learning"}, {"role": "assistant", "content": "Machine learning is a subset of AI..."}]}
Completion Format (JSONL)
For text completion models:
{"prompt": "Translate to French: Hello", "completion": " Bonjour"}
{"prompt": "Translate to French: Goodbye", "completion": " Au revoir"}
Preparing Training Data
import json
# Create training examples
examples = [
{
"messages": [
{"role": "user", "content": "What is Python?"},
{"role": "assistant", "content": "Python is a high-level programming language..."}
]
},
{
"messages": [
{"role": "user", "content": "What is JavaScript?"},
{"role": "assistant", "content": "JavaScript is a scripting language..."}
]
}
]
# Save to JSONL
with open('training_data.jsonl', 'w') as f:
for example in examples:
f.write(json.dumps(example) + '\n')
Uploading Training Files
# Upload file to Chiro instance
file = client.files.create(
file=open("training_data.jsonl", "rb"),
purpose="fine-tune"
)
print(f"File ID: {file.id}")
Data Quality Guidelines
- Minimum examples: 50-100 for basic fine-tuning
- Recommended: 500-1000 examples for production models
- Maximum size: Depends on GPU tier (1GB/10GB/100GB for D32/D128/D256)
- Format validation: Use the built-in validation tool
curl -X POST https://abc123xyz.nano.achiral.ai/v1/files/validate \
-H "Authorization: Bearer YOUR_API_KEY" \
-F "file=@training_data.jsonl"
Human-in-the-Loop Training
Achiral AI's unique HITL training allows you to guide model behavior through human feedback during and after training.
Feedback Collection
# Submit feedback on model output
client.feedback.create(
model="my-fine-tuned-model",
prompt="What is machine learning?",
completion="Machine learning uses algorithms...",
rating=4, # 1-5 scale
corrections="Add mention of neural networks",
metadata={
"user_id": "user123",
"session_id": "sess456"
}
)
Continuous Learning
Enable continuous improvement based on feedback:
# Create HITL training job
job = client.fine_tuning.jobs.create(
model="my-fine-tuned-model",
training_mode="hitl",
feedback_source="last_30_days",
auto_retrain=True,
retrain_threshold=100 # Retrain after 100 feedback items
)
Feedback Dashboard
Monitor feedback and model improvements:
- Navigate to Training → Feedback
- View feedback analytics and trends
- Filter by rating, user, or time period
- Approve or reject corrections
- Trigger manual retraining
HITL Best Practices
- Collect diverse feedback: Ensure feedback covers different use cases
- Set quality thresholds: Only use high-quality feedback (rating ≥ 3)
- Regular retraining: Retrain weekly or after significant feedback volume
- A/B testing: Compare HITL-trained vs baseline models
Custom Fine-tuning
Advanced Hyperparameters
job = client.fine_tuning.jobs.create(
model="mistral-7b",
training_file="file-abc123",
validation_file="file-def456",
hyperparameters={
"n_epochs": 5,
"batch_size": 8,
"learning_rate_multiplier": 0.5,
"prompt_loss_weight": 0.1,
"gradient_accumulation_steps": 4,
"warmup_steps": 100,
"weight_decay": 0.01,
"max_grad_norm": 1.0
},
suffix="my-custom-model"
)
Hyperparameter Reference
| Parameter | Description | Default | Range |
|---|---|---|---|
n_epochs | Training epochs | 3 | 1-20 |
batch_size | Samples per batch | 4 | 1-128 |
learning_rate_multiplier | LR scaling factor | 1.0 | 0.01-10.0 |
prompt_loss_weight | Weight for prompt loss | 0.0 | 0.0-1.0 |
gradient_accumulation_steps | Accumulation steps | 1 | 1-64 |
warmup_steps | LR warmup steps | 0 | 0-1000 |
weight_decay | L2 regularization | 0.01 | 0.0-0.1 |
max_grad_norm | Gradient clipping | 1.0 | 0.1-10.0 |
Training Techniques
LoRA (Low-Rank Adaptation)
Efficient fine-tuning for large models:
job = client.fine_tuning.jobs.create(
model="llama-70b",
training_file="file-abc123",
method="lora",
lora_config={
"r": 8, # Rank
"lora_alpha": 16,
"lora_dropout": 0.05,
"target_modules": ["q_proj", "v_proj"]
}
)
QLoRA (Quantized LoRA)
Memory-efficient training with quantization:
job = client.fine_tuning.jobs.create(
model="llama-70b",
training_file="file-abc123",
method="qlora",
quantization="4bit",
lora_config={
"r": 16,
"lora_alpha": 32
}
)
Full Fine-tuning
Update all model parameters:
job = client.fine_tuning.jobs.create(
model="mistral-7b",
training_file="file-abc123",
method="full",
hyperparameters={
"n_epochs": 3,
"batch_size": 2, # Smaller batch for memory
"learning_rate_multiplier": 0.1
}
)
Model Versioning
Automatic Versioning
Every training job creates a new model version:
# List model versions
versions = client.models.versions.list("my-model")
for version in versions.data:
print(f"Version {version.version}: {version.created_at}")
print(f" Training job: {version.training_job_id}")
print(f" Metrics: {version.metrics}")
Version Metadata
# Get version details
version = client.models.versions.retrieve("my-model", version=3)
print(f"Created: {version.created_at}")
print(f"Base model: {version.base_model}")
print(f"Training samples: {version.training_samples}")
print(f"Validation loss: {version.validation_loss}")
Version Management
# Set active version
client.models.versions.set_active("my-model", version=3)
# Compare versions
comparison = client.models.versions.compare(
"my-model",
versions=[2, 3],
test_file="file-test123"
)
print(f"Version 2 accuracy: {comparison.results[0].accuracy}")
print(f"Version 3 accuracy: {comparison.results[1].accuracy}")
# Rollback to previous version
client.models.versions.set_active("my-model", version=2)
Version Tags
# Tag versions
client.models.versions.update(
"my-model",
version=3,
tags=["production", "v1.2.0", "stable"]
)
# List by tag
production_version = client.models.versions.list(
"my-model",
tags=["production"]
)
Training Monitoring
Real-time Metrics
# Stream training metrics
for event in client.fine_tuning.jobs.stream_events(job.id):
if event.type == "metrics":
print(f"Step {event.step}: loss={event.loss:.4f}")
elif event.type == "checkpoint":
print(f"Checkpoint saved at step {event.step}")
elif event.type == "completed":
print("Training completed!")
Training Dashboard
Access via web interface:
- Navigate to Training → Jobs
- Click on job ID
- View real-time metrics:
- Training loss
- Validation loss
- Learning rate
- GPU utilization
- Estimated time remaining
Metrics Export
# Export training metrics
metrics = client.fine_tuning.jobs.export_metrics(job.id)
import pandas as pd
df = pd.DataFrame(metrics)
df.to_csv('training_metrics.csv')
Checkpoints
Automatic Checkpointing
Checkpoints are saved automatically:
- Every epoch
- Every 1000 steps
- On best validation loss
- On training completion
Manual Checkpoints
# Create checkpoint
checkpoint = client.fine_tuning.jobs.create_checkpoint(
job.id,
description="Before hyperparameter change"
)
# List checkpoints
checkpoints = client.fine_tuning.jobs.list_checkpoints(job.id)
for cp in checkpoints:
print(f"Step {cp.step}: {cp.description}")
Resume from Checkpoint
# Resume training from checkpoint
job = client.fine_tuning.jobs.create(
model="mistral-7b",
training_file="file-abc123",
resume_from_checkpoint="checkpoint-12345"
)
Training Best Practices
Data Preparation
- Clean data: Remove duplicates, errors, formatting issues
- Balanced dataset: Include diverse examples
- Validation split: Use 10-20% for validation
- Test separately: Keep separate test set for final evaluation
Hyperparameter Tuning
- Start conservative: Use default hyperparameters first
- Learning rate: Most important - start with 1e-5 to 5e-5
- Batch size: Increase until GPU memory is ~80% used
- Epochs: Start with 3-5, increase if underfitting
Monitoring
- Watch validation loss: Stop if it increases (overfitting)
- GPU utilization: Should be >80% during training
- Training time: Estimate: (samples × epochs) / (batch_size × throughput)
- Cost tracking: Monitor GPU hours and storage
Evaluation
# Evaluate fine-tuned model
evaluation = client.fine_tuning.jobs.evaluate(
model="my-fine-tuned-model",
test_file="file-test123",
metrics=["perplexity", "accuracy", "f1"]
)
print(f"Perplexity: {evaluation.perplexity}")
print(f"Accuracy: {evaluation.accuracy}")
Multi-GPU Training
Available on Growth plan (D256 GPU tier):
job = client.fine_tuning.jobs.create(
model="llama-70b",
training_file="file-abc123",
distributed_training={
"strategy": "ddp", # Distributed Data Parallel
"num_gpus": 4
},
hyperparameters={
"batch_size": 32, # Total batch size across GPUs
"gradient_accumulation_steps": 8
}
)
Next Steps
- API Reference - Training API endpoints
- Configuration - GPU and storage configuration
- Security & Compliance - Data security during training
Learn more
- Explore Features: https://achiral.ai/features
- View Pricing: https://achiral.ai/pricing