AchiralAchiral

Docs ยท LLMs and developers

Reviewed2026-06-03Version3.9.0

Training & Fine-tuning

Model adaptation, fine-tuning jobs, evaluation, deployment, and training infrastructure.

Train and fine-tune LLM models on your dedicated Chiro instance with a complete training infrastructure, human-in-the-loop capabilities, custom fine-tuning workflows, and enterprise-grade model versioning.

Overview

Your Chiro instance supports:

  • Fine-tuning: Adapt pre-trained models to your specific use case
  • Human-in-the-Loop Training: Guide model behavior with human feedback
  • Custom Training: Train models from scratch or continue pre-training
  • Model Versioning: Track and manage model iterations
  • Distributed Training: Multi-GPU training when included in the deployment

Quick Start

Basic Fine-tuning

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://abc123xyz.nano.achiral.ai/v1"
)

# Create fine-tuning job
job = client.fine_tuning.jobs.create(
    training_file="file-abc123",
    model="mistral-7b",
    hyperparameters={
        "n_epochs": 3,
        "batch_size": 4,
        "learning_rate_multiplier": 1.0
    }
)

print(f"Job ID: {job.id}")

Monitor Training

# Get job status
status = client.fine_tuning.jobs.retrieve(job.id)
print(f"Status: {status.status}")
print(f"Progress: {status.trained_tokens}/{status.training_file.tokens}")

# List all jobs
jobs = client.fine_tuning.jobs.list()
for job in jobs.data:
    print(f"{job.id}: {job.status}")

Training Data Format

Chat Format (JSONL)

For instruction-following and chat models:

{"messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What is AI?"}, {"role": "assistant", "content": "AI stands for Artificial Intelligence..."}]}
{"messages": [{"role": "user", "content": "Explain machine learning"}, {"role": "assistant", "content": "Machine learning is a subset of AI..."}]}

Completion Format (JSONL)

For text completion models:

{"prompt": "Translate to French: Hello", "completion": " Bonjour"}
{"prompt": "Translate to French: Goodbye", "completion": " Au revoir"}

Preparing Training Data

import json

# Create training examples
examples = [
    {
        "messages": [
            {"role": "user", "content": "What is Python?"},
            {"role": "assistant", "content": "Python is a high-level programming language..."}
        ]
    },
    {
        "messages": [
            {"role": "user", "content": "What is JavaScript?"},
            {"role": "assistant", "content": "JavaScript is a scripting language..."}
        ]
    }
]

# Save to JSONL
with open('training_data.jsonl', 'w') as f:
    for example in examples:
        f.write(json.dumps(example) + '\n')

Uploading Training Files

# Upload file to Chiro instance
file = client.files.create(
    file=open("training_data.jsonl", "rb"),
    purpose="fine-tune"
)

print(f"File ID: {file.id}")

Data Quality Guidelines

  • Minimum examples: 50-100 for basic fine-tuning
  • Recommended: 500-1000 examples for production models
  • Maximum size: Depends on GPU tier (1GB/10GB/100GB for D32/D128/D256)
  • Format validation: Use the built-in validation tool
curl -X POST https://abc123xyz.nano.achiral.ai/v1/files/validate \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "file=@training_data.jsonl"

Human-in-the-Loop Training

Achiral AI's unique HITL training allows you to guide model behavior through human feedback during and after training.

Feedback Collection

# Submit feedback on model output
client.feedback.create(
    model="my-fine-tuned-model",
    prompt="What is machine learning?",
    completion="Machine learning uses algorithms...",
    rating=4,  # 1-5 scale
    corrections="Add mention of neural networks",
    metadata={
        "user_id": "user123",
        "session_id": "sess456"
    }
)

Continuous Learning

Enable continuous improvement based on feedback:

# Create HITL training job
job = client.fine_tuning.jobs.create(
    model="my-fine-tuned-model",
    training_mode="hitl",
    feedback_source="last_30_days",
    auto_retrain=True,
    retrain_threshold=100  # Retrain after 100 feedback items
)

Feedback Dashboard

Monitor feedback and model improvements:

  1. Navigate to Training โ†’ Feedback
  2. View feedback analytics and trends
  3. Filter by rating, user, or time period
  4. Approve or reject corrections
  5. Trigger manual retraining

HITL Best Practices

  • Collect diverse feedback: Ensure feedback covers different use cases
  • Set quality thresholds: Only use high-quality feedback (rating โ‰ฅ 3)
  • Regular retraining: Retrain weekly or after significant feedback volume
  • A/B testing: Compare HITL-trained vs baseline models

Custom Fine-tuning

Advanced Hyperparameters

job = client.fine_tuning.jobs.create(
    model="mistral-7b",
    training_file="file-abc123",
    validation_file="file-def456",
    hyperparameters={
        "n_epochs": 5,
        "batch_size": 8,
        "learning_rate_multiplier": 0.5,
        "prompt_loss_weight": 0.1,
        "gradient_accumulation_steps": 4,
        "warmup_steps": 100,
        "weight_decay": 0.01,
        "max_grad_norm": 1.0
    },
    suffix="my-custom-model"
)

Hyperparameter Reference

ParameterDescriptionDefaultRange
n_epochsTraining epochs31-20
batch_sizeSamples per batch41-128
learning_rate_multiplierLR scaling factor1.00.01-10.0
prompt_loss_weightWeight for prompt loss0.00.0-1.0
gradient_accumulation_stepsAccumulation steps11-64
warmup_stepsLR warmup steps00-1000
weight_decayL2 regularization0.010.0-0.1
max_grad_normGradient clipping1.00.1-10.0

Training Techniques

LoRA (Low-Rank Adaptation)

Efficient fine-tuning for large models:

job = client.fine_tuning.jobs.create(
    model="llama-70b",
    training_file="file-abc123",
    method="lora",
    lora_config={
        "r": 8,  # Rank
        "lora_alpha": 16,
        "lora_dropout": 0.05,
        "target_modules": ["q_proj", "v_proj"]
    }
)

QLoRA (Quantized LoRA)

Memory-efficient training with quantization:

job = client.fine_tuning.jobs.create(
    model="llama-70b",
    training_file="file-abc123",
    method="qlora",
    quantization="4bit",
    lora_config={
        "r": 16,
        "lora_alpha": 32
    }
)

Full Fine-tuning

Update all model parameters:

job = client.fine_tuning.jobs.create(
    model="mistral-7b",
    training_file="file-abc123",
    method="full",
    hyperparameters={
        "n_epochs": 3,
        "batch_size": 2,  # Smaller batch for memory
        "learning_rate_multiplier": 0.1
    }
)

Model Versioning

Automatic Versioning

Every training job creates a new model version:

# List model versions
versions = client.models.versions.list("my-model")

for version in versions.data:
    print(f"Version {version.version}: {version.created_at}")
    print(f"  Training job: {version.training_job_id}")
    print(f"  Metrics: {version.metrics}")

Version Metadata

# Get version details
version = client.models.versions.retrieve("my-model", version=3)

print(f"Created: {version.created_at}")
print(f"Base model: {version.base_model}")
print(f"Training samples: {version.training_samples}")
print(f"Validation loss: {version.validation_loss}")

Version Management

# Set active version
client.models.versions.set_active("my-model", version=3)

# Compare versions
comparison = client.models.versions.compare(
    "my-model",
    versions=[2, 3],
    test_file="file-test123"
)

print(f"Version 2 accuracy: {comparison.results[0].accuracy}")
print(f"Version 3 accuracy: {comparison.results[1].accuracy}")

# Rollback to previous version
client.models.versions.set_active("my-model", version=2)

Version Tags

# Tag versions
client.models.versions.update(
    "my-model",
    version=3,
    tags=["production", "v1.2.0", "stable"]
)

# List by tag
production_version = client.models.versions.list(
    "my-model",
    tags=["production"]
)

Training Monitoring

Real-time Metrics

# Stream training metrics
for event in client.fine_tuning.jobs.stream_events(job.id):
    if event.type == "metrics":
        print(f"Step {event.step}: loss={event.loss:.4f}")
    elif event.type == "checkpoint":
        print(f"Checkpoint saved at step {event.step}")
    elif event.type == "completed":
        print("Training completed!")

Training Dashboard

Access via web interface:

  1. Navigate to Training โ†’ Jobs
  2. Click on job ID
  3. View real-time metrics:
    • Training loss
    • Validation loss
    • Learning rate
    • GPU utilization
    • Estimated time remaining

Metrics Export

# Export training metrics
metrics = client.fine_tuning.jobs.export_metrics(job.id)

import pandas as pd
df = pd.DataFrame(metrics)
df.to_csv('training_metrics.csv')

Checkpoints

Automatic Checkpointing

Checkpoints are saved automatically:

  • Every epoch
  • Every 1000 steps
  • On best validation loss
  • On training completion

Manual Checkpoints

# Create checkpoint
checkpoint = client.fine_tuning.jobs.create_checkpoint(
    job.id,
    description="Before hyperparameter change"
)

# List checkpoints
checkpoints = client.fine_tuning.jobs.list_checkpoints(job.id)

for cp in checkpoints:
    print(f"Step {cp.step}: {cp.description}")

Resume from Checkpoint

# Resume training from checkpoint
job = client.fine_tuning.jobs.create(
    model="mistral-7b",
    training_file="file-abc123",
    resume_from_checkpoint="checkpoint-12345"
)

Training Best Practices

Data Preparation

  1. Clean data: Remove duplicates, errors, formatting issues
  2. Balanced dataset: Include diverse examples
  3. Validation split: Use 10-20% for validation
  4. Test separately: Keep separate test set for final evaluation

Hyperparameter Tuning

  1. Start conservative: Use default hyperparameters first
  2. Learning rate: Most important - start with 1e-5 to 5e-5
  3. Batch size: Increase until GPU memory is ~80% used
  4. Epochs: Start with 3-5, increase if underfitting

Monitoring

  1. Watch validation loss: Stop if it increases (overfitting)
  2. GPU utilization: Should be >80% during training
  3. Training time: Estimate: (samples ร— epochs) / (batch_size ร— throughput)
  4. Cost tracking: Monitor GPU hours and storage

Evaluation

# Evaluate fine-tuned model
evaluation = client.fine_tuning.jobs.evaluate(
    model="my-fine-tuned-model",
    test_file="file-test123",
    metrics=["perplexity", "accuracy", "f1"]
)

print(f"Perplexity: {evaluation.perplexity}")
print(f"Accuracy: {evaluation.accuracy}")

Multi-GPU Training

Available when the deployment includes the required GPU tier:

job = client.fine_tuning.jobs.create(
    model="llama-70b",
    training_file="file-abc123",
    distributed_training={
        "strategy": "ddp",  # Distributed Data Parallel
        "num_gpus": 4
    },
    hyperparameters={
        "batch_size": 32,  # Total batch size across GPUs
        "gradient_accumulation_steps": 8
    }
)

Next Steps

Learn more