Operator Guides

Documentation for platform administrators deploying and managing Kubeflow Trainer in production.

This section contains guides for installing, configuring, and operating Kubeflow Trainer in Kubernetes clusters.


Installation & Migration

Installation

Install Kubeflow Trainer using kubectl or Helm

Installation
Migration from v1

Migrate from Kubeflow Training Operator v1 to Trainer v2

Migrating to Kubeflow Trainer v2

Configuration

Training Runtimes

Configure TrainingRuntime and ClusterTrainingRuntime resources

Runtime Guide
ML Policies

Define ML-specific policies for training workloads

ML Policy
Job Templates

Customize job templates for different frameworks

Job Template
Runtime Patches

Customize training runtime configuration with RuntimePatches

Runtime Patches

Advanced Configuration

Extension Framework

Understand the plugin-based extension architecture

Kubeflow Trainer Extension Framework
Job Scheduling

Integrate with Volcano, Kueue, Coscheduling, and KAI Scheduler

Job Scheduling