Accelerate documentation
Learning how to incorporate 🤗 Accelerate features quickly!
Getting started
Tutorials
OverviewMigrating to 🤗 AccelerateLaunching distributed codeLaunching distributed training from Jupyter NotebooksTroubleshooting guide
How-To Guides
Start Here!Example ZooHow to perform inference on large models with small resourcesKnowing how big of a model you can fit into memoryHow to quantize modelHow to perform distributed inference with normal resourcesPerforming gradient accumulationAccelerating training with local SGDSaving and loading training statesUsing experiment trackersHow to use Apple Silicon M1 GPUsHow to use DeepSpeedHow to use Fully Sharded Data ParallelismHow to use Megatron-LMHow to use 🤗 Accelerate with SageMakerHow to use 🤗 Accelerate with Intel® Extension for PyTorch for cpu
Concepts and fundamentals
🤗 Accelerate's internal mechanismLoading big models into memoryComparing performance across distributed setupsExecuting and deferring jobsGradient synchronizationTPU best practices
Reference
You are viewing v0.25.0 version. A newer version v1.13.0 is available.
Learning how to incorporate 🤗 Accelerate features quickly!
Please use the interactive tool below to help you get started with learning about a particular feature of 🤗 Accelerate and how to utilize it! It will provide you with a code diff, an explanation towards what is going on, as well as provide you with some useful links to explore more within the documentation!
Most code examples start from the following python code before integrating 🤗 Accelerate in some way:
for batch in dataloader:
optimizer.zero_grad()
inputs, targets = batch
inputs = inputs.to(device)
targets = targets.to(device)
outputs = model(inputs)
loss = loss_function(outputs, targets)
loss.backward()
optimizer.step()
scheduler.step()