Accelerate documentation
Utilities for Fully Sharded Data Parallelism
Getting started
Tutorials
OverviewMigrating to 🤗 AccelerateLaunching distributed codeLaunching distributed training from Jupyter NotebooksTroubleshooting guide
How-To Guides
Start Here!Example ZooHow to perform inference on large models with small resourcesKnowing how big of a model you can fit into memoryHow to quantize modelHow to perform distributed inference with normal resourcesPerforming gradient accumulationAccelerating training with local SGDSaving and loading training statesUsing experiment trackersHow to use Apple Silicon M1 GPUsHow to train in low precision (FP8)How to use DeepSpeedHow to use Fully Sharded Data ParallelismHow to use Megatron-LMHow to use 🤗 Accelerate with SageMakerHow to use 🤗 Accelerate with Intel® Extension for PyTorch for cpu
Concepts and fundamentals
🤗 Accelerate's internal mechanismLoading big models into memoryComparing performance across distributed setupsExecuting and deferring jobsGradient synchronizationHow training in low-precision environments is possible (FP8)TPU best practices
Reference
Main Accelerator classStateful configuration classesThe Command LineTorch wrapper classesExperiment trackersDistributed launchersDeepSpeed utilitiesLoggingWorking with large modelsDistributed inference with big modelsKwargs handlersUtility functions and classesMegatron-LM UtilitiesFully Sharded Data Parallelism Utilities
You are viewing v0.27.2 version. A newer version v1.13.0 is available.
Utilities for Fully Sharded Data Parallelism
class accelerate.FullyShardedDataParallelPlugin
< source >( sharding_strategy: typing.Any = None backward_prefetch: typing.Any = None mixed_precision_policy: typing.Any = None auto_wrap_policy: Optional = None cpu_offload: typing.Any = None ignored_modules: Optional = None state_dict_type: typing.Any = None state_dict_config: typing.Any = None optim_state_dict_config: typing.Any = None limit_all_gathers: bool = True use_orig_params: bool = True param_init_fn: Optional = None sync_module_states: bool = True forward_prefetch: bool = False activation_checkpointing: bool = False )
This plugin is used to enable fully sharded data parallelism.
get_module_class_from_name
< source >( module name )
Gets a class from a module by its name.