docs: fix removed ref to `deepspeed.initialize` (#20353)
* docs: fix removed ref to `deepspeed.initialize` * fix links
This commit is contained in:
parent
af19dda05c
commit
0e1e14f815
|
@ -52,7 +52,7 @@ Example:
|
||||||
model = WeightSharingModule()
|
model = WeightSharingModule()
|
||||||
trainer = Trainer(max_epochs=1, accelerator="tpu")
|
trainer = Trainer(max_epochs=1, accelerator="tpu")
|
||||||
|
|
||||||
See `XLA Documentation <https://github.com/pytorch/xla/blob/master/TROUBLESHOOTING.md#xla-tensor-quirks>`_
|
See `XLA Documentation <https://github.com/pytorch/xla/blob/v2.5.0/TROUBLESHOOTING.md#xla-tensor-quirks>`_
|
||||||
|
|
||||||
----
|
----
|
||||||
|
|
||||||
|
@ -61,4 +61,4 @@ XLA
|
||||||
XLA is the library that interfaces PyTorch with the TPUs.
|
XLA is the library that interfaces PyTorch with the TPUs.
|
||||||
For more information check out `XLA <https://github.com/pytorch/xla>`_.
|
For more information check out `XLA <https://github.com/pytorch/xla>`_.
|
||||||
|
|
||||||
Guide for `troubleshooting XLA <https://github.com/pytorch/xla/blob/master/TROUBLESHOOTING.md>`_
|
Guide for `troubleshooting XLA <https://github.com/pytorch/xla/blob/v2.5.0/TROUBLESHOOTING.md>`_
|
||||||
|
|
|
@ -108,7 +108,7 @@ There are cases in which training on TPUs is slower when compared with GPUs, for
|
||||||
- XLA Graph compilation during the initial steps `Reference <https://github.com/pytorch/xla/issues/2383#issuecomment-666519998>`_
|
- XLA Graph compilation during the initial steps `Reference <https://github.com/pytorch/xla/issues/2383#issuecomment-666519998>`_
|
||||||
- Some tensor ops are not fully supported on TPU, or not supported at all. These operations will be performed on CPU (context switch).
|
- Some tensor ops are not fully supported on TPU, or not supported at all. These operations will be performed on CPU (context switch).
|
||||||
|
|
||||||
The official PyTorch XLA `performance guide <https://github.com/pytorch/xla/blob/master/TROUBLESHOOTING.md#known-performance-caveats>`_
|
The official PyTorch XLA `performance guide <https://github.com/pytorch/xla/blob/v2.5.0/TROUBLESHOOTING.md#known-performance-caveats>`_
|
||||||
has more detailed information on how PyTorch code can be optimized for TPU. In particular, the
|
has more detailed information on how PyTorch code can be optimized for TPU. In particular, the
|
||||||
`metrics report <https://github.com/pytorch/xla/blob/master/TROUBLESHOOTING.md#get-a-metrics-report>`_ allows
|
`metrics report <https://github.com/pytorch/xla/blob/v2.5.0/TROUBLESHOOTING.md#get-a-metrics-report>`_ allows
|
||||||
one to identify operations that lead to context switching.
|
one to identify operations that lead to context switching.
|
||||||
|
|
|
@ -78,7 +78,7 @@ A lot of PyTorch operations aren't lowered to XLA, which could lead to significa
|
||||||
These operations are moved to the CPU memory and evaluated, and then the results are transferred back to the XLA device(s).
|
These operations are moved to the CPU memory and evaluated, and then the results are transferred back to the XLA device(s).
|
||||||
By using the `xla_debug` Strategy, users could create a metrics report to diagnose issues.
|
By using the `xla_debug` Strategy, users could create a metrics report to diagnose issues.
|
||||||
|
|
||||||
The report includes things like (`XLA Reference <https://github.com/pytorch/xla/blob/master/TROUBLESHOOTING.md#troubleshooting>`_):
|
The report includes things like (`XLA Reference <https://github.com/pytorch/xla/blob/v2.5.0/TROUBLESHOOTING.md#troubleshooting>`_):
|
||||||
|
|
||||||
* how many times we issue XLA compilations and time spent on issuing.
|
* how many times we issue XLA compilations and time spent on issuing.
|
||||||
* how many times we execute and time spent on execution
|
* how many times we execute and time spent on execution
|
||||||
|
|
|
@ -598,7 +598,7 @@ class DeepSpeedStrategy(DDPStrategy, _Sharded):
|
||||||
) -> Tuple["DeepSpeedEngine", Optimizer]:
|
) -> Tuple["DeepSpeedEngine", Optimizer]:
|
||||||
"""Initialize one model and one optimizer with an optional learning rate scheduler.
|
"""Initialize one model and one optimizer with an optional learning rate scheduler.
|
||||||
|
|
||||||
This calls :func:`deepspeed.initialize` internally.
|
This calls ``deepspeed.initialize`` internally.
|
||||||
|
|
||||||
"""
|
"""
|
||||||
import deepspeed
|
import deepspeed
|
||||||
|
|
|
@ -56,7 +56,7 @@ class XLAFSDPStrategy(ParallelStrategy, _Sharded):
|
||||||
|
|
||||||
.. warning:: This is an :ref:`experimental <versioning:Experimental API>` feature.
|
.. warning:: This is an :ref:`experimental <versioning:Experimental API>` feature.
|
||||||
|
|
||||||
For more information check out https://github.com/pytorch/xla/blob/master/docs/fsdp.md
|
For more information check out https://github.com/pytorch/xla/blob/v2.5.0/docs/fsdp.md
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
auto_wrap_policy: Same as ``auto_wrap_policy`` parameter in
|
auto_wrap_policy: Same as ``auto_wrap_policy`` parameter in
|
||||||
|
|
|
@ -414,7 +414,7 @@ class DeepSpeedStrategy(DDPStrategy):
|
||||||
) -> Tuple["deepspeed.DeepSpeedEngine", Optimizer]:
|
) -> Tuple["deepspeed.DeepSpeedEngine", Optimizer]:
|
||||||
"""Initialize one model and one optimizer with an optional learning rate scheduler.
|
"""Initialize one model and one optimizer with an optional learning rate scheduler.
|
||||||
|
|
||||||
This calls :func:`deepspeed.initialize` internally.
|
This calls ``deepspeed.initialize`` internally.
|
||||||
|
|
||||||
"""
|
"""
|
||||||
import deepspeed
|
import deepspeed
|
||||||
|
|
Loading…
Reference in New Issue