docs: fix removed ref to `deepspeed.initialize` (#20353)

* docs: fix removed ref to `deepspeed.initialize`

* fix links
This commit is contained in:
Jirka Borovec 2024-10-21 15:47:30 +02:00 committed by GitHub
parent af19dda05c
commit 0e1e14f815
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
6 changed files with 8 additions and 8 deletions

View File

@ -52,7 +52,7 @@ Example:
model = WeightSharingModule()
trainer = Trainer(max_epochs=1, accelerator="tpu")
See `XLA Documentation <https://github.com/pytorch/xla/blob/master/TROUBLESHOOTING.md#xla-tensor-quirks>`_
See `XLA Documentation <https://github.com/pytorch/xla/blob/v2.5.0/TROUBLESHOOTING.md#xla-tensor-quirks>`_
----
@ -61,4 +61,4 @@ XLA
XLA is the library that interfaces PyTorch with the TPUs.
For more information check out `XLA <https://github.com/pytorch/xla>`_.
Guide for `troubleshooting XLA <https://github.com/pytorch/xla/blob/master/TROUBLESHOOTING.md>`_
Guide for `troubleshooting XLA <https://github.com/pytorch/xla/blob/v2.5.0/TROUBLESHOOTING.md>`_

View File

@ -108,7 +108,7 @@ There are cases in which training on TPUs is slower when compared with GPUs, for
- XLA Graph compilation during the initial steps `Reference <https://github.com/pytorch/xla/issues/2383#issuecomment-666519998>`_
- Some tensor ops are not fully supported on TPU, or not supported at all. These operations will be performed on CPU (context switch).
The official PyTorch XLA `performance guide <https://github.com/pytorch/xla/blob/master/TROUBLESHOOTING.md#known-performance-caveats>`_
The official PyTorch XLA `performance guide <https://github.com/pytorch/xla/blob/v2.5.0/TROUBLESHOOTING.md#known-performance-caveats>`_
has more detailed information on how PyTorch code can be optimized for TPU. In particular, the
`metrics report <https://github.com/pytorch/xla/blob/master/TROUBLESHOOTING.md#get-a-metrics-report>`_ allows
`metrics report <https://github.com/pytorch/xla/blob/v2.5.0/TROUBLESHOOTING.md#get-a-metrics-report>`_ allows
one to identify operations that lead to context switching.

View File

@ -78,7 +78,7 @@ A lot of PyTorch operations aren't lowered to XLA, which could lead to significa
These operations are moved to the CPU memory and evaluated, and then the results are transferred back to the XLA device(s).
By using the `xla_debug` Strategy, users could create a metrics report to diagnose issues.
The report includes things like (`XLA Reference <https://github.com/pytorch/xla/blob/master/TROUBLESHOOTING.md#troubleshooting>`_):
The report includes things like (`XLA Reference <https://github.com/pytorch/xla/blob/v2.5.0/TROUBLESHOOTING.md#troubleshooting>`_):
* how many times we issue XLA compilations and time spent on issuing.
* how many times we execute and time spent on execution

View File

@ -598,7 +598,7 @@ class DeepSpeedStrategy(DDPStrategy, _Sharded):
) -> Tuple["DeepSpeedEngine", Optimizer]:
"""Initialize one model and one optimizer with an optional learning rate scheduler.
This calls :func:`deepspeed.initialize` internally.
This calls ``deepspeed.initialize`` internally.
"""
import deepspeed

View File

@ -56,7 +56,7 @@ class XLAFSDPStrategy(ParallelStrategy, _Sharded):
.. warning:: This is an :ref:`experimental <versioning:Experimental API>` feature.
For more information check out https://github.com/pytorch/xla/blob/master/docs/fsdp.md
For more information check out https://github.com/pytorch/xla/blob/v2.5.0/docs/fsdp.md
Args:
auto_wrap_policy: Same as ``auto_wrap_policy`` parameter in

View File

@ -414,7 +414,7 @@ class DeepSpeedStrategy(DDPStrategy):
) -> Tuple["deepspeed.DeepSpeedEngine", Optimizer]:
"""Initialize one model and one optimizer with an optional learning rate scheduler.
This calls :func:`deepspeed.initialize` internally.
This calls ``deepspeed.initialize`` internally.
"""
import deepspeed