From 0192f0ce403f1d62414c15c54b91392da5b7f0b2 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Adrian=20W=C3=A4lchli?= Date: Mon, 11 Jan 2021 14:12:38 +0100 Subject: [PATCH] Add a performance section to TPU docs to address FAQ (#5445) * header * update docs * punctuation * adding another note * some more notes * Update docs/source/tpu.rst Co-authored-by: Rohit Gupta * punctuation Co-authored-by: Lezwon Castelino Co-authored-by: Rohit Gupta Co-authored-by: chaton --- docs/source/tpu.rst | 29 +++++++++++++++++++++++++---- 1 file changed, 25 insertions(+), 4 deletions(-) diff --git a/docs/source/tpu.rst b/docs/source/tpu.rst index 5f4c48076d..549a3a1cd2 100644 --- a/docs/source/tpu.rst +++ b/docs/source/tpu.rst @@ -40,7 +40,7 @@ To access TPUs, there are three main ways. ---------------- Colab TPUs ------------ +---------- Colab is like a jupyter notebook with a free GPU or TPU hosted on GCP. @@ -129,8 +129,7 @@ That's it! Your model will train on all 8 TPU cores. ---------------- TPU core training - ------------------------- +----------------- Lightning supports training on a single TPU core or 8 TPU cores. @@ -177,7 +176,7 @@ on how to set up the instance groups and VMs needed to run TPU Pods. ---------------- 16 bit precision ------------------ +---------------- Lightning also supports training in 16-bit precision with TPUs. By default, TPU training will use 32-bit precision. To enable 16-bit, set the 16-bit flag. @@ -194,6 +193,28 @@ Under the hood the xla library will use the `bfloat16 type `_ +- XLA Graph compilation during the initial steps `Reference `_ +- Some tensor ops are not fully supported on TPU, or not supported at all. These operations will be performed on CPU (context switch). +- PyTorch integration is still experimental. Some performance bottlenecks may simply be the result of unfinished implementation. + +The official PyTorch XLA `performance guide `_ +has more detailed information on how PyTorch code can be optimized for TPU. In particular, the +`metrics report `_ allows +one to identify operations that lead to context switching. + + About XLA ---------- XLA is the library that interfaces PyTorch with the TPUs.