added val loop options

2019-06-27 13:47:19 -04:00 · 2019-06-27 13:47:19 -04:00 · e9fca35039
parent c636193c44
commit e9fca35039
3 changed files with 79 additions and 0 deletions
--- a/docs/Trainer/Logging.md
+++ b/docs/Trainer/Logging.md
@ -0,0 +1,31 @@
+
+
+
+---
+#### Display metrics in progress bar 
+``` {.python}
+# DEFAULT
+trainer = Trainer(progress_bar=True)
+```
+
+
+
+---
+#### Print which gradients are nan 
+This option prints a list of tensors with nan gradients.
+``` {.python}
+# DEFAULT
+trainer = Trainer(print_nan_grads=False)
+```
+
+---
+#### Process position
+When running multiple models on the same machine we want to decide which progress bar to use.
+Lightning will stack progress bars according to this value. 
+``` {.python}
+# DEFAULT
+trainer = Trainer(process_position=0)
+
+# if this is the second model on the node, show the second progress bar below
+trainer = Trainer(process_position=1)
+```
--- a/docs/Trainer/debugging.md
+++ b/docs/Trainer/debugging.md
@ -0,0 +1,48 @@
+These flags are useful to help debug a model.
+
+---
+#### Fast dev run 
+This flag is meant for debugging a full train/val/test loop. It'll activate callbacks, everything but only with 1 training and 1 validation batch.
+Use this to debug a full run of your program quickly
+``` {.python}
+# DEFAULT
+trainer = Trainer(fast_dev_run=False)
+```
+
+---
+#### Inspect gradient norms
+Looking at grad norms can help you figure out where training might be going wrong.
+``` {.python}
+# DEFAULT (-1 doesn't track norms)
+trainer = Trainer(track_grad_norm=-1)
+
+# track the LP norm (P=2 here)
+trainer = Trainer(track_grad_norm=2)
+```
+
+---
+#### Make model overfit on subset of data
+A useful debugging trick is to make your model overfit a tiny fraction of the data.
+``` {.python}
+# DEFAULT don't overfit (ie: normal training)
+trainer = Trainer(overfit_pct=0.0)
+
+# overfit on 1% of data 
+trainer = Trainer(overfit_pct=0.01)
+```
+
+---
+#### Print the parameter count by layer
+By default lightning prints a list of parameters *and submodules* when it starts training.
+
+---
+#### Print which gradients are nan 
+This option prints a list of tensors with nan gradients.
+``` {.python}
+# DEFAULT
+trainer = Trainer(print_nan_grads=False)
+```
+
+---
+#### Log GPU usage
+Lightning automatically logs gpu usage to the test tube logs. It'll only do it at the metric logging interval, so it doesn't slow down training.
--- a/docs/Trainer/hooks.md
+++ b/docs/Trainer/hooks.md