* Set precision=16 when use_amp is passed as True
* Update CHANGELOG.md
* add use_amp to deprecated API
* Update trainer.py
* Update trainer.py
* move the use_amp attribute to deprecated API
* move use_amp deprecation back to Trainer's __init__
* drop unsed
* drop deprecated
* reorder imports
* typing
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
* Make training_epoch_end behave like validation_epoch_end + minor fixes in docstrings.
* Minor fixes (Borda's comments).
* Detach tensors in batch_output (to avoid possible memory leak) + doc fix.
Co-authored-by: Jean-Baptiste SCHIRATTI <jean-baptisteschiratti@MacBook-Pro-de-Jean-Baptiste.local>
* show progress bar dependent on refresh_rate
* test progress_bar_refresh control show bar
* remove show_progress_bar from other tests
* borda fixes
* flake8 fix
* changelog update prog bar refresh rate
* move show_progress_bar to deprecated 0.9 api
* rm show_progress_bar references, test deprecated
* Update pytorch_lightning/trainer/__init__.py
* fix test
* changelog
* minor CHANGELOG.md format
* Update pytorch_lightning/trainer/__init__.py
* Update pytorch_lightning/trainer/trainer.py
Co-authored-by: Gerard Bentley <gbkh2015@mymail.pomona.edu>
Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
* fixed extra dataloader bug
* Update pytorch_lightning/trainer/training_loop.py
Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>
* updated CHANGELOG
* Small non-repetition change
self.get_model() => model as it was already defined
* Update CHANGELOG.md
* changed argument name to reload_train_dataloader_every_epoch
* fixed doc underline too short
* reverted to `reload_dataloaders_every_epoch`
* fixed val and test reloading
* fixed val and test reloading
Co-authored-by: TevenLeScao <teven.lescao@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* fix RunningMean
* changelog
* fix none
* Update supporters.py
just needed to multiply by zero for init
* Revert "Update supporters.py"
This reverts commit 7e0da6c6
* fix NaN
* formatting
Co-authored-by: William Falcon <waf2107@columbia.edu>
* pylint
* model API
* update test
* formatting
* disable logger
* fix checking overwrite
* fix test
* typo
* deprecated model
* fix for DDP
* drop Flake8 in GH actions
* Update pytorch_lightning/trainer/evaluation_loop.py
* fix imports
Co-authored-by: Nic Eggert <nic@eggert.io>
* check for nan values
* test nan detection on loss
* sys.exit
* whitespace
* detect nan and inf values in loss and params
* update
* added documentation
* moved detect nan to training loop, remove flag for print
* blank line
* test
* rename
* deprecate print_nan_grads
* deprecated print_nan_grads
* remove unused imports
* update changelog
* fix line too long
* correct deprecated version
Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>
* raise exception instead of sysexit
Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>
* raise exception instead of sysexit
Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>
* Update pytorch_lightning/trainer/training_tricks.py
Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>
* Update pytorch_lightning/trainer/training_tricks.py
Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>
* fix test
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* training_end renamed to training_step_end
* training_end renamed to training_step_end
* training_end renamed to training_step_end
* training_end renamed to training_step_end
* training_end to training_step_end
* training_end to training_step_end
* training_end to training_step_end
* training_end to training_step_end
* fix lost model reference
* training_end to training_step_end
* training_end to training_step_end
* training_end to training_step_end
* training_end to training_step_end
* training_end to training_step_end
* training_end to training_step_end
* training_end to training_step_end
* training_end to training_step_end
* training_end to training_step_end
* training_end to training_step_end
* training_end to training_step_end
* training_end to training_step_end
* training_end to training_step_end
* training_end to training_step_end
* training_end to training_step_end
* training_end to training_step_end
* training_end to training_step_end
* training_end to training_step_end
* training_end to training_step_end
* training_end to training_step_end
* training_end to training_step_end
* training_end to training_step_end
* training_end to training_step_end
* training_end to training_step_end
* training_end to training_step_end
* training_end to training_step_end
* training_end to training_step_end
* training_end to training_step_end
* enabled early stopping/checkpooiunt even without val step
* enabled early stopping/checkpooiunt even without val step
* enabled early stopping/checkpooiunt even without val step
* enabled early stopping/checkpooiunt even without val step
* enabled early stopping/checkpooiunt even without val step
* enabled early stopping/checkpooiunt even without val step
* enabled early stopping/checkpooiunt even without val step
* enabled early stopping/checkpooiunt even without val step
* enabled early stopping/checkpooiunt even without val step
* enabled early stopping/checkpooiunt even without val step
* enabled early stopping/checkpooiunt even without val step
* enabled early stopping/checkpooiunt even without val step
* enabled early stopping/checkpooiunt even without val step
* enabled early stopping/checkpooiunt even without val step
* enabled early stopping/checkpooiunt even without val step
* enabled early stopping/checkpooiunt even without val step
* enabled early stopping/checkpooiunt even without val step
* enabled early stopping/checkpooiunt even without val step
* enabled early stopping/checkpooiunt even without val step
* enabled early stopping/checkpooiunt even without val step
* enabled early stopping/checkpooiunt even without val step
* enabled early stopping/checkpooiunt even without val step
* enabled early stopping/checkpooiunt even without val step
* enabled early stopping/checkpooiunt even without val step
* enabled early stopping/checkpooiunt even without val step
* enabled early stopping/checkpooiunt even without val step
* enabled early stopping/checkpooiunt even without val step
* enabled early stopping/checkpooiunt even without val step
* name formatting
* version
* testing
* add test
* fix test
* Update model_checkpoint.py
* doctests
* pylint
* tests
* debug
* debug
* enabled early stopping/checkpooiunt even without val step
* fix MNIST download (#1044)
* fix MNIST download
* simple
* name formatting
* version
* testing
* add test
* fix test
* doctests
* tests
* debug
* debug
* rebased 1041
* rebased 1041
* tests
* rebased 1041
* rebased 1041
* rebased 1041
* rebased 1041
* rebased 1041
* rebased 1041
* rebased 1041
* rebased 1041
* rebased 1041
* rebased 1041
* rebased 1041
* rebased 1041
* rebased 1041
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* consolidate callbacks and hooks
* ensure callbacks recieve proper arg types
* remove model from init callback events
* clean up early stopping event
* update changelog
* remove on_fit_start and on_fit_end
* fix args for on_init_start and on_init_end
* handle case where early stopping is not used
* show all callback methods
* wrap checkpoint callback logic into proper class
* fix check for main process in checkpoint callback
* move callbacks test to separate file
* refactor arg checks
* get model and call hook on same line
* define trainer_options dict in one call
* add more asserts to callback test
* Add callback system + associated test
* Add trainer and pl_module args to callback methods
* typing
* typo in docstring
* Switch to on_.*_start()
* fix on_test_start
* fix the mess after rebasing
* added get dataloaders directly using a getter
* deleted decorator
* added prepare_data hook
* refactored dataloader init
* refactored dataloader init
* added dataloader reset flag and main loop
* added dataloader reset flag and main loop
* added dataloader reset flag and main loop
* made changes
* made changes
* made changes
* made changes
* made changes
* made changes
* made changes
* made changes
* made changes
* made changes
* made changes
* made changes
* made changes
* made changes
* made changes
* made changes
* made changes
* made changes
* made changes
* made changes
* made changes
* made changes
* made changes
* made changes
* made changes
* made changes
* made changes
* made changes
* made changes
* made changes
* made changes
* made changes
* made changes
* made changes
* made changes
* made changes
* made changes
* made changes
* made changes
* made changes
* made changes
* made changes
* made changes
* made changes
* made changes
* made changes
* fixed bad loaders
* fixed bad loaders
* fixed bad loaders
* fixed bad loaders
* fixed bad loaders
* fixed bad loaders
* fixed bad loaders
* fixed bad loaders
* fixed bad loaders
* fixed error in .fit with loaders
* fixed error in .fit with loaders
* fixed error in .fit with loaders
* fixed error in .fit with loaders
* fixed error in .fit with loaders
* fixed error in .fit with loaders
* fixed error in .fit with loaders
* fixed error in .fit with loaders
* fixed error in .fit with loaders
* fixed error in .fit with loaders
* fixed error in .fit with loaders
* fixed error in .fit with loaders
* fixed error in .fit with loaders
* fixes#909
* fixes#909
* bug fix
* Fixes#902
* Added max number of steps in Trainer
* Added docstring
* Fix flake8 errors
* Clarified docstrings
* Fixed flake8 error
* Added min_steps to Trainer
* Added steps and epochs test
* flake8
* minor fix
* fix steps test in test_trainer
* Split steps test into 2 tests
* Refactor steps test
* Update test_trainer.py
* Minor in test_trainer.py
* Update test_trainer.py
* Address PR comments
* Minor
Co-authored-by: William Falcon <waf2107@columbia.edu>
* added tpu docs
* added tpu flags
* add tpu docs + init training call
* amp
* amp
* amp
* amp
* optimizer step
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* fix test pkg create (#873)
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added test return and print
* added test return and print
* added test return and print
* added test return and print
* added test return and print
* Update pytorch_lightning/trainer/trainer.py
Co-Authored-By: Luis Capelo <luiscape@gmail.com>
* Fix segmentation example (#876)
* removed torchvision model and added custom model
* minor fix
* Fixed relative imports issue
* Fix/typo (#880)
* Update greetings.yml
* Update greetings.yml
* Changelog (#869)
* Create CHANGELOG.md
* Update CHANGELOG.md
* Update CHANGELOG.md
* Update PULL_REQUEST_TEMPLATE.md
* Update PULL_REQUEST_TEMPLATE.md
* Add PR links to Version 0.6.0 in CHANGELOG.md
* Add PR links for Unreleased in CHANGELOG.md
* Update PULL_REQUEST_TEMPLATE.md
* Fixing Function Signatures (#871)
* added tpu docs
* added tpu flags
* add tpu docs + init training call
* amp
* amp
* amp
* amp
* optimizer step
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added auto data transfer to TPU
* added test return and print
* added test return and print
* added test return and print
* added test return and print
* added test return and print
* added test return and print
* added test return and print
* added test return and print
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Luis Capelo <luiscape@gmail.com>
Co-authored-by: Akshay Kulkarni <akshayk.vnit@gmail.com>
Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk>
Co-authored-by: Shikhar Chauhan <xssChauhan@users.noreply.github.com>
* initial implementation
* formatting, pass through profiler, docstring
* call profiler during training
* add initial tests
* report stats when training is done
* fix formatting
* error handling, bugfix in passthroughprofiler
* finish documenting profiler arg in Trainer
* relax required precision for profiling tests
* option to dump cProfiler results to text file
* use logging, format with black
* include profiler in docs
* improved logging and better docs
* appease the linter
* better summaries, wrapper for iterables
* fix typo
* allow profiler=True creation
* more documentation
* add tests for advanced profiler
* Update trainer.py
* make profilers accessible in pl.utilities
* reorg profiler files
* change import for profiler tests
Co-authored-by: William Falcon <waf2107@columbia.edu>