Commit Graph

1874 Commits

Author SHA1 Message Date
William Falcon f86dd55145
fixes tpu data loader bug (#957)
* fixes tpu data loader bug

* fixes tpu data loader bug
2020-02-26 19:29:03 -05:00
Ethan Harris b2e9607362
Refactor dataloading (#955)
* Refactor dataloading

* Refactor dataloading

* Refactor dataloading

* Add shuffle to test
2020-02-26 16:55:18 -05:00
Hadrien Mary be244560b2
Callbacks [wip] (#889)
* Add callback system + associated test

* Add trainer and pl_module args to callback methods

* typing

* typo in docstring

* Switch to on_.*_start()

* fix on_test_start

* fix the mess after rebasing
2020-02-25 23:17:27 -05:00
William Falcon 96b058c5fa
added docs (#944) 2020-02-25 15:05:56 -05:00
Ir1dXD be83e7515b
feat(trainer): add enable_benchmarking option (#803)
* feat(trainer): add enable_benchmarking option

closes #370

* Update trainer.py

* Update trainer.py

* Update trainer.py

* Update trainer.py

* Update trainer.py

* Update trainer.py

* Update trainer.py

* Update trainer.py

* add test

* try to make the lint work

* fix typo

* add test, verify torch.backends.cudnn.benchmark

* make lint happy

* make lint happy

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-02-25 15:05:41 -05:00
Ethan Harris a5f159b2c7
Add support for multiple loggers (#903)
* Add support for multiple loggers

* Fix PEP

* Cleanup

* Cleanup

* Add typing to loggers

* Update base.py

* Replace duck typing with isinstance check

* Update CHANGELOG.md

* Update comet experiment type, Switch to abstractmethod in logging.py

* Fix test

* Add passes to LightningLoggerBase

* Update experiment_logging.rst
2020-02-25 14:52:39 -05:00
William Falcon 5d89fed2a6
use log no print (#940) 2020-02-25 13:06:48 -05:00
Jirka Borovec 5dd2afeab1
Fixing tests (#936)
* abs import

* rename test model

* update trainer

* revert test_step check

* move tags

* fix test_step

* clean tests

* fix template

* update dataset path

* fix parent order
2020-02-25 13:06:24 -05:00
Adrian Wälchli 20d15c8023
relax hparams (#919)
relax model loading hparams


test wip


wip


fix warning


finish test


remove unused import
2020-02-25 10:36:44 -05:00
baeseongsu 932770771b
add epoch option (#933) 2020-02-25 09:46:01 -05:00
Chirag Raman 4d36e76cbc
Update tests README to point to tests/requirements.txt (#935)
* Update tests README

Point to tests/requirements.txt as part of instructions

* Update `requirements` to `dependencies`
2020-02-25 09:45:34 -05:00
Donal Byrne 9854084136
Caching MNIST dataset for testing (#917)
* Caching MNIST dataset for testing

* Added MNIST datset to the tests directory

* Caches dataset based off hash of the test.pt file

* Cleaned Up yml file

* Cleaned Up yml file

* Removed MNIST Data from framework

* Set cache key for dataset to 'mnist'

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-02-25 09:20:41 -05:00
William Falcon ceec51d96c
fix tests (#938)
* fix tests

* fix tests
2020-02-25 08:53:33 -05:00
Matt Painter 6b667b1237
Fix/test pass overrides (#918)
* Fix test requiring both test_step and test_end

* Add test

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-02-24 22:33:11 -05:00
William Falcon 2b5293ddfc
Tpu features (#932)
* added guide

* added self.print()

* added self.print()
2020-02-24 22:30:53 -05:00
William Falcon 1015a00506
Clean up dataloader logic (#926)
* added get dataloaders directly using a getter

* deleted decorator

* added prepare_data hook

* refactored dataloader init

* refactored dataloader init

* added dataloader reset flag and main loop

* added dataloader reset flag and main loop

* added dataloader reset flag and main loop

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* made changes

* fixed bad loaders

* fixed bad loaders

* fixed bad loaders

* fixed bad loaders

* fixed bad loaders

* fixed bad loaders

* fixed bad loaders

* fixed bad loaders

* fixed bad loaders

* fixed error in .fit with loaders

* fixed error in .fit with loaders

* fixed error in .fit with loaders

* fixed error in .fit with loaders

* fixed error in .fit with loaders

* fixed error in .fit with loaders

* fixed error in .fit with loaders

* fixed error in .fit with loaders

* fixed error in .fit with loaders

* fixed error in .fit with loaders

* fixed error in .fit with loaders

* fixed error in .fit with loaders

* fixed error in .fit with loaders

* fixes #909

* fixes #909

* bug fix

* Fixes #902
2020-02-24 22:23:25 -05:00
Adrian Wälchli c56ee8bdee
Update docs for map_location (#920)
* update docs for map location

* update return description
2020-02-23 15:01:08 -05:00
srush 5778a4131c
Add tags to the rendezvous calls for TPU. (#921)
* Update data_loading.py

* Update training_io.py

* Update trainer.py
2020-02-23 15:00:32 -05:00
Hadrien Mary 89d5772f55
Split callbacks (#849)
* add .vscode in .gitignore

* Split callbacks in individual files + add a  property to Callback for easy trainer instance access

* formatting

* Add a conda env file for quick and easy env setup to develop on PL

* Adress comments

* add fix to kth_best_model

* add some typing to callbacks

* fix typo

* add autopep8 config to pyproject.toml

* format again

* format

* fix toml

* fix toml again

* consistent max line length in all config files

* remove conda env file

* Update pytorch_lightning/callbacks/early_stopping.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/callbacks/model_checkpoint.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* docstring

* Update pytorch_lightning/callbacks/model_checkpoint.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/callbacks/model_checkpoint.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* fix logic error

* format

* simplify if/else

* format

* fix linting issue in changelog

* edit changelog about new callback mechanism

* fix remaining formating issue on CHANGELOG

* remove lambda function because it's compatible with pickle (used during ddp)

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-02-22 21:45:34 -05:00
Adrian Wälchli da2f11a9c4
Type Hints for Trainer (#912)
* typehints for trainer 

fix type links in docs


fix types in docs


type hints for trainer methods


fix fit docs


switch to comments


readability


added sphinx typehints extension


wip


remove typehints from docstring


more type annotations


fix spaces

* Update trainer.py

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-02-22 21:23:30 -05:00
Jeremy Jordan e05586c4b2
extract training teardown into method, catch KeyboardInterrupt (#856)
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-02-22 17:06:48 -05:00
William Falcon 446a1e23d7
Update training_loop.py (#913) 2020-02-22 05:15:36 -05:00
fdelrio89 4ac9925dad
Fix comet logger to log after train (#892)
* Fix comet logger to log after train

* Add clarifying comment to COmetLogger code

Explains the need to use CometExistingExperiment in the CometLogger class after
CometLogger.finalize.
2020-02-21 20:47:48 -05:00
William Falcon c00a8a10dd
finished dist (#911) 2020-02-21 20:39:12 -05:00
Matt Painter 6e7dc9c236
Fixes resuming checkpoints rerunning last epoch (#866)
* Properly restore current epoch and global step on resume

* Add test

* Move increment to saving rather than loading

* Fix other tests that refer to current epoch

* Formatting

* Add warning for mid-epoch resuming

* Formatting

* Fix warning check for accumulated batches

* Add variable to init

* Formatting

* Add check for 0 training steps

* Make check more readable
2020-02-21 20:27:19 -05:00
Jirka Borovec 2b5458e852
add Sphinx Check (#844)
* add sphinx bot

* source

* typo

* Make a change to the docs (#2)

* Make a change to the docs
* Introduce an error
* Install git before building docs
* Apply suggestions from code review

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update docs/source/apex.rst

* Update docs/source/apex.rst

Co-authored-by: Ammar Askar <ammar_askar@hotmail.com>
2020-02-21 15:55:03 -05:00
Hadrien Mary 5c5a241e01
Add conda env setup (#898)
* add a conda env file for easy PL conda env setup

* Update environment.yml

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update environment.yml

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update environment.yml

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update environment.yml

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update environment.yml

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update environment.yml

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-02-21 15:31:40 -05:00
Jirka Borovec f7e9700aae
update Loggers (#818)
* add warnings

* fix link
2020-02-21 13:39:37 -05:00
Aljoscha Steffens 9eb1907151
separate requirements for logger dependencies (#792)
* added file that contains information on the minimal versions needed for the supported loggers

* copied minimal version, combined files, deleted duplicates

* sorted functions in tests/test_loggers.py to be consistent

* expanded wandb logging test; added minimal versions for requirements-extra.txt; increased the amount of training data that is used for tests

* formatting

* added requirements-extra.txt to MANIFEST.in

* reverted wandb test; ensured minimal version for dependencies in requirements-extra.txt in ci-testing.yml
2020-02-21 13:30:27 -05:00
Tullie Murrell 897def2cac
Fix backwards compatibility for optional logging dependencies (#900) 2020-02-21 13:18:27 -05:00
Jirka Borovec b933b23d5c
add Stale action (#905) 2020-02-21 11:46:42 -05:00
Jirka Borovec 56dddf9708
update CHANGELOG (#897)
add info about TPU and segmentation
2020-02-19 09:08:43 -05:00
Jirka Borovec b5e9fd0b2c
typo JB
typo in my name lol
2020-02-19 14:56:46 +01:00
William Falcon b1040523b2
update contributors (#895)
* updated governance docs

* added maintainers to readme

* added governance docs

* added governance docs
2020-02-19 07:49:22 -05:00
Jeremy Jordan ea8878bc14
clean up tests/test_profiler.py (#867)
* cleanup docstrings, _get_total_cprofile_duration in module

* relax profiler overhead tolerance
2020-02-19 07:09:28 -05:00
Nicki Skafte c58aab0b00
remove deprecated args to learning rate step function (#890) 2020-02-19 06:37:35 -05:00
William Falcon c4b0693a4d
update governance docs (#894)
* updated governance docs

* added maintainers to readme

* added governance docs
2020-02-19 06:26:23 -05:00
Nicki Skafte ffd6e693de
new way of passing dataloaders (#759)
* new way of passing dataloaders

* fixed docs

* fixed codestyle to follow flake8

* allow val/test be list of dataloaders and smarter checking

* added test

* fix flake error

* fix linking to new test model

* split into multiple test

* fix naming and typo

* minor documentation changes

* remove random file

* Update trainer.py

* Update trainer.py

* Update trainer.py

* Update trainer.py

* Update trainer.py

* Update trainer.py

* better error/warning message

* final adjustments

* update CHANGELOG.md

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-02-19 06:00:08 -05:00
Luis Capelo b9b5a93f0f
Updates theme Sphinx configuration (#893)
I am updating the project's Sphinx documentation to fix (#819). The issue is related to a library the Sphinx extension `nbsphinx` (to load Jupyter Notebooks) loads into the docs context (RequireJS). That library conflicts with other theme libraries, causing the latter to be not loaded. This would result in several crashes, the most obvious of them the lack of anchors.

The fix above solves all errors -- and now anchors work.
2020-02-19 04:48:33 -05:00
Vadim Bereznyuk dfbb50cd6a
Fix docs for early stopping (#865)
* updated docs

* updated docs

* upd
2020-02-18 11:25:39 -05:00
Peter Izsak 054a35312d
Added max number of steps in Trainer (#728)
* Added max number of steps in Trainer

* Added docstring

* Fix flake8 errors

* Clarified docstrings

* Fixed flake8 error

* Added min_steps to Trainer

* Added steps and epochs test

* flake8

* minor fix

* fix steps test in test_trainer

* Split steps test into 2 tests

* Refactor steps test

* Update test_trainer.py

* Minor in test_trainer.py

* Update test_trainer.py

* Address PR comments

* Minor

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-02-18 11:23:22 -05:00
William Falcon 9571de8757
fix tpu docs (#886) 2020-02-17 17:52:42 -05:00
William Falcon 3562aa5aae fix tpu transfer bug 2 2020-02-17 17:47:16 -05:00
William Falcon 919a26fe41 fix tpu transfer bug 2020-02-17 17:46:46 -05:00
William Falcon d4a31f02e0
Enable TPU support (#868)
* added tpu docs

* added tpu flags

* add tpu docs + init training call

* amp

* amp

* amp

* amp

* optimizer step

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* fix test pkg create (#873)

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added test return and print

* added test return and print

* added test return and print

* added test return and print

* added test return and print

* Update pytorch_lightning/trainer/trainer.py

Co-Authored-By: Luis Capelo <luiscape@gmail.com>

* Fix segmentation example (#876)

* removed torchvision model and added custom model

* minor fix

* Fixed relative imports issue

* Fix/typo (#880)

* Update greetings.yml

* Update greetings.yml

* Changelog (#869)

* Create CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update PULL_REQUEST_TEMPLATE.md

* Update PULL_REQUEST_TEMPLATE.md

* Add PR links to Version 0.6.0 in CHANGELOG.md

* Add PR links for Unreleased in CHANGELOG.md

* Update PULL_REQUEST_TEMPLATE.md

* Fixing Function Signatures (#871)

* added tpu docs

* added tpu flags

* add tpu docs + init training call

* amp

* amp

* amp

* amp

* optimizer step

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added test return and print

* added test return and print

* added test return and print

* added test return and print

* added test return and print

* added test return and print

* added test return and print

* added test return and print

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Luis Capelo <luiscape@gmail.com>
Co-authored-by: Akshay Kulkarni <akshayk.vnit@gmail.com>
Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk>
Co-authored-by: Shikhar Chauhan <xssChauhan@users.noreply.github.com>
2020-02-17 16:01:20 -05:00
Akshay Kulkarni e38b18e9eb
updated fast training docs with latest usage (#884) 2020-02-17 15:47:07 -05:00
Akshay Kulkarni 0ad3e8b8e9
changed to absolute imports and added docs (#881) 2020-02-17 11:05:59 -05:00
Shikhar Chauhan f44dfb3e7a
Fixing Function Signatures (#871) 2020-02-17 08:10:10 -05:00
Ethan Harris a33beb6ebf
Changelog (#869)
* Create CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update PULL_REQUEST_TEMPLATE.md

* Update PULL_REQUEST_TEMPLATE.md

* Add PR links to Version 0.6.0 in CHANGELOG.md

* Add PR links for Unreleased in CHANGELOG.md

* Update PULL_REQUEST_TEMPLATE.md
2020-02-17 08:09:11 -05:00
Ethan Harris 93e8ad1aa7
Fix/typo (#880)
* Update greetings.yml

* Update greetings.yml
2020-02-17 08:04:16 -05:00