Commit Graph

1500 Commits

Author SHA1 Message Date
William Falcon ef98931d18 flake8 2019-10-05 16:56:24 -04:00
William Falcon a59f351ef8 updated readme 2019-10-05 16:52:58 -04:00
William Falcon 5e41159b16 updated readme 2019-10-05 16:47:31 -04:00
William Falcon 07c5d22ae3
cleaning up demos (#313)
* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos
2019-10-05 16:39:05 -04:00
William Falcon f7d762416c
cleaning up demos (#312)
* cleaning up demos

* Update job_submit.sh

* Update README.md
2019-10-05 14:48:22 -04:00
William Falcon cdfcb01073
Fixes #234 (#311)
* Fixes #234

* default logger version is now slurm job id

* default logger version is now slurm job id
2019-10-05 14:45:37 -04:00
William Falcon ed86bf96c5 cleaned up demos 2019-10-05 14:30:12 -04:00
William Falcon 8c2adf6250 cleaned up demos 2019-10-05 14:28:08 -04:00
William Falcon e739c79819 cleaned up demos 2019-10-05 14:21:12 -04:00
William Falcon 94f89e8e10 cleaned up demos 2019-10-05 14:15:09 -04:00
William Falcon 4d3a8c25d2 cleaned up demos 2019-10-05 14:13:55 -04:00
William Falcon 9fc01e3fd3 cleaned up demos 2019-10-05 14:13:32 -04:00
William Falcon c86524b0cc
Update single_gpu_node_ddp_template.py 2019-10-05 13:55:05 -04:00
William Falcon 0e2b0e39b5
Update single_gpu_node_16bit_template.py 2019-10-05 13:54:07 -04:00
William Falcon d03d7a2440
Update single_cpu_template.py 2019-10-05 13:52:25 -04:00
William Falcon cdc6e6a4bb
Update .run_local_tests.sh 2019-10-05 13:47:33 -04:00
William Falcon 6cc3f1757f
decouple returns from each step (#307)
* decoupled training metrics from logging metrics

* decoupled validation metrics from log metrics

* updated docs

* updated docs

* updated docs

* Fixed test

* merged master

* merged master

* merged master

* merged master

* merged master

* merged master

* merged master

* merged master

* merged master

* merged master

* merged master

* merged master

* merged master

* merged master

* merged master

* merged master

* merged master

* merged master

* merged master

* merged master

* merged master

* merged master

* merged master

* merged master

* merged master
2019-10-05 13:35:20 -04:00
William Falcon 8f5a06bfb8
Gpu mem (#308)
* Fixes #289

* Fixes #289

* added lbfgs support

* Fixes #280 (#309)

* added test seeds (#306)

* added test seeds

* added test seeds

* updated docs

* added lbfgs support (#310)

* added lbfgs support

* added lbfgs support

* added lbfgs support

* Fixes #280 (#309)

* added test seeds (#306)

* added test seeds

* added test seeds

* updated docs

* added lbfgs support

* added lbfgs support

* added lbfgs support

* added lbfgs support

* added lbfgs support

* added lbfgs support

* added lbfgs support

* added lbfgs support

* Fixes #289

* Fixes #289

* merged master

* merged master
2019-10-05 11:29:34 -04:00
William Falcon 75fd89106f
added lbfgs support (#310)
* added lbfgs support

* added lbfgs support

* added lbfgs support

* Fixes #280 (#309)

* added test seeds (#306)

* added test seeds

* added test seeds

* updated docs

* added lbfgs support

* added lbfgs support

* added lbfgs support

* added lbfgs support

* added lbfgs support

* added lbfgs support

* added lbfgs support

* added lbfgs support
2019-10-05 11:10:21 -04:00
William Falcon c9786cdef1
added test seeds (#306)
* added test seeds

* added test seeds

* updated docs
2019-10-05 10:56:52 -04:00
William Falcon 2ac9f1aea7
Fixes #280 (#309) 2019-10-05 10:55:50 -04:00
William Falcon 967957e55c added lbfgs support 2019-10-05 10:47:18 -04:00
William Falcon bf09060fef
Fixes #292 (#303)
* early stopping callback is not default

* added a default logger

* added default checkpoint callback

* added default checkpoint/loggers

* added default checkpoint/loggers

* updated docs

* cleaned demos

* cleaned demos

* cleaned demos

* clean up docs around loggers

* clean up docs around loggers

* clean up docs around loggers

* clean up docs around loggers

* clean up docs around loggers

* clean up docs around loggers

* clean up docs around loggers

* clean up docs around loggers

* clean up docs around loggers

* clean up docs around loggers

* clean up docs around loggers

* clean up docs around loggers

* clean up docs around loggers
2019-10-04 19:48:57 -04:00
William Falcon a578de511d
clean up docs around loggers (#304) 2019-10-04 18:53:38 -04:00
William Falcon a8ccb88163 Merge branch 'master' of https://github.com/williamFalcon/pytorch-lightning 2019-10-04 17:32:59 -04:00
William Falcon 033be9e9b4 tests fix 2019-10-04 17:32:52 -04:00
William Falcon 9ffd64bd60
Gpu load (#302)
* Update root_module.py

* Update root_module.py

* Update root_module.py

* tests fix

* tests fix

* tests fix
2019-10-04 17:21:11 -04:00
William Falcon 3a3ac73963 Merge branch 'master' of https://github.com/williamFalcon/pytorch-lightning 2019-10-04 16:56:05 -04:00
William Falcon cf07c153e9 tests fix 2019-10-04 16:55:51 -04:00
William Falcon a60a24d11b
disable auto gpu loading when restoring weights to avoid OOM (#242)
* Update root_module.py

* Update root_module.py

* Update root_module.py

* tests fix

* tests fix
2019-10-04 16:18:43 -04:00
William Falcon af1456a051
Update .run_local_tests.sh 2019-10-04 15:58:54 -04:00
William Falcon 73a7cf3c99
Mem crash (#299)
* fixes memory crash

* fixes memory crash
2019-10-04 15:53:44 -04:00
Hendrik Schröter 36f0b5bbd0 Use getter instead of python property for the dataloaders (#275)
* Use getter instead of python property for the dataloaders

* Fix lint

* Update trainer.py
2019-10-04 15:35:02 -04:00
William Falcon 32e74b8f36
Ddp2 (#261)
* adds ddp2 option where on each node a single  process  uses all gpus

* added ddp2  test

* added ddp2 docs

* Update Distributed training.md

* delete ref to old update_training_log_metrics

* delete ref to old update_training_log_metrics

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* cheesecake
2019-10-04 15:07:54 -04:00
William Falcon 2d335c664c
Update multi_node_cluster_auto_slurm.py 2019-10-04 13:07:22 -04:00
William Falcon 5fdfad5766
Update multi_node_cluster_auto_slurm.py 2019-10-04 13:05:52 -04:00
Hendrik Schröter 42764d18c7 Better error message if no loss was returned from model.training_step() (#294) 2019-10-04 07:15:19 -04:00
Wouter van Amsterdam 63c475c600 tiny spelling error (#295) 2019-10-04 07:14:30 -04:00
kvhooreb 41236c7bbb WIP: Moved grad_norm tracking code to __run_tng_batch (#278)
* Moved grad_norm tracking code to __run_tng_batch + added norms to tqdm_metrics

* Update trainer.py

* Update trainer.py

* Update trainer.py

* Update trainer.py
2019-10-02 11:11:08 -04:00
Nic Eggert 614cb3c03b Initialize loggers only once (#270)
* Create underlying loggers lazily

This avoids creating duplicate experiments or run in multi-node DDP.

* Save hyperparameters automatically

* Update docs for snapshotting hyperparams

* Fix test tube

* Fix test tube pickling
2019-10-02 11:10:40 -04:00
Anton Bakhtin 222d7d2d5d Hacky fix for mlflow logger (#277)
* Hacky fix for mlflow logger

It dies when "created_at" is logged

* Log warning
2019-10-01 21:32:52 -04:00
William Falcon 133d6b3ec1 updated docs 2019-10-01 06:38:10 -04:00
William Falcon fbc2cfd513 updated docs 2019-10-01 06:29:12 -04:00
Hendrik Schröter dd45896e78 Allow newer torch versions (#269) 2019-10-01 05:24:49 -04:00
Hendrik Schröter 8a2472269a Make test_tube optional (#274) 2019-10-01 05:19:47 -04:00
William Falcon 324c28eb5e
Update README.md 2019-09-27 12:09:18 -04:00
William Falcon 970d032d80
Update README.md 2019-09-27 12:08:36 -04:00
Nic Eggert 480eed5cb6 Enable any ML experiment tracking framework (#223)
* Implement generic loggers for experiment tracking

* Add tests for loggers

* Get model tests passing

* Test and fix logger pickling

* Expand pickle test and fix bug

* Missed exp -> logger conversion

* Remove commented code

* Add docstrings

* Update logging docs

* Add mlflow to test requirements

* Make linter happy

* Fix mlflow timestamp

* Update Logging.md

* Update test_models.py

* Update test_models.py

* Update test_models.py

* Update properties.md

* Fix tests

* Line length
2019-09-27 12:05:29 -04:00
William Falcon e9c5aff7ba
Update .run_local_tests.sh 2019-09-26 18:44:27 -04:00
William Falcon 1d7ffd11da
delete ref to old update_training_log_metrics (#262) 2019-09-26 17:53:15 -04:00