Commit Graph

138 Commits

Author SHA1 Message Date
Jirka Borovec 853c4c1e7b
update deprecated messages (#810)
* update deprecated messages

* formatting

* fix docs tags
2020-02-11 07:41:15 -05:00
Jirka Borovec a804755e6e update logger init (#727)
* update logger init

* formatting
2020-01-23 11:36:40 -05:00
Jirka Borovec db6b404748 CI pass (#671)
* fix pillow in test

* test acc

* update version in deprecated msg
2020-01-13 22:09:47 -05:00
Jirka Borovec ab4fea0b55 fix defecation warnings (#570)
* fix defecation warnings

* flake8

* update deprecations
2019-12-04 06:59:19 -05:00
Jirka Borovec 9785a3e78e Refactor: name modules (#548)
* refactor: rename some modules

* add deprecation warnings

* fix paths
2019-11-26 22:39:18 -05:00
William Falcon 3e38005a61
Ddp2 fix (#448)
* added training_end

* added training_end

* added training_end

* added training_end

* added training_end

* added training_end

* added training_end

* added training_end

* added training_end

* added training_end

* added training_end

* added training_end

* allow ddp and apex to be configured

* allow ddp and apex to be configured

* bananas

* bananas

* bananas

* bananas

* bananas

* bananas

* bananas

* bananas

* bananas

* bananas

* bananas

* bananas

* bananas

* bananas

* bananas

* added eval and train for redundancy

* added eval and train for redundancy

* added training_end

* added training_end

* added training_end

* added training_end

* added training_end

* added training_end

* added training_end

* added training_end

* added training_end

* added training_end

* added training_end

* added training_end

* allow ddp and apex to be configured

* allow ddp and apex to be configured

* bananas

* bananas

* bananas

* bananas

* bananas

* bananas

* bananas

* bananas

* bananas

* bananas

* bananas

* bananas

* bananas

* bananas

* bananas

* added eval and train for redundancy

* added eval and train for redundancy
2019-11-05 10:01:52 -05:00
William Falcon 8fbaccddae
added eval and train for redundancy (#464) 2019-11-05 09:14:33 -05:00
Yongrae Jo 32dd803b1e Fix min_max gpu memory logging bug (#453)
* #452 Fix ValueError

* #452 Use subprocess.run

* #452 Simplify code for gpu_memory_map

* #452 Simplify code for min max memory

* #452 Add test for get_memory_profile

* #452 Use os.sep

* #452 Use os.linesep
2019-11-05 08:55:44 -05:00
Ir1dXD 5a9afb11cc change print to logging (#457)
* change print to logging

* always use logging.info

* use f-strings

* update code style

* set logging configs

* remove unused code
2019-11-05 08:43:21 -05:00
Tullie Murrell 248495b1d1 Add tbptt (#429)
* Add truncated bptt

* Fix rebase error

* AutoPep8

* Address comments, incl default bptt_split impl

* Add tbptt test

* Add default split for lists/tuples

* Add tbptt docs

* Fix trainer spacing

* Update RequiredTrainerInterface.md
2019-10-31 06:45:28 -04:00
William Falcon d5ca464cc6
Back hook (#424)
* Fixes #356

* Fixes #356

* Fixes #356

* Fixes #356

* Fixes #356

* Fixes #356
2019-10-24 07:56:56 -04:00
Nic Eggert 05cea3ff8b Save / Load Hyperparameters with checkpoint (#415)
* Save and load hparams from checkpoints

* Update docs

* Add warning when not saving hparams

* Missing import

* Update .run_local_tests.sh

* Update lm_test_module_mixins.py

* Update lightning_module_template.py
2019-10-23 04:48:24 -04:00
Jirka Borovec f18aee30a5 Minor imports cleaning (#402)
* code cleaning

* drop unused imports

* optimize imports
2019-10-22 11:32:40 +03:00
tamyiuchau 4103a5ca73 Provide backward compatibility for #124 (#400)
* Provide backward compatibility for e681253

* typo fix
2019-10-21 08:16:55 +02:00
Adrian Wälchli 6e3e740a7f Param printing (#336)
* print thousands as K, M, B, T, ...

* add option to print top-level modules only

* added doc string and added spacing

* do not print summary if neither "full" nor "top"

* updated docs showing summary print options

* fix line length for travis
2019-10-08 15:30:06 -04:00
William Falcon 07c5d22ae3
cleaning up demos (#313)
* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos

* cleaning up demos
2019-10-05 16:39:05 -04:00
William Falcon 6cc3f1757f
decouple returns from each step (#307)
* decoupled training metrics from logging metrics

* decoupled validation metrics from log metrics

* updated docs

* updated docs

* updated docs

* Fixed test

* merged master

* merged master

* merged master

* merged master

* merged master

* merged master

* merged master

* merged master

* merged master

* merged master

* merged master

* merged master

* merged master

* merged master

* merged master

* merged master

* merged master

* merged master

* merged master

* merged master

* merged master

* merged master

* merged master

* merged master

* merged master
2019-10-05 13:35:20 -04:00
William Falcon 8f5a06bfb8
Gpu mem (#308)
* Fixes #289

* Fixes #289

* added lbfgs support

* Fixes #280 (#309)

* added test seeds (#306)

* added test seeds

* added test seeds

* updated docs

* added lbfgs support (#310)

* added lbfgs support

* added lbfgs support

* added lbfgs support

* Fixes #280 (#309)

* added test seeds (#306)

* added test seeds

* added test seeds

* updated docs

* added lbfgs support

* added lbfgs support

* added lbfgs support

* added lbfgs support

* added lbfgs support

* added lbfgs support

* added lbfgs support

* added lbfgs support

* Fixes #289

* Fixes #289

* merged master

* merged master
2019-10-05 11:29:34 -04:00
William Falcon 967957e55c added lbfgs support 2019-10-05 10:47:18 -04:00
William Falcon bf09060fef
Fixes #292 (#303)
* early stopping callback is not default

* added a default logger

* added default checkpoint callback

* added default checkpoint/loggers

* added default checkpoint/loggers

* updated docs

* cleaned demos

* cleaned demos

* cleaned demos

* clean up docs around loggers

* clean up docs around loggers

* clean up docs around loggers

* clean up docs around loggers

* clean up docs around loggers

* clean up docs around loggers

* clean up docs around loggers

* clean up docs around loggers

* clean up docs around loggers

* clean up docs around loggers

* clean up docs around loggers

* clean up docs around loggers

* clean up docs around loggers
2019-10-04 19:48:57 -04:00
William Falcon a578de511d
clean up docs around loggers (#304) 2019-10-04 18:53:38 -04:00
William Falcon a60a24d11b
disable auto gpu loading when restoring weights to avoid OOM (#242)
* Update root_module.py

* Update root_module.py

* Update root_module.py

* tests fix

* tests fix
2019-10-04 16:18:43 -04:00
William Falcon 73a7cf3c99
Mem crash (#299)
* fixes memory crash

* fixes memory crash
2019-10-04 15:53:44 -04:00
Hendrik Schröter 36f0b5bbd0 Use getter instead of python property for the dataloaders (#275)
* Use getter instead of python property for the dataloaders

* Fix lint

* Update trainer.py
2019-10-04 15:35:02 -04:00
William Falcon 32e74b8f36
Ddp2 (#261)
* adds ddp2 option where on each node a single  process  uses all gpus

* added ddp2  test

* added ddp2 docs

* Update Distributed training.md

* delete ref to old update_training_log_metrics

* delete ref to old update_training_log_metrics

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* banana pancakes

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* cheesecake
2019-10-04 15:07:54 -04:00
Alok Singh b0a0a47a0b Rename variables (#124)
-   data_batch → batch
-   batch_i → batch_idx
-   dataloader_i → dataloader_idx
-   tng → training
-   training_dataloader → train_dataloader
-   add_log_row_interval → row_log_interval
-   gradient_clip → gradient_clip_val
-   prog → progress
-   tqdm_dic → tqdm_dict
2019-09-25 19:05:06 -04:00
William Falcon 55e7322747
Metrics load (#228)
* load from metrics defaults to CPU

* load from metrics defaults to CPU

* load from metrics defaults to CPU
2019-09-16 10:47:19 -04:00
William Falcon 7099f8dbfb
split trainer mixins (#209)
* split trainer mixins

* Update multi_node_cluster_template.py

* Update single_cpu_template.py

* Update single_gpu_node_16bit_template.py

* Update single_gpu_node_ddp_template.py

* Update single_gpu_node_dp_template.py

* Update trainer_cpu_template.py

* Update trainer_io.py

* split trainer mixins

* Update multi_node_cluster_template.py

* deconflicted

* deconflicted

* deconflicted
2019-09-06 14:11:07 -04:00
William Falcon 60633eaa32
Moves hpc auto-resubmit to trainer from test-tube (#207)
* added slurm signal handler

* added restore weight functions

* set slurm signal handling inside process

* added resubmit docs

* added resubmit docs

* fixed missing param

* Update trainer.py

* fixed missing param

* fixed missing param

* debugging tests

* debugging tests

* debugging tests

* debugging tests

* debugging tests

* debugging tests

* debugging tests
2019-09-06 11:54:51 -04:00
Verena Haunschmid 25d5b25792 Expectopatronum implement #89 (#182)
* rename validate -> evaluate; implement test logic; allow multiple test_loaders

* add test_step and test_end to LightningModule

* add in_test_mode to pretraining to implement case 2 (test pretrained model)

* fix code style issues

* LightningTestModel: add optional second test set, implement test_step and test_end

* implemented test for multiple test_dataloaders; fixed typo

* add two test cases for #89

* add documentation for test_step, test_end; fix computation of loss in validation_step example

* Update trainer.py

* Update trainer.py

* Update trainer.py

* Update trainer.py

* Update trainer.py

* Update trainer.py

* Added proper dp ddp routing calls for test mode

* Update trainer.py

* Update test_models.py

* Update trainer.py

* Update trainer.py

* Update override_data_parallel.py

* Update test_models.py

* Update test_models.py

* Update trainer.py

* Update trainer.py

* Update trainer.py

* Update test_models.py

* Update test_models.py

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* Update trainer.py

* Update override_data_parallel.py

* Update debug.py

* Update lm_test_module.py

* Update test_models.py
2019-09-02 07:15:27 -04:00
William Falcon 4104a0fc47
cleaned up progbar (#165)
* cleaned up progbar

* cleaned up progbar

* cleaned up progbar

* cleaned up progbar

* cleaned up progbar

* cleaned up progbar

* cleaned up progbar

* updated base files

* updated base files

* updated base files

* updated base files

* updated base files

* updated base files

* updated base files

* updated base files

* updated base files

* updated base files

* updated base files

* updated base files

* updated base files

* updated base files

* updated base files

* updated base files

* updated base files

* updated base files

* updated base files

* updated base files

* updated base files

* flake 8
2019-08-23 21:23:27 -04:00
Sebastian Præsius b31539f62e Guard against AttributeError in dataloaders. (#161)
A solution for https://github.com/williamFalcon/pytorch-lightning/issues/142.
Since hasattr "calls getattr(object, name) and to see whether it raises an AttributeError or not", I replaced it with a single call to getattr.
See also https://stackoverflow.com/questions/24971061/python-hasattr-vs-getattr
2019-08-23 08:21:39 -04:00
William Falcon 7f53e7bfb3
Val idx optional in validation_step (#108)
* made dataset_i only available with multiple datasets

* updated interface signature

* updated tests
2019-08-13 11:37:37 -04:00
William Falcon 905a2e5a12
allow user to control optimizer step for every optimizer
* added custom hook for user defined optimizer step

* refactored to allow multiple optimizers different training_step

* refactored to allow multiple optimizers different training_step

* refactored to allow multiple optimizers different training_step

* refactored to allow multiple optimizers different training_step

* refactored to allow multiple optimizers different training_step

* pep8
2019-08-13 09:32:45 -04:00
William Falcon e5805bf8ff
val and test are optional now (#95)
* made validation step optional

* added no val model

* val_step can be implemented but not validation_end

* added no val end model

* added tests

* added tests

* remove class

* remove class

* remove class

* remove class

* remove class

* remove class

* remove class

* remove class

* remove class

* remove class

* remove class

* updated docs

* updated docs

* updated test

* updated test

* updated test

* updated test

* updated test

* updated test

* updated test

* updated test

* updated test

* fix pep8
2019-08-11 10:01:57 -04:00
William Falcon 10e4b18452 made imports absolute 2019-08-07 10:14:59 -04:00
William Falcon 35f23bbc82
Merge pull request #55 from williamFalcon/continue
add training restore
2019-08-07 09:02:16 -04:00
William Falcon cdbcbad352 added hook on_sanity_check_start 2019-08-07 07:51:55 -04:00
William Falcon 5c398d7a4e removed bad hook call 2019-08-07 07:39:41 -04:00
William Falcon a931ded310 removed bad hook call 2019-08-07 07:35:02 -04:00
William Falcon 95ec072d1e removed bad hook call 2019-08-07 07:30:02 -04:00
William Falcon d3f19c8321 added auto restore 2019-08-07 06:55:05 -04:00
Jiri BOROVEC d9bfe964f9 update by flake8 2019-08-06 22:45:46 +02:00
Jiri BOROVEC 632d07b490 fix prints for py3.5 2019-08-06 22:45:46 +02:00
Jiri BOROVEC c44966a8bf apply PEP8 2019-08-06 22:45:27 +02:00
Jiri BOROVEC 469941a528 pkg relative imports
* split requirements.txt
* pytest verbose
2019-08-05 10:52:09 +02:00
William Falcon 019b4d16d0 formatting 2019-08-04 13:08:14 -05:00
William Falcon f2ef367f7d removing unused imports 2019-08-04 13:07:50 -05:00
William Falcon ef6d5a412c proc 0 only for save hpc. all procs for hpc load 2019-08-01 16:19:04 -04:00
williamFalcon 27660b8a96 running tests 2019-07-28 05:57:37 -07:00