Carlos Mocholí
f0c5479de9
Remove legacy `Result` parameters ( #6016 )
2021-03-28 11:55:08 +02:00
thomas chaton
0e45220263
[warning] Add warning when values are not being reduced ( #6417 )
...
* add warning non reduced
* add test
* update test
* update changelog
* Update pytorch_lightning/trainer/connectors/logger_connector/epoch_result_store.py
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
* update
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
2021-03-26 18:33:11 +00:00
Carlos Mocholí
b730a5a281
Do not describe when there's no summary ( #6681 )
2021-03-26 14:58:05 +00:00
Carlos Mocholí
bc613611e2
Do not add return dict items to callback_metrics ( #6682 )
2021-03-26 14:05:20 +01:00
Ethan Harris
6b990f3fa5
Add artifcact_location arg to MLFlow logger ( #6677 )
...
* Add artifcact_location arg to MLFlow logger
* Add CHANGELOG URL
* Update test
2021-03-26 00:12:03 +01:00
thomas chaton
0ea8f39841
Resolve schedule step bug for PyTorch Profiler ( #6674 )
...
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-03-25 17:03:06 +01:00
Jirka Borovec
217c12a4e7
Simplify deprecations ( #6620 )
...
* use external deprecate
* simplify
* simplify
* simplify
* flake8
* .
* others
* .
2021-03-25 15:26:38 +01:00
Rohit Gupta
9be092dbdb
Add on_epoch_start to run at the beginning of every loop irrespective of train/val/test ( #6498 )
...
* update docs
* add hook and update docs
* update tests
* chlog
* Update CHANGELOG.md
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* chlog
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-03-25 14:20:49 +01:00
ananthsub
40976e4eba
Support teardown hook on DataModule ( #4673 )
...
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
Co-authored-by: chaton <thomas@grid.ai>
2021-03-25 07:51:55 -05:00
Kaushik B
2cbdc01256
Fix checkpoint callback & Trainer.test(_) issue for TPUs ( #6654 )
...
* Fix checkpoint callback issue for TPUs
* update changelog
* add barrier
* apply code suggestions
* update trainer test
* remove spaces
* fix tpu tests
* Apply suggestions from code review
* add comment
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2021-03-25 10:37:37 +00:00
Shengyao Zhuang
b8ef52baa1
Match the number of outputs of backward with forward for AllGatherGrad ( #6625 )
2021-03-25 15:07:58 +05:30
Carlos Mocholí
2dd6f9e09d
`MetricsHolder` clean-up + typing ( #6645 )
...
* Metrics holder cleanup and better error message
* Update pytorch_lightning/trainer/connectors/logger_connector/logger_connector.py
* _VALUE -> _METRIC_TYPE
2021-03-24 20:34:46 +01:00
Jirka Borovec
d471fa30b3
add copyr ( #6661 )
2021-03-24 14:29:46 +01:00
ananthsub
ab4c838ba3
Remove ModelSummary validation from train loop on_trainer_init ( #6610 )
2021-03-24 13:54:41 +01:00
Akihiro Nitta
ac60536818
Follow E231 [flake8] ( #6110 )
...
* Remove E231 from ignore list
* Follow E231
* Update pytorch_lightning/trainer/data_loading.py
2021-03-24 12:50:50 +01:00
Ethan Harris
d02fe342c1
Feature/double precision ( #6595 )
...
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
2021-03-24 15:47:58 +05:30
Jirka Borovec
5733889203
Docs/robots ( #6658 )
2021-03-24 10:46:56 +01:00
Jirka Borovec
70beddfc13
Prune metrics: others 11/DoNe ( #6659 )
...
* classif
* grad_img
* nlp
* ssl
* format
2021-03-24 09:16:28 +01:00
Ethan Harris
741c452551
Fix disabled grads after call to predict ( #6657 )
2021-03-23 23:07:48 +01:00
thomas chaton
fd5cb7fcc3
Add PyTorch 1.8 Profiler 5/5 ( #6618 )
...
* Refactor profilers
* Update PassThrough
* WIP - This is broken and will change
* Update pytorch_lightning/profiler/pytorch.py
Co-authored-by: thomas chaton <thomas@grid.ai>
* resolve tests
* resolve tests
* find output
* try something
* update
* add support for test and predict
* update
* update
* use getattr
* test
* test
* update
* tests
* update
* update
* update
* update
* update
* remove file
* update
* update
* update
* update
* update
* test
* update#
* update
* update tests
* update
* add suport for 1.8
* rename records
* add support for 1.8
* update
* resolve flake8
* resolve test
* Refactor basic profilers
* Fixes
* Unused import
* Introduce setup
* Profile on all ranks. Print to stdout on 0
* Introduce dirpath + filename
* CHANGELOG
* Add tests. Address comments
* add `on_run_stage_setup`
* add on_run_stage_setup function
* update
* add test for RegisterRecordFunction
* update lightnng flow direction
* move variable to private
* remove trace
* Undo code that should be in 3/4
* Multi-stage multi-rank
* 2/5 changes
* Pass stage in __del__
* Remove TODOs
* Describe on_evaluation_end. Add tests
* Typo
* Address comments
* deepcopy tests
* Advanced teardown
* Fix teardown test
* Fix tests
* Minor change
* Update CHANGELOG.md
* Fix test
* Quick fixes
* Fix 6522
* resolve ddp tests
* resolve tests
* resolve some tests
* update tests
* resolve tests
* update
* resolve tests
* resolve some tests
* Missed fixes from 3/5
* Fixes
* resolve some tests
* resolve test for 1.7.1
* Broken refactor
* Missed stage
* Minor changes
* resolve tests
* Update CHANGELOG
* resolve bug
* remove print
* Typo
* Cleanup
* resolve ddp test
* remove barrier
* update profiler
* update
* Smaller model
* update
* resolve tests
* update
* Minor changes. CHANGELOG
* Minimize diff
* update to 1.8.1
* RunIf. Extra code. Check segfault
* resolve tests
* Typo. Bad merge
* Fixing a bad merge
* replace for kineto
* Update pytorch_lightning/profiler/pytorch.py
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
* Update pytorch_lightning/profiler/pytorch.py
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
* Minor changes
* Bad merge
* Use lists for flexibility
* Use sets
* predict_step
* Ananth's suggestion
* update
* Docs
* Update pl_examples/basic_examples/profiler_example.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* update example
* update example
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2021-03-23 20:43:21 +00:00
Carlos Mocholí
51b10f78f4
Refactor PyTorch profiler 4/5 ( #6349 )
...
Co-authored-by: thomas chaton <thomas@grid.ai>
2021-03-23 18:13:29 +01:00
Jirka Borovec
3cf0c3117a
fix back-compatibility for Accel ( #6655 )
2021-03-23 17:41:36 +01:00
thomas chaton
0995d30fab
Flash predict step ( #6577 )
...
* add predict_step
* Update predict_loop.py
* Update trainer.py
* Update trainer.py
* resolve bugs
* update
* update
* update
* resolve bug
* resolve some failing tests
* udpate tests
* update
* resolve tests
* add a test
* remove typo
* add a test for attachement
* update
* changed to on_train_dataloader
* remove __flash_special_attr__
* resolve tests
* update
* update
* update
* update on comments
* Update pytorch_lightning/trainer/data_loading.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2021-03-23 11:13:13 -04:00
Jirka Borovec
a74909affa
prune metrics: info retrieval ( #6649 )
2021-03-23 15:05:32 +00:00
Carlos Mocholí
36d180e532
Refactor base profilers 3/5 ( #6621 )
...
Co-authored-by: tchaton <thomas@grid.ai>
2021-03-23 10:07:35 +00:00
Jirka Borovec
f93414d085
Prune metyrics: regression 9/n ( #6637 )
...
* psnr
* r2score
* ssim
* chlog
2021-03-23 10:01:25 +00:00
Jirka Borovec
efce2b7777
Prune metrics: regression 8/n ( #6636 )
...
* explained_variance
* tests
* mean_absolute_error
* mean_squared_error
* mean_relative_error
* mean_squared_log_error
* chlog
2021-03-23 09:35:51 +01:00
Jirka Borovec
8cd75a4dd5
fix comparing versions ( #6434 )
...
* fix comparing versions
* chlog
* .
* ...
* datasets
2021-03-23 07:51:45 +00:00
thomas chaton
2064ece582
[refactor] Add setup to profilers + _run_stage_setup to trainer 2/5 ( #6633 )
...
* add setup
* update
* updates on comment
* Minor changes
* Extra import
* Docs
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2021-03-22 14:32:31 -04:00
Jirka Borovec
1fae10a2dc
refactoring setup ( #6590 )
...
* refactoring setup
* .
* docs
* flake8
2021-03-22 08:39:19 -04:00
camruta
e2e1de0fb7
Add teardown method to BaseProfiler. ( #6370 )
...
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
2021-03-22 11:49:06 +00:00
Sean Naren
58c9fa7edb
Allow training type plugin to delay optimizer creation (FSDP 2/n) ( #6331 )
...
* Allow training_type_plugin to delay optimizer configure
* Add missing references to trainer, add a CPU accelerator based test
2021-03-22 11:43:53 +00:00
Ethan Harris
853523ee64
Clean utilities/argparse and add missing tests ( #6607 )
2021-03-22 08:53:51 +00:00
Kaushik B
37f22c99ff
Add trainer.predict config validation ( #6543 )
...
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-03-21 21:07:54 +00:00
Justus Schock
634d83134f
Add AMP for validation, prediction and testing ( #6565 )
...
* Add Tests for val and test-steps
* Add native AMP
* pep8 tests
* pep8 plugin
* changelog
2021-03-20 23:15:49 +00:00
Jirka Borovec
3a56a6024e
Prune metrics: other classification 7/n ( #6584 )
...
* confusion_matrix
* iou
* f_beta
* hamming_distance
* stat_scores
* tests
* flake8
* chlog
2021-03-20 03:18:52 +05:30
Amog Kamsetty
3b72bccdf2
Automatically set sync_batchnorm for training_type_plugin ( #6536 )
...
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Roger Shieh <sh.rog@protonmail.ch>
Co-authored-by: Kaushik Bokka <kaushikbokka@gmail.com>
2021-03-19 21:38:49 +00:00
Kaushik B
87c03b1038
Update Gradient Clipping for TPU Accelerator ( #6576 )
2021-03-20 01:02:57 +05:30
Ethan Harris
983a888f49
Fix all_gather for tpu_cores=8 ( #6587 )
2021-03-19 21:56:58 +05:30
Sean Naren
4e9b453854
[Fix] Move init dist connection into the setup function ( #6506 )
...
* Move connection setup into the setup function. Call setup hook after we set up the accelerator
* Added CHANGELOG.md
* fix setup order in callback test
* fix input arguments in test
* Mock distributed function, remove protection to turn into training type hook
* Remove import
* Add missing mock, ensure custom plugin does not create children process
* Skip test on windows
* Update deepspeed to init connection in setup
* Do not initialize distributed module
* Move DeepSpeed tests to special tests since dist communication is being set up
* Special the test to see if this fixes CI
* Delete accelerator connector test to see if its causing build to fail
* Delete deepspeed test
* Revert "Delete accelerator connector test to see if its causing build to fail"
This reverts commit edde60b8
* Revert "Delete deepspeed test"
This reverts commit 9d317429
* Reverse hook
* Reverse setup hooks to debug again
* Add todo so i know where i left off
* For single device move in pre_dispatch after setup function
* Add additional model to device hook if any additional parameters have been set
* See if we can enable deepspeed tests
* Revert "See if we can enable deepspeed tests"
This reverts commit b5450def
* See if this hook approach works
* Introduce new granular hooks
* Remove import, fix tpu spawn by moving the function to setup
* Added missing special test
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-03-18 14:33:39 -07:00
Jirka Borovec
38a2119359
Prune metrics: precision & recall 6/n ( #6573 )
...
* avg precision
* precision
* recall
* curve
* tests
* chlog
* isort
* fix
2021-03-18 13:21:59 -04:00
Jirka Borovec
9e35f979ea
Prune metrics: AUC & AUROC ( #6572 )
...
* class: AUC AUROC
* func: auc auroc
* format
* tests
2021-03-18 10:38:56 +01:00
Jirka Borovec
2f6ce1ae7f
prune metric: accuracy 4/n ( #6515 )
...
* prune accuracy
* chlog
* flake8
* Apply suggestions from code review
Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
* wrap
* test
* test
* fix
Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
2021-03-17 11:37:10 +00:00
Jirka Borovec
297e438153
fix deprecation wrapper & tests ( #6553 )
...
* fix deprecation wrapper & tests
* flake8
2021-03-17 10:41:08 +00:00
Kaushik B
b190403e28
Add outputs param for `on_val/test_epoch_end` hooks ( #6120 )
...
* add outputs param for on_val/test_epoch_end hooks
* update changelog
* fix warning message
* add custom call hook
* cache logged metrics
* add args to docstrings
* use warning cache
* add utility method for param in sig check
* Update CHANGELOG.md
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* update docstring
* add test for eval epoch end hook
* add types and replace model ref
* add deprecation test
* fix test fx name
* add model hooks warning
* add old signature model to tests
* add clear warning cache
* sopport args param
* update tests
* add tests for model hooks
* code suggestions
* add signature utils
* fix pep8 issues
* fix pep8 issues
* fix outputs issue
* fix tests
* code fixes
* fix validate test
* test
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2021-03-16 12:15:16 -04:00
Jirka Borovec
555a6fea21
prune warning & deprecation wrapper ( #6540 )
...
* docs
* wrapper
* test
* count
* flake8
2021-03-16 14:55:31 +00:00
Jirka Borovec
a312219d42
Prune metric: helpers and inputs 3/n ( #6547 )
...
* _basic_input_validation
* _check_shape_and_type_consistency
* _check_num_classes_binary
* _check_num_classes_mc
* _check_num_classes_ml
* _check_top_k
* _check_classification_inputs
* _input_format_classification
* _reduce_stat_scores
* DataType
* rest
* flake8
* chlog
2021-03-16 13:54:06 +01:00
Jirka Borovec
0f07eaf51a
refactor reading env defaults ( #6510 )
...
* change tests
* fix
* test
* _defaults_from_env_vars
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-03-16 10:10:17 +00:00
Amog Kamsetty
6a14146811
Custom Plugin is_distributed ( #6537 )
...
* return from plugin
* dont return for tpu
2021-03-15 19:38:30 +00:00
Jirka Borovec
6453091b8a
Prune metrics base classes 2/n ( #6530 )
...
* base class
* extensions
* chlog
* _stable_1d_sort
* _check_same_shape
* _input_format_classification_one_hot
* utils
* to_onehot
* select_topk
* to_categorical
* get_num_classes
* reduce
* class_reduce
* tests
2021-03-15 19:28:18 +00:00