Commit Graph

2557 Commits

Author SHA1 Message Date
Carlos Mocholí f0c5479de9
Remove legacy `Result` parameters (#6016) 2021-03-28 11:55:08 +02:00
thomas chaton 0e45220263
[warning] Add warning when values are not being reduced (#6417)
* add warning non reduced

* add test

* update test

* update changelog

* Update pytorch_lightning/trainer/connectors/logger_connector/epoch_result_store.py

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>

* update

Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
2021-03-26 18:33:11 +00:00
Carlos Mocholí b730a5a281
Do not describe when there's no summary (#6681) 2021-03-26 14:58:05 +00:00
Carlos Mocholí bc613611e2
Do not add return dict items to callback_metrics (#6682) 2021-03-26 14:05:20 +01:00
Ethan Harris 6b990f3fa5
Add artifcact_location arg to MLFlow logger (#6677)
* Add artifcact_location arg to MLFlow logger

* Add CHANGELOG URL

* Update test
2021-03-26 00:12:03 +01:00
thomas chaton 0ea8f39841
Resolve schedule step bug for PyTorch Profiler (#6674)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-03-25 17:03:06 +01:00
Jirka Borovec 217c12a4e7
Simplify deprecations (#6620)
* use external deprecate

* simplify

* simplify

* simplify

* flake8

* .

* others

* .
2021-03-25 15:26:38 +01:00
Rohit Gupta 9be092dbdb
Add on_epoch_start to run at the beginning of every loop irrespective of train/val/test (#6498)
* update docs

* add hook and update docs

* update tests

* chlog

* Update CHANGELOG.md

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* chlog

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-03-25 14:20:49 +01:00
ananthsub 40976e4eba
Support teardown hook on DataModule (#4673)
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
Co-authored-by: chaton <thomas@grid.ai>
2021-03-25 07:51:55 -05:00
Kaushik B 2cbdc01256
Fix checkpoint callback & Trainer.test(_) issue for TPUs (#6654)
* Fix checkpoint callback issue for TPUs

* update changelog

* add barrier

* apply code suggestions

* update trainer test

* remove spaces

* fix tpu tests

* Apply suggestions from code review

* add comment

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2021-03-25 10:37:37 +00:00
Shengyao Zhuang b8ef52baa1
Match the number of outputs of backward with forward for AllGatherGrad (#6625) 2021-03-25 15:07:58 +05:30
Carlos Mocholí 2dd6f9e09d
`MetricsHolder` clean-up + typing (#6645)
* Metrics holder cleanup and better error message

* Update pytorch_lightning/trainer/connectors/logger_connector/logger_connector.py

* _VALUE -> _METRIC_TYPE
2021-03-24 20:34:46 +01:00
Jirka Borovec d471fa30b3
add copyr (#6661) 2021-03-24 14:29:46 +01:00
ananthsub ab4c838ba3
Remove ModelSummary validation from train loop on_trainer_init (#6610) 2021-03-24 13:54:41 +01:00
Akihiro Nitta ac60536818
Follow E231 [flake8] (#6110)
* Remove E231 from ignore list

* Follow E231

* Update pytorch_lightning/trainer/data_loading.py
2021-03-24 12:50:50 +01:00
Ethan Harris d02fe342c1
Feature/double precision (#6595)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
2021-03-24 15:47:58 +05:30
Jirka Borovec 5733889203
Docs/robots (#6658) 2021-03-24 10:46:56 +01:00
Jirka Borovec 70beddfc13
Prune metrics: others 11/DoNe (#6659)
* classif

* grad_img

* nlp

* ssl

* format
2021-03-24 09:16:28 +01:00
Ethan Harris 741c452551
Fix disabled grads after call to predict (#6657) 2021-03-23 23:07:48 +01:00
thomas chaton fd5cb7fcc3
Add PyTorch 1.8 Profiler 5/5 (#6618)
* Refactor profilers

* Update PassThrough

* WIP - This is broken and will change

* Update pytorch_lightning/profiler/pytorch.py

Co-authored-by: thomas chaton <thomas@grid.ai>

* resolve tests

* resolve tests

* find output

* try something

* update

* add support for test and predict

* update

* update

* use getattr

* test

* test

* update

* tests

* update

* update

* update

* update

* update

* remove file

* update

* update

* update

* update

* update

* test

* update#

* update

* update tests

* update

* add suport for 1.8

* rename records

* add support for 1.8

* update

* resolve flake8

* resolve test

* Refactor basic profilers

* Fixes

* Unused import

* Introduce setup

* Profile on all ranks. Print to stdout on 0

* Introduce dirpath + filename

* CHANGELOG

* Add tests. Address comments

* add `on_run_stage_setup`

* add on_run_stage_setup function

* update

* add test for RegisterRecordFunction

* update lightnng flow direction

* move variable to private

* remove trace

* Undo code that should be in 3/4

* Multi-stage multi-rank

* 2/5 changes

* Pass stage in __del__

* Remove TODOs

* Describe on_evaluation_end. Add tests

* Typo

* Address comments

* deepcopy tests

* Advanced teardown

* Fix teardown test

* Fix tests

* Minor change

* Update CHANGELOG.md

* Fix test

* Quick fixes

* Fix 6522

* resolve ddp tests

* resolve tests

* resolve some tests

* update tests

* resolve tests

* update

* resolve tests

* resolve some tests

* Missed fixes from 3/5

* Fixes

* resolve some tests

* resolve test for 1.7.1

* Broken refactor

* Missed stage

* Minor changes

* resolve tests

* Update CHANGELOG

* resolve bug

* remove print

* Typo

* Cleanup

* resolve ddp test

* remove barrier

* update profiler

* update

* Smaller model

* update

* resolve tests

* update

* Minor changes. CHANGELOG

* Minimize diff

* update to 1.8.1

* RunIf. Extra code. Check segfault

* resolve tests

* Typo. Bad merge

* Fixing a bad merge

* replace for kineto

* Update pytorch_lightning/profiler/pytorch.py

Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>

* Update pytorch_lightning/profiler/pytorch.py

Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>

* Minor changes

* Bad merge

* Use lists for flexibility

* Use sets

* predict_step

* Ananth's suggestion

* update

* Docs

* Update pl_examples/basic_examples/profiler_example.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* update example

* update example

Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2021-03-23 20:43:21 +00:00
Carlos Mocholí 51b10f78f4
Refactor PyTorch profiler 4/5 (#6349)
Co-authored-by: thomas chaton <thomas@grid.ai>
2021-03-23 18:13:29 +01:00
Jirka Borovec 3cf0c3117a
fix back-compatibility for Accel (#6655) 2021-03-23 17:41:36 +01:00
thomas chaton 0995d30fab
Flash predict step (#6577)
* add predict_step

* Update predict_loop.py

* Update trainer.py

* Update trainer.py

* resolve bugs

* update

* update

* update

* resolve bug

* resolve some failing tests

* udpate tests

* update

* resolve tests

* add a test

* remove typo

* add a test for attachement

* update

* changed to on_train_dataloader

* remove __flash_special_attr__

* resolve tests

* update

* update

* update

* update on comments

* Update pytorch_lightning/trainer/data_loading.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2021-03-23 11:13:13 -04:00
Jirka Borovec a74909affa
prune metrics: info retrieval (#6649) 2021-03-23 15:05:32 +00:00
Carlos Mocholí 36d180e532
Refactor base profilers 3/5 (#6621)
Co-authored-by: tchaton <thomas@grid.ai>
2021-03-23 10:07:35 +00:00
Jirka Borovec f93414d085
Prune metyrics: regression 9/n (#6637)
* psnr

* r2score

* ssim

* chlog
2021-03-23 10:01:25 +00:00
Jirka Borovec efce2b7777
Prune metrics: regression 8/n (#6636)
* explained_variance

* tests

* mean_absolute_error

* mean_squared_error

* mean_relative_error

* mean_squared_log_error

* chlog
2021-03-23 09:35:51 +01:00
Jirka Borovec 8cd75a4dd5
fix comparing versions (#6434)
* fix comparing versions

* chlog

* .

* ...

* datasets
2021-03-23 07:51:45 +00:00
thomas chaton 2064ece582
[refactor] Add setup to profilers + _run_stage_setup to trainer 2/5 (#6633)
* add setup

* update

* updates on comment

* Minor changes

* Extra import

* Docs

Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2021-03-22 14:32:31 -04:00
Jirka Borovec 1fae10a2dc
refactoring setup (#6590)
* refactoring setup

* .

* docs

* flake8
2021-03-22 08:39:19 -04:00
camruta e2e1de0fb7
Add teardown method to BaseProfiler. (#6370)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
2021-03-22 11:49:06 +00:00
Sean Naren 58c9fa7edb
Allow training type plugin to delay optimizer creation (FSDP 2/n) (#6331)
* Allow training_type_plugin to delay optimizer configure

* Add missing references to trainer, add a CPU accelerator based test
2021-03-22 11:43:53 +00:00
Ethan Harris 853523ee64
Clean utilities/argparse and add missing tests (#6607) 2021-03-22 08:53:51 +00:00
Kaushik B 37f22c99ff
Add trainer.predict config validation (#6543)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-03-21 21:07:54 +00:00
Justus Schock 634d83134f
Add AMP for validation, prediction and testing (#6565)
* Add Tests for val and test-steps

* Add native AMP

* pep8 tests

* pep8 plugin

* changelog
2021-03-20 23:15:49 +00:00
Jirka Borovec 3a56a6024e
Prune metrics: other classification 7/n (#6584)
* confusion_matrix

* iou

* f_beta

* hamming_distance

* stat_scores

* tests

* flake8

* chlog
2021-03-20 03:18:52 +05:30
Amog Kamsetty 3b72bccdf2
Automatically set sync_batchnorm for training_type_plugin (#6536)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Roger Shieh <sh.rog@protonmail.ch>
Co-authored-by: Kaushik Bokka <kaushikbokka@gmail.com>
2021-03-19 21:38:49 +00:00
Kaushik B 87c03b1038
Update Gradient Clipping for TPU Accelerator (#6576) 2021-03-20 01:02:57 +05:30
Ethan Harris 983a888f49
Fix all_gather for tpu_cores=8 (#6587) 2021-03-19 21:56:58 +05:30
Sean Naren 4e9b453854
[Fix] Move init dist connection into the setup function (#6506)
* Move connection setup into the setup function. Call setup hook after we set up the accelerator

* Added CHANGELOG.md

* fix setup order in callback test

* fix input arguments in test

* Mock distributed function, remove protection to turn into training type hook

* Remove import

* Add missing mock, ensure custom plugin does not create children process

* Skip test on windows

* Update deepspeed to init connection in setup

* Do not initialize distributed module

* Move DeepSpeed tests to special tests since dist communication is being set up

* Special the test to see if this fixes CI

* Delete accelerator connector test to see if its causing build to fail

* Delete deepspeed test

* Revert "Delete accelerator connector test to see if its causing build to fail"

This reverts commit edde60b8

* Revert "Delete deepspeed test"

This reverts commit 9d317429

* Reverse hook

* Reverse setup hooks to debug again

* Add todo so i know where i left off

* For single device move in pre_dispatch after setup function

* Add additional model to device hook if any additional parameters have been set

* See if we can enable deepspeed tests

* Revert "See if we can enable deepspeed tests"

This reverts commit b5450def

* See if this hook approach works

* Introduce new granular hooks

* Remove import, fix tpu spawn by moving the function to setup

* Added missing special test

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-03-18 14:33:39 -07:00
Jirka Borovec 38a2119359
Prune metrics: precision & recall 6/n (#6573)
* avg precision

* precision
* recall

* curve

* tests

* chlog

* isort

* fix
2021-03-18 13:21:59 -04:00
Jirka Borovec 9e35f979ea
Prune metrics: AUC & AUROC (#6572)
* class: AUC AUROC

* func: auc auroc

* format

* tests
2021-03-18 10:38:56 +01:00
Jirka Borovec 2f6ce1ae7f
prune metric: accuracy 4/n (#6515)
* prune accuracy

* chlog

* flake8

* Apply suggestions from code review

Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>

* wrap

* test

* test

* fix

Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
2021-03-17 11:37:10 +00:00
Jirka Borovec 297e438153
fix deprecation wrapper & tests (#6553)
* fix deprecation wrapper & tests

* flake8
2021-03-17 10:41:08 +00:00
Kaushik B b190403e28
Add outputs param for `on_val/test_epoch_end` hooks (#6120)
* add outputs param for on_val/test_epoch_end hooks

* update changelog

* fix warning message

* add custom call hook

* cache logged metrics

* add args to docstrings

* use warning cache

* add utility method for param in sig check

* Update CHANGELOG.md

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* update docstring

* add test for eval epoch end hook

* add types and replace model ref

* add deprecation test

* fix test fx name

* add model hooks warning

* add old signature model to tests

* add clear warning cache

* sopport args param

* update tests

* add tests for model hooks

* code suggestions

* add signature utils

* fix pep8 issues

* fix pep8 issues

* fix outputs issue

* fix tests

* code fixes

* fix validate test

* test

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2021-03-16 12:15:16 -04:00
Jirka Borovec 555a6fea21
prune warning & deprecation wrapper (#6540)
* docs

* wrapper

* test

* count

* flake8
2021-03-16 14:55:31 +00:00
Jirka Borovec a312219d42
Prune metric: helpers and inputs 3/n (#6547)
* _basic_input_validation

* _check_shape_and_type_consistency

* _check_num_classes_binary

* _check_num_classes_mc

* _check_num_classes_ml

* _check_top_k

* _check_classification_inputs

* _input_format_classification

* _reduce_stat_scores

* DataType

* rest

* flake8

* chlog
2021-03-16 13:54:06 +01:00
Jirka Borovec 0f07eaf51a
refactor reading env defaults (#6510)
* change tests

* fix

* test

* _defaults_from_env_vars

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-03-16 10:10:17 +00:00
Amog Kamsetty 6a14146811
Custom Plugin is_distributed (#6537)
* return from plugin

* dont return for tpu
2021-03-15 19:38:30 +00:00
Jirka Borovec 6453091b8a
Prune metrics base classes 2/n (#6530)
* base class

* extensions

* chlog

* _stable_1d_sort

* _check_same_shape

* _input_format_classification_one_hot

* utils

* to_onehot

* select_topk

* to_categorical

* get_num_classes

* reduce

* class_reduce

* tests
2021-03-15 19:28:18 +00:00