Commit Graph

4604 Commits

Author SHA1 Message Date
Akihiro Nitta ac60536818
Follow E231 [flake8] (#6110)
* Remove E231 from ignore list

* Follow E231

* Update pytorch_lightning/trainer/data_loading.py
2021-03-24 12:50:50 +01:00
Ethan Harris d02fe342c1
Feature/double precision (#6595)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
2021-03-24 15:47:58 +05:30
Jirka Borovec 5733889203
Docs/robots (#6658) 2021-03-24 10:46:56 +01:00
Ben Ahlbrand cbca6cd354
fix: update example autoencoder.py to reflect args (#6638)
* fix: update example autoencoder.py to reflect args

* Update pl_examples/basic_examples/autoencoder.py

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-03-24 08:27:08 +00:00
Jirka Borovec 70beddfc13
Prune metrics: others 11/DoNe (#6659)
* classif

* grad_img

* nlp

* ssl

* format
2021-03-24 09:16:28 +01:00
Eric Rubiel b1e3dcc607
Use `pl.LightningModule` in new-project docs (#6656) 2021-03-24 00:08:57 +01:00
Ethan Harris 741c452551
Fix disabled grads after call to predict (#6657) 2021-03-23 23:07:48 +01:00
Jirka Borovec 64d0fa4472
update coverage config (#6524)
* update coverage config

* parallel

* parallel

* Apply suggestions from code review

* Apply suggestions from code review

* paralel

* paralel

* paralel

* combine

* combine

* .

* ..

* ..

* ..

* rev

* cb

* cb

* drop

* drop

* .

* ..

* ...

* ...

* ...

* .
2021-03-23 23:05:04 +01:00
thomas chaton fd5cb7fcc3
Add PyTorch 1.8 Profiler 5/5 (#6618)
* Refactor profilers

* Update PassThrough

* WIP - This is broken and will change

* Update pytorch_lightning/profiler/pytorch.py

Co-authored-by: thomas chaton <thomas@grid.ai>

* resolve tests

* resolve tests

* find output

* try something

* update

* add support for test and predict

* update

* update

* use getattr

* test

* test

* update

* tests

* update

* update

* update

* update

* update

* remove file

* update

* update

* update

* update

* update

* test

* update#

* update

* update tests

* update

* add suport for 1.8

* rename records

* add support for 1.8

* update

* resolve flake8

* resolve test

* Refactor basic profilers

* Fixes

* Unused import

* Introduce setup

* Profile on all ranks. Print to stdout on 0

* Introduce dirpath + filename

* CHANGELOG

* Add tests. Address comments

* add `on_run_stage_setup`

* add on_run_stage_setup function

* update

* add test for RegisterRecordFunction

* update lightnng flow direction

* move variable to private

* remove trace

* Undo code that should be in 3/4

* Multi-stage multi-rank

* 2/5 changes

* Pass stage in __del__

* Remove TODOs

* Describe on_evaluation_end. Add tests

* Typo

* Address comments

* deepcopy tests

* Advanced teardown

* Fix teardown test

* Fix tests

* Minor change

* Update CHANGELOG.md

* Fix test

* Quick fixes

* Fix 6522

* resolve ddp tests

* resolve tests

* resolve some tests

* update tests

* resolve tests

* update

* resolve tests

* resolve some tests

* Missed fixes from 3/5

* Fixes

* resolve some tests

* resolve test for 1.7.1

* Broken refactor

* Missed stage

* Minor changes

* resolve tests

* Update CHANGELOG

* resolve bug

* remove print

* Typo

* Cleanup

* resolve ddp test

* remove barrier

* update profiler

* update

* Smaller model

* update

* resolve tests

* update

* Minor changes. CHANGELOG

* Minimize diff

* update to 1.8.1

* RunIf. Extra code. Check segfault

* resolve tests

* Typo. Bad merge

* Fixing a bad merge

* replace for kineto

* Update pytorch_lightning/profiler/pytorch.py

Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>

* Update pytorch_lightning/profiler/pytorch.py

Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>

* Minor changes

* Bad merge

* Use lists for flexibility

* Use sets

* predict_step

* Ananth's suggestion

* update

* Docs

* Update pl_examples/basic_examples/profiler_example.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* update example

* update example

Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2021-03-23 20:43:21 +00:00
Carlos Mocholí 51b10f78f4
Refactor PyTorch profiler 4/5 (#6349)
Co-authored-by: thomas chaton <thomas@grid.ai>
2021-03-23 18:13:29 +01:00
Jirka Borovec 3cf0c3117a
fix back-compatibility for Accel (#6655) 2021-03-23 17:41:36 +01:00
thomas chaton 0995d30fab
Flash predict step (#6577)
* add predict_step

* Update predict_loop.py

* Update trainer.py

* Update trainer.py

* resolve bugs

* update

* update

* update

* resolve bug

* resolve some failing tests

* udpate tests

* update

* resolve tests

* add a test

* remove typo

* add a test for attachement

* update

* changed to on_train_dataloader

* remove __flash_special_attr__

* resolve tests

* update

* update

* update

* update on comments

* Update pytorch_lightning/trainer/data_loading.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2021-03-23 11:13:13 -04:00
Jirka Borovec a74909affa
prune metrics: info retrieval (#6649) 2021-03-23 15:05:32 +00:00
Carlos Mocholí 36d180e532
Refactor base profilers 3/5 (#6621)
Co-authored-by: tchaton <thomas@grid.ai>
2021-03-23 10:07:35 +00:00
Jirka Borovec f93414d085
Prune metyrics: regression 9/n (#6637)
* psnr

* r2score

* ssim

* chlog
2021-03-23 10:01:25 +00:00
Jirka Borovec efce2b7777
Prune metrics: regression 8/n (#6636)
* explained_variance

* tests

* mean_absolute_error

* mean_squared_error

* mean_relative_error

* mean_squared_log_error

* chlog
2021-03-23 09:35:51 +01:00
Jirka Borovec 8cd75a4dd5
fix comparing versions (#6434)
* fix comparing versions

* chlog

* .

* ...

* datasets
2021-03-23 07:51:45 +00:00
thomas chaton 2064ece582
[refactor] Add setup to profilers + _run_stage_setup to trainer 2/5 (#6633)
* add setup

* update

* updates on comment

* Minor changes

* Extra import

* Docs

Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2021-03-22 14:32:31 -04:00
Jirka Borovec e62c7c7839
hotfix: mock examples (#6632)
* mock examples

* drop from GA
2021-03-22 16:49:01 +00:00
Jirka Borovec 1fae10a2dc
refactoring setup (#6590)
* refactoring setup

* .

* docs

* flake8
2021-03-22 08:39:19 -04:00
camruta e2e1de0fb7
Add teardown method to BaseProfiler. (#6370)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
2021-03-22 11:49:06 +00:00
Sean Naren 58c9fa7edb
Allow training type plugin to delay optimizer creation (FSDP 2/n) (#6331)
* Allow training_type_plugin to delay optimizer configure

* Add missing references to trainer, add a CPU accelerator based test
2021-03-22 11:43:53 +00:00
Ethan Harris 853523ee64
Clean utilities/argparse and add missing tests (#6607) 2021-03-22 08:53:51 +00:00
Carlos Mocholí 870247ffe6
drop mypy from .pre-commit-config.yaml (#6542) 2021-03-22 00:38:10 +00:00
Carlos Mocholí 51c9260fad
Move profiler tests (#6619) 2021-03-21 23:39:55 +00:00
Kaushik B 42a7b70585
Add DDP Spawn being default for Multi GPUs (#6292) 2021-03-21 22:10:54 +01:00
Kaushik B 37f22c99ff
Add trainer.predict config validation (#6543)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-03-21 21:07:54 +00:00
Justus Schock 634d83134f
Add AMP for validation, prediction and testing (#6565)
* Add Tests for val and test-steps

* Add native AMP

* pep8 tests

* pep8 plugin

* changelog
2021-03-20 23:15:49 +00:00
Jirka Borovec cb59039288
fixing examples (#6600)
* try Azure

* -e

* path
2021-03-20 18:58:59 +00:00
Jirka Borovec 3a56a6024e
Prune metrics: other classification 7/n (#6584)
* confusion_matrix

* iou

* f_beta

* hamming_distance

* stat_scores

* tests

* flake8

* chlog
2021-03-20 03:18:52 +05:30
Amog Kamsetty 3b72bccdf2
Automatically set sync_batchnorm for training_type_plugin (#6536)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Roger Shieh <sh.rog@protonmail.ch>
Co-authored-by: Kaushik Bokka <kaushikbokka@gmail.com>
2021-03-19 21:38:49 +00:00
Jirka Borovec 5780796931
NGC container PoC (#6187)
* add NVIDIA flows

* push

* pull

* ...

* extras

* ci prune

* fix

* tag

* .

* list
2021-03-20 02:55:46 +05:30
Kaushik B 87c03b1038
Update Gradient Clipping for TPU Accelerator (#6576) 2021-03-20 01:02:57 +05:30
Ethan Harris 983a888f49
Fix all_gather for tpu_cores=8 (#6587) 2021-03-19 21:56:58 +05:30
Sean Naren 4e9b453854
[Fix] Move init dist connection into the setup function (#6506)
* Move connection setup into the setup function. Call setup hook after we set up the accelerator

* Added CHANGELOG.md

* fix setup order in callback test

* fix input arguments in test

* Mock distributed function, remove protection to turn into training type hook

* Remove import

* Add missing mock, ensure custom plugin does not create children process

* Skip test on windows

* Update deepspeed to init connection in setup

* Do not initialize distributed module

* Move DeepSpeed tests to special tests since dist communication is being set up

* Special the test to see if this fixes CI

* Delete accelerator connector test to see if its causing build to fail

* Delete deepspeed test

* Revert "Delete accelerator connector test to see if its causing build to fail"

This reverts commit edde60b8

* Revert "Delete deepspeed test"

This reverts commit 9d317429

* Reverse hook

* Reverse setup hooks to debug again

* Add todo so i know where i left off

* For single device move in pre_dispatch after setup function

* Add additional model to device hook if any additional parameters have been set

* See if we can enable deepspeed tests

* Revert "See if we can enable deepspeed tests"

This reverts commit b5450def

* See if this hook approach works

* Introduce new granular hooks

* Remove import, fix tpu spawn by moving the function to setup

* Added missing special test

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-03-18 14:33:39 -07:00
Kaushik B b606171299
Update Changelog for v1.2.4 (#6581)
* Update changelog for v1.2.4

* lagacy v1.2.4

* prune duplicates from changelog

Co-authored-by: Jirka Borovec <jirka.borovec@seznam.cz>
2021-03-18 20:13:54 +00:00
Jirka Borovec 38a2119359
Prune metrics: precision & recall 6/n (#6573)
* avg precision

* precision
* recall

* curve

* tests

* chlog

* isort

* fix
2021-03-18 13:21:59 -04:00
thomas chaton 8853a36d45
[doc] Update Dict Train Loader doc. (#6579)
* update doc

* update example
2021-03-18 17:14:38 +00:00
Jirka Borovec 9e35f979ea
Prune metrics: AUC & AUROC (#6572)
* class: AUC AUROC

* func: auc auroc

* format

* tests
2021-03-18 10:38:56 +01:00
Jirka Borovec 2f6ce1ae7f
prune metric: accuracy 4/n (#6515)
* prune accuracy

* chlog

* flake8

* Apply suggestions from code review

Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>

* wrap

* test

* test

* fix

Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
2021-03-17 11:37:10 +00:00
Jirka Borovec 297e438153
fix deprecation wrapper & tests (#6553)
* fix deprecation wrapper & tests

* flake8
2021-03-17 10:41:08 +00:00
thomas chaton 00cd918177
[doc] Add Zero Grad `set_to_none=True` trick (#6548)
* add trick to doc

* update

* update path

* Update docs/source/benchmarking/performance.rst

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2021-03-16 23:40:14 +00:00
Kaushik B b190403e28
Add outputs param for `on_val/test_epoch_end` hooks (#6120)
* add outputs param for on_val/test_epoch_end hooks

* update changelog

* fix warning message

* add custom call hook

* cache logged metrics

* add args to docstrings

* use warning cache

* add utility method for param in sig check

* Update CHANGELOG.md

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* update docstring

* add test for eval epoch end hook

* add types and replace model ref

* add deprecation test

* fix test fx name

* add model hooks warning

* add old signature model to tests

* add clear warning cache

* sopport args param

* update tests

* add tests for model hooks

* code suggestions

* add signature utils

* fix pep8 issues

* fix pep8 issues

* fix outputs issue

* fix tests

* code fixes

* fix validate test

* test

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2021-03-16 12:15:16 -04:00
Jirka Borovec 555a6fea21
prune warning & deprecation wrapper (#6540)
* docs

* wrapper

* test

* count

* flake8
2021-03-16 14:55:31 +00:00
Jirka Borovec a312219d42
Prune metric: helpers and inputs 3/n (#6547)
* _basic_input_validation

* _check_shape_and_type_consistency

* _check_num_classes_binary

* _check_num_classes_mc

* _check_num_classes_ml

* _check_top_k

* _check_classification_inputs

* _input_format_classification

* _reduce_stat_scores

* DataType

* rest

* flake8

* chlog
2021-03-16 13:54:06 +01:00
Jirka Borovec 0f07eaf51a
refactor reading env defaults (#6510)
* change tests

* fix

* test

* _defaults_from_env_vars

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-03-16 10:10:17 +00:00
Amog Kamsetty 6a14146811
Custom Plugin is_distributed (#6537)
* return from plugin

* dont return for tpu
2021-03-15 19:38:30 +00:00
Jirka Borovec 6453091b8a
Prune metrics base classes 2/n (#6530)
* base class

* extensions

* chlog

* _stable_1d_sort

* _check_same_shape

* _input_format_classification_one_hot

* utils

* to_onehot

* select_topk

* to_categorical

* get_num_classes

* reduce

* class_reduce

* tests
2021-03-15 19:28:18 +00:00
Carlos Mocholí 9c5973357e
Update hook lifecycle (#6538)
* Update hook lifecycle

* Update docs/source/common/lightning_module.rst
2021-03-15 19:16:31 +00:00
Adrian Wälchli ea36ee30b0
fix attribute access in LightningModule.toggle_optimizer (#6513) 2021-03-15 19:06:17 +01:00