Commit Graph

226 Commits

Author SHA1 Message Date
William Falcon b34c7add23
Fixes #3668, #3887 as a bonus (#3888)
* Fixes #3668, #3887 as a bonus

* Fixes #3668, #3887 as a bonus
2020-10-05 21:30:41 -04:00
William Falcon b014223f72
Fixes #2678 - enables training_step to return None (#3862)
* Fixes #2678 - enables training_step to return None

* Fixes #2678 - enables training_step to return None
2020-10-05 07:33:46 -04:00
William Falcon d787208e76
Fixes #2792 (#3857) 2020-10-04 23:25:02 -04:00
William Falcon f58c760409
Fixes #2551 (#3858) 2020-10-04 23:02:35 -04:00
William Falcon 97e62b38cf
Fixed #2143 and many more :) (#3855) 2020-10-04 22:18:49 -04:00
William Falcon d9656d166c
fixed model checkpoint frequency (#3852)
* fixed model checkpoint frequency

* fixed model checkpoint frequency

* fixed model checkpoint frequency

* fixed model checkpoint frequency

* merged
2020-10-04 21:49:20 -04:00
William Falcon 2bca89a752
added tbptt test for logging (#3850)
* added tbptt test for logging

* added tbptt test for logging
2020-10-04 19:38:42 -04:00
William Falcon 00f0d19a61
fixes #3798 (#3849)
* fix #3798

* added tbptt test for logging
2020-10-04 19:36:51 -04:00
Carlos Mocholí 89cc12311f
Fix tbptt_reduce_fx when non-floating tensors are logged (#3796)
* Add failing test

* force all tbptt vals to be floats for reduce

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-10-04 17:10:25 -04:00
Rohit Gupta d3696052cf
Add back sanity checks (#3846)
* Add back sanity checks

* pep
2020-10-04 17:05:26 -04:00
William Falcon 1aa9d39506
Eval epoch can now log independently (#3843)
* ref: routed epoch outputs to logger

* ref: routed epoch outputs to logger

* ref: routed epoch outputs to logger

* ref: routed epoch outputs to logger
2020-10-04 13:36:35 -04:00
Rohit Gupta a628d181ee
Fix val_progress_bar total with num_sanity_val_steps (#3751)
* Fix val_progress_bar total with num_sanity_val_steps

* chlog

* Fix val_progress_bar total with num_sanity_val_steps

* move test

* replaced with sanity flag and suggestions
2020-10-04 08:32:18 -04:00
William Falcon 66aef10239
verified epoch logging (#3830)
* ref: fix epoch logging

* verified epoch logging

* verified epoch logging

* verified epoch logging

* verified epoch logging

* verified epoch logging

* verified epoch logging

* verified epoch logging

* verified epoch logging
2020-10-03 21:17:24 -04:00
William Falcon 3903cf63c6
ref: training flag tests (val_check_interval) (#3825)
* added test_val_check_interval tests

* added test_val_check_interval tests

* added test_val_check_interval tests
2020-10-03 14:05:01 -04:00
William Falcon d9bc95f83e
ref: bug fix with logging val epoch end + monitor (#3812)
* ref: fix metric err

* ref: fix metric err

* ref: fix metric err

* ref: merge

* ref: merge

* ref: merge

* ref: merge

* ref: decoupled ddp2

* ref: decoupled ddp2

* ref: decoupled ddp2

* ref: decoupled ddp2

* ref: decoupled ddp2

* ref: clean up ddp before final fix

* ref: clean up ddp before final fix

* ref: clean up ddp before final fix

* ref: clean up ddp before final fix

* ref: clean up ddp before final fix

* ref: clean up ddp before final fix

* ref: clean up ddp before final fix

* ref: clean up ddp before final fix

* ref: clean up ddp before final fix

* ref: clean up ddp before final fix

* ref: clean up ddp before final fix
2020-10-03 12:33:29 -04:00
GimmickNG e4e60e9b82
Add datamodule parameter to lr_find() (#3425)
* Add datamodule parameter to lr_find()

* Fixed missing import

* Move datamodule parameter to end

* Add datamodule parameter test with auto_lr_find

* Change test for datamodule parameter

* Apply suggestions from code review

Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>

* Fix lr_find documentation

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* formatting

* Add description to datamodule param in lr_find

* pep8: remove trailing whitespace on line 105

* added changelog

Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
Co-authored-by: Nicki Skafte <nugginea@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-10-01 10:33:12 +02:00
Teddy Koker 5ec00ccd28
Added gradient clip test for native AMP (#3754)
* added gradient clip test for fp16

* pep8
2020-10-01 01:36:34 -04:00
Adrian Wälchli c73032e39d
Make ModelCheckpoint(save_top_k=-1) track the best models (#3735)
* fix topk=-1 tracking best

* update test

* clean up

* add changelog

* enable loading best topk in trainer.test()

* make trivial

* return right away

* make windows test path happy
2020-09-30 08:34:02 -04:00
Adrian Wälchli 9405c880af
log/save_interval based on global step (#3667)
* log interval based on global step

* test

* test

* test

* test

* pep

* pep

* added changelog

* pep

* merge

* remove unused arg
2020-09-30 12:26:27 +02:00
William Falcon b3be8022bd
tests for val step flow and logging (#3731)
* ref: test val epoch end

* ref: test val epoch end

* ref: test val epoch end

* ref: test log dict

* ref: test log dict

* ref: test log dict

* ref: test log dict
2020-09-29 22:12:56 -04:00
William Falcon c14928a72a
ref: test val flow steps (#3723)
* ref: test val epoch end

* ref: test val epoch end

* ref: test val epoch end
2020-09-29 11:42:38 -04:00
William Falcon f42ea303c9
ref: enable self.log for eval loop metrics (#3715)
* ref: test val epoch end

* ref: test val epoch end

* ref: test val epoch end

* ref: test val epoch end

* ref: test val epoch end

* ref: test val epoch end
2020-09-29 02:00:28 -04:00
Rohit Gupta 783750547d
disable optimizers setup during testing (#3059)
* disable configure_optimizers during testing

* minor changes

* hvd and ddp

* fix precision during testing

* fix ddp

* fix amp

* fix cpu

* update dp

* simplify optimizers

* add test

* codefactor

* ref optimizer setup

* chlog

* suggestions

* isort

* rebased with master
2020-09-29 01:09:04 +02:00
William Falcon 4d5c0fa1bc
ref: separate flow vs log tests (#3704) 2020-09-28 12:01:52 -04:00
William Falcon cdd7266cd8
ref: enable self.log from val step (#3701)
* .log in eval

* ref

* ref: enable self.log in val step
2020-09-28 10:49:07 -04:00
William Falcon 2ecaa2a8be
ref: (2/n) fix no log in epoch end (#3699) 2020-09-28 08:25:44 -04:00
William Falcon ddd11075bd
[WIP] ref: deprecated results obj, added support for simpler comms (1/n) (#3681)
* ref: deprecated results obj, added support for simpler comms. Decouples logging from loops

* ref: deprecated results obj, added support for simpler comms. Decouples logging from loops

* ref: deprecated results obj, added support for simpler comms. Decouples logging from loops

* ref: deprecated results obj, added support for simpler comms. Decouples logging from loops

* ref: deprecated results obj, added support for simpler comms. Decouples logging from loops

* ref: deprecated results obj, added support for simpler comms. Decouples logging from loops

* fix global step err

* fix global step err

* fix global step err

* fix global step err

* fix global step err

* fix typing err

* fix str

* fix typing err
2020-09-27 23:19:46 -04:00
William Falcon ff2bab0996
ref: (results 1/n) enable tracking original metric when step and epoch are both true (#3685)
* enable tracking original metric when step and epoch are both true
2020-09-27 22:08:31 -04:00
Adrian Wälchli f37e9e8a83
Fix global step increment on training_epoch_end (#3673)
* fix

* fix global step err

* fix global step err

* fix global step err

* fix global step err

* fix global step err

* fix global step err

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-09-27 20:19:51 -04:00
William Falcon d79bce1dff
enable None model checkpoint default (#3669)
* enable None model checkpoint default

* enable None model checkpoint default

* enable None model checkpoint default

* enable None model checkpoint default

* enable None model checkpoint default

* enable None model checkpoint default

* enable None model checkpoint default

* enable None model checkpoint default

* enable None model checkpoint default

* enable None model checkpoint default

* enable None model checkpoint default

* enable None model checkpoint default

* enable None model checkpoint default

* enable None model checkpoint default

* enable None model checkpoint default

* enable None model checkpoint default

* enable None model checkpoint default

* enable None model checkpoint default

* enable None model checkpoint default

* enable None model checkpoint default

* enable None model checkpoint default

* enable None model checkpoint default
2020-09-26 23:14:04 -04:00
William Falcon c591013708
enable any logged metric to be accessible in callbacks (#3598)
* enable any logged or written metric to be accessible in callbacks

* enable any logged or written metric to be accessible in callbacks

* enable any logged or written metric to be accessible in callbacks

* enable any logged or written metric to be accessible in callbacks

* enable any logged or written metric to be accessible in callbacks

* enable any logged or written metric to be accessible in callbacks

* enable any logged or written metric to be accessible in callbacks

* enable any logged or written metric to be accessible in callbacks

* enable any logged or written metric to be accessible in callbacks

* enable any logged or written metric to be accessible in callbacks

* enable any logged or written metric to be accessible in callbacks

* enable any logged or written metric to be accessible in callbacks

* enable any logged or written metric to be accessible in callbacks

* enable any logged or written metric to be accessible in callbacks

* enable any logged or written metric to be accessible in callbacks

* enable any logged or written metric to be accessible in callbacks

* enable any logged or written metric to be accessible in callbacks

* enable any logged or written metric to be accessible in callbacks

* enable any logged or written metric to be accessible in callbacks

* enable any logged or written metric to be accessible in callbacks

* enable any logged or written metric to be accessible in callbacks

* clarify forward

* clarify forward

* clarify forward

* clarify forward
2020-09-22 18:00:23 -04:00
Nicki Skafte 88e6b29bba
faster tests (#3604) 2020-09-22 07:37:34 -04:00
William Falcon 21cfdf6874
ref: result 1/n (make monitor default to checkpoint_on to simplify re… (#3571)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)

* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)

* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)

* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)

* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)

* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)

* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)

* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)

* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)

* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)

* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)

* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)

* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)

* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)

* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)

* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)

* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)

* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)

* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)

* Update pytorch_lightning/callbacks/model_checkpoint.py

Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>

* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)

* force crash when max_epochs < epochs in a checkpoint

Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
2020-09-20 22:58:43 -04:00
William Falcon 9acee67c31
fixes 3549 (#3564) 2020-09-19 20:00:50 -04:00
Carlos Mocholí 580b04b490
Fix ModelCheckpoints name formatting (#3163)
* Fix ModelCheckpoint's name formatting

* Fix failing tests

* Add dot to CHECKPOINT_SUFFIX

* Set variables to their default values at the end of tests

* Fix logic for filepath='' and filename=None. Add test

* Fix Windows tests

* Fix typo. Remove leading line break and zeroes

* Remove CHECKPOINT_SUFFIX

* Fix typos. Use appropriate f-string format

* Apply suggestions from code review

* Fix broken tests after #3320

* Finish changes suggested by Borda

* Use explicit test var names

* Apply suggestions

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* Apply suggestions

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* Update CHANGELOG

* Apply suggestions from code review

* for

* prepend whitespace in warn msg

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-09-18 23:09:11 +02:00
Abe Botros 76c4afb840
Fix IoU score for classes not present in target or pred (#3098)
* Fix IoU score for classes not present in target or pred

Fixes #3097

- Allow configurable not_present_score for IoU for classes
  not present in target or pred. Defaults to 1.0.
- Also allow passing `num_classes` parameter through from iou
  metric class down to its underlying functional iou
  call.

* Changelog: move IoU not-present score fix to [unreleased]

* IoU: avoid recomputing class presence in target and pred

Use already-computed support, true positives, and false positives to
determine if a class is not present in either target or pred.

* Test IoU against sklearn jaccard_score

Also add TODO to test our IoU's not_present_score against sklearn's
jaccard_score's zero_division when it beecomes available.

* IoU: remove_bg -> ignore_index

Fixes #2736

- Rename IoU metric argument from `remove_bg` -> `ignore_index`.
- Accept an optional int class index to ignore, instead of a bool and
  instead of always assuming the background class has index 0.
- If given, ignore the class index when computing the IoU output,
  regardless of reduction method.

* Improve documentation for IoU not_present_score

* Update default IoU not_present_score to 0.0

* Add note about IoU division by zero

* Rename IoU not_present_score -> absent_score

* Update IoU absent score changelog wording

* Condense IoU absent_score argument docstring

* Remove unnecessary IoU ignore_index comment

* docstrings

* isort

* flake8

* Fix test of IoU against sklearn jaccard

Use macro instead of micro averaging in sklearn's jaccard score, to
match multi-class IoU, which conventionally takes per-class scores
before averaging.

Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>
2020-09-17 10:37:49 +02:00
Phil b5dc6998ae
Disable train dataloader shuffle when overfit_batches is active. (#3501)
* Disable train dataloader shuffle when overfit_batches is active.

* pep8

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-09-15 05:07:27 -04:00
William Falcon 1d7c615d82
cleaning up stale logger tests + flake8 (#3490)
* cleaning up stale logger tests

* cleaning up stale logger tests

* cleaning up stale logger tests

* cleaning up stale logger tests

* cleaning up stale logger tests

* cleaning up stale logger tests
2020-09-14 00:06:48 -04:00
William Falcon cd16aa9854
ref: checkpoint connector methods 4/n (#3474)
* ref: checkpoint connector methods 4/n

* ref: checkpoint connector methods 4/n

* ref: checkpoint connector methods 4/n

* ref: checkpoint connector methods 4/n

* ref: checkpoint connector methods 4/n

* ref: checkpoint connector methods 4/n

* ref: checkpoint connector methods 4/n

* ref: checkpoint connector methods 4/n

* ref: checkpoint connector methods 4/n
2020-09-12 08:42:27 -04:00
Rohit Gupta a1ea681c47
Fix batch_outputs with optimizer frequencies (#3229)
* Fix batch_outputs with optimizers frequencies

* optimizers

* fix batch_outputs with optimizer frequencies

* clean test

* suggestion

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* chlog

* failing doctest

* failing doctest

* update doctest

* chlog

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-09-10 23:01:20 +02:00
William Falcon 5abf7d9123
ref: move lr_finder (#3434)
* ref: move lr_finder

* ref: move lr_finder

* ref: move lr_finder

* ref: move lr_finder

* ref: move lr_finder

* ref: move lr_finder

* ref: move lr_finder
2020-09-09 22:12:27 -04:00
William Falcon b36c5e86d0
ref: trainer argparse 1/n (#3421)
* ref: trainer argparse 1/n

* ref: trainer argparse 1/n

* ref: trainer argparse 1/n

* ref: trainer argparse 1/n

* ref: trainer argparse 1/n

* ref: trainer argparse 1/n

* ref: trainer argparse 1/n

* ref: trainer argparse 1/n
2020-09-09 12:31:17 -04:00
Adrian Wälchli e245065fbc
limit auto scaling batch size to the size of the training dataset (#3271)
* fix

* fix and test

* fix merge error

* test for max dataset size

* changelog

* update docs

* fix merge

* unused imports

* imports
2020-09-09 10:51:43 +02:00
William Falcon ff5f099cb7
ref: remove inner train loop 1/n (#3397)
* ref: remove inner train loop 1/n

* ref: remove inner train loop 1/n
2020-09-08 12:05:00 -04:00
William Falcon d438ad8a8d
ensure calling test multiple times does not change results (#3391) 2020-09-07 22:25:12 -04:00
William Falcon b76d9e5dd5
Refa22 (#3388)
* ref: inner train loop (intermediate step) 20/n

* ref: inner train loop (intermediate step) 21/n

* ref: inner train loop (intermediate step) 21/n

* ref: inner train loop (intermediate step) 21/n

* ref: inner train loop (intermediate step) 21/n

* ref: inner train loop (intermediate step) 21/n
2020-09-07 16:45:31 -04:00
William Falcon 0b5b70d6c9
ref: inner train loop (intermediate step) 17/n (#3376)
* ref: inner train loop (intermediate step) 17/n

* ref: inner train loop (intermediate step) 17/n

* ref: inner train loop (intermediate step) 17/n
2020-09-07 09:31:42 -04:00
William Falcon 69e3f904df
ref: inner train loop (intermediate step) 16/n (#3375)
* ref: inner train loop (intermediate step) 16/n

* ref: inner train loop (intermediate step) 16/n

* ref: inner train loop (intermediate step) 16/n

* ref: inner train loop (intermediate step) 16/n

* ref: inner train loop (intermediate step) 16/n

* ref: inner train loop (intermediate step) 16/n
2020-09-06 21:57:20 -04:00
William Falcon 85421466ab
ref: inner train loop (intermediate step) 10/n (#3369) 2020-09-06 08:59:58 -04:00
Adrian Wälchli 48c22c8bad
update batch size in DataModule when auto scaling batch size (#3266)
* fix datamodule hasattr

* fix patch check

* fix setattr

* update docs

* revert patch fix

* changelog

* fix datamodule passed in as fit arg

* docs

* set datamodule batch size in lightning_setattr

* fix merge

* check with has_attr

* access datamodule via trainer

* pass fit args down to tuner

* docs

* fix typos in docs

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2020-09-03 22:07:49 +02:00