Commit Graph

4276 Commits

Author SHA1 Message Date
Jirka Borovec 8cf185a06d
fix: use standalone tests' exit code (#20430) 2024-11-19 09:50:26 +01:00
Alan Chu c110f4f3f6
Allow callbacks to be restored not just during training (#20403)
* Allow callbacks to be restored not just during training

* add test case

* test test case failure

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix test case

---------

Co-authored-by: Alan Chu <alanchu@Alans-Air.lan>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Luca Antiga <luca.antiga@gmail.com>
2024-11-14 23:46:19 +01:00
Mauricio Villegas bfe3e8ab8f
Change LightningCLI tests to account for future fix in jsonargparse (#20372)
Co-authored-by: Luca Antiga <luca.antiga@gmail.com>
2024-11-13 14:40:23 +01:00
Yuanhong Yu bd5866b295
fix batchsampler does not work correctly (#20327)
* fix batchsampler does not work correctly

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add batch sampler shuffle state test
2024-11-13 14:01:47 +01:00
Luca Antiga 9358898c6e
Ensure restarting from checkpoints leads to consistent internal counters (#20379)
* Fix checkpoint progress for fit loop and batch loop

* Check loss parity

* Rename test

* Fix validation loop handling on restart

* Fix loop reset test

* Avoid skipping to val end if saved mid validation

* Fix type checks in compare state dicts

* Fix edge cases and start from last with and without val

* Clean up

* Formatting

* Avoid running validation when restarting from last

* Fix type annotations

* Fix formatting

* Ensure int max_batch

* Fix condition on batches that stepped

* Remove expected on_train_epoch_start when restarting mid epoch
2024-11-13 11:51:40 +01:00
Jirka Borovec 61a403a512
bump: Torch `2.5` (#20351)
* bump: Torch `2.5.0`

* push docker

* docker

* 2.5.1 and mypy

* update USE_DISTRIBUTED=0 test

* also for pytorch lightning no distributed

* set USE_LIBUV=0 on windows

* try drop pickle warning

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* disable compiling update_metrics

* bump 2.2.x to bugfix

* disable also log in logger connector (also calls metric)

* more point release bumps

* remove unloved type ignore and print some more on exit

* update checkgroup

* minor versions

* shortened version in build-pl

* pytorch 2.4 is with python 3.11

* 2.1 and 2.3 without patch release

* for 2.4.1: docker with 3.11 test with 3.12

---------

Co-authored-by: Thomas Viehmann <tv.code@beamnet.de>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-11-12 15:59:08 +01:00
Tianshu Wang 474bdd0393
Make RichProgressBar visible for both light and dark background (#20260) 2024-09-30 18:08:45 +02:00
Jirka Borovec d1ca3c6e09
fix(tests): update tests after torch 2.4.1 (#20302)
* update

* test_loggers_pickle_all

* more...

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-09-26 11:52:22 -04:00
Thomas Viehmann 1551a16b94
Add device property to lazy load functionality (#20183) 2024-08-09 08:41:40 -04:00
GdoongMathew 828fd99896
Re-enable passing BytesIO as path in `.to_onnx()` (#20172) 2024-08-07 11:07:02 -04:00
Abhishek Singh be0ae06596
Add `ddp_find_unused_parameters_true` alias in Fabric's DDPStrategy (#20125) 2024-08-07 10:47:36 -04:00
Corwin Joy 631911c004
Add special logic for 'step' in _optimizer_to_device (#20019)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2024-08-05 17:17:06 -04:00
awaelchli 345450b0c3
Fix parameter count in ModelSummary when parameters are DTensors (#20163) 2024-08-05 10:57:31 -04:00
awaelchli d4de8e20e9
Count number of modules in train/eval mode in ModelSummary (#20159) 2024-08-04 15:28:26 -04:00
Jonas Tingeborn e61eafa671
Add ability for TQDMProgressBar to retain prior epoch training bars (#19578)
Co-authored-by: awaelchli <aedu.waelchli@gmail.com>
2024-08-04 03:28:26 -04:00
awaelchli 194c3c31aa
Add simple LSTM example to demo folder (#20143)
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
2024-07-31 09:02:15 -04:00
Alexander Zhipa b19eba3ac7
Fix moving keys to device in ResultCollection (#19814)
Co-authored-by: Alexander Jipa <azzhipa@amazon.com>
2024-07-26 14:03:18 -04:00
awaelchli 6c70dd7cf0
Fix attribute error on `_NotYetLoadedTensor` after loading checkpoint into quantized model with `_lazy_load()` (#20121) 2024-07-24 05:39:40 -04:00
awaelchli d0a6b34ea9
Avoid printing the seed info message multiple times (#20108) 2024-07-20 20:25:11 +02:00
awaelchli e214395d31
Remove confusing warning "Missing logger folder" (#20109) 2024-07-20 20:24:38 +02:00
Alexander Zhipa 74470a6dbd
Enable dumping raw prof files in `AdvancedProfiler` (#19703)
Co-authored-by: Alexander Jipa <azzhipa@amazon.com>
2024-07-15 10:40:32 -04:00
awaelchli bdafe5e739
Add Python 3.12 to the CPU test matrix (#20078) 2024-07-13 06:07:35 -04:00
awaelchli 7d1a70752f
Update PyTorch 2.4 tests (#20079) 2024-07-13 05:09:09 -04:00
Abhishek Singh d5ae9ec568
Make numpy an optional dependency in `utilities\seed.py` (#20055)
Co-authored-by: awaelchli <aedu.waelchli@gmail.com>
2024-07-12 17:24:04 -04:00
awaelchli 9987d993a0
Remove support for Python 3.8 (#20071) 2024-07-12 10:33:35 -04:00
awaelchli bf25167bbf
Add testing for PyTorch 2.4 (Trainer) (#20010) 2024-07-11 06:52:56 -04:00
Mauricio Villegas 96b75df41a
Fix LightningCLI saving hyperparameters breaking change (#20068) 2024-07-11 06:38:18 -04:00
PL Ghost 00cf5c90ca
Adding test for legacy checkpoint created with 2.3.3 (#20061) 2024-07-08 18:16:19 -04:00
awaelchli 5829ef8ab3
Set `weights_only` in tests to avoid warnings in PyTorch 2.4 (#20057) 2024-07-08 04:38:27 -04:00
pre-commit-ci[bot] a40affb953
[pre-commit.ci] pre-commit suggestions (#20035)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
Co-authored-by: awaelchli <aedu.waelchli@gmail.com>
2024-07-05 12:17:15 -04:00
awaelchli 330af381de
Remove the lightning app code (#20039)
* remove source, tests, docs, workflows

* update checkgroup

* update codeowners

* update workflows

* package setup

* config files

* update

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove entry point

* docs

* __main__

* remove store

* leftover store removals

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-07-05 13:56:29 +02:00
PL Ghost c00ed5cb93
Adding test for legacy checkpoint created with 2.3.2 (#20042) 2024-07-04 05:38:23 -04:00
awaelchli 693c21ac1b
Add testing for PyTorch 2.4 (Fabric) (#20028) 2024-07-02 18:01:03 -04:00
awaelchli 14493c0685
Drop PyTorch 2.0 from the test matrix (#20009) 2024-06-30 18:02:00 -04:00
PL Ghost 2524864b3c
Adding test for legacy checkpoint created with 2.3.1 (#20023) 2024-06-28 14:48:15 +02:00
awaelchli 3f69134479
Fix seed in test to avoid interactions on global random state (#20014) 2024-06-27 15:29:13 +02:00
thomas chaton df0d462738
Add support for batch stop (#20017) 2024-06-26 17:20:10 +01:00
thomas chaton d53e107fb5
Scale mmt (#19984) 2024-06-26 11:53:41 +01:00
awaelchli e330da5870
Fix torch-numpy compatibility conflict in tests (#20004) 2024-06-21 20:20:59 -04:00
Mauricio Villegas 5981aebfcc
Update `test_lightning_cli_help` for future change in jsonargparse (#20002) 2024-06-21 10:38:42 -04:00
Etay Livne 1e83a1bd32
Check if CometLogger experiment is alive (#19915)
Co-authored-by: Etay Livne <etay.livne@mobileye.com>
2024-06-18 13:15:12 -04:00
awaelchli c1af4d0527
Better graceful shutdown for KeyboardInterrupt (#19976) 2024-06-16 10:43:42 -04:00
PL Ghost b16e998a6e
Adding test for legacy checkpoint created with 2.3.0 (#19974) 2024-06-16 09:37:39 -04:00
awaelchli a42484cf8e
Fix failing app tests (#19971) 2024-06-13 20:58:34 +01:00
Alexander Jipa 06ea3a0571
Fix resetting epoch loop restarting flag in LearningRateFinder (#19819) 2024-06-07 10:52:58 -04:00
Björn Barz 5fa32d95e3
Ignore parameters causing ValueError when dumping to YAML (#19804) 2024-06-06 18:36:28 -04:00
Douwe den Blanken 4f96c83ba0
Sanitize argument-free object params before logging (#19771)
Co-authored-by: awaelchli <aedu.waelchli@gmail.com>
2024-06-06 14:51:48 -04:00
Mario Vasilev 812ffdec84
Fix `save_last` type annotation for ModelCheckpoint (#19808) 2024-06-05 20:24:45 -04:00
Liyang90 7668a6bf59
Flexible and easy to use HSDP setting (#19504)
Co-authored-by: awaelchli <aedu.waelchli@gmail.com>
2024-06-05 20:15:03 -04:00
awaelchli 1a6786d682
Destroy process group in atexit handler (#19931) 2024-06-05 19:31:43 -04:00