Commit Graph

10098 Commits

Author SHA1 Message Date
Jirka Borovec 6421dd8d4f
precommit: drop Black in favor of Ruff (#19380) 2024-01-31 17:09:39 +00:00
awaelchli 01f8531c9d
Refactor BoringFabric in tests (#19364) 2024-01-30 23:32:45 +01:00
thomas chaton 28b380610f
StreamingDataloader: Resolve typo (#19370) 2024-01-30 16:52:47 +00:00
thomas chaton 322f474978
JPEGSerializer: Fix serializer io.bytes image (#19369) 2024-01-30 16:52:25 +00:00
thomas chaton 10c3a71dbd
Bump Lightning Cloud 0.5.64 (#19372) 2024-01-30 14:57:11 +00:00
Michael Pilosov, PhD 5361acdcca
Shorten docstring (for CLI compat) (#19356) 2024-01-30 08:11:51 +01:00
awaelchli 6018b0743c
Error message to inform bitsandbytes is only supported on CUDA (#19360) 2024-01-29 19:52:28 -05:00
awaelchli bcc8de8dec
Update Trainer's ckpt_path type for pathlib Path (#19362) 2024-01-30 00:42:18 +01:00
thomas chaton b0e1ee2469
map operator: Add support for nested folders (#19366) 2024-01-29 19:17:28 +00:00
thomas chaton 37a521cad2
map operator: Add weights to evenly distributed works among workers (#19365) 2024-01-29 18:27:37 +00:00
Jirka Borovec 9d35c61f5f
ci: adding missing requirements for generating legacy ckpt (#19353) 2024-01-28 11:22:07 +01:00
awaelchli 1a59097ab2
Drop support for PyTorch 1.12 (#19300)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
2024-01-26 11:44:24 -05:00
Jirka Borovec 3bd133b107
CI: enable testing with coming PT 2.2 (#19289)
* ci: build dockers for PT 2.2
* py3.12
* --pre --extra-index-url
* typing-extensions
* bump jsonargparse
* install latest jsonargparse
* Add windows skips for Fabric
* convert to xfail
* add pytorch skips
* skip checkpoint consolidation test
* set max torch

---------

Co-authored-by: awaelchli <aedu.waelchli@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-01-26 16:42:09 +01:00
thomas chaton ee9f17eb3c
Downloader: Resolve race condition (#19348) 2024-01-25 15:36:42 +00:00
thomas chaton c10fd22c74
BC: Switch map operator arguments order (#19345)
update

Co-authored-by: thomas <thomas@thomass-MacBook-Pro.local>
2024-01-25 09:37:28 +00:00
thomas chaton 012f68dcfd
StreamingDataloader: Add profiling support (#19338) 2024-01-24 20:30:55 +00:00
thomas chaton 925357d2e9
Streaming Dataset: tiny optimisations (#19342) 2024-01-24 19:43:33 +00:00
thomas chaton 0a75d3b7e6
tiny improvement (#19341) 2024-01-24 17:58:30 +00:00
Andy☼ McSherry☼ 577bd85654
Allow any AWS authentication method in studios (#19336) 2024-01-24 16:20:53 +00:00
awaelchli 71bfdc3c60
Remove `__len__` from CombinedStreamingDataset (#19321) 2024-01-24 11:07:32 -05:00
Carlos Mocholí b446b08be5
Fallback to `ACCELERATOR_TYPE` for TPU flops (#19314) 2024-01-24 16:21:56 +01:00
awaelchli 7cc79fe7ba
Reapply `torch.compile` in Fabric.setup() (#19280) 2024-01-23 21:17:41 -05:00
awaelchli 1faddcb24c
Update Lightning AI multi-node guide (#19324)
Co-authored-by: Luca Antiga <luca.antiga@gmail.com>
2024-01-23 18:23:49 -05:00
Laurits Fredsgaard Larsen 3044e83d11
`_restricted_classmethod`: add wrapper, to allow inspection (#19332) 2024-01-23 18:23:06 -05:00
awaelchli b1127e3608
Utility to consolidate sharded checkpoints (#19213)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2024-01-23 17:15:22 -05:00
thomas chaton ed367ca675
StreamingDataLoader: Resolve fault tolerance with the CombinedStreamingDataset and multiple workers (#19326) 2024-01-23 17:54:10 +00:00
awaelchli e1a6dd9edb
Clarify return type in `training_step` docs in case of manual optimization (#19327)
Co-authored-by: Julien Hauret <53187038+jhauret@users.noreply.github.com>
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
2024-01-23 10:47:58 -05:00
Aditya Singh 40197edc66
Fix the PyTorchProfiler description (#19334) 2024-01-23 16:40:51 +01:00
thomas chaton d08e6cd916
Add walk operator (#19333) 2024-01-23 14:21:08 +00:00
shenmishajing d02009af76
Fix saving relative symlink for ModelCheckpoint callback (#19303)
Co-authored-by: awaelchli <aedu.waelchli@gmail.com>
2024-01-20 09:32:08 -05:00
Victor Prins e89f46a74e
Add `@override` for files in `src/lightning/pytorch/utilities` (#19315) 2024-01-19 21:45:44 +01:00
thomas chaton 75510dd9f8
StreamingDataset: Add intra node shuffling to accelerate second epoch (#19296) 2024-01-19 17:08:32 +00:00
Victor Prins 4004f856ca
Add `@override` for files in `src/lightning/pytorch/overrides` (#19316) 2024-01-19 17:35:55 +01:00
thomas chaton 97d71aba0b
Data Processor: Resolve several bugs found while publishing a Studio (#19309) 2024-01-18 20:46:06 +00:00
awaelchli 93c1ab0653
Dedicated docs page for distributed checkpoints (Trainer) (#19299) 2024-01-17 12:20:12 +01:00
thomas chaton 6655c4d752
Bump Lightning Cloud to 0.5.59 (#19301)
update

Co-authored-by: thomas <thomas@thomass-MacBook-Pro.local>
2024-01-17 09:39:26 +00:00
awaelchli 6dfaebabe5
Avoid deprecated `load_state_dict` for distributed checkpoints in PyTorch 2.2+ (#19298) 2024-01-16 21:09:20 -05:00
Victor Prins dbcfcb9780
Add `@override` for files in `src/lightning/fabric/utilities` (#19293)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-01-16 14:44:37 +01:00
awaelchli a4ecf8d5c8
Dedicated docs page for distributed checkpoints (#19287) 2024-01-16 08:44:10 -05:00
thomas chaton 052c0d5b04
Bump Lightning Cloud 0.5.58 (#19295) 2024-01-16 13:43:18 +00:00
thomas chaton 19d9eabbc5
Enable map over inputs without files input (#19285) 2024-01-16 12:19:01 +00:00
Victor Prins 4996965d11
Add `@override` for `src/lightning/fabric/wrappers.py` (#19292) 2024-01-16 06:30:56 -05:00
awaelchli 628ee0cb61
Handle queue errors in streaming dataset reader (#19167) 2024-01-15 12:04:29 -05:00
Michael Bommarito 661c181c34
Fix typo in kwarg in `SpikeDetection` (#19282) 2024-01-15 16:42:53 +01:00
awaelchli 23c3454edc
Assert job id when requeuing SLURM job (#19283) 2024-01-15 16:25:50 +01:00
awaelchli 75e112f138
Support gradient clipping by value in Fabric FSDP (#19236) 2024-01-11 17:28:30 +01:00
awaelchli 41503fc2ba
Install `bitsandbytes` to run skipped bitsandbytes tests (#19260) 2024-01-11 02:50:43 -05:00
awaelchli 6bc27d54a0
Request `torch.cuda` RNG states only if CUDA is available (#19234) 2024-01-10 16:16:29 -05:00
dependabot[bot] 1a1b989457
Bump follow-redirects from 1.15.3 to 1.15.4 in /src/lightning/app/cli/react-ui-template/ui (#19255)
Bump follow-redirects in /src/lightning/app/cli/react-ui-template/ui

Bumps [follow-redirects](https://github.com/follow-redirects/follow-redirects) from 1.15.3 to 1.15.4.
- [Release notes](https://github.com/follow-redirects/follow-redirects/releases)
- [Commits](https://github.com/follow-redirects/follow-redirects/compare/v1.15.3...v1.15.4)

---
updated-dependencies:
- dependency-name: follow-redirects
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-01-10 21:04:07 +01:00
Jerry Mannil bb14a979e5
Call `prepare_data()` after `setup_environment()` for XLA (#19181)
Co-authored-by: awaelchli <aedu.waelchli@gmail.com>
2024-01-10 20:05:48 +01:00