Commit Graph

25 Commits

Author SHA1 Message Date
thomas chaton e43820a4be
migrate Data subpackage (#19523)
* update

* update

* update

* update

* Update checkgroup.yml

* More

* Add note

* Labeller should be kept as long as we have the stubs

* update

* update

* update

* Apply suggestions from code review

* init

* ci fix

* pin version range

* https://www.neptune.ai/

---------

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2024-02-26 08:25:00 -05:00
thomas chaton b097a4df3f
Improve data processing to enable downloading LAOIN 400M (#19452) 2024-02-13 13:23:39 +00:00
awaelchli f2f9978377
Update mypy in CI (#19449) 2024-02-13 12:09:35 +01:00
thomas chaton 7dfc279b3f
Add support for parallelizing processing parquet files across workers and nodes. (#19400) 2024-02-05 23:21:25 +00:00
thomas chaton 012f68dcfd
StreamingDataloader: Add profiling support (#19338) 2024-01-24 20:30:55 +00:00
thomas chaton 75510dd9f8
StreamingDataset: Add intra node shuffling to accelerate second epoch (#19296) 2024-01-19 17:08:32 +00:00
Adrian Wälchli 710cac4ce9
Make fsspec requirement the same across subpackages (#19085) 2023-11-29 10:36:28 -05:00
Adrian Wälchli b8a96fe5d6
Fix fsspec local file protocol checks for new fsspec version (#19023)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-11-18 08:51:37 +01:00
Carlos Mocholí 4e72dcc8db
Reduce lightning data's dependencies (#19026) 2023-11-17 16:52:14 -05:00
thomas chaton 792cb73fc6
Remove the LightningDataset relying on un-maintained torchdata (#19019)
Co-authored-by: thomas <thomas@thomass-MacBook-Pro.local>
2023-11-16 16:08:15 -05:00
dependabot[bot] 73f5df0a0a
Bump torch from 2.0.1 to 2.1.0 in /requirements (#18752)
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2023-10-24 14:11:54 +02:00
thomas chaton 1d5851ffe2
Introduce Cache 1/n (#18642)
Co-authored-by: Ethan Harris <ethanwharris@gmail.com>
Co-authored-by: Luca Antiga <luca.antiga@gmail.com>
Co-authored-by: thomas <thomas@thomass-MacBook-Pro.local>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-10-09 16:06:32 +01:00
dependabot[bot] 343f80436d
Bump coverage from 7.3.0 to 7.3.1 in /requirements (#18685)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
2023-10-03 22:51:43 +02:00
dependabot[bot] 74b2ff8196
Update fsspec[http] requirement from <2023.7.0,>2021.06.0 to >2021.06.0,<2023.10.0 in /requirements (#18469)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
2023-10-03 20:52:20 +02:00
Jirka Borovec dbe7ed46a3
replace tests skip with soft xfail (#18486)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-09-12 23:11:03 +02:00
Jirka Borovec 7834bb6377
relax some dependencies from `<=` to `<` (#18435)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2023-08-31 16:47:06 +02:00
dependabot[bot] 2fb90e6a66
Bump coverage from 7.2.7 to 7.3.0 in /requirements (#18300)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-08-16 17:19:35 +02:00
dependabot[bot] 4da8078c99
Bump pytest-rerunfailures from 10.3 to 12.0 in /requirements (#18302) 2023-08-14 14:36:02 +02:00
dependabot[bot] cee05ea029
Update fsspec[http] requirement from <2023.5.0,>2021.06.0 to >2021.06.0,<2023.7.0 in /requirements (#18248)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-08-08 11:50:35 +02:00
dependabot[bot] 85d5b14618
Bump pytest-cov from 4.0.0 to 4.1.0 in /requirements (#18036)
Bumps [pytest-cov](https://github.com/pytest-dev/pytest-cov) from 4.0.0 to 4.1.0.
- [Changelog](https://github.com/pytest-dev/pytest-cov/blob/master/CHANGELOG.rst)
- [Commits](https://github.com/pytest-dev/pytest-cov/compare/v4.0.0...v4.1.0)

---
updated-dependencies:
- dependency-name: pytest-cov
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-08-01 23:17:19 +00:00
dependabot[bot] 324d90aca7
Bump pytest from 7.3.1 to 7.4.0 in /requirements (#18038)
Bumps [pytest](https://github.com/pytest-dev/pytest) from 7.3.1 to 7.4.0.
- [Release notes](https://github.com/pytest-dev/pytest/releases)
- [Changelog](https://github.com/pytest-dev/pytest/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pytest-dev/pytest/compare/7.3.1...7.4.0)

---
updated-dependencies:
- dependency-name: pytest
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-26 22:40:36 +00:00
Jirka Borovec 37c244f94b
bump Lit-Utils to 0.9 (#17955) 2023-07-03 17:49:00 +00:00
dependabot[bot] 93c5f999a7
Bump coverage from 7.2.5 to 7.2.7 in /requirements (#17920)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-01 15:36:18 -04:00
dependabot[bot] c8578c7eec
Update s3fs requirement from <2022.11.1,>=2022.5.0 to >=2022.5.0,<2023.6.1 in /requirements (#17861)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-06-19 14:29:20 +02:00
Noha Alon ca30fd7752
Lightning Dataset (including optimized dataloading of s3 buckets) (#17743)
* Lightning DataLoader

* lightning dataloader

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* init

* example

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* env var

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update src/lightning/pytorch/utilities/data/__init__.py

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>

* remove unused functions

* extra reqs

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update src/lightning/pytorch/utilities/data/fileio.py

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* imports work now! yay

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* tests

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* imports

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* missing import

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* error handling

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update creds for local use case

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* codeowners

* recursive get index

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* index

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* clean up get index

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update imagenet example

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* docstrings

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* docstrings

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* docstrings

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* example cleanup

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* changelog

* reqs

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* codeowners

* requirements

* expose LightningDataset too

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* expost LightningDataset at top level

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove unused private methods from init

* remove private imports

* upper bound on extra requirements

* review comments

* loosen req

* deps

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* test updating fabric base req

* remove version pin on s3fs to test

* recover missing function

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* tests

* update

* random

* torchdata >= 0.3.0

* update torchdata version

* remove torchdata version to test

* try rem torch version pin

* req

* update bucket in test

* req

* skips

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* import

* update structure to lightning.data

* base.txt for data reqs

* fix imports

* rename to LightningS3Dataset

* new workflow

* dont need to test warnings

* reqs

* req

* revert data folder in pytorch

* test import

* tests

* req

* req

* req

* torch version

* req

* req

* open dep

* reformatted

* pin strict

* pin strict extra

* req

* modify workflow, no cache

* try

* patch

* import

* fix

* dataset test

* update getattr

* pin everything to test

* remove torch preinstall from workflow

* workflow

* req

* Update .github/workflows/ci-tests-data.yml

Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>

* workflow

* workflow

* req

* Update .github/workflows/ci-tests-data.yml

Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>

* workflow

* print

* skip test for now

* update path join

* revert app dep version bump

* Update .github/workflows/ci-tests-data.yml

Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>

* workflow updates

* app base req

* req

* window test failure

* add data req to assistant

* try

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add missing comma

* updates

* update

* typo

* requirements

* try widening req

* older torch version

* update

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update

* update

* update

* update

* cleanup tests

* typo again

* update

* remove unnecessary line

* Update .github/CODEOWNERS

* Discard changes to requirements/pytorch/base.txt

* Discard changes to requirements/fabric/base.txt

* Discard changes to requirements/app/base.txt

* requirements

* requirements

* one line

* app workflow pick only app reqs

* rename package

* undo

* don't use cache

* examples CI

* pytorch and fabric CI

* try remove cache

* Apply suggestions from code review

* jirka playing

* jirka playing

* jirka playing

* blah

* flatten LightningDataset

* cleans up dataset class

* jirka playing

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* jirka playing

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* extra

* fix dataset test

* update checkgroups

* Luca's review comments

* val error fix

* unskip test

* min

* fix precommit warning

* cpu

* docstrings

* req

* 2.0.1

* add return type

* typing errors

* req

* return types with quotations

* import for type-checking

* no botocore in cloudagnostic code

* exit args

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* backends typing

* remove oldest from data tests

* typing

* typing

* typing

* types

* type

* typing

* typing

* typing

* import fix

* Changelog

---------

Co-authored-by: Noha Alon <nohaalon@Nohas-MacBook-Air.local>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
Co-authored-by: Justus Schock <justus.schock@posteo.de>
2023-06-13 11:44:41 +01:00