Lightning Dataset (including optimized dataloading of s3 buckets) (#17743)
* Lightning DataLoader
* lightning dataloader
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* init
* example
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* env var
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Update src/lightning/pytorch/utilities/data/__init__.py
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
* remove unused functions
* extra reqs
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Update src/lightning/pytorch/utilities/data/fileio.py
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* imports work now! yay
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* tests
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* imports
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* missing import
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* error handling
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* update creds for local use case
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* codeowners
* recursive get index
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* index
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* clean up get index
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* update imagenet example
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* docstrings
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* docstrings
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* docstrings
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* example cleanup
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* changelog
* reqs
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* codeowners
* requirements
* expose LightningDataset too
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* expost LightningDataset at top level
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* remove unused private methods from init
* remove private imports
* upper bound on extra requirements
* review comments
* loosen req
* deps
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* test updating fabric base req
* remove version pin on s3fs to test
* recover missing function
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* tests
* update
* random
* torchdata >= 0.3.0
* update torchdata version
* remove torchdata version to test
* try rem torch version pin
* req
* update bucket in test
* req
* skips
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* import
* update structure to lightning.data
* base.txt for data reqs
* fix imports
* rename to LightningS3Dataset
* new workflow
* dont need to test warnings
* reqs
* req
* revert data folder in pytorch
* test import
* tests
* req
* req
* req
* torch version
* req
* req
* open dep
* reformatted
* pin strict
* pin strict extra
* req
* modify workflow, no cache
* try
* patch
* import
* fix
* dataset test
* update getattr
* pin everything to test
* remove torch preinstall from workflow
* workflow
* req
* Update .github/workflows/ci-tests-data.yml
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
* workflow
* workflow
* req
* Update .github/workflows/ci-tests-data.yml
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
* workflow
* print
* skip test for now
* update path join
* revert app dep version bump
* Update .github/workflows/ci-tests-data.yml
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
* workflow updates
* app base req
* req
* window test failure
* add data req to assistant
* try
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* add missing comma
* updates
* update
* typo
* requirements
* try widening req
* older torch version
* update
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* update
* update
* update
* update
* cleanup tests
* typo again
* update
* remove unnecessary line
* Update .github/CODEOWNERS
* Discard changes to requirements/pytorch/base.txt
* Discard changes to requirements/fabric/base.txt
* Discard changes to requirements/app/base.txt
* requirements
* requirements
* one line
* app workflow pick only app reqs
* rename package
* undo
* don't use cache
* examples CI
* pytorch and fabric CI
* try remove cache
* Apply suggestions from code review
* jirka playing
* jirka playing
* jirka playing
* blah
* flatten LightningDataset
* cleans up dataset class
* jirka playing
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* jirka playing
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* extra
* fix dataset test
* update checkgroups
* Luca's review comments
* val error fix
* unskip test
* min
* fix precommit warning
* cpu
* docstrings
* req
* 2.0.1
* add return type
* typing errors
* req
* return types with quotations
* import for type-checking
* no botocore in cloudagnostic code
* exit args
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* update
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* backends typing
* remove oldest from data tests
* typing
* typing
* typing
* types
* type
* typing
* typing
* typing
* import fix
* Changelog
---------
Co-authored-by: Noha Alon <nohaalon@Nohas-MacBook-Air.local>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
Co-authored-by: Justus Schock <justus.schock@posteo.de>
2023-06-13 10:44:41 +00:00
|
|
|
import os
|
|
|
|
from unittest import mock
|
|
|
|
|
|
|
|
import pytest
|
ruff: replace isort with ruff +TPU (#17684)
* ruff: replace isort with ruff
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fixing & imports
* lines in warning test
* docs
* fix enum import
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fixing
* import
* fix lines
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* type ClusterEnvironment
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-09-26 15:54:55 +00:00
|
|
|
from lightning.data.fileio import OpenCloudFileObj, is_path, is_url, open_single_file, path_to_url
|
Lightning Dataset (including optimized dataloading of s3 buckets) (#17743)
* Lightning DataLoader
* lightning dataloader
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* init
* example
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* env var
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Update src/lightning/pytorch/utilities/data/__init__.py
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
* remove unused functions
* extra reqs
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Update src/lightning/pytorch/utilities/data/fileio.py
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* imports work now! yay
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* tests
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* imports
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* missing import
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* error handling
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* update creds for local use case
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* codeowners
* recursive get index
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* index
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* clean up get index
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* update imagenet example
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* docstrings
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* docstrings
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* docstrings
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* example cleanup
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* changelog
* reqs
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* codeowners
* requirements
* expose LightningDataset too
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* expost LightningDataset at top level
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* remove unused private methods from init
* remove private imports
* upper bound on extra requirements
* review comments
* loosen req
* deps
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* test updating fabric base req
* remove version pin on s3fs to test
* recover missing function
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* tests
* update
* random
* torchdata >= 0.3.0
* update torchdata version
* remove torchdata version to test
* try rem torch version pin
* req
* update bucket in test
* req
* skips
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* import
* update structure to lightning.data
* base.txt for data reqs
* fix imports
* rename to LightningS3Dataset
* new workflow
* dont need to test warnings
* reqs
* req
* revert data folder in pytorch
* test import
* tests
* req
* req
* req
* torch version
* req
* req
* open dep
* reformatted
* pin strict
* pin strict extra
* req
* modify workflow, no cache
* try
* patch
* import
* fix
* dataset test
* update getattr
* pin everything to test
* remove torch preinstall from workflow
* workflow
* req
* Update .github/workflows/ci-tests-data.yml
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
* workflow
* workflow
* req
* Update .github/workflows/ci-tests-data.yml
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
* workflow
* print
* skip test for now
* update path join
* revert app dep version bump
* Update .github/workflows/ci-tests-data.yml
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
* workflow updates
* app base req
* req
* window test failure
* add data req to assistant
* try
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* add missing comma
* updates
* update
* typo
* requirements
* try widening req
* older torch version
* update
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* update
* update
* update
* update
* cleanup tests
* typo again
* update
* remove unnecessary line
* Update .github/CODEOWNERS
* Discard changes to requirements/pytorch/base.txt
* Discard changes to requirements/fabric/base.txt
* Discard changes to requirements/app/base.txt
* requirements
* requirements
* one line
* app workflow pick only app reqs
* rename package
* undo
* don't use cache
* examples CI
* pytorch and fabric CI
* try remove cache
* Apply suggestions from code review
* jirka playing
* jirka playing
* jirka playing
* blah
* flatten LightningDataset
* cleans up dataset class
* jirka playing
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* jirka playing
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* extra
* fix dataset test
* update checkgroups
* Luca's review comments
* val error fix
* unskip test
* min
* fix precommit warning
* cpu
* docstrings
* req
* 2.0.1
* add return type
* typing errors
* req
* return types with quotations
* import for type-checking
* no botocore in cloudagnostic code
* exit args
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* update
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* backends typing
* remove oldest from data tests
* typing
* typing
* typing
* types
* type
* typing
* typing
* typing
* import fix
* Changelog
---------
Co-authored-by: Noha Alon <nohaalon@Nohas-MacBook-Air.local>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
Co-authored-by: Justus Schock <justus.schock@posteo.de>
2023-06-13 10:44:41 +00:00
|
|
|
|
|
|
|
|
|
|
|
@pytest.mark.parametrize(
|
|
|
|
("input_str", "expected"),
|
|
|
|
[
|
|
|
|
("s3://my_bucket/a", True),
|
|
|
|
("s3:/my_bucket", False),
|
|
|
|
("my_bucket", False),
|
|
|
|
("my_bucket_s3://", False),
|
|
|
|
],
|
|
|
|
)
|
|
|
|
def test_is_url(input_str, expected):
|
|
|
|
assert is_url(input_str) == expected
|
|
|
|
|
|
|
|
|
|
|
|
@pytest.mark.parametrize(
|
|
|
|
("input_str", "expected"),
|
|
|
|
[
|
|
|
|
("s3://my_bucket/a", False),
|
|
|
|
("s3:/my_bucket", False),
|
|
|
|
("my_bucket", False),
|
|
|
|
("my_bucket_s3://", False),
|
|
|
|
("/my_bucket", True),
|
|
|
|
],
|
|
|
|
)
|
|
|
|
def test_is_path(input_str, expected):
|
|
|
|
assert is_path(input_str) == expected
|
|
|
|
|
|
|
|
|
|
|
|
@pytest.mark.parametrize(
|
|
|
|
("path", "bucket_name", "bucket_root_path", "expected"),
|
|
|
|
[
|
|
|
|
("/data/abc/def", "my_bucket", "/data/abc", "s3://my_bucket/def"),
|
|
|
|
("/data/abc/def", "my_bucket", "/data", "s3://my_bucket/abc/def"),
|
|
|
|
],
|
|
|
|
)
|
|
|
|
def test_path_to_url(path, bucket_name, bucket_root_path, expected):
|
|
|
|
assert path_to_url(path, bucket_name, bucket_root_path) == expected
|
|
|
|
|
|
|
|
|
|
|
|
def test_path_to_url_error():
|
|
|
|
with pytest.raises(ValueError, match="Cannot create a path from /path1/abc relative to /path2"):
|
|
|
|
path_to_url("/path1/abc", "foo", "/path2")
|
|
|
|
|
|
|
|
|
|
|
|
@pytest.mark.parametrize("path", ["s3://my_bucket/da.txt", "abc.txt"])
|
|
|
|
@mock.patch("s3fs.S3FileSystem", autospec=True)
|
2023-09-22 09:08:28 +00:00
|
|
|
def test_read_single_file_read(patch: mock.Mock, path, tmp_path):
|
Lightning Dataset (including optimized dataloading of s3 buckets) (#17743)
* Lightning DataLoader
* lightning dataloader
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* init
* example
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* env var
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Update src/lightning/pytorch/utilities/data/__init__.py
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
* remove unused functions
* extra reqs
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Update src/lightning/pytorch/utilities/data/fileio.py
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* imports work now! yay
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* tests
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* imports
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* missing import
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* error handling
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* update creds for local use case
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* codeowners
* recursive get index
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* index
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* clean up get index
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* update imagenet example
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* docstrings
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* docstrings
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* docstrings
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* example cleanup
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* changelog
* reqs
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* codeowners
* requirements
* expose LightningDataset too
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* expost LightningDataset at top level
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* remove unused private methods from init
* remove private imports
* upper bound on extra requirements
* review comments
* loosen req
* deps
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* test updating fabric base req
* remove version pin on s3fs to test
* recover missing function
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* tests
* update
* random
* torchdata >= 0.3.0
* update torchdata version
* remove torchdata version to test
* try rem torch version pin
* req
* update bucket in test
* req
* skips
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* import
* update structure to lightning.data
* base.txt for data reqs
* fix imports
* rename to LightningS3Dataset
* new workflow
* dont need to test warnings
* reqs
* req
* revert data folder in pytorch
* test import
* tests
* req
* req
* req
* torch version
* req
* req
* open dep
* reformatted
* pin strict
* pin strict extra
* req
* modify workflow, no cache
* try
* patch
* import
* fix
* dataset test
* update getattr
* pin everything to test
* remove torch preinstall from workflow
* workflow
* req
* Update .github/workflows/ci-tests-data.yml
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
* workflow
* workflow
* req
* Update .github/workflows/ci-tests-data.yml
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
* workflow
* print
* skip test for now
* update path join
* revert app dep version bump
* Update .github/workflows/ci-tests-data.yml
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
* workflow updates
* app base req
* req
* window test failure
* add data req to assistant
* try
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* add missing comma
* updates
* update
* typo
* requirements
* try widening req
* older torch version
* update
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* update
* update
* update
* update
* cleanup tests
* typo again
* update
* remove unnecessary line
* Update .github/CODEOWNERS
* Discard changes to requirements/pytorch/base.txt
* Discard changes to requirements/fabric/base.txt
* Discard changes to requirements/app/base.txt
* requirements
* requirements
* one line
* app workflow pick only app reqs
* rename package
* undo
* don't use cache
* examples CI
* pytorch and fabric CI
* try remove cache
* Apply suggestions from code review
* jirka playing
* jirka playing
* jirka playing
* blah
* flatten LightningDataset
* cleans up dataset class
* jirka playing
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* jirka playing
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* extra
* fix dataset test
* update checkgroups
* Luca's review comments
* val error fix
* unskip test
* min
* fix precommit warning
* cpu
* docstrings
* req
* 2.0.1
* add return type
* typing errors
* req
* return types with quotations
* import for type-checking
* no botocore in cloudagnostic code
* exit args
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* update
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* backends typing
* remove oldest from data tests
* typing
* typing
* typing
* types
* type
* typing
* typing
* typing
* import fix
* Changelog
---------
Co-authored-by: Noha Alon <nohaalon@Nohas-MacBook-Air.local>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
Co-authored-by: Justus Schock <justus.schock@posteo.de>
2023-06-13 10:44:41 +00:00
|
|
|
from torchdata.datapipes.utils import StreamWrapper
|
|
|
|
|
|
|
|
is_s3 = is_url(path)
|
|
|
|
|
|
|
|
if not is_s3:
|
2023-09-22 09:08:28 +00:00
|
|
|
path = os.path.join(tmp_path, path)
|
Lightning Dataset (including optimized dataloading of s3 buckets) (#17743)
* Lightning DataLoader
* lightning dataloader
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* init
* example
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* env var
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Update src/lightning/pytorch/utilities/data/__init__.py
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
* remove unused functions
* extra reqs
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Update src/lightning/pytorch/utilities/data/fileio.py
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* imports work now! yay
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* tests
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* imports
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* missing import
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* error handling
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* update creds for local use case
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* codeowners
* recursive get index
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* index
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* clean up get index
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* update imagenet example
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* docstrings
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* docstrings
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* docstrings
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* example cleanup
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* changelog
* reqs
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* codeowners
* requirements
* expose LightningDataset too
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* expost LightningDataset at top level
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* remove unused private methods from init
* remove private imports
* upper bound on extra requirements
* review comments
* loosen req
* deps
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* test updating fabric base req
* remove version pin on s3fs to test
* recover missing function
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* tests
* update
* random
* torchdata >= 0.3.0
* update torchdata version
* remove torchdata version to test
* try rem torch version pin
* req
* update bucket in test
* req
* skips
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* import
* update structure to lightning.data
* base.txt for data reqs
* fix imports
* rename to LightningS3Dataset
* new workflow
* dont need to test warnings
* reqs
* req
* revert data folder in pytorch
* test import
* tests
* req
* req
* req
* torch version
* req
* req
* open dep
* reformatted
* pin strict
* pin strict extra
* req
* modify workflow, no cache
* try
* patch
* import
* fix
* dataset test
* update getattr
* pin everything to test
* remove torch preinstall from workflow
* workflow
* req
* Update .github/workflows/ci-tests-data.yml
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
* workflow
* workflow
* req
* Update .github/workflows/ci-tests-data.yml
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
* workflow
* print
* skip test for now
* update path join
* revert app dep version bump
* Update .github/workflows/ci-tests-data.yml
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
* workflow updates
* app base req
* req
* window test failure
* add data req to assistant
* try
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* add missing comma
* updates
* update
* typo
* requirements
* try widening req
* older torch version
* update
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* update
* update
* update
* update
* cleanup tests
* typo again
* update
* remove unnecessary line
* Update .github/CODEOWNERS
* Discard changes to requirements/pytorch/base.txt
* Discard changes to requirements/fabric/base.txt
* Discard changes to requirements/app/base.txt
* requirements
* requirements
* one line
* app workflow pick only app reqs
* rename package
* undo
* don't use cache
* examples CI
* pytorch and fabric CI
* try remove cache
* Apply suggestions from code review
* jirka playing
* jirka playing
* jirka playing
* blah
* flatten LightningDataset
* cleans up dataset class
* jirka playing
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* jirka playing
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* extra
* fix dataset test
* update checkgroups
* Luca's review comments
* val error fix
* unskip test
* min
* fix precommit warning
* cpu
* docstrings
* req
* 2.0.1
* add return type
* typing errors
* req
* return types with quotations
* import for type-checking
* no botocore in cloudagnostic code
* exit args
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* update
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* backends typing
* remove oldest from data tests
* typing
* typing
* typing
* types
* type
* typing
* typing
* typing
* import fix
* Changelog
---------
Co-authored-by: Noha Alon <nohaalon@Nohas-MacBook-Air.local>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
Co-authored-by: Justus Schock <justus.schock@posteo.de>
2023-06-13 10:44:41 +00:00
|
|
|
with open(path, "w") as f:
|
|
|
|
f.write("mytestfile")
|
|
|
|
|
|
|
|
file_stream = open_single_file(path)
|
|
|
|
assert isinstance(file_stream, StreamWrapper)
|
|
|
|
|
|
|
|
content = file_stream.read()
|
|
|
|
|
|
|
|
if is_s3:
|
|
|
|
assert isinstance(file_stream.file_obj, mock.Mock)
|
|
|
|
assert patch.open.assert_called_once
|
|
|
|
|
|
|
|
else:
|
|
|
|
assert content == "mytestfile"
|
|
|
|
|
|
|
|
|
|
|
|
@pytest.mark.parametrize("path", ["s3://my_bucket/da.txt", "abc.txt"])
|
|
|
|
@mock.patch("s3fs.S3FileSystem", autospec=True)
|
2023-09-22 09:08:28 +00:00
|
|
|
def test_read_single_file_write(patch: mock.Mock, path, tmp_path):
|
Lightning Dataset (including optimized dataloading of s3 buckets) (#17743)
* Lightning DataLoader
* lightning dataloader
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* init
* example
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* env var
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Update src/lightning/pytorch/utilities/data/__init__.py
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
* remove unused functions
* extra reqs
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Update src/lightning/pytorch/utilities/data/fileio.py
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* imports work now! yay
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* tests
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* imports
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* missing import
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* error handling
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* update creds for local use case
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* codeowners
* recursive get index
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* index
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* clean up get index
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* update imagenet example
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* docstrings
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* docstrings
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* docstrings
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* example cleanup
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* changelog
* reqs
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* codeowners
* requirements
* expose LightningDataset too
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* expost LightningDataset at top level
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* remove unused private methods from init
* remove private imports
* upper bound on extra requirements
* review comments
* loosen req
* deps
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* test updating fabric base req
* remove version pin on s3fs to test
* recover missing function
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* tests
* update
* random
* torchdata >= 0.3.0
* update torchdata version
* remove torchdata version to test
* try rem torch version pin
* req
* update bucket in test
* req
* skips
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* import
* update structure to lightning.data
* base.txt for data reqs
* fix imports
* rename to LightningS3Dataset
* new workflow
* dont need to test warnings
* reqs
* req
* revert data folder in pytorch
* test import
* tests
* req
* req
* req
* torch version
* req
* req
* open dep
* reformatted
* pin strict
* pin strict extra
* req
* modify workflow, no cache
* try
* patch
* import
* fix
* dataset test
* update getattr
* pin everything to test
* remove torch preinstall from workflow
* workflow
* req
* Update .github/workflows/ci-tests-data.yml
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
* workflow
* workflow
* req
* Update .github/workflows/ci-tests-data.yml
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
* workflow
* print
* skip test for now
* update path join
* revert app dep version bump
* Update .github/workflows/ci-tests-data.yml
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
* workflow updates
* app base req
* req
* window test failure
* add data req to assistant
* try
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* add missing comma
* updates
* update
* typo
* requirements
* try widening req
* older torch version
* update
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* update
* update
* update
* update
* cleanup tests
* typo again
* update
* remove unnecessary line
* Update .github/CODEOWNERS
* Discard changes to requirements/pytorch/base.txt
* Discard changes to requirements/fabric/base.txt
* Discard changes to requirements/app/base.txt
* requirements
* requirements
* one line
* app workflow pick only app reqs
* rename package
* undo
* don't use cache
* examples CI
* pytorch and fabric CI
* try remove cache
* Apply suggestions from code review
* jirka playing
* jirka playing
* jirka playing
* blah
* flatten LightningDataset
* cleans up dataset class
* jirka playing
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* jirka playing
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* extra
* fix dataset test
* update checkgroups
* Luca's review comments
* val error fix
* unskip test
* min
* fix precommit warning
* cpu
* docstrings
* req
* 2.0.1
* add return type
* typing errors
* req
* return types with quotations
* import for type-checking
* no botocore in cloudagnostic code
* exit args
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* update
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* backends typing
* remove oldest from data tests
* typing
* typing
* typing
* types
* type
* typing
* typing
* typing
* import fix
* Changelog
---------
Co-authored-by: Noha Alon <nohaalon@Nohas-MacBook-Air.local>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
Co-authored-by: Justus Schock <justus.schock@posteo.de>
2023-06-13 10:44:41 +00:00
|
|
|
from torchdata.datapipes.utils import StreamWrapper
|
|
|
|
|
|
|
|
is_s3 = is_url(path)
|
|
|
|
|
|
|
|
if not is_s3:
|
2023-09-22 09:08:28 +00:00
|
|
|
path = os.path.join(tmp_path, path)
|
Lightning Dataset (including optimized dataloading of s3 buckets) (#17743)
* Lightning DataLoader
* lightning dataloader
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* init
* example
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* env var
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Update src/lightning/pytorch/utilities/data/__init__.py
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
* remove unused functions
* extra reqs
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Update src/lightning/pytorch/utilities/data/fileio.py
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* imports work now! yay
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* tests
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* imports
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* missing import
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* error handling
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* update creds for local use case
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* codeowners
* recursive get index
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* index
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* clean up get index
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* update imagenet example
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* docstrings
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* docstrings
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* docstrings
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* example cleanup
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* changelog
* reqs
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* codeowners
* requirements
* expose LightningDataset too
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* expost LightningDataset at top level
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* remove unused private methods from init
* remove private imports
* upper bound on extra requirements
* review comments
* loosen req
* deps
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* test updating fabric base req
* remove version pin on s3fs to test
* recover missing function
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* tests
* update
* random
* torchdata >= 0.3.0
* update torchdata version
* remove torchdata version to test
* try rem torch version pin
* req
* update bucket in test
* req
* skips
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* import
* update structure to lightning.data
* base.txt for data reqs
* fix imports
* rename to LightningS3Dataset
* new workflow
* dont need to test warnings
* reqs
* req
* revert data folder in pytorch
* test import
* tests
* req
* req
* req
* torch version
* req
* req
* open dep
* reformatted
* pin strict
* pin strict extra
* req
* modify workflow, no cache
* try
* patch
* import
* fix
* dataset test
* update getattr
* pin everything to test
* remove torch preinstall from workflow
* workflow
* req
* Update .github/workflows/ci-tests-data.yml
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
* workflow
* workflow
* req
* Update .github/workflows/ci-tests-data.yml
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
* workflow
* print
* skip test for now
* update path join
* revert app dep version bump
* Update .github/workflows/ci-tests-data.yml
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
* workflow updates
* app base req
* req
* window test failure
* add data req to assistant
* try
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* add missing comma
* updates
* update
* typo
* requirements
* try widening req
* older torch version
* update
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* update
* update
* update
* update
* cleanup tests
* typo again
* update
* remove unnecessary line
* Update .github/CODEOWNERS
* Discard changes to requirements/pytorch/base.txt
* Discard changes to requirements/fabric/base.txt
* Discard changes to requirements/app/base.txt
* requirements
* requirements
* one line
* app workflow pick only app reqs
* rename package
* undo
* don't use cache
* examples CI
* pytorch and fabric CI
* try remove cache
* Apply suggestions from code review
* jirka playing
* jirka playing
* jirka playing
* blah
* flatten LightningDataset
* cleans up dataset class
* jirka playing
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* jirka playing
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* extra
* fix dataset test
* update checkgroups
* Luca's review comments
* val error fix
* unskip test
* min
* fix precommit warning
* cpu
* docstrings
* req
* 2.0.1
* add return type
* typing errors
* req
* return types with quotations
* import for type-checking
* no botocore in cloudagnostic code
* exit args
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* update
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* backends typing
* remove oldest from data tests
* typing
* typing
* typing
* types
* type
* typing
* typing
* typing
* import fix
* Changelog
---------
Co-authored-by: Noha Alon <nohaalon@Nohas-MacBook-Air.local>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
Co-authored-by: Justus Schock <justus.schock@posteo.de>
2023-06-13 10:44:41 +00:00
|
|
|
|
|
|
|
file_stream = open_single_file(path, mode="w")
|
|
|
|
assert isinstance(file_stream, StreamWrapper)
|
|
|
|
file_stream.write("mytestfile")
|
|
|
|
file_stream.close()
|
|
|
|
|
|
|
|
if is_s3:
|
|
|
|
assert isinstance(file_stream.file_obj, mock.Mock)
|
|
|
|
assert patch.open.assert_called_once
|
|
|
|
|
|
|
|
else:
|
|
|
|
with open(path) as f:
|
|
|
|
assert f.read() == "mytestfile"
|
|
|
|
|
|
|
|
|
2023-09-22 09:08:28 +00:00
|
|
|
def test_open_cloud_file_obj(tmp_path):
|
|
|
|
path = os.path.join(tmp_path, "foo.txt")
|
Lightning Dataset (including optimized dataloading of s3 buckets) (#17743)
* Lightning DataLoader
* lightning dataloader
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* init
* example
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* env var
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Update src/lightning/pytorch/utilities/data/__init__.py
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
* remove unused functions
* extra reqs
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Update src/lightning/pytorch/utilities/data/fileio.py
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* imports work now! yay
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* tests
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* imports
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* missing import
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* error handling
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* update creds for local use case
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* codeowners
* recursive get index
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* index
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* clean up get index
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* update imagenet example
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* docstrings
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* docstrings
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* docstrings
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* example cleanup
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* changelog
* reqs
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* codeowners
* requirements
* expose LightningDataset too
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* expost LightningDataset at top level
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* remove unused private methods from init
* remove private imports
* upper bound on extra requirements
* review comments
* loosen req
* deps
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* test updating fabric base req
* remove version pin on s3fs to test
* recover missing function
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* tests
* update
* random
* torchdata >= 0.3.0
* update torchdata version
* remove torchdata version to test
* try rem torch version pin
* req
* update bucket in test
* req
* skips
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* import
* update structure to lightning.data
* base.txt for data reqs
* fix imports
* rename to LightningS3Dataset
* new workflow
* dont need to test warnings
* reqs
* req
* revert data folder in pytorch
* test import
* tests
* req
* req
* req
* torch version
* req
* req
* open dep
* reformatted
* pin strict
* pin strict extra
* req
* modify workflow, no cache
* try
* patch
* import
* fix
* dataset test
* update getattr
* pin everything to test
* remove torch preinstall from workflow
* workflow
* req
* Update .github/workflows/ci-tests-data.yml
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
* workflow
* workflow
* req
* Update .github/workflows/ci-tests-data.yml
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
* workflow
* print
* skip test for now
* update path join
* revert app dep version bump
* Update .github/workflows/ci-tests-data.yml
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
* workflow updates
* app base req
* req
* window test failure
* add data req to assistant
* try
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* add missing comma
* updates
* update
* typo
* requirements
* try widening req
* older torch version
* update
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* update
* update
* update
* update
* cleanup tests
* typo again
* update
* remove unnecessary line
* Update .github/CODEOWNERS
* Discard changes to requirements/pytorch/base.txt
* Discard changes to requirements/fabric/base.txt
* Discard changes to requirements/app/base.txt
* requirements
* requirements
* one line
* app workflow pick only app reqs
* rename package
* undo
* don't use cache
* examples CI
* pytorch and fabric CI
* try remove cache
* Apply suggestions from code review
* jirka playing
* jirka playing
* jirka playing
* blah
* flatten LightningDataset
* cleans up dataset class
* jirka playing
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* jirka playing
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* extra
* fix dataset test
* update checkgroups
* Luca's review comments
* val error fix
* unskip test
* min
* fix precommit warning
* cpu
* docstrings
* req
* 2.0.1
* add return type
* typing errors
* req
* return types with quotations
* import for type-checking
* no botocore in cloudagnostic code
* exit args
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* update
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* backends typing
* remove oldest from data tests
* typing
* typing
* typing
* types
* type
* typing
* typing
* typing
* import fix
* Changelog
---------
Co-authored-by: Noha Alon <nohaalon@Nohas-MacBook-Air.local>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
Co-authored-by: Justus Schock <justus.schock@posteo.de>
2023-06-13 10:44:41 +00:00
|
|
|
with open(path, "w") as f:
|
|
|
|
f.write("bar!")
|
|
|
|
|
|
|
|
f = OpenCloudFileObj(path)
|
|
|
|
|
|
|
|
with f:
|
|
|
|
assert f.read() == "bar!"
|
|
|
|
assert f._stream.closed
|
|
|
|
|
|
|
|
f = OpenCloudFileObj(path)
|
|
|
|
assert f.read() == "bar!"
|
|
|
|
f.close()
|
|
|
|
assert f._stream.closed
|
|
|
|
|
|
|
|
with OpenCloudFileObj(path, "w") as f:
|
|
|
|
f.write("not bar anymore!")
|
|
|
|
|
|
|
|
with open(path) as f:
|
|
|
|
assert f.read() == "not bar anymore!"
|