lightning/tests/plugins
shuyingsunshine21 299f2c481b
FSDP with full state dict (#7487)
* Fix some test errors
Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

* checkpoint consolidation

* Update ddp_spawn.py

* Update test_metric_result_integration.py

* Update test_results.py

* Update utils.py

* Update utils.py

* Update test_all_gather_grad.py

* Update test_all_gather_grad.py

* Update test_results.py

* Revert "Update test_results.py"

This reverts commit 9d4a2b891d.

* Revert "Merge pull request #1 from shuyingsunshine21/shuyingsunshine21-checkpoint_consolidate"

This reverts commit c5053da789, reversing
changes made to 0d23d75bc9.

* Revert "Update test_all_gather_grad.py"

This reverts commit 0d23d75bc9.

* Revert "Update utils.py"

This reverts commit 70fe5da9c6.

* Revert "Update utils.py"

This reverts commit a9aae99f6e.

* Revert "Update test_results.py"

This reverts commit ea74906878.

* Revert "Update test_metric_result_integration.py"

This reverts commit bf70e431b3.

* Revert "Update ddp_spawn.py"

This reverts commit f17210183b.

* Revert "checkpoint consolidation"

This reverts commit 536c1323b0.

* Revert "Revert "checkpoint consolidation""

This reverts commit 3a9fde915a.

* Revert "Revert "Revert "checkpoint consolidation"""

This reverts commit 7a369f47e1.

* Revert "Revert "Update ddp_spawn.py""

This reverts commit 8222dc98ea.

* Revert "Revert "Update test_metric_result_integration.py""

This reverts commit 6c095b2370.

* Revert "Revert "Update test_results.py""

This reverts commit 250d0aaaa2.

* Revert "Revert "Update utils.py""

This reverts commit 8651d54d79.

* Revert "Revert "Update test_all_gather_grad.py""

This reverts commit dcdcd29731.

* modify distributed environment to make test pass

* fix version for ddp plugin test

* fix

* fix

* changelog

* Update CHANGELOG.md

* fsdp with full state dict

* fix missing import

* modify unitest

* fix

* fix

* fix typo

* modify test and add changelog

* fix

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* limit max_epoch to 1 for testing

* test

* fix

* update

* testing remove special for multi gpu

* assert gpu

* add assertion for gpu

* fix

* Re-enable special test, use ModelCheckpoint

* Fix paths

* Fix path passing

* test

* test

* fix test

* fix

* pre-commit format

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: SeanNaren <sean@grid.ai>
2021-05-24 08:11:45 +01:00
..
environments Add kubeflow cluster environment (#7300) 2021-05-17 09:05:24 +01:00
__init__.py
test_amp_plugins.py [bugfix] Apex never instantiated. (#7274) 2021-04-30 13:16:28 -04:00
test_cluster_integration.py Set `num_nodes` and `sync_batchnorm` From Trainer for Manually Passed Training Type Plugin (#7026) 2021-05-08 11:25:51 +00:00
test_custom_plugin.py Add typings for evaluation_loop.py and remove some dead code (#7015) 2021-04-15 07:36:04 +00:00
test_ddp_fully_sharded_with_full_state_dict.py FSDP with full state dict (#7487) 2021-05-24 08:11:45 +01:00
test_ddp_plugin.py refactor accelerator teardown -> training type plugin teardown (#7579) 2021-05-22 13:19:24 -07:00
test_ddp_plugin_with_comm_hook.py `TrainerState` refactor [5/5] (#7173) 2021-05-04 12:50:56 +02:00
test_ddp_spawn_plugin.py refactor accelerator teardown -> training type plugin teardown (#7579) 2021-05-22 13:19:24 -07:00
test_deepspeed_plugin.py refactor accelerator teardown -> training type plugin teardown (#7579) 2021-05-22 13:19:24 -07:00
test_double_plugin.py [bugfix] Add set_default_tensor_type to torch.DoubleTensor with precision=64 (#7108) 2021-04-20 15:25:37 +00:00
test_plugins_registry.py Add ddp_find_unused_parameters_false to Registry (#7224) 2021-05-04 22:40:00 +00:00
test_rpc_plugin.py Clean up environment access in plugins (#6941) 2021-04-13 20:07:40 +02:00
test_rpc_sequential_plugin.py Remove legacy support for the magic `log`/`progress_bar` keys in dict returns (#6734) 2021-03-31 00:28:04 +02:00
test_sharded_plugin.py Fix ShardedDataParallel has no attribute require_backward_grad_sync (#6915) 2021-04-10 16:14:37 +00:00
test_single_device_plugin.py refactor accelerator teardown -> training type plugin teardown (#7579) 2021-05-22 13:19:24 -07:00
test_tpu_spawn.py refactor accelerator teardown -> training type plugin teardown (#7579) 2021-05-22 13:19:24 -07:00