tarepan
|
bb366232e7
|
Add non-existing resume_from_checkpoint acceptance for auto-resubmit (#4402)
* Add empty resume_from_checkpoint acceptance #4366
* Fix general error catch with focused file check
* Add fsspec HTTP extras
Add fsspec's HTTPFileSystem support through http extras.
pl has supported remote http file (e.g. #2925),
so this commit do not add new functionality.
* Fix potential too much logging in DDP
* Add PR changelog
* Add well-written argument explanation
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* Fix DDP-compatible restore logging
Notify from where the states are restored.
This feature temporally deleted as a result of PR review.
With succeeding review, added with DDP compatibility.
* Fix utility import pathes
* Refactor load step commentaries
* Refactor hpc ckpt suffix acquisition
* Refactor restore/hpc_load match
* Refactor hpc load trial
* Refactor checkpoint dir check
* Refactor unneeded function nest
* Refactor nested If
* Refactor duplicated cache clear
* Refactor attempt flow with if/elif
* Fix pip8
* Refactor hook commentary
Co-authored-by: chaton <thomas@grid.ai>
* Fix pep8
* Refactor hpc load checkpoint path acquisition
* Fix pip8
* Fix typo
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* Fix typo
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* Fix doc
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* Refactor None Union type with Optional
* Fix build-doc CI failure debuged in #5329
* Fix fsspec import during build-doc #5329
* Fix test epoch
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* Fix test with latest test models
* .
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: Roger Shieh <sh.rog@protonmail.ch>
(cherry picked from commit b0051e8c03 )
|
2021-01-06 12:55:38 +01:00 |