* - Added cli unit tests for help, print_config and submodules.
- Added to cli documentation use of subclass help and print_config, submodules and other minor improvements.
- Increased minimum jsonargparse version required for new documented features.
* Improvements to lightning_cli.rst
* Add check for all trainer parameters in test_lightning_cli_help
* Increased minimum jsonargparse version
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
* Try updating CI to latest fairscale
* Update availability of imports.py
* Remove some of the fairscale custom ci stuff
* Update grad scaler within the new process as reference is incorrect for spawn
* Remove fairscale from mocks
* Install fairscale 0.3.4 into the base container, remove from extra.txt
* Update docs/source/conf.py
* Fix import issues
* Mock fairscale for docs
* Fix DeepSpeed and FairScale to specific versions
* Swap back to greater than
* extras
* Revert "extras"
This reverts commit 7353479f
* ci
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: jirka <jirka.borovec@seznam.cz>
* Add single checkpoint capability
* Fix checkpointing in test, few cleanups
* Add comment
* Change restore logic
* Move vars around, add better explanation, make todo align with DeepSpeed team
* Fix checkpointing
* Remove deepspeed from extra, install in Dockerfile
* push
* pull
* Split to two tests to see if it fixes Deepspeed error
* Add comment
* Add context to call hook to handle all modules defined within the hook
* Expose some additional parameters
* Added docs, exposed parameters
* Make sure we only configure if necessary
* Setup activation checkpointing regardless, saves the user having to do it manually
* Add some tests that fail currently
* update
* update
* update
* add tests
* change docstring
* resolve accumulate_grad_batches
* resolve flake8
* Update DeepSpeed to use latest version, add some comments
* add metrics
* update
* Small formatting fixes, clean up some code
* Few cleanups
* No need for default state
* Fix tests, add some boilerplate that should move eventually
* Add hook removal
* Add a context manager to handle hook
* Small naming cleanup
* wip
* move save_checkpoint responsability to accelerator
* resolve flake8
* add BC
* Change recommended scale to 16
* resolve flake8
* update test
* update install
* update
* update test
* update
* update
* update test
* resolve flake8
* update
* update
* update on comments
* Push
* pull
* Update pytorch_lightning/plugins/training_type/deepspeed.py
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* Update pytorch_lightning/plugins/training_type/deepspeed.py
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* update
* Apply suggestions from code review
* Swap to using world size defined by plugin
* update
* update todo
* Remove deepspeed from extra, keep it in the base cuda docker install
* Push
* pull
* update
* update
* update
* update
* Minor changes
* duplicate
* format
* format2
Co-authored-by: SeanNaren <sean@grid.ai>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka Borovec <jirka.borovec@seznam.cz>
* xfail if not installed
include mkpatch
fix test
* mock comet
comet mocks
fix test
remove dep
undo merge duplication
* line
* line
* convert doctest
* doctest
* docs
* Use .comet.config file or env var for API key.
* Make CometLogger API key changes backwards compatible.
* Fix line too long.
* Add documentation about loading from ~/.comet_config.
* Update required comet_ml version.
* Comet logger: allow offline experiments with config file.
This adds a new argument to the logger to control the online / offline mode explicitly so that if you give an API key and a save_dir (e.g. to control where checkpoints go while having ~/.comet.config) you can specify which mode you want.
* Make CometLogger API key changes backwards compatible.
* Comet logger: change online argument to be offline.
For consistency with other loggers.
* chlog
Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
* export model to onnx
* prepare data before exporting
* support for dataloaders and tensors
* added tests
* use example_input_array
add to changelog
* updated docstring
* added onnx inference tests
* temp commit
* removed schema valid test
* add onnxruntime to environment.yml
* moved onnxruntime to environment.yml pip
* add example in doc
* add lines between code block
* added PR to changelog
* is file check
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* remove *
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* infer example outputs
* added doctest for onnx
* fix windows tests
* moved eval within condition block
* self.forward to self
* added docs
* fixed docs error
* added to toctree
* Update CHANGELOG.md
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* metrics: added bleu score and test bleu
* metrics: fixed type hints in bleu
* bleu score moved to metrics/functional/nlp.py
* refactor with torch.Tensor
* Update test_sequence.py
* refactor as Borda requests and nltk==3.2
* locked nltk==3.3
* nltk>=3.3, parametrized smooth argument for test
* fix bleu_score example
* added class BLEUScore metrics and test
* added class BLEUScore metrics and test
* update CHANGELOG
* refactor with torchtext
* torchtext changed to optional import
* fix E501 line too long
* add else: in optional import
* remove pragma: no-cover
* constants changed to CAPITALS
* remove class in tests
* List -> Sequence, conda -> pip, cast with tensor
* add torchtext in test.txt
* remove torchtext from test.txt
* bump torchtext to 0.5.0
* bump torchtext to 0.5.0
* Apply suggestions from code review
* ignore bleu score in doctest, renamed to nlp.py
* back to implementation with torch
* remove --ignore in CI test, proper reference format
* apply justus comment
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>