lightning

Commit Graph

Author	SHA1	Message	Date
Jirka Borovec	944ffba305	join coverage (#2460 ) * join coverage * full TPU test * codecov * typo * report * docker * timeout * base * show * cd dir * req * docker * docker * docker * coverage * upload * drop main * report * report * python * upload * drone * drone * drone * drone * drone * drone * drone * drone * drone	2020-07-04 10:22:58 -04:00
William Falcon	e5a979990e	Hang (#2488 ) * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test	2020-07-03 15:16:45 -04:00
Jirka Borovec	fc61c200c0	DDp interpreter (#2482 ) * interpreter * chlog	2020-07-03 13:23:30 -04:00
zcain117	6d9c7bf0b0	Add link to TPU Pods tutorial. (#2477 )	2020-07-03 00:57:17 -04:00
William Falcon	020c332ae9	Clean up (#2467 ) * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * Fixes #2455 * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test * added early stop tpu test	2020-07-03 00:38:29 -04:00
Jirka Borovec	e77add3301	fix gpu example (#2466 ) * fix gpu example * make cpu_template and gpu_template differnt Co-authored-by: Adrian Wälchli <adrian.waelchli@inf.unibe.ch>	2020-07-03 00:17:18 -04:00
William Falcon	0697dd306d	Fixes #2455 (#2463 )	2020-07-02 07:18:58 -04:00
William Falcon	afdfba1dc6	removed auto val reduce (#2462 )	2020-07-02 07:04:18 -04:00
zcain117	1a40963d1d	Add Github Action to run TPU tests. (#2376 ) * Add Github Action to run TPU tests. * Trigger new Github Actions run. * Clean up more comments. * Use different fixed version of ml-testing-accelerators and update config to match. * use cluster in us-central1-a * Run 'gcloud logging read' directly without 'echo' to preserve newlines. * cat coverage.xml on the TPU VM side and upload xml on the Github Action side * Use new commit on ml-testing-accelerators so command runs fully. * Preserve newlines in the xml and use if: always() temporarily to upload codecov * Use pytorch_lightning for coverage instead of pytorch-lightning * Remove the debug cat of coverage xml * Apply suggestions from code review * jsonnet rename * name * add codecov flags * add codecov flags * codecov * codecov * revert codecov * Clean up after apt-get and remove old TODOs. * More codefactor cleanups. * drone * drone * disable codecov * cleaning * docker py versions * docker py 3.7 * readme * bash * docker * freeze conda * py3.6 * Stop using apt-get clean. * Dont rm pytorch-lightning * Update docker/tpu/Dockerfile * Longer timeout in the Github Action to wait for GKE to finish. * job1 * job2 * job3 Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Jirka <jirka@pytorchlightning.ai>	2020-07-01 21:44:19 -04:00
Jirka Borovec	dcd6000be7	continue (#2450 )	2020-07-01 08:35:51 -04:00
Jirka Borovec	7f1eab4cad	try adding coverage (#2441 ) * add coverage, test failing * fix test * badges * typo * freeze conda	2020-07-01 08:00:36 -04:00
Jirka Borovec	695e0514f8	cleaning (#2449 )	2020-07-01 07:56:10 -04:00
Adrian Wälchli	927f305f7e	Warn user when IterableDataset has __len__ defined (#2437 ) * add warning when getting checking len * added test * changelog * pep * do not show warning below 1.4 * try version parse * comments * xfail * Update requirements/base.txt Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update pytorch_lightning/trainer/data_loading.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * version Co-authored-by: William Falcon <waf2107@columbia.edu> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Jirka <jirka@pytorchlightning.ai>	2020-07-01 07:53:19 -04:00
William Falcon	325852c6df	enabled no returns from eval (#2446 ) * enabled no returns from eval * fixed docs * fixed docs * fixed docs * fixed docs * fixed docs * fixed docs * fixed docs * fixed docs * fixed docs * fixed docs * fixed docs * fixed docs	2020-07-01 07:38:00 -04:00
Llannelongue	fa2233f56f	Corrected typo `python -m pip pre-commit install` (#2447 )	2020-07-01 07:02:02 -04:00
Jirka Borovec	ded8a56bb3	missing changes in chlog (#2430 ) * missing * miss	2020-06-30 22:45:50 -04:00
Jirka Borovec	e268061614	Pure package & base tests (#2418 ) * base tests * pil * wip * wip * wip * ignore * ignore * win * link * win * cpu * Apply suggestions from code review Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>	2020-06-30 19:35:54 -04:00
Adrian Wälchli	145670f893	fix logging on rank 0 only (#2425 ) * fix and test for ddp block logging rank > 0 * rename * use the dummy logger * dummy logger test * set the logger in model * decorator for rank zero experiment * simplify check * simplify * fix problem with None in checkpoint path * revert configure logger * unused import * offline * try rank 0 decorator in checkpoint * try fix test * imgs * add asserts to make sure log zero only saves checkpoints * add asserts to make sure log zero only saves checkpoints * add asserts to make sure log zero only saves checkpoints * add asserts to make sure log zero only saves checkpoints * add asserts to make sure log zero only saves checkpoints * fix tpu tests * fix tpu tests Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-06-30 18:09:16 -04:00
William Falcon	04e68f022f	fix tpu tests	2020-06-30 17:20:35 -04:00
William Falcon	fc26078e39	fix tpu tests	2020-06-30 17:20:18 -04:00
Oliver Neumann	1a54ed6ad9	Checking ipywidgets is installed for ensure tqdm working (#2417 ) * Adding importing ipywidgets before importing tqdm.auto to make sure ipywidgets is installed. * Updated CHANGELOG.md * Updated ipywidgets importing checks to @awaelchli comments. Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-06-30 16:59:35 -04:00
William Falcon	309ed75c5d	added reduce ddp results on eval (#2434 ) * added reduce ddp results on eval * added reduce ddp results on eval * added reduce ddp results on eval * added reduce ddp results on eval * added reduce ddp results on eval * added reduce ddp results on eval * added reduce ddp results on eval * added reduce ddp results on eval * added reduce ddp results on eval * added reduce ddp results on eval * added reduce ddp results on eval	2020-06-30 16:15:35 -04:00
William Falcon	e8bb4165b7	Fix apex scaling with decoupled backward (#2433 ) * fix outputs * fix outputs * fix outputs * fix outputs * fix outputs * fix outputs * fix outputs * fix outputs * fix outputs * fix outputs * fix outputs * fix outputs * fix outputs * fix outputs * fix outputs * fix outputs * fix outputs * fix outputs * fix outputs	2020-06-30 14:51:39 -04:00
Jirka Borovec	d4a02e3bd8	tests: drop CircleCI (#2412 ) * drop CircleCI * add PT testing * fix * cpu * conda * conda * req * base * conda * conda * conda * conda * conda * conda * conda * name * req * info * tests * pt 1.6 * drop 1.6 * info	2020-06-30 10:56:05 -04:00
William Falcon	a42a0e16dd	Fixes train outputs (#2428 ) * fix outputs * fix outputs	2020-06-30 10:03:49 -04:00
Jirka Borovec	a75398530c	continue (#2416 )	2020-06-29 21:00:52 +02:00
Jirka Borovec	dec074c2e7	typo (#2415 )	2020-06-29 07:36:56 -04:00
Jirka Borovec	02d6045cac	release (#2414 )	2020-06-29 07:21:28 -04:00
William Falcon	33b92557f5	Update __init__.py	2020-06-29 06:59:35 -04:00
William Falcon	92d1e75b26	fix batch typo	2020-06-29 06:54:21 -04:00
William Falcon	593837e1da	fix amp wrong call	2020-06-29 06:46:19 -04:00
Jirka Borovec	3ff695510e	missing changes (#2283 ) * missing * RC1 * RC1 * format	2020-06-29 06:34:19 -04:00
William Falcon	58f03f3076	Update README.md	2020-06-28 22:44:58 -04:00
William Falcon	8f07b77fc0	Update __init__.py	2020-06-28 22:08:51 -04:00
Adrian Wälchli	25ee51bc57	Continue Jeremy's early stopping PR #1504 (#2391 ) * add state_dict for early stopping * move best attr after monitor_op defined * improve early stopping and model checkpoint callbacks * fix formatting * fix attr init order * clean up setting of default_root_dir attr * logger needs default root dir set first * reorg trainer init * remove direct references to checkpoint callback * more fixes * more bugfixes * run callbacks at epoch end * update tests to use on epoch end * PR cleanup * address failing tests * refactor for homogeneity * fix merge conflict * separate tests * tests for early stopping bug regressions * small fixes * revert model checkpoint change * typo fix * fix tests * update train loop * cannot pass an int as default_save_path * refactor log message * fix test case * appease the linter * fix some doctests * move config to callback * fixes from rebase * fixes from rebase * chlog * docs * reformat * formatting * fix * fix * fixes from rebase * add new test for patience * Update pytorch_lightning/callbacks/model_checkpoint.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update pytorch_lightning/callbacks/model_checkpoint.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update tests/callbacks/test_early_stopping.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * fix formatting * remove enable_early_stop attribute * add state_dict for early stopping * move best attr after monitor_op defined * improve early stopping and model checkpoint callbacks * fix formatting * fix attr init order * clean up setting of default_root_dir attr * logger needs default root dir set first * reorg trainer init * remove direct references to checkpoint callback * more fixes * more bugfixes * run callbacks at epoch end * update tests to use on epoch end * PR cleanup * address failing tests * refactor for homogeneity * fix merge conflict * separate tests * tests for early stopping bug regressions * small fixes * revert model checkpoint change * typo fix * fix tests * update train loop * fix test case * appease the linter * fix some doctests * move config to callback * fixes from rebase * fixes from rebase * chlog * docs * reformat * formatting * fix * fix * fixes from rebase * add new test for patience * Update pytorch_lightning/callbacks/model_checkpoint.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update pytorch_lightning/callbacks/model_checkpoint.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update tests/callbacks/test_early_stopping.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * fix formatting * remove enable_early_stop attribute * fix test with new epoch indexing * fix progress bar totals * fix off by one error (see #2289) epoch starts at 0 now * added missing imports * fix hpc_save folderpath * fix formatting * fix tests * small fixes from a rebase * fix * tmpdir * tmpdir * tmpdir * wandb * fix merge conflict * add back evaluation after training * test_resume_early_stopping_from_checkpoint TODO * undo the horovod check * update changelog * remove a duplicate test from merge error * try fix dp_resume test * add the logger fix from master * try remove default_root_dir * try mocking numpy * try import numpy in docs test * fix wandb test * pep 8 fix * skip if no amp * dont mock when doctesting * install extra * fix the resume ES test * undo conf.py changes * revert remove comet pickle from test * Update CHANGELOG.md Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update weights_loading.rst * Update weights_loading.rst * Update weights_loading.rst * renamed flag * renamed flag * revert the None check in logger experiment name/version * add the old comments * _experiment * test chckpointing on DDP * skip the ddp test on windows * cloudpickle * renamed flag * renamed flag * parentheses for clarity * apply suggestion max epochs Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Jeremy Jordan <jtjordan@ncsu.edu> Co-authored-by: Jirka <jirka@pytorchlightning.ai> Co-authored-by: Jeremy Jordan <13970565+jeremyjordan@users.noreply.github.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-06-28 21:36:46 -04:00
Jirka Borovec	1e16681693	fix loading with hparams (#2403 ) * fix #2386 * extra test * extra case * extra test * chlog * fix test	2020-06-28 20:22:03 -04:00
Adrian Wälchli	058c500300	fix when torchtext not installed (#2402 )	2020-06-28 20:03:51 -04:00
Jirka Borovec	861a73be12	fix loading past checpoints (#2405 ) * fix #2334 * chlog	2020-06-28 17:20:33 -04:00
William Falcon	66ffbaddf5	updates teardown to account for ddp (#2389 ) * remove warnings * remove warnings * added doc lines * added doc lines	2020-06-28 07:01:04 -04:00
Adrian Wälchli	d910cc5200	docs: dont mock imports when running sphinx doctest (#2396 ) * skip if no amp * dont mock when doctesting * install extra	2020-06-27 23:31:06 -04:00
Jirka Borovec	75f0a2062c	move torchtext as optional (#2395 ) * torchtext * Update pytorch_lightning/utilities/apply_func.py Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * Update apply_func.py Co-authored-by: William Falcon <waf2107@columbia.edu> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>	2020-06-27 20:15:10 -04:00
Jirka Borovec	51711c265a	fix loading model with kwargs (#2387 ) * test * fix * fix	2020-06-27 16:38:03 -04:00
Mateusz Pieniak	e82d9cdb66	Support torchtext on a single GPU (#2379 ) * Handle torchtext.data.Batch on GPU * Update CHANGELOG.md * Apply code review requests * Correct the docs * Change requirements	2020-06-27 16:36:45 -04:00
Jirka Borovec	73a78a13c7	CI: partial move from CircleCI (#2378 ) * move from CircleCI * req * tex * tex * sudo * extra * recom * pic * dvipng	2020-06-27 16:25:33 -04:00
William Falcon	90f641af0d	fixes logger crash on ddp (#2388 ) * remove warnings * remove warnings * remove warnings * remove warnings * remove warnings * remove warnings * remove warnings * remove warnings * remove warnings * remove warnings	2020-06-27 15:08:22 -04:00
Jirka Borovec	41f5df18a4	move Trains logger to Bolts (#2384 ) * move Trains logger * chlog	2020-06-27 09:14:05 -04:00
Jirka Borovec	4e13e419ea	add CLI test for examples (#2285 ) * cli examples * ddp * CI * CI * req * tests * skip DDP Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-06-27 09:13:29 -04:00
Jirka Borovec	6673fc9a0b	fix docker builds (#2383 )	2020-06-27 08:49:19 -04:00
Jirka Borovec	2f739f5977	fix key typo (#2374 )	2020-06-26 21:46:08 -04:00
Kshitij09	20d0f53896	Fix ModelCheckpoint example (#2321 ) `save_top_k` should be an `int` and have been mentioned as `save_top_k=True` in the snippet provided under 'Saving and Loading Weights' docs. Changed it to its default value (1) to make it consistent. Signed-off-by: Kshitij Patil <kshitijpatil98@gmail.com>	2020-06-26 21:45:41 -04:00

1 2 3 4 5 ...

2639 Commits All Branches Search

2639 Commits

All Branches