Commit Graph

36 Commits (main)

Author SHA1 Message Date
Fredo Guan 81ca323751
Davit update formatting and fix grad checkpointing (#7)
1 year ago
Ross Wightman d5e7d6b27e Merge remote-tracking branch 'origin/main' into refactor-imports
1 year ago
Ross Wightman cda39b35bd Add a deprecation phase to module re-org
1 year ago
Ross Wightman 98047ef5e3 Add EVA FT results, hopefully fix BEiT test failures
1 year ago
Ross Wightman dbe7531aa3 Update scripts to support torch.compile(). Make --results_file arg more consistent across benchmark/validate/inference. Fix #1570
1 year ago
Ross Wightman 9da7e3a799 Add crop_mode for pretraind config / image transforms. Add support for dynamo compilation to benchmark/train/validate
1 year ago
Ross Wightman 87939e6fab Refactor device handling in scripts, distributed init to be less 'cuda' centric. More device args passed through where needed.
2 years ago
Ross Wightman ff6a919cf5 Add --fast-norm arg to benchmark.py, train.py, validate.py
2 years ago
Ross Wightman 0dbd9352ce Add bulk_runner script and updates to benchmark.py and validate.py for better error handling in bulk runs (used for benchmark and validation result runs). Improved batch size decay stepping on retry...
2 years ago
Ross Wightman 4670d375c6 Reorg benchmark.py import
2 years ago
Ross Wightman 28e0152043 Add --no-retry flag to benchmark.py to skip batch_size decay and retry on error. Fix #1226. Update deepspeed profile usage for latest DS releases. Fix # 1333
2 years ago
Ross Wightman 34f382f8f6 move dataconfig before script, scripting killing metadata now (PyTorch 1.12? just nvfuser?)
2 years ago
Ross Wightman 2d7ab06503 Move aot-autograd opt after model metadata used to setup data config in benchmark.py
2 years ago
Xiao Wang ca991c1fa5 add --aot-autograd
2 years ago
Ross Wightman 372ad5fa0d Significant model refactor and additions:
2 years ago
Ross Wightman 95cfc9b3e8 Merge remote-tracking branch 'origin/master' into norm_norm_norm
2 years ago
Ross Wightman cf4334391e Update benchmark and validate scripts to output results in JSON with a fixed delimiter for use in multi-process launcher
2 years ago
kozistr 56a6b38f76 refactor: remove if-condition
2 years ago
Ross Wightman f0f9eccda8 Add --fuser arg to train/validate/benchmark scripts to select jit fuser type
2 years ago
Ross Wightman 683fba7686 Add drop args to benchmark.py
2 years ago
Ross Wightman aaff2d82d0 Add new 50ts attn models to benchmark/meta csv files
3 years ago
Ross Wightman 1e17863b7b Fixed botne*t26 model results, add some 50ts self-attn variants
3 years ago
Ross Wightman 71f00bfe9e Don't run profile if model is torchscripted
3 years ago
Ross Wightman 5882e62ada Add activation count to fvcore based profiling in benchmark.py
3 years ago
Ross Wightman f7325c7b71 Support either deepspeed or fvcore for flop profiling
3 years ago
Ross Wightman 66253790d4 Add `--bench profile` mode for benchmark.py to just run deepspeed detailed profile on model
3 years ago
Ross Wightman 13a8bf7972 Add train size override and deepspeed GMACs counter (if deepspeed installed) to benchmark.py
3 years ago
Ross Wightman ac469b50da Optimizer improvements, additions, cleanup
3 years ago
Ross Wightman 137a374930
Merge pull request #555 from MichaelMonashev/patch-1
3 years ago
Ross Wightman e15e68d881 Fix #566, summary.csv writing to pwd on local_rank != 0. Tweak benchmark mem handling to see if it reduces likelihood of 'bad' exceptions on OOM.
3 years ago
Michael Monashev 0be1fa4793
Argument description fixed
3 years ago
Ross Wightman 37c71a5609 Some further create_optimizer_v2 tweaks, remove some redudnant code, add back safe model str. Benchmark step times per batch.
3 years ago
Ross Wightman 288682796f Update benchmark script to add precision arg. Fix some downstream (DeiT) compat issues with latest changes. Bump version to 0.4.7
3 years ago
Ross Wightman 4445eaa470 Add img_size to benchmark output
3 years ago
Ross Wightman 0706d05d52 Benchmark models listed in txt file. Add more hybrid vit variants for testing
3 years ago
Ross Wightman 0e16d4e9fb Add benchmark.py script, and update optimizer factory to be more friendly to use outside of argparse interface.
3 years ago