Commit Graph

30 Commits (0dadb4a6e9e245c30653db2a48752423df98fa44)

Author SHA1 Message Date
Ross Wightman 87939e6fab Refactor device handling in scripts, distributed init to be less 'cuda' centric. More device args passed through where needed.
2 years ago
Ross Wightman ff6a919cf5 Add --fast-norm arg to benchmark.py, train.py, validate.py
2 years ago
Ross Wightman 0dbd9352ce Add bulk_runner script and updates to benchmark.py and validate.py for better error handling in bulk runs (used for benchmark and validation result runs). Improved batch size decay stepping on retry...
2 years ago
Ross Wightman 4670d375c6 Reorg benchmark.py import
2 years ago
Ross Wightman 28e0152043 Add --no-retry flag to benchmark.py to skip batch_size decay and retry on error. Fix #1226. Update deepspeed profile usage for latest DS releases. Fix # 1333
2 years ago
Ross Wightman 34f382f8f6 move dataconfig before script, scripting killing metadata now (PyTorch 1.12? just nvfuser?)
2 years ago
Ross Wightman 2d7ab06503 Move aot-autograd opt after model metadata used to setup data config in benchmark.py
2 years ago
Xiao Wang ca991c1fa5 add --aot-autograd
2 years ago
Ross Wightman 372ad5fa0d Significant model refactor and additions:
3 years ago
Ross Wightman 95cfc9b3e8 Merge remote-tracking branch 'origin/master' into norm_norm_norm
3 years ago
Ross Wightman cf4334391e Update benchmark and validate scripts to output results in JSON with a fixed delimiter for use in multi-process launcher
3 years ago
kozistr 56a6b38f76 refactor: remove if-condition
3 years ago
Ross Wightman f0f9eccda8 Add --fuser arg to train/validate/benchmark scripts to select jit fuser type
3 years ago
Ross Wightman 683fba7686 Add drop args to benchmark.py
3 years ago
Ross Wightman aaff2d82d0 Add new 50ts attn models to benchmark/meta csv files
3 years ago
Ross Wightman 1e17863b7b Fixed botne*t26 model results, add some 50ts self-attn variants
3 years ago
Ross Wightman 71f00bfe9e Don't run profile if model is torchscripted
3 years ago
Ross Wightman 5882e62ada Add activation count to fvcore based profiling in benchmark.py
3 years ago
Ross Wightman f7325c7b71 Support either deepspeed or fvcore for flop profiling
3 years ago
Ross Wightman 66253790d4 Add `--bench profile` mode for benchmark.py to just run deepspeed detailed profile on model
3 years ago
Ross Wightman 13a8bf7972 Add train size override and deepspeed GMACs counter (if deepspeed installed) to benchmark.py
3 years ago
Ross Wightman ac469b50da Optimizer improvements, additions, cleanup
3 years ago
Ross Wightman 137a374930
Merge pull request #555 from MichaelMonashev/patch-1
4 years ago
Ross Wightman e15e68d881 Fix #566, summary.csv writing to pwd on local_rank != 0. Tweak benchmark mem handling to see if it reduces likelihood of 'bad' exceptions on OOM.
4 years ago
Michael Monashev 0be1fa4793
Argument description fixed
4 years ago
Ross Wightman 37c71a5609 Some further create_optimizer_v2 tweaks, remove some redudnant code, add back safe model str. Benchmark step times per batch.
4 years ago
Ross Wightman 288682796f Update benchmark script to add precision arg. Fix some downstream (DeiT) compat issues with latest changes. Bump version to 0.4.7
4 years ago
Ross Wightman 4445eaa470 Add img_size to benchmark output
4 years ago
Ross Wightman 0706d05d52 Benchmark models listed in txt file. Add more hybrid vit variants for testing
4 years ago
Ross Wightman 0e16d4e9fb Add benchmark.py script, and update optimizer factory to be more friendly to use outside of argparse interface.
4 years ago