Commit Graph

14 Commits (5c5cadfe4c2d14a5f35a71ec73082469fbc03729)

Author SHA1 Message Date
Ross Wightman 5c5cadfe4c
Update README.md
3 years ago
Ross Wightman ee2b8f49ee
Update README.md
3 years ago
Ross Wightman cc870df7b8
Update README.md
3 years ago
Ross Wightman 6b2d9c2660 Another bits/README.md update
3 years ago
Ross Wightman c3db5f5801 Worker hack for TFDS eval, add TPU env var setting.
3 years ago
Ross Wightman f411724de4 Fix checkpoint delete issue. Add README about bits and initial Pytorch XLA usage on TPU-VM. Add some FIXMEs and fold train_cfg into train_state by default.
3 years ago
Ross Wightman 91ab0b6ce5 Add proper TrainState checkpoint save/load. Some reorg/refactoring and other cleanup. More to go...
4 years ago
Ross Wightman 5b9c69e80a Add basic training resume based on legacy code
4 years ago
Ross Wightman 72ca831dd4 Back to using strings for the enum translation, forgot about import dep
4 years ago
Ross Wightman cbd4ee737f Fix model init for XLA, remove some prints.
4 years ago
Ross Wightman aa92d7b1c5 Major timm.bits update. Updater and DeviceEnv now dataclasses, after_step closure used, metrics base impl w/ distributed reduce, many tweaks/fixes.
4 years ago
Ross Wightman 938716c753 Fix import issue, use devenv for dist info in parser_tfds
4 years ago
Ross Wightman 76de984a5f Fix some bugs with XLA support, logger, add hacky xla dist launch script since torch.dist.launch doesn't work
4 years ago
Ross Wightman 12d9a6d4d2 First timm.bits commit, add initial abstractions, WIP updates to train, val... some of it working
4 years ago