* add polynomial decay 'poly'
* cleanup cycle specific args for cosine, poly, and tanh sched, t_mul -> cycle_mul, decay -> cycle_decay, default cycle_limit to 1 in each opt
* add k-decay for cosine and poly sched as per https://arxiv.org/abs/2004.05909
* change default tanh ub/lb to push inflection to later epochs
* Add some of the trendy new optimizers. Decent results but not clearly better than the standards.
* Can create a None scheduler for constant LR
* ResNet defaults to zero_init of last BN in residual
* add resnet50d config
* reorganize train args
* allow resolve_data_config to be used with dict args, not just arparse
* stop incrementing epoch before save, more consistent naming vs csv, etc
* update resume and start epoch handling to match above
* stop auto-incrementing epoch in scheduler