a85df34993Update lambda_resnet26rpt weights to 78.9, add better halonet26t weights at 79.1 with tweak to attention dim
Ross Wightman
2021-10-08 17:44:13 -0700
38804c721bCheckpoint clean fn useable stand alone
Ross Wightman
2021-10-08 17:43:53 -0700
b544ad4d3fregnetz model default cfg tweaks
Ross Wightman
2021-10-06 21:14:59 -0700
d80653cb99Merge branch 'alexander-soare-freeze-functionality'
Ross Wightman
2021-10-06 17:01:41 -0700
e5da481073Small post-merge tweak for freeze/unfreeze, add to __init__ for utils
Ross Wightman
2021-10-06 17:00:27 -0700
e2b8d44ff0Halo, bottleneck attn, lambda layer additions and cleanup along w/ experimental model defs * align interfaces of halo, bottleneck attn and lambda layer * add qk_ratio to all of above, control q/k dim relative to output dim * add experimental haloregnetz, and trionet (lambda + halo + bottle) models
Ross Wightman
2021-10-06 16:29:33 -0700
e0b3a3fab3Make test-pooling flag for validate.py opt in
Ross Wightman
2021-10-06 16:12:05 -0700
431e60c83fAdd acknowledgements for freeze_batch_norm inspiration
#876
Alexander Soare
2021-10-06 14:28:49 +0100
fbf59c04eeChange crop ratio on correct resnet50 variant.
Ross Wightman
2021-10-04 22:31:08 -0700
1fdc7af8fdMerge remote-tracking branch 'origin/fixes_bce_regnet' into bits_and_tpu
Ross Wightman
2021-10-01 16:32:04 -0700
d9abfa48dfMake broadcast_buffers disable its own flag for now (needs more testing on interaction with dist_bn)
Ross Wightman
2021-10-01 13:43:55 -0700
b1c2e3eb92Match rel_pos_indices attr rename in conv branch
Ross Wightman
2021-09-30 23:19:05 -0700
b49630a138Add relative pos embed option to LambdaLayer, fix last transpose/reshape.
Ross Wightman
2021-09-30 22:45:09 -0700
d657e2cc0bRemove dead code line from efficientnet
Ross Wightman
2021-09-30 21:54:42 -0700
0ca687f224Make 'regnetz' model experiments closer to actual RegNetZ, bottleneck expansion, expand from in_chs, no shortcut on stride 2, tweak model sizes
Ross Wightman
2021-09-30 21:49:38 -0700
Remove a duplicate layer creation in byobnet.py
#898
leondgarse
2021-09-30 18:30:48 +0800
b81e79aae9Fix bottleneck attn transpose typo, hopefully these train better now..
Ross Wightman
2021-09-28 16:38:41 -0700
80075b0b8aAdd worker_seeding arg to allow selecting old vs updated data loader worker seed for (old) experiment repeatability
Ross Wightman
2021-09-28 16:37:45 -0700
25d52ea71dMerge remote-tracking branch 'origin/fixes_bce_regnet' into bits_and_tpu
Ross Wightman
2021-09-24 22:55:38 -0700
0387e6057eUpdate binary cross ent impl to use thresholding as an option (convert soft targets from mixup/cutmix to 0, 1)
Ross Wightman
2021-09-23 15:45:39 -0700
5d6983c462Batch validate a list of files if model is a text file with model per line
Ross Wightman
2021-09-23 15:45:17 -0700
f8a63a3b71Add worker_init_fn to loader for numpy seed per worker
Ross Wightman
2021-09-23 15:44:38 -0700
515121cca1Use reshape instead of view in std_conv, causing issues in recent PyTorch in channels_last
Ross Wightman
2021-09-23 15:43:48 -0700
da06cc61d4ResNetV2 seems to work best without zero_init residual
Ross Wightman
2021-09-23 15:43:22 -0700
8e11da0ce3Add experimental RegNetZ(ish) models for training / perf trials.
Ross Wightman
2021-09-23 15:42:57 -0700
Merge pull request #821 from rwightman/attn_update
Ross Wightman
2021-09-13 17:49:34 -0700
cf5ac2800cBotNet models were still off, remove weights for bad configs. Add good SE-HaloNet33-TS weights.
#821
attn_update
Ross Wightman
2021-09-13 17:18:59 -0700
24720abe3bMerge branch 'master' into attn_update
Ross Wightman
2021-09-13 16:51:10 -0700
1c9284c640Add BeiT 'finetuned' 1k weights and pretrained 22k weights, pretraining specific (masked) model excluded for now
Ross Wightman
2021-09-13 16:38:23 -0700
f8a215cfe6A few more crossvit tweaks, fix training w/ no_weight_decay names, add crop option for scaling, adjust default crop_pct for large img size to 1.0 for better results
Ross Wightman
2021-09-13 14:17:34 -0700
7ab2491ab7Better handling of crossvit for tests / forward_features, fix torchscript regression in my changes
Ross Wightman
2021-09-13 13:01:05 -0700
702982d8afMerge branch 'chunfuchen-feature/crossvit'
Ross Wightman
2021-09-13 11:50:58 -0700
f1808e0970Post crossvit merge cleanup, change model names to reflect input size, cleanup img size vs scale handling, fix tests
Ross Wightman
2021-09-13 11:49:54 -0700
008f25430badd deterministic flag + functionality
#853
Alexander Soare
2021-09-06 19:06:26 +0100
3581affb77Update train.py with some flags related to scheduler tweaks, fix best checkpoint bug.
Ross Wightman
2021-09-05 16:05:31 -0700
c2f02b08b8Merge remote-tracking branch 'origin/attn_update' into bits_and_tpu
Ross Wightman
2021-09-05 16:02:50 -0700
5bd04714e4Cleanup weight init for byob/byoanet and related
Ross Wightman
2021-09-05 15:34:05 -0700
8642401e88Swap botnet 26/50 weights/models after realizing a mistake in arch def, now figuring out why they were so low...
Ross Wightman
2021-09-05 15:17:19 -0700
5f12de4875Add initial AttentionPool2d that's being trialed. Fix comment and still trying to improve reliability of sgd test.
Ross Wightman
2021-09-05 12:29:36 -0700
76881d207bAdd baseline resnet26t @ 256x256 weights. Add 33ts variant of halonet with at least one halo in stage 2,3,4
Ross Wightman
2021-09-04 14:52:54 -0700
54e90e82a5Another attempt at sgd momentum test passing...
Ross Wightman
2021-08-27 10:39:31 -0700
484e61648dAdding the attn series weights, tweaking model names, comments...
Ross Wightman
2021-09-03 18:09:42 -0700
0639d9a591Fix updated validation_batch_size fallback
Ross Wightman
2021-09-02 14:44:53 -0700
5db057dca0Fix misnamed arg, tweak other train script args for better defaults.
Ross Wightman
2021-09-02 14:15:49 -0700
fb94350896Update training script and loader factory to allow use of scheduler updates, repeat augment, and bce loss
Ross Wightman
2021-09-01 17:46:40 -0700
f262137ff2Add RepeatAugSampler as per DeiT RASampler impl, showing promise for current (distributed) training experiments.
Ross Wightman
2021-09-01 17:40:53 -0700
ba9c1108a1Add a BCE loss impl that converts dense targets to sparse /w smoothing as an alternate to CE w/ smoothing. For training experiments.
Ross Wightman
2021-09-01 17:39:28 -0700
29a37e23eeLR scheduler update: * add polynomial decay 'poly' * cleanup cycle specific args for cosine, poly, and tanh sched, t_mul -> cycle_mul, decay -> cycle_decay, default cycle_limit to 1 in each opt * add k-decay for cosine and poly sched as per https://arxiv.org/abs/2004.05909 * change default tanh ub/lb to push inflection to later epochs
Ross Wightman
2021-09-01 17:33:11 -0700
2568ffc5efMerge branch 'master' into attn_update
Ross Wightman
2021-08-27 09:21:22 -0700
708d87a813Fix ViT SAM weight compat as weights at URL changed to not use repr layer. Fix#825. Tweak optim test.
Ross Wightman
2021-08-27 09:20:13 -0700
8449ba210cImprove performance of HaloAttn, change default dim calc. Some cleanup / fixes for byoanet. Rename resnet26ts to tfs to distinguish (extra fc).
Ross Wightman
2021-08-26 21:56:44 -0700