<li>New MobileNet-V3 Large weights trained from stratch with this code to 75.77% top-1</li>
<li>New MobileNet-V3 Large weights trained from stratch with this code to 75.77% top-1</li>
<li>IMPORTANT CHANGE - default weight init changed for all MobilenetV3 / EfficientNet / related models</li>
<li>IMPORTANT CHANGE - default weight init changed for all MobilenetV3 / EfficientNet / related models<ul>
<li>overall results similar to a bit better training from scratch on a few smaller models tried</li>
<li>overall results similar to a bit better training from scratch on a few smaller models tried</li>
<li>performance early in training seems consistently improved but less difference by end</li>
<li>performance early in training seems consistently improved but less difference by end</li>
<li>set <code>fix_group_fanout=False</code> in <code>_init_weight_goog</code> fn if you need to reproducte past behaviour</li>
<li>set <code>fix_group_fanout=False</code> in <code>_init_weight_goog</code> fn if you need to reproducte past behaviour</li>
</ul>
</li>
<li>Experimental LR noise feature added applies a random perturbation to LR each epoch in specified range of training</li>
<li>Experimental LR noise feature added applies a random perturbation to LR each epoch in specified range of training</li>
</ul>
</ul>
<h3id="feb-18-2020">Feb 18, 2020</h3>
<h3id="feb-18-2020">Feb 18, 2020</h3>
<ul>
<ul>
<li>Big refactor of model layers and addition of several attention mechanisms. Several additions motivated by 'Compounding the Performance Improvements...' (<ahref="https://arxiv.org/abs/2001.06268">https://arxiv.org/abs/2001.06268</a>):</li>
<li>Big refactor of model layers and addition of several attention mechanisms. Several additions motivated by 'Compounding the Performance Improvements...' (<ahref="https://arxiv.org/abs/2001.06268">https://arxiv.org/abs/2001.06268</a>):<ul>
<li>Move layer/module impl into <code>layers</code> subfolder/module of <code>models</code> and organize in a more granular fashion</li>
<li>Move layer/module impl into <code>layers</code> subfolder/module of <code>models</code> and organize in a more granular fashion</li>
<li>ResNet downsample paths now properly support dilation (output stride != 32) for avg_pool ('D' variant) and 3x3 (SENets) networks</li>
<li>ResNet downsample paths now properly support dilation (output stride != 32) for avg_pool ('D' variant) and 3x3 (SENets) networks</li>
<li>Add Selective Kernel Nets on top of ResNet base, pretrained weights<ul>
<li>Add Selective Kernel Nets on top of ResNet base, pretrained weights<ul>
@ -570,6 +572,8 @@
<li>CBAM attention experiment (not the best results so far, may remove)</li>
<li>CBAM attention experiment (not the best results so far, may remove)</li>
<li>Attention factory to allow dynamically selecting one of SE, ECA, CBAM in the <code>.se</code> position for all ResNets</li>
<li>Attention factory to allow dynamically selecting one of SE, ECA, CBAM in the <code>.se</code> position for all ResNets</li>
<li>Add DropBlock and DropPath (formerly DropConnect for EfficientNet/MobileNetv3) support to all ResNet variants</li>
<li>Add DropBlock and DropPath (formerly DropConnect for EfficientNet/MobileNetv3) support to all ResNet variants</li>
</ul>
</li>
<li>Full dataset results updated that incl NoisyStudent weights and 2 of the 3 SK weights</li>
<li>Full dataset results updated that incl NoisyStudent weights and 2 of the 3 SK weights</li>
</ul>
</ul>
<h3id="feb-12-2020">Feb 12, 2020</h3>
<h3id="feb-12-2020">Feb 12, 2020</h3>
@ -609,7 +613,7 @@
</ul>
</ul>
<h3id="dec-28-2019">Dec 28, 2019</h3>
<h3id="dec-28-2019">Dec 28, 2019</h3>
<ul>
<ul>
<li>Add new model weights and training hparams (see Training Hparams section)</li>
<li>Add new model weights and training hparams (see Training Hparams section)<ul>
<li><code>efficientnet_b3</code> - 81.5 top-1, 95.7 top-5 at default res/crop, 81.9, 95.8 at 320x320 1.0 crop-pct<ul>
<li><code>efficientnet_b3</code> - 81.5 top-1, 95.7 top-5 at default res/crop, 81.9, 95.8 at 320x320 1.0 crop-pct<ul>
<li>trained with RandAugment, ended up with an interesting but less than perfect result (see training section)</li>
<li>trained with RandAugment, ended up with an interesting but less than perfect result (see training section)</li>
</ul>
</ul>
@ -625,6 +629,8 @@
</ul>
</ul>
</li>
</li>
</ul>
</ul>
</li>
</ul>
<h3id="dec-23-2019">Dec 23, 2019</h3>
<h3id="dec-23-2019">Dec 23, 2019</h3>
<ul>
<ul>
<li>Add RandAugment trained MixNet-XL weights with 80.48 top-1.</li>
<li>Add RandAugment trained MixNet-XL weights with 80.48 top-1.</li>
@ -636,13 +642,17 @@
</ul>
</ul>
<h3id="nov-29-2019">Nov 29, 2019</h3>
<h3id="nov-29-2019">Nov 29, 2019</h3>
<ul>
<ul>
<li>Brought EfficientNet and MobileNetV3 up to date with my <ahref="https://github.com/rwightman/gen-efficientnet-pytorch">https://github.com/rwightman/gen-efficientnet-pytorch</a> code. Torchscript and ONNX export compat excluded.</li>
<li>Brought EfficientNet and MobileNetV3 up to date with my <ahref="https://github.com/rwightman/gen-efficientnet-pytorch">https://github.com/rwightman/gen-efficientnet-pytorch</a> code. Torchscript and ONNX export compat excluded.<ul>
<li>AdvProp weights added</li>
<li>AdvProp weights added</li>
<li>Official TF MobileNetv3 weights added</li>
<li>Official TF MobileNetv3 weights added</li>
</ul>
</li>
<li>EfficientNet and MobileNetV3 hook based 'feature extraction' classes added. Will serve as basis for using models as backbones in obj detection/segmentation tasks. Lots more to be done here...</li>
<li>EfficientNet and MobileNetV3 hook based 'feature extraction' classes added. Will serve as basis for using models as backbones in obj detection/segmentation tasks. Lots more to be done here...</li>
<li>HRNet classification models and weights added from <ahref="https://github.com/HRNet/HRNet-Image-Classification">https://github.com/HRNet/HRNet-Image-Classification</a></li>
<li>HRNet classification models and weights added from <ahref="https://github.com/HRNet/HRNet-Image-Classification">https://github.com/HRNet/HRNet-Image-Classification</a></li>
<li>Consistency in global pooling, <code>reset_classifer</code>, and <code>forward_features</code> across models</li>
<li>Consistency in global pooling, <code>reset_classifer</code>, and <code>forward_features</code> across models<ul>
<p>Universal feature extraction, new models, new weights, new test sets.
<p>Universal feature extraction, new models, new weights, new test sets.</p>
* All models support the <code>features_only=True</code> argument for <code>create_model</code> call to return a network that extracts features from the deepest layer at each stride.
<ul>
* New models
<li>All models support the <code>features_only=True</code> argument for <code>create_model</code> call to return a network that extracts features from the deepest layer at each stride.</li>
* (Aligned) Xception41/65/71 (a proper port of TF models)
<li>ReXNet</li>
* New trained weights
<li>(Aligned) Xception41/65/71 (a proper port of TF models)</li>
* SEResNet50 - 80.3
</ul>
* CSPDarkNet53 - 80.1 top-1
</li>
* CSPResNeXt50 - 80.0 to-1
<li>New trained weights<ul>
* DPN68b - 79.2 top-1
<li>SEResNet50 - 80.3</li>
* EfficientNet-Lite0 (non-TF ver) - 75.5 (submitted by @hal-314)
<li>CSPDarkNet53 - 80.1 top-1</li>
* Add 'real' labels for ImageNet and ImageNet-Renditions test set, see <ahref="results/README.md"><code>results/README.md</code></a></p>
<li>CSPResNeXt50 - 80.0 to-1</li>
<li>DPN68b - 79.2 top-1</li>
<li>EfficientNet-Lite0 (non-TF ver) - 75.5 (submitted by @hal-314)</li>
</ul>
</li>
<li>Add 'real' labels for ImageNet and ImageNet-Renditions test set, see <ahref="results/README.md"><code>results/README.md</code></a></li>
</ul>
<h3id="june-11-2020">June 11, 2020</h3>
<h3id="june-11-2020">June 11, 2020</h3>
<p>Bunch of changes:</p>
<p>Bunch of changes:</p>
<ul>
<ul>
<li>DenseNet models updated with memory efficient addition from torchvision (fixed a bug), blur pooling and deep stem additions</li>
<li>DenseNet models updated with memory efficient addition from torchvision (fixed a bug), blur pooling and deep stem additions</li>
<li>VoVNet V1 and V2 models added, 39 V2 variant (ese_vovnet_39b) trained to 79.3 top-1</li>
<li>VoVNet V1 and V2 models added, 39 V2 variant (ese_vovnet_39b) trained to 79.3 top-1</li>
<li>Activation factory added along with new activations:</li>
<li>Activation factory added along with new activations:<ul>
<li>select act at model creation time for more flexibility in using activations compatible with scripting or tracing (ONNX export)</li>
<li>select act at model creation time for more flexibility in using activations compatible with scripting or tracing (ONNX export)</li>
<li>hard_mish (experimental) added with memory-efficient grad, along with ME hard_swish</li>
<li>hard_mish (experimental) added with memory-efficient grad, along with ME hard_swish</li>
<li>context mgr for setting exportable/scriptable/no_jit states</li>
<li>context mgr for setting exportable/scriptable/no_jit states</li>
</ul>
</li>
<li>Norm + Activation combo layers added with initial trial support in DenseNet and VoVNet along with impl of EvoNorm and InplaceAbn wrapper that fit the interface</li>
<li>Norm + Activation combo layers added with initial trial support in DenseNet and VoVNet along with impl of EvoNorm and InplaceAbn wrapper that fit the interface</li>
<li>Torchscript works for all but two of the model types as long as using Pytorch 1.5+, tests added for this</li>
<li>Torchscript works for all but two of the model types as long as using Pytorch 1.5+, tests added for this</li>
<li>Some import cleanup and classifier reset changes, all models will have classifier reset to nn.Identity on reset_classifer(0) call</li>
<li>Some import cleanup and classifier reset changes, all models will have classifier reset to nn.Identity on reset_classifer(0) call</li>
@ -486,20 +494,24 @@
</ul>
</ul>
<h3id="may-1-2020">May 1, 2020</h3>
<h3id="may-1-2020">May 1, 2020</h3>
<ul>
<ul>
<li>Merged a number of execellent contributions in the ResNet model family over the past month</li>
<li>Merged a number of execellent contributions in the ResNet model family over the past month<ul>
<li>BlurPool2D and resnetblur models initiated by <ahref="https://github.com/VRandme">Chris Ha</a>, I trained resnetblur50 to 79.3.</li>
<li>BlurPool2D and resnetblur models initiated by <ahref="https://github.com/VRandme">Chris Ha</a>, I trained resnetblur50 to 79.3.</li>
<li>TResNet models and SpaceToDepth, AntiAliasDownsampleLayer layers by <ahref="https://github.com/mrT23">mrT23</a></li>
<li>TResNet models and SpaceToDepth, AntiAliasDownsampleLayer layers by <ahref="https://github.com/mrT23">mrT23</a></li>
<li>ecaresnet (50d, 101d, light) models and two pruned variants using pruning as per (<ahref="https://arxiv.org/abs/2002.08258">https://arxiv.org/abs/2002.08258</a>) by <ahref="https://github.com/yoniaflalo">Yonathan Aflalo</a></li>
<li>ecaresnet (50d, 101d, light) models and two pruned variants using pruning as per (<ahref="https://arxiv.org/abs/2002.08258">https://arxiv.org/abs/2002.08258</a>) by <ahref="https://github.com/yoniaflalo">Yonathan Aflalo</a></li>
</ul>
</li>
<li>200 pretrained models in total now with updated results csv in results folder</li>
<li>200 pretrained models in total now with updated results csv in results folder</li>
</ul>
</ul>
<h3id="april-5-2020">April 5, 2020</h3>
<h3id="april-5-2020">April 5, 2020</h3>
<ul>
<ul>
<li>Add some newly trained MobileNet-V2 models trained with latest h-params, rand augment. They compare quite favourably to EfficientNet-Lite</li>
<li>Add some newly trained MobileNet-V2 models trained with latest h-params, rand augment. They compare quite favourably to EfficientNet-Lite<ul>
<li>3.5M param MobileNet-V2 100 @ 73%</li>
<li>3.5M param MobileNet-V2 100 @ 73%</li>
<li>4.5M param MobileNet-V2 110d @ 75%</li>
<li>4.5M param MobileNet-V2 110d @ 75%</li>
<li>6.1M param MobileNet-V2 140 @ 76.5%</li>
<li>6.1M param MobileNet-V2 140 @ 76.5%</li>
<li>5.8M param MobileNet-V2 120d @ 77.3%</li>
<li>5.8M param MobileNet-V2 120d @ 77.3%</li>
</ul>
</ul>
</li>
</ul>
<h3id="march-18-2020">March 18, 2020</h3>
<h3id="march-18-2020">March 18, 2020</h3>
<ul>
<ul>
<li>Add EfficientNet-Lite models w/ weights ported from <ahref="https://github.com/tensorflow/tpu/tree/master/models/official/efficientnet/lite">Tensorflow TPU</a></li>
<li>Add EfficientNet-Lite models w/ weights ported from <ahref="https://github.com/tensorflow/tpu/tree/master/models/official/efficientnet/lite">Tensorflow TPU</a></li>
<p>The model architectures included come from a wide variety of sources. Sources, including papers, original impl ("reference code") that I rewrote / adapted, and PyTorch impl that I leveraged directly ("code") are listed below.</p>
<p>The model architectures included come from a wide variety of sources. Sources, including papers, original impl ("reference code") that I rewrote / adapted, and PyTorch impl that I leveraged directly ("code") are listed below.</p>
<p>Most included models have pretrained weights. The weights are either:
<p>Most included models have pretrained weights. The weights are either:</p>
1. from their original sources
<ol>
2. ported by myself from their original impl in a different framework (e.g. Tensorflow models)
<li>from their original sources</li>
3. trained from scratch using the included training script</p>
<li>ported by myself from their original impl in a different framework (e.g. Tensorflow models)</li>
<li>trained from scratch using the included training script</li>
</ol>
<p>The validation results for the pretrained weights can be found <ahref="../results/">here</a></p>
<p>The validation results for the pretrained weights can be found <ahref="../results/">here</a></p>