From 8f6d63888727a385a0ff770022f5950ca96a0173 Mon Sep 17 00:00:00 2001 From: Ross Wightman Date: Wed, 23 Mar 2022 16:16:26 -0700 Subject: [PATCH] Update README.md --- README.md | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index ad5bcc45..b779c5cd 100644 --- a/README.md +++ b/README.md @@ -23,6 +23,10 @@ I'm fortunate to be able to dedicate significant time and money of my own suppor ## What's New +### March 23, 2022 +* Add `ParallelBlock` and `LayerScale` option to base vit models to support model configs in [Three things everyone should know about ViT](https://arxiv.org/abs/2203.09795) +* `convnext_tiny_hnf` (head norm first) weights trained with (close to) A2 recipe, 82.2% top-1, could do better with more epochs. + ### March 21, 2022 * Merge `norm_norm_norm`. **IMPORTANT** this update for a coming 0.6.x release will likely de-stabilize the master branch for a while. Branch `0.5.x` or a previous 0.5.x release can be used if stability is required. * Significant weights update (all TPU trained) as described in this [release](https://github.com/rwightman/pytorch-image-models/releases/tag/v0.1-tpu-weights) @@ -45,7 +49,8 @@ I'm fortunate to be able to dedicate significant time and money of my own suppor * `resnetrs200` - 83.85 @ 256, 84.44 @ 320 * HuggingFace hub support fixed w/ initial groundwork for allowing alternative 'config sources' for pretrained model definitions and weights (generic local file / remote url support soon) * SwinTransformer-V2 implementation added. Submitted by [Christoph Reich](https://github.com/ChristophReich1996). Training experiments and model changes by myself are ongoing so expect compat breaks. -* MobileViT models w/ weights adapted from https://github.com/apple/ml-cvnets ( +* Swin-S3 (AutoFormerV2) models / weights added from https://github.com/microsoft/Cream/tree/main/AutoFormerV2 +* MobileViT models w/ weights adapted from https://github.com/apple/ml-cvnets * PoolFormer models w/ weights adapted from https://github.com/sail-sg/poolformer * VOLO models w/ weights adapted from https://github.com/sail-sg/volo * Significant work experimenting with non-BatchNorm norm layers such as EvoNorm, FilterResponseNorm, GroupNorm, etc @@ -344,13 +349,16 @@ A full version of the list below with source links can be found in the [document * FBNet-V3 - https://arxiv.org/abs/2006.02049 * HardCoRe-NAS - https://arxiv.org/abs/2102.11646 * LCNet - https://arxiv.org/abs/2109.15099 +* MobileViT - https://arxiv.org/abs/2110.02178 * NASNet-A - https://arxiv.org/abs/1707.07012 * NesT - https://arxiv.org/abs/2105.12723 * NFNet-F - https://arxiv.org/abs/2102.06171 * NF-RegNet / NF-ResNet - https://arxiv.org/abs/2101.08692 * PNasNet - https://arxiv.org/abs/1712.00559 +* PoolFormer (MetaFormer) - https://arxiv.org/abs/2111.11418 * Pooling-based Vision Transformer (PiT) - https://arxiv.org/abs/2103.16302 * RegNet - https://arxiv.org/abs/2003.13678 +* RegNetZ - https://arxiv.org/abs/2103.06877 * RepVGG - https://arxiv.org/abs/2101.03697 * ResMLP - https://arxiv.org/abs/2105.03404 * ResNet/ResNeXt @@ -367,12 +375,15 @@ A full version of the list below with source links can be found in the [document * ReXNet - https://arxiv.org/abs/2007.00992 * SelecSLS - https://arxiv.org/abs/1907.00837 * Selective Kernel Networks - https://arxiv.org/abs/1903.06586 +* Swin S3 (AutoFormerV2) - https://arxiv.org/abs/2111.14725 * Swin Transformer - https://arxiv.org/abs/2103.14030 +* Swin Transformer V2 - https://arxiv.org/abs/2111.09883 * Transformer-iN-Transformer (TNT) - https://arxiv.org/abs/2103.00112 * TResNet - https://arxiv.org/abs/2003.13630 * Twins (Spatial Attention in Vision Transformers) - https://arxiv.org/pdf/2104.13840.pdf * Visformer - https://arxiv.org/abs/2104.12533 * Vision Transformer - https://arxiv.org/abs/2010.11929 +* VOLO (Vision Outlooker) - https://arxiv.org/abs/2106.13112 * VovNet V2 and V1 - https://arxiv.org/abs/1911.06667 * Xception - https://arxiv.org/abs/1610.02357 * Xception (Modified Aligned, Gluon) - https://arxiv.org/abs/1802.02611