Model index files and script to generate it

pull/494/head
Guillem Cucurull 4 years ago
parent 5c7d298234
commit 9e74125276

@ -0,0 +1,62 @@
## How do I use this model on an image?
To load a pretrained model:
```python
import timm
model = timm.create_model('{{ model_name }}', pretrained=True)
model.eval()
```
To load and preprocess the image:
```python
import urllib
from PIL import Image
from timm.data import resolve_data_config
from timm.data.transforms_factory import create_transform
config = resolve_data_config({}, model=model)
transform = create_transform(**config)
url, filename = ("https://github.com/pytorch/hub/raw/master/images/dog.jpg", "dog.jpg")
urllib.request.urlretrieve(url, filename)
img = Image.open(filename).convert('RGB')
tensor = transform(img).unsqueeze(0) # transform and add batch dimension
```
To get the model predictions:
```python
import torch
with torch.no_grad():
out = model(tensor)
probabilities = torch.nn.functional.softmax(out[0], dim=0)
print(probabilities.shape)
# prints: torch.Size([1000])
```
To get the top-5 predictions class names:
```python
# Get imagenet class mappings
url, filename = ("https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt", "imagenet_classes.txt")
urllib.request.urlretrieve(url, filename)
with open("imagenet_classes.txt", "r") as f:
categories = [s.strip() for s in f.readlines()]
# Print top categories per image
top5_prob, top5_catid = torch.topk(probabilities, 5)
for i in range(top5_prob.size(0)):
print(categories[top5_catid[i]], top5_prob[i].item())
# prints class names and probabilities like:
# [('Samoyed', 0.6425196528434753), ('Pomeranian', 0.04062102362513542), ('keeshond', 0.03186424449086189), ('white wolf', 0.01739676296710968), ('Eskimo dog', 0.011717947199940681)]
```
Replace the model name with the variant you want to use, e.g. `{{ model_name }}`. You can find the IDs in the model summaries at the top of this page.
To extract image features with this model, follow the [timm feature extraction examples](https://rwightman.github.io/pytorch-image-models/feature_extraction/), just change the name of the model you want to use.
## How do I finetune this model?
You can finetune any of the pre-trained models just by changing the classifier (the last layer).
```python
model = timm.create_model('{{ model_name }}', pretrained=True).reset_classifier(NUM_FINETUNE_CLASSES)
```
To finetune on your own dataset, you have to write a training loop or adapt [timm's training
script](https://github.com/rwightman/pytorch-image-models/blob/master/train.py) to use your dataset.

@ -0,0 +1,89 @@
# Summary
**Inception v3** is a convolutional neural network architecture from the Inception family that makes several improvements including using [Label Smoothing](https://paperswithcode.com/method/label-smoothing), Factorized 7 x 7 convolutions, and the use of an [auxiliary classifer](https://paperswithcode.com/method/auxiliary-classifier) to propagate label information lower down the network (along with the use of batch normalization for layers in the sidehead). The key building block is an [Inception Module](https://paperswithcode.com/method/inception-v3-module).
This particular model was trained for study of adversarial examples (adversarial training).
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@article{DBLP:journals/corr/abs-1804-00097,
author = {Alexey Kurakin and
Ian J. Goodfellow and
Samy Bengio and
Yinpeng Dong and
Fangzhou Liao and
Ming Liang and
Tianyu Pang and
Jun Zhu and
Xiaolin Hu and
Cihang Xie and
Jianyu Wang and
Zhishuai Zhang and
Zhou Ren and
Alan L. Yuille and
Sangxia Huang and
Yao Zhao and
Yuzhe Zhao and
Zhonglin Han and
Junjiajia Long and
Yerkebulan Berdibekov and
Takuya Akiba and
Seiya Tokui and
Motoki Abe},
title = {Adversarial Attacks and Defences Competition},
journal = {CoRR},
volume = {abs/1804.00097},
year = {2018},
url = {http://arxiv.org/abs/1804.00097},
archivePrefix = {arXiv},
eprint = {1804.00097},
timestamp = {Thu, 31 Oct 2019 16:31:22 +0100},
biburl = {https://dblp.org/rec/journals/corr/abs-1804-00097.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
```
<!--
Models:
- Name: adv_inception_v3
Metadata:
FLOPs: 7352418880
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Auxiliary Classifier
- Average Pooling
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inception-v3 Module
- Max Pooling
- ReLU
- Softmax
File Size: 95549439
Tasks:
- Image Classification
ID: adv_inception_v3
Crop Pct: '0.875'
Image Size: '299'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/inception_v3.py#L456
In Collection: Adversarial Inception v3
Collections:
- Name: Adversarial Inception v3
Paper:
title: Adversarial Attacks and Defences Competition
url: https://papperswithcode.com//paper/adversarial-attacks-and-defences-competition
type: model-index
Type: model-index
-->

@ -0,0 +1,384 @@
# Summary
**AdvProp** is an adversarial training scheme which treats adversarial examples as additional examples, to prevent overfitting. Key to the method is the usage of a separate auxiliary batch norm for adversarial examples, as they have different underlying distributions to normal examples.
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{xie2020adversarial,
title={Adversarial Examples Improve Image Recognition},
author={Cihang Xie and Mingxing Tan and Boqing Gong and Jiang Wang and Alan Yuille and Quoc V. Le},
year={2020},
eprint={1911.09665},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: tf_efficientnet_b1_ap
Metadata:
FLOPs: 883633200
Epochs: 350
Batch Size: 2048
Training Data:
- ImageNet
Training Techniques:
- AdvProp
- AutoAugment
- Label Smoothing
- RMSProp
- Stochastic Depth
- Weight Decay
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 31515350
Tasks:
- Image Classification
ID: tf_efficientnet_b1_ap
LR: 0.256
Crop Pct: '0.882'
Momentum: 0.9
Image Size: '240'
Weight Decay: 1.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Label Smoothing: 0.1
BatchNorm Momentum: 0.99
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1344
In Collection: AdvProp
- Name: tf_efficientnet_b2_ap
Metadata:
FLOPs: 1234321170
Epochs: 350
Batch Size: 2048
Training Data:
- ImageNet
Training Techniques:
- AdvProp
- AutoAugment
- Label Smoothing
- RMSProp
- Stochastic Depth
- Weight Decay
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 36800745
Tasks:
- Image Classification
ID: tf_efficientnet_b2_ap
LR: 0.256
Crop Pct: '0.89'
Momentum: 0.9
Image Size: '260'
Weight Decay: 1.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Label Smoothing: 0.1
BatchNorm Momentum: 0.99
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1354
In Collection: AdvProp
- Name: tf_efficientnet_b3_ap
Metadata:
FLOPs: 2275247568
Epochs: 350
Batch Size: 2048
Training Data:
- ImageNet
Training Techniques:
- AdvProp
- AutoAugment
- Label Smoothing
- RMSProp
- Stochastic Depth
- Weight Decay
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 49384538
Tasks:
- Image Classification
ID: tf_efficientnet_b3_ap
LR: 0.256
Crop Pct: '0.904'
Momentum: 0.9
Image Size: '300'
Weight Decay: 1.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Label Smoothing: 0.1
BatchNorm Momentum: 0.99
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1364
In Collection: AdvProp
- Name: tf_efficientnet_b4_ap
Metadata:
FLOPs: 5749638672
Epochs: 350
Batch Size: 2048
Training Data:
- ImageNet
Training Techniques:
- AdvProp
- AutoAugment
- Label Smoothing
- RMSProp
- Stochastic Depth
- Weight Decay
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 77993585
Tasks:
- Image Classification
ID: tf_efficientnet_b4_ap
LR: 0.256
Crop Pct: '0.922'
Momentum: 0.9
Image Size: '380'
Weight Decay: 1.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Label Smoothing: 0.1
BatchNorm Momentum: 0.99
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1374
In Collection: AdvProp
- Name: tf_efficientnet_b5_ap
Metadata:
FLOPs: 13176501888
Epochs: 350
Batch Size: 2048
Training Data:
- ImageNet
Training Techniques:
- AdvProp
- AutoAugment
- Label Smoothing
- RMSProp
- Stochastic Depth
- Weight Decay
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 122403150
Tasks:
- Image Classification
ID: tf_efficientnet_b5_ap
LR: 0.256
Crop Pct: '0.934'
Momentum: 0.9
Image Size: '456'
Weight Decay: 1.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Label Smoothing: 0.1
BatchNorm Momentum: 0.99
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1384
In Collection: AdvProp
- Name: tf_efficientnet_b6_ap
Metadata:
FLOPs: 24180518488
Epochs: 350
Batch Size: 2048
Training Data:
- ImageNet
Training Techniques:
- AdvProp
- AutoAugment
- Label Smoothing
- RMSProp
- Stochastic Depth
- Weight Decay
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 173237466
Tasks:
- Image Classification
ID: tf_efficientnet_b6_ap
LR: 0.256
Crop Pct: '0.942'
Momentum: 0.9
Image Size: '528'
Weight Decay: 1.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Label Smoothing: 0.1
BatchNorm Momentum: 0.99
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1394
In Collection: AdvProp
- Name: tf_efficientnet_b7_ap
Metadata:
FLOPs: 48205304880
Epochs: 350
Batch Size: 2048
Training Data:
- ImageNet
Training Techniques:
- AdvProp
- AutoAugment
- Label Smoothing
- RMSProp
- Stochastic Depth
- Weight Decay
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 266850607
Tasks:
- Image Classification
ID: tf_efficientnet_b7_ap
LR: 0.256
Crop Pct: '0.949'
Momentum: 0.9
Image Size: '600'
Weight Decay: 1.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Label Smoothing: 0.1
BatchNorm Momentum: 0.99
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1405
In Collection: AdvProp
- Name: tf_efficientnet_b8_ap
Metadata:
FLOPs: 80962956270
Epochs: 350
Batch Size: 2048
Training Data:
- ImageNet
Training Techniques:
- AdvProp
- AutoAugment
- Label Smoothing
- RMSProp
- Stochastic Depth
- Weight Decay
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 351412563
Tasks:
- Image Classification
ID: tf_efficientnet_b8_ap
LR: 0.128
Crop Pct: '0.954'
Momentum: 0.9
Image Size: '672'
Weight Decay: 1.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Label Smoothing: 0.1
BatchNorm Momentum: 0.99
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1416
In Collection: AdvProp
- Name: tf_efficientnet_b0_ap
Metadata:
FLOPs: 488688572
Epochs: 350
Batch Size: 2048
Training Data:
- ImageNet
Training Techniques:
- AdvProp
- AutoAugment
- Label Smoothing
- RMSProp
- Stochastic Depth
- Weight Decay
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 21385973
Tasks:
- Image Classification
ID: tf_efficientnet_b0_ap
LR: 0.256
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 1.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Label Smoothing: 0.1
BatchNorm Momentum: 0.99
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1334
In Collection: AdvProp
Collections:
- Name: AdvProp
Paper:
title: Adversarial Examples Improve Image Recognition
url: https://papperswithcode.com//paper/adversarial-examples-improve-image
type: model-index
Type: model-index
-->

@ -0,0 +1,255 @@
# Summary
**Big Transfer (BiT)** is a type of pretraining recipe that pre-trains on a large supervised source dataset, and fine-tunes the weights on the target task. Models are trained on the JFT-300M dataset. The finetuned models contained in this collection are finetuned on ImageNet.
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{kolesnikov2020big,
title={Big Transfer (BiT): General Visual Representation Learning},
author={Alexander Kolesnikov and Lucas Beyer and Xiaohua Zhai and Joan Puigcerver and Jessica Yung and Sylvain Gelly and Neil Houlsby},
year={2020},
eprint={1912.11370},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: resnetv2_152x4_bitm
Metadata:
Training Data:
- ImageNet
- JFT-300M
Training Techniques:
- Mixup
- SGD with Momentum
- Weight Decay
Training Resources: Cloud TPUv3-512
Architecture:
- 1x1 Convolution
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Group Normalization
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
- Weight Standardization
File Size: 3746270104
Tasks:
- Image Classification
Training Time: ''
ID: resnetv2_152x4_bitm
Crop Pct: '1.0'
Image Size: '480'
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/b9843f954b0457af2db4f9dea41a8538f51f5d78/timm/models/resnetv2.py#L465
Config: ''
In Collection: Big Transfer
- Name: resnetv2_152x2_bitm
Metadata:
Training Data:
- ImageNet
- JFT-300M
Training Techniques:
- Mixup
- SGD with Momentum
- Weight Decay
Training Resources: ''
Architecture:
- 1x1 Convolution
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Group Normalization
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
- Weight Standardization
File Size: 945476668
Tasks:
- Image Classification
Training Time: ''
ID: resnetv2_152x2_bitm
Crop Pct: '1.0'
Image Size: '480'
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/b9843f954b0457af2db4f9dea41a8538f51f5d78/timm/models/resnetv2.py#L458
Config: ''
In Collection: Big Transfer
- Name: resnetv2_50x1_bitm
Metadata:
Epochs: 90
Batch Size: 4096
Training Data:
- ImageNet
- JFT-300M
Training Techniques:
- Mixup
- SGD with Momentum
- Weight Decay
Training Resources: Cloud TPUv3-512
Architecture:
- 1x1 Convolution
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Group Normalization
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
- Weight Standardization
File Size: 102242668
Tasks:
- Image Classification
Training Time: ''
ID: resnetv2_50x1_bitm
LR: 0.03
Layers: 50
Crop Pct: '1.0'
Momentum: 0.9
Image Size: '480'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/b9843f954b0457af2db4f9dea41a8538f51f5d78/timm/models/resnetv2.py#L430
Config: ''
In Collection: Big Transfer
- Name: resnetv2_101x3_bitm
Metadata:
Epochs: 90
Batch Size: 4096
Training Data:
- ImageNet
- JFT-300M
Training Techniques:
- Mixup
- SGD with Momentum
- Weight Decay
Training Resources: Cloud TPUv3-512
Architecture:
- 1x1 Convolution
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Group Normalization
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
- Weight Standardization
File Size: 1551830100
Tasks:
- Image Classification
Training Time: ''
ID: resnetv2_101x3_bitm
LR: 0.03
Layers: 101
Crop Pct: '1.0'
Momentum: 0.9
Image Size: '480'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/b9843f954b0457af2db4f9dea41a8538f51f5d78/timm/models/resnetv2.py#L451
Config: ''
In Collection: Big Transfer
- Name: resnetv2_50x3_bitm
Metadata:
Epochs: 90
Batch Size: 4096
Training Data:
- ImageNet
- JFT-300M
Training Techniques:
- Mixup
- SGD with Momentum
- Weight Decay
Training Resources: Cloud TPUv3-512
Architecture:
- 1x1 Convolution
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Group Normalization
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
- Weight Standardization
File Size: 869321580
Tasks:
- Image Classification
Training Time: ''
ID: resnetv2_50x3_bitm
LR: 0.03
Layers: 50
Crop Pct: '1.0'
Momentum: 0.9
Image Size: '480'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/b9843f954b0457af2db4f9dea41a8538f51f5d78/timm/models/resnetv2.py#L437
Config: ''
In Collection: Big Transfer
- Name: resnetv2_101x1_bitm
Metadata:
Epochs: 90
Batch Size: 4096
Training Data:
- ImageNet
- JFT-300M
Training Techniques:
- Mixup
- SGD with Momentum
- Weight Decay
Training Resources: Cloud TPUv3-512
Architecture:
- 1x1 Convolution
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Group Normalization
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
- Weight Standardization
File Size: 178256468
Tasks:
- Image Classification
Training Time: ''
ID: resnetv2_101x1_bitm
LR: 0.03
Layers: 101
Crop Pct: '1.0'
Momentum: 0.9
Image Size: '480'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/b9843f954b0457af2db4f9dea41a8538f51f5d78/timm/models/resnetv2.py#L444
Config: ''
In Collection: Big Transfer
Collections:
- Name: Big Transfer
Paper:
title: 'Big Transfer (BiT): General Visual Representation Learning'
url: https://papperswithcode.com//paper/large-scale-learning-of-general-visual
type: model-index
Type: model-index
-->

@ -0,0 +1,74 @@
# Summary
**CSPDarknet53** is a convolutional neural network and backbone for object detection that uses [DarkNet-53](https://paperswithcode.com/method/darknet-53). It employs a CSPNet strategy to partition the feature map of the base layer into two parts and then merges them through a cross-stage hierarchy. The use of a split and merge strategy allows for more gradient flow through the network.
This CNN is used as the backbone for [YOLOv4](https://paperswithcode.com/method/yolov4).
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{bochkovskiy2020yolov4,
title={YOLOv4: Optimal Speed and Accuracy of Object Detection},
author={Alexey Bochkovskiy and Chien-Yao Wang and Hong-Yuan Mark Liao},
year={2020},
eprint={2004.10934},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: cspdarknet53
Metadata:
FLOPs: 8545018880
Batch Size: 128
Training Data:
- ImageNet
Training Techniques:
- CutMix
- Label Smoothing
- Mosaic
- Polynomial Learning Rate Decay
- SGD with Momentum
- Self-Adversarial Training
- Weight Decay
Training Resources: 1x NVIDIA RTX 2070 GPU
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Mish
- Residual Connection
- Softmax
File Size: 110775135
Tasks:
- Image Classification
ID: cspdarknet53
LR: 0.1
Layers: 53
Crop Pct: '0.887'
Momentum: 0.9
Image Size: '256'
Warmup Steps: 1000
Weight Decay: 0.0005
Interpolation: bilinear
Training Steps: 8000000
FPS (GPU RTX 2070): 66
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/cspnet.py#L441
In Collection: CSP DarkNet
Collections:
- Name: CSP DarkNet
Paper:
title: 'YOLOv4: Optimal Speed and Accuracy of Object Detection'
url: https://papperswithcode.com//paper/yolov4-optimal-speed-and-accuracy-of-object
type: model-index
Type: model-index
-->

@ -0,0 +1,72 @@
# Summary
**CSPResNet** is a convolutional neural network where we apply the Cross Stage Partial Network (CSPNet) approach to [ResNet](https://paperswithcode.com/method/resnet). The CSPNet partitions the feature map of the base layer into two parts and then merges them through a cross-stage hierarchy. The use of a split and merge strategy allows for more gradient flow through the network.
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{wang2019cspnet,
title={CSPNet: A New Backbone that can Enhance Learning Capability of CNN},
author={Chien-Yao Wang and Hong-Yuan Mark Liao and I-Hau Yeh and Yueh-Hua Wu and Ping-Yang Chen and Jun-Wei Hsieh},
year={2019},
eprint={1911.11929},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: cspresnet50
Metadata:
FLOPs: 5924992000
Batch Size: 128
Training Data:
- ImageNet
Training Techniques:
- Label Smoothing
- Polynomial Learning Rate Decay
- SGD with Momentum
- Weight Decay
Training Resources: ''
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 86679303
Tasks:
- Image Classification
Training Time: ''
ID: cspresnet50
LR: 0.1
Layers: 50
Crop Pct: '0.887'
Momentum: 0.9
Image Size: '256'
Weight Decay: 0.005
Interpolation: bilinear
Training Steps: 8000000
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/cspnet.py#L415
Config: ''
In Collection: CSP ResNet
Collections:
- Name: CSP ResNet
Paper:
title: 'CSPNet: A New Backbone that can Enhance Learning Capability of CNN'
url: https://papperswithcode.com//paper/cspnet-a-new-backbone-that-can-enhance
type: model-index
Type: model-index
-->

@ -0,0 +1,72 @@
# Summary
**CSPResNeXt** is a convolutional neural network where we apply the Cross Stage Partial Network (CSPNet) approach to [ResNeXt](https://paperswithcode.com/method/resnext). The CSPNet partitions the feature map of the base layer into two parts and then merges them through a cross-stage hierarchy. The use of a split and merge strategy allows for more gradient flow through the network.
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{wang2019cspnet,
title={CSPNet: A New Backbone that can Enhance Learning Capability of CNN},
author={Chien-Yao Wang and Hong-Yuan Mark Liao and I-Hau Yeh and Yueh-Hua Wu and Ping-Yang Chen and Jun-Wei Hsieh},
year={2019},
eprint={1911.11929},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: cspresnext50
Metadata:
FLOPs: 3962945536
Batch Size: 128
Training Data:
- ImageNet
Training Techniques:
- Label Smoothing
- Polynomial Learning Rate Decay
- SGD with Momentum
- Weight Decay
Training Resources: 1x GPU
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Grouped Convolution
- Max Pooling
- ReLU
- ResNeXt Block
- Residual Connection
- Softmax
File Size: 82562887
Tasks:
- Image Classification
Training Time: ''
ID: cspresnext50
LR: 0.1
Layers: 50
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.005
Interpolation: bilinear
Training Steps: 8000000
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/cspnet.py#L430
Config: ''
In Collection: CSP ResNeXt
Collections:
- Name: CSP ResNeXt
Paper:
title: 'CSPNet: A New Backbone that can Enhance Learning Capability of CNN'
url: https://papperswithcode.com//paper/cspnet-a-new-backbone-that-can-enhance
type: model-index
Type: model-index
-->

@ -0,0 +1,261 @@
# Summary
**DenseNet** is a type of convolutional neural network that utilises dense connections between layers, through [Dense Blocks](http://www.paperswithcode.com/method/dense-block), where we connect *all layers* (with matching feature-map sizes) directly with each other. To preserve the feed-forward nature, each layer obtains additional inputs from all preceding layers and passes on its own feature-maps to all subsequent layers.
The **DenseNet Blur** variant in this collection by Ross Wightman employs [Blur Pooling](http://www.paperswithcode.com/method/blur-pooling)
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@article{DBLP:journals/corr/HuangLW16a,
author = {Gao Huang and
Zhuang Liu and
Kilian Q. Weinberger},
title = {Densely Connected Convolutional Networks},
journal = {CoRR},
volume = {abs/1608.06993},
year = {2016},
url = {http://arxiv.org/abs/1608.06993},
archivePrefix = {arXiv},
eprint = {1608.06993},
timestamp = {Mon, 10 Sep 2018 15:49:32 +0200},
biburl = {https://dblp.org/rec/journals/corr/HuangLW16a.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
```
```
@misc{rw2019timm,
author = {Ross Wightman},
title = {PyTorch Image Models},
year = {2019},
publisher = {GitHub},
journal = {GitHub repository},
doi = {10.5281/zenodo.4414861},
howpublished = {\url{https://github.com/rwightman/pytorch-image-models}}
}
```
<!--
Models:
- Name: densenetblur121d
Metadata:
FLOPs: 3947812864
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Blur Pooling
- Convolution
- Dense Block
- Dense Connections
- Dropout
- Max Pooling
- ReLU
- Softmax
File Size: 32456500
Tasks:
- Image Classification
ID: densenetblur121d
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/densenet.py#L305
In Collection: DenseNet
- Name: tv_densenet121
Metadata:
FLOPs: 3641843200
Epochs: 90
Batch Size: 32
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Block
- Dense Connections
- Dropout
- Max Pooling
- ReLU
- Softmax
File Size: 32342954
Tasks:
- Image Classification
ID: tv_densenet121
LR: 0.1
Crop Pct: '0.875'
LR Gamma: 0.1
Momentum: 0.9
Image Size: '224'
LR Step Size: 30
Weight Decay: 0.0001
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/densenet.py#L379
In Collection: DenseNet
- Name: densenet121
Metadata:
FLOPs: 3641843200
Epochs: 90
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- Kaiming Initialization
- Nesterov Accelerated Gradient
- Weight Decay
Training Resources: ''
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Block
- Dense Connections
- Dropout
- Max Pooling
- ReLU
- Softmax
File Size: 32376726
Tasks:
- Image Classification
Training Time: ''
ID: densenet121
LR: 0.1
Layers: 121
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/densenet.py#L295
Config: ''
In Collection: DenseNet
- Name: densenet201
Metadata:
FLOPs: 5514321024
Epochs: 90
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- Kaiming Initialization
- Nesterov Accelerated Gradient
- Weight Decay
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Block
- Dense Connections
- Dropout
- Max Pooling
- ReLU
- Softmax
File Size: 81131730
Tasks:
- Image Classification
ID: densenet201
LR: 0.1
Layers: 201
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/densenet.py#L337
In Collection: DenseNet
- Name: densenet169
Metadata:
FLOPs: 4316945792
Epochs: 90
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- Kaiming Initialization
- Nesterov Accelerated Gradient
- Weight Decay
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Block
- Dense Connections
- Dropout
- Max Pooling
- ReLU
- Softmax
File Size: 57365526
Tasks:
- Image Classification
ID: densenet169
LR: 0.1
Layers: 169
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/densenet.py#L327
In Collection: DenseNet
- Name: densenet161
Metadata:
FLOPs: 9931959264
Epochs: 90
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- Kaiming Initialization
- Nesterov Accelerated Gradient
- Weight Decay
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Block
- Dense Connections
- Dropout
- Max Pooling
- ReLU
- Softmax
File Size: 115730790
Tasks:
- Image Classification
ID: densenet161
LR: 0.1
Layers: 161
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/densenet.py#L347
In Collection: DenseNet
Collections:
- Name: DenseNet
Paper:
title: Densely Connected Convolutional Networks
url: https://papperswithcode.com//paper/densely-connected-convolutional-networks
type: model-index
Type: model-index
-->

@ -0,0 +1,482 @@
# Summary
Extending “shallow” skip connections, **Dense Layer Aggregation (DLA)** incorporates more depth and sharing. The authors introduce two structures for deep layer aggregation (DLA): iterative deep aggregation (IDA) and hierarchical deep aggregation (HDA). These structures are expressed through an architectural framework, independent of the choice of backbone, for compatibility with current and future networks.
IDA focuses on fusing resolutions and scales while HDA focuses on merging features from all modules and channels. IDA follows the base hierarchy to refine resolution and aggregate scale stage-bystage. HDA assembles its own hierarchy of tree-structured connections that cross and merge stages to aggregate different levels of representation.
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{yu2019deep,
title={Deep Layer Aggregation},
author={Fisher Yu and Dequan Wang and Evan Shelhamer and Trevor Darrell},
year={2019},
eprint={1707.06484},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: dla60
Metadata:
FLOPs: 4256251880
Epochs: 120
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: ''
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- DLA Bottleneck Residual Block
- DLA Residual Block
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 89560235
Tasks:
- Image Classification
Training Time: ''
ID: dla60
LR: 0.1
Layers: 60
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/dla.py#L394
Config: ''
In Collection: DLA
- Name: dla46_c
Metadata:
FLOPs: 583277288
Epochs: 120
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: ''
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- DLA Bottleneck Residual Block
- DLA Residual Block
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 5307963
Tasks:
- Image Classification
Training Time: ''
ID: dla46_c
LR: 0.1
Layers: 46
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/dla.py#L369
Config: ''
In Collection: DLA
- Name: dla102x2
Metadata:
FLOPs: 9343847400
Epochs: 120
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 8x GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- DLA Bottleneck Residual Block
- DLA Residual Block
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 167645295
Tasks:
- Image Classification
Training Time: ''
ID: dla102x2
LR: 0.1
Layers: 102
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/dla.py#L426
Config: ''
In Collection: DLA
- Name: dla102
Metadata:
FLOPs: 7192952808
Epochs: 120
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 8x GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- DLA Bottleneck Residual Block
- DLA Residual Block
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 135290579
Tasks:
- Image Classification
Training Time: ''
ID: dla102
LR: 0.1
Layers: 102
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/dla.py#L410
Config: ''
In Collection: DLA
- Name: dla102x
Metadata:
FLOPs: 5886821352
Epochs: 120
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 8x GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- DLA Bottleneck Residual Block
- DLA Residual Block
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 107552695
Tasks:
- Image Classification
Training Time: ''
ID: dla102x
LR: 0.1
Layers: 102
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/dla.py#L418
Config: ''
In Collection: DLA
- Name: dla169
Metadata:
FLOPs: 11598004200
Epochs: 120
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 8x GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- DLA Bottleneck Residual Block
- DLA Residual Block
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 216547113
Tasks:
- Image Classification
Training Time: ''
ID: dla169
LR: 0.1
Layers: 169
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/dla.py#L434
Config: ''
In Collection: DLA
- Name: dla46x_c
Metadata:
FLOPs: 544052200
Epochs: 120
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: ''
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- DLA Bottleneck Residual Block
- DLA Residual Block
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 4387641
Tasks:
- Image Classification
Training Time: ''
ID: dla46x_c
LR: 0.1
Layers: 46
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/dla.py#L378
Config: ''
In Collection: DLA
- Name: dla60_res2net
Metadata:
FLOPs: 4147578504
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: ''
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- DLA Bottleneck Residual Block
- DLA Residual Block
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 84886593
Tasks:
- Image Classification
Training Time: ''
ID: dla60_res2net
Layers: 60
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/dla.py#L346
Config: ''
In Collection: DLA
- Name: dla60_res2next
Metadata:
FLOPs: 3485335272
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: ''
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- DLA Bottleneck Residual Block
- DLA Residual Block
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 69639245
Tasks:
- Image Classification
Training Time: ''
ID: dla60_res2next
Layers: 60
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/dla.py#L354
Config: ''
In Collection: DLA
- Name: dla34
Metadata:
FLOPs: 3070105576
Epochs: 120
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: ''
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- DLA Bottleneck Residual Block
- DLA Residual Block
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 63228658
Tasks:
- Image Classification
Training Time: ''
ID: dla34
LR: 0.1
Layers: 32
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/dla.py#L362
Config: ''
In Collection: DLA
- Name: dla60x
Metadata:
FLOPs: 3544204264
Epochs: 120
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: ''
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- DLA Bottleneck Residual Block
- DLA Residual Block
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 70883139
Tasks:
- Image Classification
Training Time: ''
ID: dla60x
LR: 0.1
Layers: 60
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/dla.py#L402
Config: ''
In Collection: DLA
- Name: dla60x_c
Metadata:
FLOPs: 593325032
Epochs: 120
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: ''
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- DLA Bottleneck Residual Block
- DLA Residual Block
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 5454396
Tasks:
- Image Classification
Training Time: ''
ID: dla60x_c
LR: 0.1
Layers: 60
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/dla.py#L386
Config: ''
In Collection: DLA
Collections:
- Name: DLA
Paper:
title: Deep Layer Aggregation
url: https://papperswithcode.com//paper/deep-layer-aggregation
type: model-index
Type: model-index
-->

@ -0,0 +1,209 @@
# Summary
A **Dual Path Network (DPN)** is a convolutional neural network which presents a new topology of connection paths internally. The intuition is that [ResNets](https://paperswithcode.com/method/resnet) enables feature re-usage while DenseNet enables new feature exploration, and both are important for learning good representations. To enjoy the benefits from both path topologies, Dual Path Networks share common features while maintaining the flexibility to explore new features through dual path architectures.
The principal building block is an [DPN Block](https://paperswithcode.com/method/dpn-block).
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{chen2017dual,
title={Dual Path Networks},
author={Yunpeng Chen and Jianan Li and Huaxin Xiao and Xiaojie Jin and Shuicheng Yan and Jiashi Feng},
year={2017},
eprint={1707.01629},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: dpn68
Metadata:
FLOPs: 2990567880
Batch Size: 1280
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 40x K80 GPUs
Architecture:
- Batch Normalization
- Convolution
- DPN Block
- Dense Connections
- Global Average Pooling
- Max Pooling
- Softmax
File Size: 50761994
Tasks:
- Image Classification
ID: dpn68
LR: 0.316
Layers: 68
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/dpn.py#L270
In Collection: DPN
- Name: dpn68b
Metadata:
FLOPs: 2990567880
Batch Size: 1280
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 40x K80 GPUs
Architecture:
- Batch Normalization
- Convolution
- DPN Block
- Dense Connections
- Global Average Pooling
- Max Pooling
- Softmax
File Size: 50781025
Tasks:
- Image Classification
ID: dpn68b
LR: 0.316
Layers: 68
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/dpn.py#L278
In Collection: DPN
- Name: dpn92
Metadata:
FLOPs: 8357659624
Batch Size: 1280
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 40x K80 GPUs
Architecture:
- Batch Normalization
- Convolution
- DPN Block
- Dense Connections
- Global Average Pooling
- Max Pooling
- Softmax
File Size: 151248422
Tasks:
- Image Classification
ID: dpn92
LR: 0.316
Layers: 92
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/dpn.py#L286
In Collection: DPN
- Name: dpn131
Metadata:
FLOPs: 20586274792
Batch Size: 960
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 40x K80 GPUs
Architecture:
- Batch Normalization
- Convolution
- DPN Block
- Dense Connections
- Global Average Pooling
- Max Pooling
- Softmax
File Size: 318016207
Tasks:
- Image Classification
ID: dpn131
LR: 0.316
Layers: 131
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/dpn.py#L302
In Collection: DPN
- Name: dpn107
Metadata:
FLOPs: 23524280296
Batch Size: 1280
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 40x K80 GPUs
Architecture:
- Batch Normalization
- Convolution
- DPN Block
- Dense Connections
- Global Average Pooling
- Max Pooling
- Softmax
File Size: 348612331
Tasks:
- Image Classification
ID: dpn107
LR: 0.316
Layers: 107
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/dpn.py#L310
In Collection: DPN
- Name: dpn98
Metadata:
FLOPs: 15003675112
Batch Size: 1280
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 40x K80 GPUs
Architecture:
- Batch Normalization
- Convolution
- DPN Block
- Dense Connections
- Global Average Pooling
- Max Pooling
- Softmax
File Size: 247021307
Tasks:
- Image Classification
ID: dpn98
LR: 0.4
Layers: 98
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/dpn.py#L294
In Collection: DPN
Collections:
- Name: DPN
Paper:
title: Dual Path Networks
url: https://papperswithcode.com//paper/dual-path-networks
type: model-index
Type: model-index
-->

@ -0,0 +1,200 @@
# Summary
An **ECA ResNet** is a variant on a [ResNet](https://paperswithcode.com/method/resnet) that utilises an [Efficient Channel Attention module](https://paperswithcode.com/method/efficient-channel-attention). Efficient Channel Attention is an architectural unit based on [squeeze-and-excitation blocks](https://paperswithcode.com/method/squeeze-and-excitation-block) that reduces model complexity without dimensionality reduction.
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{wang2020ecanet,
title={ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks},
author={Qilong Wang and Banggu Wu and Pengfei Zhu and Peihua Li and Wangmeng Zuo and Qinghua Hu},
year={2020},
eprint={1910.03151},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: ecaresnet101d
Metadata:
FLOPs: 10377193728
Epochs: 100
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 4x RTX 2080Ti GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Efficient Channel Attention
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 178815067
Tasks:
- Image Classification
ID: ecaresnet101d
LR: 0.1
Layers: 101
Crop Pct: '0.875'
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/a7f95818e44b281137503bcf4b3e3e94d8ffa52f/timm/models/resnet.py#L1087
In Collection: ECAResNet
- Name: ecaresnet101d_pruned
Metadata:
FLOPs: 4463972081
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: ''
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Efficient Channel Attention
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 99852736
Tasks:
- Image Classification
Training Time: ''
ID: ecaresnet101d_pruned
Layers: 101
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/a7f95818e44b281137503bcf4b3e3e94d8ffa52f/timm/models/resnet.py#L1097
Config: ''
In Collection: ECAResNet
- Name: ecaresnet50d_pruned
Metadata:
FLOPs: 3250730657
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Efficient Channel Attention
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 79990436
Tasks:
- Image Classification
ID: ecaresnet50d_pruned
Layers: 50
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/a7f95818e44b281137503bcf4b3e3e94d8ffa52f/timm/models/resnet.py#L1055
In Collection: ECAResNet
- Name: ecaresnet50d
Metadata:
FLOPs: 5591090432
Epochs: 100
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 4x RTX 2080Ti GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Efficient Channel Attention
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 102579290
Tasks:
- Image Classification
ID: ecaresnet50d
LR: 0.1
Layers: 50
Crop Pct: '0.875'
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/a7f95818e44b281137503bcf4b3e3e94d8ffa52f/timm/models/resnet.py#L1045
In Collection: ECAResNet
- Name: ecaresnetlight
Metadata:
FLOPs: 5276118784
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Efficient Channel Attention
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 120956612
Tasks:
- Image Classification
ID: ecaresnetlight
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/a7f95818e44b281137503bcf4b3e3e94d8ffa52f/timm/models/resnet.py#L1077
In Collection: ECAResNet
Collections:
- Name: ECAResNet
Paper:
title: 'ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks'
url: https://papperswithcode.com//paper/eca-net-efficient-channel-attention-for-deep
type: model-index
Type: model-index
-->

@ -0,0 +1,123 @@
# Summary
**EfficientNet** is a convolutional neural network architecture and scaling method that uniformly scales all dimensions of depth/width/resolution using a *compound coefficient*. Unlike conventional practice that arbitrary scales these factors, the EfficientNet scaling method uniformly scales network width, depth, and resolution with a set of fixed scaling coefficients. For example, if we want to use $2^N$ times more computational resources, then we can simply increase the network depth by $\alpha ^ N$, width by $\beta ^ N$, and image size by $\gamma ^ N$, where $\alpha, \beta, \gamma$ are constant coefficients determined by a small grid search on the original small model. EfficientNet uses a compound coefficient $\phi$ to uniformly scales network width, depth, and resolution in a principled way.
The compound scaling method is justified by the intuition that if the input image is bigger, then the network needs more layers to increase the receptive field and more channels to capture more fine-grained patterns on the bigger image.
The base EfficientNet-B0 network is based on the inverted bottleneck residual blocks of [MobileNetV2](https://paperswithcode.com/method/mobilenetv2), in addition to [squeeze-and-excitation blocks](https://paperswithcode.com/method/squeeze-and-excitation-block).
This collection consists of pruned EfficientNet models.
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{tan2020efficientnet,
title={EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks},
author={Mingxing Tan and Quoc V. Le},
year={2020},
eprint={1905.11946},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
```
```
@misc{rw2019timm,
author = {Ross Wightman},
title = {PyTorch Image Models},
year = {2019},
publisher = {GitHub},
journal = {GitHub repository},
doi = {10.5281/zenodo.4414861},
howpublished = {\url{https://github.com/rwightman/pytorch-image-models}}
}
```
<!--
Models:
- Name: efficientnet_b1_pruned
Metadata:
FLOPs: 489653114
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 25595162
Tasks:
- Image Classification
ID: efficientnet_b1_pruned
Crop Pct: '0.882'
Image Size: '240'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/a7f95818e44b281137503bcf4b3e3e94d8ffa52f/timm/models/efficientnet.py#L1208
In Collection: EfficientNet Pruned
- Name: efficientnet_b3_pruned
Metadata:
FLOPs: 1239590641
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 39770812
Tasks:
- Image Classification
ID: efficientnet_b3_pruned
Crop Pct: '0.904'
Image Size: '300'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/a7f95818e44b281137503bcf4b3e3e94d8ffa52f/timm/models/efficientnet.py#L1230
In Collection: EfficientNet Pruned
- Name: efficientnet_b2_pruned
Metadata:
FLOPs: 878133915
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 33555005
Tasks:
- Image Classification
ID: efficientnet_b2_pruned
Crop Pct: '0.89'
Image Size: '260'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/a7f95818e44b281137503bcf4b3e3e94d8ffa52f/timm/models/efficientnet.py#L1219
In Collection: EfficientNet Pruned
Collections:
- Name: EfficientNet Pruned
Paper:
title: Knapsack Pruning with Inner Distillation
url: https://papperswithcode.com//paper/knapsack-pruning-with-inner-distillation
type: model-index
Type: model-index
-->

@ -0,0 +1,254 @@
# Summary
**EfficientNet** is a convolutional neural network architecture and scaling method that uniformly scales all dimensions of depth/width/resolution using a *compound coefficient*. Unlike conventional practice that arbitrary scales these factors, the EfficientNet scaling method uniformly scales network width, depth, and resolution with a set of fixed scaling coefficients. For example, if we want to use $2^N$ times more computational resources, then we can simply increase the network depth by $\alpha ^ N$, width by $\beta ^ N$, and image size by $\gamma ^ N$, where $\alpha, \beta, \gamma$ are constant coefficients determined by a small grid search on the original small model. EfficientNet uses a compound coefficient $\phi$ to uniformly scales network width, depth, and resolution in a principled way.
The compound scaling method is justified by the intuition that if the input image is bigger, then the network needs more layers to increase the receptive field and more channels to capture more fine-grained patterns on the bigger image.
The base EfficientNet-B0 network is based on the inverted bottleneck residual blocks of [MobileNetV2](https://paperswithcode.com/method/mobilenetv2), in addition to [squeeze-and-excitation blocks](https://paperswithcode.com/method/squeeze-and-excitation-block).
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{tan2020efficientnet,
title={EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks},
author={Mingxing Tan and Quoc V. Le},
year={2020},
eprint={1905.11946},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
```
<!--
Models:
- Name: efficientnet_b2a
Metadata:
FLOPs: 1452041554
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 49369973
Tasks:
- Image Classification
ID: efficientnet_b2a
Crop Pct: '1.0'
Image Size: '288'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/a7f95818e44b281137503bcf4b3e3e94d8ffa52f/timm/models/efficientnet.py#L1029
In Collection: EfficientNet
- Name: efficientnet_b3a
Metadata:
FLOPs: 2600628304
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 49369973
Tasks:
- Image Classification
ID: efficientnet_b3a
Crop Pct: '1.0'
Image Size: '320'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/a7f95818e44b281137503bcf4b3e3e94d8ffa52f/timm/models/efficientnet.py#L1047
In Collection: EfficientNet
- Name: efficientnet_em
Metadata:
FLOPs: 3935516480
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 27927309
Tasks:
- Image Classification
ID: efficientnet_em
Crop Pct: '0.882'
Image Size: '240'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/a7f95818e44b281137503bcf4b3e3e94d8ffa52f/timm/models/efficientnet.py#L1118
In Collection: EfficientNet
- Name: efficientnet_lite0
Metadata:
FLOPs: 510605024
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 18820005
Tasks:
- Image Classification
ID: efficientnet_lite0
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/a7f95818e44b281137503bcf4b3e3e94d8ffa52f/timm/models/efficientnet.py#L1163
In Collection: EfficientNet
- Name: efficientnet_es
Metadata:
FLOPs: 2317181824
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 22003339
Tasks:
- Image Classification
ID: efficientnet_es
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/a7f95818e44b281137503bcf4b3e3e94d8ffa52f/timm/models/efficientnet.py#L1110
In Collection: EfficientNet
- Name: efficientnet_b3
Metadata:
FLOPs: 2327905920
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 49369973
Tasks:
- Image Classification
ID: efficientnet_b3
Crop Pct: '0.904'
Image Size: '300'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/a7f95818e44b281137503bcf4b3e3e94d8ffa52f/timm/models/efficientnet.py#L1038
In Collection: EfficientNet
- Name: efficientnet_b0
Metadata:
FLOPs: 511241564
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 21376743
Tasks:
- Image Classification
ID: efficientnet_b0
Layers: 18
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/a7f95818e44b281137503bcf4b3e3e94d8ffa52f/timm/models/efficientnet.py#L1002
In Collection: EfficientNet
- Name: efficientnet_b1
Metadata:
FLOPs: 909691920
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 31502706
Tasks:
- Image Classification
ID: efficientnet_b1
Crop Pct: '0.875'
Image Size: '240'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/a7f95818e44b281137503bcf4b3e3e94d8ffa52f/timm/models/efficientnet.py#L1011
In Collection: EfficientNet
- Name: efficientnet_b2
Metadata:
FLOPs: 1265324514
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 36788104
Tasks:
- Image Classification
ID: efficientnet_b2
Crop Pct: '0.875'
Image Size: '260'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/a7f95818e44b281137503bcf4b3e3e94d8ffa52f/timm/models/efficientnet.py#L1020
In Collection: EfficientNet
Collections:
- Name: EfficientNet
Paper:
title: 'EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks'
url: https://papperswithcode.com//paper/efficientnet-rethinking-model-scaling-for
type: model-index
Type: model-index
-->

@ -0,0 +1,89 @@
# Summary
**Inception-ResNet-v2** is a convolutional neural architecture that builds on the Inception family of architectures but incorporates [residual connections](https://paperswithcode.com/method/residual-connection) (replacing the filter concatenation stage of the Inception architecture).
This particular model was trained for study of adversarial examples (adversarial training).
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@article{DBLP:journals/corr/abs-1804-00097,
author = {Alexey Kurakin and
Ian J. Goodfellow and
Samy Bengio and
Yinpeng Dong and
Fangzhou Liao and
Ming Liang and
Tianyu Pang and
Jun Zhu and
Xiaolin Hu and
Cihang Xie and
Jianyu Wang and
Zhishuai Zhang and
Zhou Ren and
Alan L. Yuille and
Sangxia Huang and
Yao Zhao and
Yuzhe Zhao and
Zhonglin Han and
Junjiajia Long and
Yerkebulan Berdibekov and
Takuya Akiba and
Seiya Tokui and
Motoki Abe},
title = {Adversarial Attacks and Defences Competition},
journal = {CoRR},
volume = {abs/1804.00097},
year = {2018},
url = {http://arxiv.org/abs/1804.00097},
archivePrefix = {arXiv},
eprint = {1804.00097},
timestamp = {Thu, 31 Oct 2019 16:31:22 +0100},
biburl = {https://dblp.org/rec/journals/corr/abs-1804-00097.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
```
<!--
Models:
- Name: ens_adv_inception_resnet_v2
Metadata:
FLOPs: 16959133120
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Auxiliary Classifier
- Average Pooling
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inception-v3 Module
- Max Pooling
- ReLU
- Softmax
File Size: 223774238
Tasks:
- Image Classification
ID: ens_adv_inception_resnet_v2
Crop Pct: '0.897'
Image Size: '299'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/inception_resnet_v2.py#L351
In Collection: Ensemble Adversarial
Collections:
- Name: Ensemble Adversarial
Paper:
title: Adversarial Attacks and Defences Competition
url: https://papperswithcode.com//paper/adversarial-attacks-and-defences-competition
type: model-index
Type: model-index
-->

@ -0,0 +1,77 @@
# Summary
**VoVNet** is a convolutional neural network that seeks to make [DenseNet](https://paperswithcode.com/method/densenet) more efficient by concatenating all features only once in the last feature map, which makes input size constant and enables enlarging new output channel.
Read about [one-shot aggregation here](https://paperswithcode.com/method/one-shot-aggregation).
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{lee2019energy,
title={An Energy and GPU-Computation Efficient Backbone Network for Real-Time Object Detection},
author={Youngwan Lee and Joong-won Hwang and Sangrok Lee and Yuseok Bae and Jongyoul Park},
year={2019},
eprint={1904.09730},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: ese_vovnet39b
Metadata:
FLOPs: 9089259008
Training Data:
- ImageNet
Architecture:
- Batch Normalization
- Convolution
- Max Pooling
- One-Shot Aggregation
- ReLU
File Size: 98397138
Tasks:
- Image Classification
ID: ese_vovnet39b
Layers: 39
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/vovnet.py#L371
In Collection: ESE VovNet
- Name: ese_vovnet19b_dw
Metadata:
FLOPs: 1711959904
Training Data:
- ImageNet
Architecture:
- Batch Normalization
- Convolution
- Max Pooling
- One-Shot Aggregation
- ReLU
File Size: 26243175
Tasks:
- Image Classification
ID: ese_vovnet19b_dw
Layers: 19
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/vovnet.py#L361
In Collection: ESE VovNet
Collections:
- Name: ESE VovNet
Paper:
title: 'CenterMask : Real-Time Anchor-Free Instance Segmentation'
url: https://papperswithcode.com//paper/centermask-real-time-anchor-free-instance-1
type: model-index
Type: model-index
-->

@ -0,0 +1,69 @@
# Summary
**FBNet** is a type of convolutional neural architectures discovered through [DNAS](https://paperswithcode.com/method/dnas) neural architecture search. It utilises a basic type of image model block inspired by [MobileNetv2](https://paperswithcode.com/method/mobilenetv2) that utilises depthwise convolutions and an inverted residual structure (see components).
The principal building block is the [FBNet Block](https://paperswithcode.com/method/fbnet-block).
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{wu2019fbnet,
title={FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search},
author={Bichen Wu and Xiaoliang Dai and Peizhao Zhang and Yanghan Wang and Fei Sun and Yiming Wu and Yuandong Tian and Peter Vajda and Yangqing Jia and Kurt Keutzer},
year={2019},
eprint={1812.03443},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: fbnetc_100
Metadata:
FLOPs: 508940064
Epochs: 360
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 8x GPUs
Architecture:
- 1x1 Convolution
- Convolution
- Dense Connections
- Dropout
- FBNet Block
- Global Average Pooling
- Softmax
File Size: 22525094
Tasks:
- Image Classification
ID: fbnetc_100
LR: 0.1
Layers: 22
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0005
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L985
In Collection: FBNet
Collections:
- Name: FBNet
Paper:
title: 'FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural
Architecture Search'
url: https://papperswithcode.com//paper/fbnet-hardware-aware-efficient-convnet-design
type: model-index
Type: model-index
-->

@ -0,0 +1,71 @@
# Summary
**Inception v3** is a convolutional neural network architecture from the Inception family that makes several improvements including using [Label Smoothing](https://paperswithcode.com/method/label-smoothing), Factorized 7 x 7 convolutions, and the use of an [auxiliary classifer](https://paperswithcode.com/method/auxiliary-classifier) to propagate label information lower down the network (along with the use of batch normalization for layers in the sidehead). The key building block is an [Inception Module](https://paperswithcode.com/method/inception-v3-module).
The weights from this model were ported from Gluon.
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@article{DBLP:journals/corr/SzegedyVISW15,
author = {Christian Szegedy and
Vincent Vanhoucke and
Sergey Ioffe and
Jonathon Shlens and
Zbigniew Wojna},
title = {Rethinking the Inception Architecture for Computer Vision},
journal = {CoRR},
volume = {abs/1512.00567},
year = {2015},
url = {http://arxiv.org/abs/1512.00567},
archivePrefix = {arXiv},
eprint = {1512.00567},
timestamp = {Mon, 13 Aug 2018 16:49:07 +0200},
biburl = {https://dblp.org/rec/journals/corr/SzegedyVISW15.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
```
<!--
Models:
- Name: gluon_inception_v3
Metadata:
FLOPs: 7352418880
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Auxiliary Classifier
- Average Pooling
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inception-v3 Module
- Max Pooling
- ReLU
- Softmax
File Size: 95567055
Tasks:
- Image Classification
ID: gluon_inception_v3
Crop Pct: '0.875'
Image Size: '299'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/inception_v3.py#L464
In Collection: Gloun Inception v3
Collections:
- Name: Gloun Inception v3
Paper:
title: Rethinking the Inception Architecture for Computer Vision
url: https://papperswithcode.com//paper/rethinking-the-inception-architecture-for
type: model-index
Type: model-index
-->

@ -0,0 +1,393 @@
# Summary
**Residual Networks**, or **ResNets**, learn residual functions with reference to the layer inputs, instead of learning unreferenced functions. Instead of hoping each few stacked layers directly fit a desired underlying mapping, residual nets let these layers fit a residual mapping. They stack [residual blocks](https://paperswithcode.com/method/residual-block) ontop of each other to form network: e.g. a ResNet-50 has fifty layers using these blocks.
The weights from this model were ported from Gluon.
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@article{DBLP:journals/corr/HeZRS15,
author = {Kaiming He and
Xiangyu Zhang and
Shaoqing Ren and
Jian Sun},
title = {Deep Residual Learning for Image Recognition},
journal = {CoRR},
volume = {abs/1512.03385},
year = {2015},
url = {http://arxiv.org/abs/1512.03385},
archivePrefix = {arXiv},
eprint = {1512.03385},
timestamp = {Wed, 17 Apr 2019 17:23:45 +0200},
biburl = {https://dblp.org/rec/journals/corr/HeZRS15.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
```
<!--
Models:
- Name: gluon_resnet101_v1b
Metadata:
FLOPs: 10068547584
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 178723172
Tasks:
- Image Classification
ID: gluon_resnet101_v1b
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/gluon_resnet.py#L89
In Collection: Gloun ResNet
- Name: gluon_resnet101_v1s
Metadata:
FLOPs: 11805511680
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 179221777
Tasks:
- Image Classification
ID: gluon_resnet101_v1s
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/gluon_resnet.py#L166
In Collection: Gloun ResNet
- Name: gluon_resnet101_v1c
Metadata:
FLOPs: 10376567296
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 178802575
Tasks:
- Image Classification
ID: gluon_resnet101_v1c
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/gluon_resnet.py#L113
In Collection: Gloun ResNet
- Name: gluon_resnet152_v1c
Metadata:
FLOPs: 15165680128
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 241613404
Tasks:
- Image Classification
ID: gluon_resnet152_v1c
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/gluon_resnet.py#L121
In Collection: Gloun ResNet
- Name: gluon_resnet152_v1b
Metadata:
FLOPs: 14857660416
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 241534001
Tasks:
- Image Classification
ID: gluon_resnet152_v1b
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/gluon_resnet.py#L97
In Collection: Gloun ResNet
- Name: gluon_resnet101_v1d
Metadata:
FLOPs: 10377018880
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 178802755
Tasks:
- Image Classification
ID: gluon_resnet101_v1d
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/gluon_resnet.py#L138
In Collection: Gloun ResNet
- Name: gluon_resnet152_v1d
Metadata:
FLOPs: 15166131712
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 241613584
Tasks:
- Image Classification
ID: gluon_resnet152_v1d
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/gluon_resnet.py#L147
In Collection: Gloun ResNet
- Name: gluon_resnet152_v1s
Metadata:
FLOPs: 16594624512
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 242032606
Tasks:
- Image Classification
ID: gluon_resnet152_v1s
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/gluon_resnet.py#L175
In Collection: Gloun ResNet
- Name: gluon_resnet50_v1b
Metadata:
FLOPs: 5282531328
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 102493763
Tasks:
- Image Classification
ID: gluon_resnet50_v1b
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/gluon_resnet.py#L81
In Collection: Gloun ResNet
- Name: gluon_resnet18_v1b
Metadata:
FLOPs: 2337073152
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 46816736
Tasks:
- Image Classification
ID: gluon_resnet18_v1b
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/gluon_resnet.py#L65
In Collection: Gloun ResNet
- Name: gluon_resnet34_v1b
Metadata:
FLOPs: 4718469120
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 87295112
Tasks:
- Image Classification
ID: gluon_resnet34_v1b
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/gluon_resnet.py#L73
In Collection: Gloun ResNet
- Name: gluon_resnet50_v1c
Metadata:
FLOPs: 5590551040
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 102573166
Tasks:
- Image Classification
ID: gluon_resnet50_v1c
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/gluon_resnet.py#L105
In Collection: Gloun ResNet
- Name: gluon_resnet50_v1d
Metadata:
FLOPs: 5591002624
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 102573346
Tasks:
- Image Classification
ID: gluon_resnet50_v1d
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/gluon_resnet.py#L129
In Collection: Gloun ResNet
- Name: gluon_resnet50_v1s
Metadata:
FLOPs: 7019495424
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 102992368
Tasks:
- Image Classification
ID: gluon_resnet50_v1s
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/gluon_resnet.py#L156
In Collection: Gloun ResNet
Collections:
- Name: Gloun ResNet
Paper:
title: Deep Residual Learning for Image Recognition
url: https://papperswithcode.com//paper/deep-residual-learning-for-image-recognition
type: model-index
Type: model-index
-->

@ -0,0 +1,119 @@
# Summary
A **ResNeXt** repeats a [building block](https://paperswithcode.com/method/resnext-block) that aggregates a set of transformations with the same topology. Compared to a [ResNet](https://paperswithcode.com/method/resnet), it exposes a new dimension, *cardinality* (the size of the set of transformations) $C$, as an essential factor in addition to the dimensions of depth and width.
The weights from this model were ported from Gluon.
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@article{DBLP:journals/corr/XieGDTH16,
author = {Saining Xie and
Ross B. Girshick and
Piotr Doll{\'{a}}r and
Zhuowen Tu and
Kaiming He},
title = {Aggregated Residual Transformations for Deep Neural Networks},
journal = {CoRR},
volume = {abs/1611.05431},
year = {2016},
url = {http://arxiv.org/abs/1611.05431},
archivePrefix = {arXiv},
eprint = {1611.05431},
timestamp = {Mon, 13 Aug 2018 16:45:58 +0200},
biburl = {https://dblp.org/rec/journals/corr/XieGDTH16.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
```
<!--
Models:
- Name: gluon_resnext50_32x4d
Metadata:
FLOPs: 5472648192
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Grouped Convolution
- Max Pooling
- ReLU
- ResNeXt Block
- Residual Connection
- Softmax
File Size: 100441719
Tasks:
- Image Classification
ID: gluon_resnext50_32x4d
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/gluon_resnet.py#L185
In Collection: Gloun ResNeXt
- Name: gluon_resnext101_32x4d
Metadata:
FLOPs: 10298145792
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Grouped Convolution
- Max Pooling
- ReLU
- ResNeXt Block
- Residual Connection
- Softmax
File Size: 177367414
Tasks:
- Image Classification
ID: gluon_resnext101_32x4d
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/gluon_resnet.py#L193
In Collection: Gloun ResNeXt
- Name: gluon_resnext101_64x4d
Metadata:
FLOPs: 19954172928
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Grouped Convolution
- Max Pooling
- ReLU
- ResNeXt Block
- Residual Connection
- Softmax
File Size: 334737852
Tasks:
- Image Classification
ID: gluon_resnext101_64x4d
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/gluon_resnet.py#L201
In Collection: Gloun ResNeXt
Collections:
- Name: Gloun ResNeXt
Paper:
title: Aggregated Residual Transformations for Deep Neural Networks
url: https://papperswithcode.com//paper/aggregated-residual-transformations-for-deep
type: model-index
Type: model-index
-->

@ -0,0 +1,56 @@
# Summary
A **SENet** is a convolutional neural network architecture that employs [squeeze-and-excitation blocks](https://paperswithcode.com/method/squeeze-and-excitation-block) to enable the network to perform dynamic channel-wise feature recalibration.
The weights from this model were ported from Gluon.
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{hu2019squeezeandexcitation,
title={Squeeze-and-Excitation Networks},
author={Jie Hu and Li Shen and Samuel Albanie and Gang Sun and Enhua Wu},
year={2019},
eprint={1709.01507},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: gluon_senet154
Metadata:
FLOPs: 26681705136
Training Data:
- ImageNet
Architecture:
- Convolution
- Dense Connections
- Global Average Pooling
- Max Pooling
- Softmax
- Squeeze-and-Excitation Block
File Size: 461546622
Tasks:
- Image Classification
ID: gluon_senet154
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/gluon_resnet.py#L239
In Collection: Gloun SENet
Collections:
- Name: Gloun SENet
Paper:
title: Squeeze-and-Excitation Networks
url: https://papperswithcode.com//paper/squeeze-and-excitation-networks
type: model-index
Type: model-index
-->

@ -0,0 +1,113 @@
# Summary
**SE ResNeXt** is a variant of a [ResNext](https://www.paperswithcode.com/method/resnext) that employs [squeeze-and-excitation blocks](https://paperswithcode.com/method/squeeze-and-excitation-block) to enable the network to perform dynamic channel-wise feature recalibration.
The weights from this model were ported from Gluon.
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{hu2019squeezeandexcitation,
title={Squeeze-and-Excitation Networks},
author={Jie Hu and Li Shen and Samuel Albanie and Gang Sun and Enhua Wu},
year={2019},
eprint={1709.01507},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: gluon_seresnext50_32x4d
Metadata:
FLOPs: 5475179184
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Grouped Convolution
- Max Pooling
- ReLU
- ResNeXt Block
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 110578827
Tasks:
- Image Classification
ID: gluon_seresnext50_32x4d
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/gluon_resnet.py#L209
In Collection: Gloun SEResNeXt
- Name: gluon_seresnext101_32x4d
Metadata:
FLOPs: 10302923504
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Grouped Convolution
- Max Pooling
- ReLU
- ResNeXt Block
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 196505510
Tasks:
- Image Classification
ID: gluon_seresnext101_32x4d
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/gluon_resnet.py#L219
In Collection: Gloun SEResNeXt
- Name: gluon_seresnext101_64x4d
Metadata:
FLOPs: 19958950640
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Grouped Convolution
- Max Pooling
- ReLU
- ResNeXt Block
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 353875948
Tasks:
- Image Classification
ID: gluon_seresnext101_64x4d
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/gluon_resnet.py#L229
In Collection: Gloun SEResNeXt
Collections:
- Name: Gloun SEResNeXt
Paper:
title: Squeeze-and-Excitation Networks
url: https://papperswithcode.com//paper/squeeze-and-excitation-networks
type: model-index
Type: model-index
-->

@ -0,0 +1,57 @@
# Summary
**Xception** is a convolutional neural network architecture that relies solely on [depthwise separable convolution](https://paperswithcode.com/method/depthwise-separable-convolution) layers. The weights from this model were ported from Gluon.
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{chollet2017xception,
title={Xception: Deep Learning with Depthwise Separable Convolutions},
author={François Chollet},
year={2017},
eprint={1610.02357},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: gluon_xception65
Metadata:
FLOPs: 17594889728
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Convolution
- Dense Connections
- Depthwise Separable Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Connection
- Softmax
File Size: 160551306
Tasks:
- Image Classification
ID: gluon_xception65
Crop Pct: '0.903'
Image Size: '299'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/gluon_xception.py#L241
In Collection: Gloun Xception
Collections:
- Name: Gloun Xception
Paper:
title: 'Xception: Deep Learning with Depthwise Separable Convolutions'
url: https://papperswithcode.com//paper/xception-deep-learning-with-depthwise
type: model-index
Type: model-index
-->

@ -0,0 +1,303 @@
# Summary
**HRNet**, or **High-Resolution Net**, is a general purpose convolutional neural network for tasks like semantic segmentation, object detection and image classification. It is able to maintain high resolution representations through the whole process. We start from a high-resolution convolution stream, gradually add high-to-low resolution convolution streams one by one, and connect the multi-resolution streams in parallel. The resulting network consists of several ($4$ in the paper) stages and the $n$th stage contains $n$ streams corresponding to $n$ resolutions. The authors conduct repeated multi-resolution fusions by exchanging the information across the parallel streams over and over.
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{sun2019highresolution,
title={High-Resolution Representations for Labeling Pixels and Regions},
author={Ke Sun and Yang Zhao and Borui Jiang and Tianheng Cheng and Bin Xiao and Dong Liu and Yadong Mu and Xinggang Wang and Wenyu Liu and Jingdong Wang},
year={2019},
eprint={1904.04514},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: hrnet_w18_small
Metadata:
FLOPs: 2071651488
Epochs: 100
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- Nesterov Accelerated Gradient
- Weight Decay
Training Resources: 4x NVIDIA V100 GPUs
Architecture:
- Batch Normalization
- Convolution
- ReLU
- Residual Connection
File Size: 52934302
Tasks:
- Image Classification
Training Time: ''
ID: hrnet_w18_small
Layers: 18
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/hrnet.py#L790
Config: ''
In Collection: HRNet
- Name: hrnet_w18_small_v2
Metadata:
FLOPs: 3360023160
Epochs: 100
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- Nesterov Accelerated Gradient
- Weight Decay
Training Resources: 4x NVIDIA V100 GPUs
Architecture:
- Batch Normalization
- Convolution
- ReLU
- Residual Connection
File Size: 62682879
Tasks:
- Image Classification
Training Time: ''
ID: hrnet_w18_small_v2
Layers: 18
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/hrnet.py#L795
Config: ''
In Collection: HRNet
- Name: hrnet_w32
Metadata:
FLOPs: 11524528320
Epochs: 100
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- Nesterov Accelerated Gradient
- Weight Decay
Training Resources: 4x NVIDIA V100 GPUs
Architecture:
- Batch Normalization
- Convolution
- ReLU
- Residual Connection
File Size: 165547812
Tasks:
- Image Classification
Training Time: 60 hours
ID: hrnet_w32
Layers: 32
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/hrnet.py#L810
Config: ''
In Collection: HRNet
- Name: hrnet_w40
Metadata:
FLOPs: 16381182192
Epochs: 100
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- Nesterov Accelerated Gradient
- Weight Decay
Training Resources: 4x NVIDIA V100 GPUs
Architecture:
- Batch Normalization
- Convolution
- ReLU
- Residual Connection
File Size: 230899236
Tasks:
- Image Classification
Training Time: ''
ID: hrnet_w40
Layers: 40
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/hrnet.py#L815
Config: ''
In Collection: HRNet
- Name: hrnet_w44
Metadata:
FLOPs: 19202520264
Epochs: 100
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- Nesterov Accelerated Gradient
- Weight Decay
Training Resources: 4x NVIDIA V100 GPUs
Architecture:
- Batch Normalization
- Convolution
- ReLU
- Residual Connection
File Size: 268957432
Tasks:
- Image Classification
Training Time: ''
ID: hrnet_w44
Layers: 44
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/hrnet.py#L820
Config: ''
In Collection: HRNet
- Name: hrnet_w48
Metadata:
FLOPs: 22285865760
Epochs: 100
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- Nesterov Accelerated Gradient
- Weight Decay
Training Resources: 4x NVIDIA V100 GPUs
Architecture:
- Batch Normalization
- Convolution
- ReLU
- Residual Connection
File Size: 310603710
Tasks:
- Image Classification
Training Time: 80 hours
ID: hrnet_w48
Layers: 48
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/hrnet.py#L825
Config: ''
In Collection: HRNet
- Name: hrnet_w18
Metadata:
FLOPs: 5547205500
Epochs: 100
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- Nesterov Accelerated Gradient
- Weight Decay
Training Resources: 4x NVIDIA V100 GPUs
Architecture:
- Batch Normalization
- Convolution
- ReLU
- Residual Connection
File Size: 85718883
Tasks:
- Image Classification
Training Time: ''
ID: hrnet_w18
Layers: 18
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/hrnet.py#L800
Config: ''
In Collection: HRNet
- Name: hrnet_w64
Metadata:
FLOPs: 37239321984
Epochs: 100
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- Nesterov Accelerated Gradient
- Weight Decay
Training Resources: 4x NVIDIA V100 GPUs
Architecture:
- Batch Normalization
- Convolution
- ReLU
- Residual Connection
File Size: 513071818
Tasks:
- Image Classification
Training Time: ''
ID: hrnet_w64
Layers: 64
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/hrnet.py#L830
Config: ''
In Collection: HRNet
- Name: hrnet_w30
Metadata:
FLOPs: 10474119492
Epochs: 100
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- Nesterov Accelerated Gradient
- Weight Decay
Training Resources: 4x NVIDIA V100 GPUs
Architecture:
- Batch Normalization
- Convolution
- ReLU
- Residual Connection
File Size: 151452218
Tasks:
- Image Classification
Training Time: ''
ID: hrnet_w30
Layers: 30
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/hrnet.py#L805
Config: ''
In Collection: HRNet
Collections:
- Name: HRNet
Paper:
title: Deep High-Resolution Representation Learning for Visual Recognition
url: https://papperswithcode.com//paper/190807919
type: model-index
Type: model-index
-->

@ -0,0 +1,178 @@
# Summary
A **ResNeXt** repeats a [building block](https://paperswithcode.com/method/resnext-block) that aggregates a set of transformations with the same topology. Compared to a [ResNet](https://paperswithcode.com/method/resnet), it exposes a new dimension, *cardinality* (the size of the set of transformations) $C$, as an essential factor in addition to the dimensions of depth and width.
This model was trained on billions of Instagram images using thousands of distinct hashtags as labels exhibit excellent transfer learning performance.
Please note the CC-BY-NC 4.0 license on theses weights, non-commercial use only.
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{mahajan2018exploring,
title={Exploring the Limits of Weakly Supervised Pretraining},
author={Dhruv Mahajan and Ross Girshick and Vignesh Ramanathan and Kaiming He and Manohar Paluri and Yixuan Li and Ashwin Bharambe and Laurens van der Maaten},
year={2018},
eprint={1805.00932},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: ig_resnext101_32x32d
Metadata:
FLOPs: 112225170432
Epochs: 100
Batch Size: 8064
Training Data:
- IG-3.5B-17k
- ImageNet
Training Techniques:
- Nesterov Accelerated Gradient
- Weight Decay
Training Resources: 336x GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Grouped Convolution
- Max Pooling
- ReLU
- ResNeXt Block
- Residual Connection
- Softmax
File Size: 1876573776
Tasks:
- Image Classification
ID: ig_resnext101_32x32d
Layers: 101
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.001
Interpolation: bilinear
Minibatch Size: 8064
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/resnet.py#L885
In Collection: IG ResNeXt
- Name: ig_resnext101_32x16d
Metadata:
FLOPs: 46623691776
Epochs: 100
Batch Size: 8064
Training Data:
- IG-3.5B-17k
- ImageNet
Training Techniques:
- Nesterov Accelerated Gradient
- Weight Decay
Training Resources: 336x GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Grouped Convolution
- Max Pooling
- ReLU
- ResNeXt Block
- Residual Connection
- Softmax
File Size: 777518664
Tasks:
- Image Classification
ID: ig_resnext101_32x16d
Layers: 101
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/resnet.py#L874
In Collection: IG ResNeXt
- Name: ig_resnext101_32x48d
Metadata:
FLOPs: 197446554624
Epochs: 100
Batch Size: 8064
Training Data:
- IG-3.5B-17k
- ImageNet
Training Techniques:
- Nesterov Accelerated Gradient
- Weight Decay
Training Resources: 336x GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Grouped Convolution
- Max Pooling
- ReLU
- ResNeXt Block
- Residual Connection
- Softmax
File Size: 3317136976
Tasks:
- Image Classification
ID: ig_resnext101_32x48d
Layers: 101
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/resnet.py#L896
In Collection: IG ResNeXt
- Name: ig_resnext101_32x8d
Metadata:
FLOPs: 21180417024
Epochs: 100
Batch Size: 8064
Training Data:
- IG-3.5B-17k
- ImageNet
Training Techniques:
- Nesterov Accelerated Gradient
- Weight Decay
Training Resources: 336x GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Grouped Convolution
- Max Pooling
- ReLU
- ResNeXt Block
- Residual Connection
- Softmax
File Size: 356056638
Tasks:
- Image Classification
ID: ig_resnext101_32x8d
Layers: 101
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/resnet.py#L863
In Collection: IG ResNeXt
Collections:
- Name: IG ResNeXt
Paper:
title: Exploring the Limits of Weakly Supervised Pretraining
url: https://papperswithcode.com//paper/exploring-the-limits-of-weakly-supervised
type: model-index
Type: model-index
-->

@ -0,0 +1,65 @@
# Summary
**Inception-ResNet-v2** is a convolutional neural architecture that builds on the Inception family of architectures but incorporates [residual connections](https://paperswithcode.com/method/residual-connection) (replacing the filter concatenation stage of the Inception architecture).
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{szegedy2016inceptionv4,
title={Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning},
author={Christian Szegedy and Sergey Ioffe and Vincent Vanhoucke and Alex Alemi},
year={2016},
eprint={1602.07261},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: inception_resnet_v2
Metadata:
FLOPs: 16959133120
Training Data:
- ImageNet
Training Techniques:
- Label Smoothing
- RMSProp
- Weight Decay
Training Resources: 20x NVIDIA Kepler GPUs
Architecture:
- Average Pooling
- Dropout
- Inception-ResNet-v2 Reduction-B
- Inception-ResNet-v2-A
- Inception-ResNet-v2-B
- Inception-ResNet-v2-C
- Reduction-A
- Softmax
File Size: 223774238
Tasks:
- Image Classification
ID: inception_resnet_v2
LR: 0.045
Dropout: 0.2
Crop Pct: '0.897'
Momentum: 0.9
Image Size: '299'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/inception_resnet_v2.py#L343
In Collection: Inception ResNet v2
Collections:
- Name: Inception ResNet v2
Paper:
title: Inception-v4, Inception-ResNet and the Impact of Residual Connections on
Learning
url: https://papperswithcode.com//paper/inception-v4-inception-resnet-and-the-impact
type: model-index
Type: model-index
-->

@ -0,0 +1,78 @@
# Summary
**Inception v3** is a convolutional neural network architecture from the Inception family that makes several improvements including using [Label Smoothing](https://paperswithcode.com/method/label-smoothing), Factorized 7 x 7 convolutions, and the use of an [auxiliary classifer](https://paperswithcode.com/method/auxiliary-classifier) to propagate label information lower down the network (along with the use of batch normalization for layers in the sidehead). The key building block is an [Inception Module](https://paperswithcode.com/method/inception-v3-module).
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@article{DBLP:journals/corr/SzegedyVISW15,
author = {Christian Szegedy and
Vincent Vanhoucke and
Sergey Ioffe and
Jonathon Shlens and
Zbigniew Wojna},
title = {Rethinking the Inception Architecture for Computer Vision},
journal = {CoRR},
volume = {abs/1512.00567},
year = {2015},
url = {http://arxiv.org/abs/1512.00567},
archivePrefix = {arXiv},
eprint = {1512.00567},
timestamp = {Mon, 13 Aug 2018 16:49:07 +0200},
biburl = {https://dblp.org/rec/journals/corr/SzegedyVISW15.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
```
<!--
Models:
- Name: inception_v3
Metadata:
FLOPs: 7352418880
Training Data:
- ImageNet
Training Techniques:
- Gradient Clipping
- Label Smoothing
- RMSProp
- Weight Decay
Training Resources: 50x NVIDIA Kepler GPUs
Architecture:
- 1x1 Convolution
- Auxiliary Classifier
- Average Pooling
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inception-v3 Module
- Max Pooling
- ReLU
- Softmax
File Size: 108857766
Tasks:
- Image Classification
ID: inception_v3
LR: 0.045
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '299'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/inception_v3.py#L442
In Collection: Inception v3
Collections:
- Name: Inception v3
Paper:
title: Rethinking the Inception Architecture for Computer Vision
url: https://papperswithcode.com//paper/rethinking-the-inception-architecture-for
type: model-index
Type: model-index
-->

@ -0,0 +1,64 @@
# Summary
**Inception-v4** is a convolutional neural network architecture that builds on previous iterations of the Inception family by simplifying the architecture and using more inception modules than [Inception-v3](https://paperswithcode.com/method/inception-v3).
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{szegedy2016inceptionv4,
title={Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning},
author={Christian Szegedy and Sergey Ioffe and Vincent Vanhoucke and Alex Alemi},
year={2016},
eprint={1602.07261},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: inception_v4
Metadata:
FLOPs: 15806527936
Training Data:
- ImageNet
Training Techniques:
- Label Smoothing
- RMSProp
- Weight Decay
Training Resources: 20x NVIDIA Kepler GPUs
Architecture:
- Average Pooling
- Dropout
- Inception-A
- Inception-B
- Inception-C
- Reduction-A
- Reduction-B
- Softmax
File Size: 171082495
Tasks:
- Image Classification
ID: inception_v4
LR: 0.045
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '299'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/inception_v4.py#L313
In Collection: Inception v4
Collections:
- Name: Inception v4
Paper:
title: Inception-v4, Inception-ResNet and the Impact of Residual Connections on
Learning
url: https://papperswithcode.com//paper/inception-v4-inception-resnet-and-the-impact
type: model-index
Type: model-index
-->

@ -0,0 +1,218 @@
# Summary
**SE ResNet** is a variant of a [ResNet](https://www.paperswithcode.com/method/resnet) that employs [squeeze-and-excitation blocks](https://paperswithcode.com/method/squeeze-and-excitation-block) to enable the network to perform dynamic channel-wise feature recalibration.
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{hu2019squeezeandexcitation,
title={Squeeze-and-Excitation Networks},
author={Jie Hu and Li Shen and Samuel Albanie and Gang Sun and Enhua Wu},
year={2019},
eprint={1709.01507},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: legacy_seresnet101
Metadata:
FLOPs: 9762614000
Epochs: 100
Batch Size: 1024
Training Data:
- ImageNet
Training Techniques:
- Label Smoothing
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA Titan X GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 197822624
Tasks:
- Image Classification
ID: legacy_seresnet101
LR: 0.6
Layers: 101
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/senet.py#L426
In Collection: Legacy SE ResNet
- Name: legacy_seresnet152
Metadata:
FLOPs: 14553578160
Epochs: 100
Batch Size: 1024
Training Data:
- ImageNet
Training Techniques:
- Label Smoothing
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA Titan X GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 268033864
Tasks:
- Image Classification
ID: legacy_seresnet152
LR: 0.6
Layers: 152
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/senet.py#L433
In Collection: Legacy SE ResNet
- Name: legacy_seresnet18
Metadata:
FLOPs: 2328876024
Epochs: 100
Batch Size: 1024
Training Data:
- ImageNet
Training Techniques:
- Label Smoothing
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA Titan X GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 47175663
Tasks:
- Image Classification
ID: legacy_seresnet18
LR: 0.6
Layers: 18
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/senet.py#L405
In Collection: Legacy SE ResNet
- Name: legacy_seresnet34
Metadata:
FLOPs: 4706201004
Epochs: 100
Batch Size: 1024
Training Data:
- ImageNet
Training Techniques:
- Label Smoothing
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA Titan X GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 87958697
Tasks:
- Image Classification
ID: legacy_seresnet34
LR: 0.6
Layers: 34
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/senet.py#L412
In Collection: Legacy SE ResNet
- Name: legacy_seresnet50
Metadata:
FLOPs: 4974351024
Epochs: 100
Training Data:
- ImageNet
Training Techniques:
- Label Smoothing
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA Titan X GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 112611220
Tasks:
- Image Classification
ID: legacy_seresnet50
LR: 0.6
Layers: 50
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Interpolation: bilinear
Minibatch Size: 1024
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/senet.py#L419
In Collection: Legacy SE ResNet
Collections:
- Name: Legacy SE ResNet
Paper:
title: Squeeze-and-Excitation Networks
url: https://papperswithcode.com//paper/squeeze-and-excitation-networks
type: model-index
Type: model-index
-->

@ -0,0 +1,144 @@
# Summary
**SE ResNeXt** is a variant of a [ResNeXt](https://www.paperswithcode.com/method/resnext) that employs [squeeze-and-excitation blocks](https://paperswithcode.com/method/squeeze-and-excitation-block) to enable the network to perform dynamic channel-wise feature recalibration.
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{hu2019squeezeandexcitation,
title={Squeeze-and-Excitation Networks},
author={Jie Hu and Li Shen and Samuel Albanie and Gang Sun and Enhua Wu},
year={2019},
eprint={1709.01507},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: legacy_seresnext101_32x4d
Metadata:
FLOPs: 10287698672
Epochs: 100
Batch Size: 1024
Training Data:
- ImageNet
Training Techniques:
- Label Smoothing
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA Titan X GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Grouped Convolution
- Max Pooling
- ReLU
- ResNeXt Block
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 196466866
Tasks:
- Image Classification
ID: legacy_seresnext101_32x4d
LR: 0.6
Layers: 101
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/senet.py#L462
In Collection: Legacy SE ResNeXt
- Name: legacy_seresnext26_32x4d
Metadata:
FLOPs: 3187342304
Epochs: 100
Batch Size: 1024
Training Data:
- ImageNet
Training Techniques:
- Label Smoothing
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA Titan X GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Grouped Convolution
- Max Pooling
- ReLU
- ResNeXt Block
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 67346327
Tasks:
- Image Classification
ID: legacy_seresnext26_32x4d
LR: 0.6
Layers: 26
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/senet.py#L448
In Collection: Legacy SE ResNeXt
- Name: legacy_seresnext50_32x4d
Metadata:
FLOPs: 5459954352
Epochs: 100
Batch Size: 1024
Training Data:
- ImageNet
Training Techniques:
- Label Smoothing
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA Titan X GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Grouped Convolution
- Max Pooling
- ReLU
- ResNeXt Block
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 110559176
Tasks:
- Image Classification
ID: legacy_seresnext50_32x4d
LR: 0.6
Layers: 50
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/senet.py#L455
In Collection: Legacy SE ResNeXt
Collections:
- Name: Legacy SE ResNeXt
Paper:
title: Squeeze-and-Excitation Networks
url: https://papperswithcode.com//paper/squeeze-and-excitation-networks
type: model-index
Type: model-index
-->

@ -0,0 +1,67 @@
# Summary
A **SENet** is a convolutional neural network architecture that employs [squeeze-and-excitation blocks](https://paperswithcode.com/method/squeeze-and-excitation-block) to enable the network to perform dynamic channel-wise feature recalibration.
The weights from this model were ported from Gluon.
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{hu2019squeezeandexcitation,
title={Squeeze-and-Excitation Networks},
author={Jie Hu and Li Shen and Samuel Albanie and Gang Sun and Enhua Wu},
year={2019},
eprint={1709.01507},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: legacy_senet154
Metadata:
FLOPs: 26659556016
Epochs: 100
Batch Size: 1024
Training Data:
- ImageNet
Training Techniques:
- Label Smoothing
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA Titan X GPUs
Architecture:
- Convolution
- Dense Connections
- Global Average Pooling
- Max Pooling
- Softmax
- Squeeze-and-Excitation Block
File Size: 461488402
Tasks:
- Image Classification
ID: legacy_senet154
LR: 0.6
Layers: 154
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/senet.py#L440
In Collection: Legacy SENet
Collections:
- Name: Legacy SENet
Paper:
title: Squeeze-and-Excitation Networks
url: https://papperswithcode.com//paper/squeeze-and-excitation-networks
type: model-index
Type: model-index
-->

@ -0,0 +1,133 @@
# Summary
**MixNet** is a type of convolutional neural network discovered via AutoML that utilises [MixConvs](https://paperswithcode.com/method/mixconv) instead of regular [depthwise convolutions](https://paperswithcode.com/method/depthwise-convolution).
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{tan2019mixconv,
title={MixConv: Mixed Depthwise Convolutional Kernels},
author={Mingxing Tan and Quoc V. Le},
year={2019},
eprint={1907.09595},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: mixnet_xl
Metadata:
FLOPs: 1195880424
Training Data:
- ImageNet
Training Techniques:
- MNAS
Architecture:
- Batch Normalization
- Dense Connections
- Dropout
- Global Average Pooling
- Grouped Convolution
- MixConv
- Squeeze-and-Excitation Block
- Swish
File Size: 48001170
Tasks:
- Image Classification
ID: mixnet_xl
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1678
In Collection: MixNet
- Name: mixnet_m
Metadata:
FLOPs: 454543374
Training Data:
- ImageNet
Training Techniques:
- MNAS
Architecture:
- Batch Normalization
- Dense Connections
- Dropout
- Global Average Pooling
- Grouped Convolution
- MixConv
- Squeeze-and-Excitation Block
- Swish
File Size: 20298347
Tasks:
- Image Classification
ID: mixnet_m
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1660
In Collection: MixNet
- Name: mixnet_s
Metadata:
FLOPs: 321264910
Training Data:
- ImageNet
Training Techniques:
- MNAS
Architecture:
- Batch Normalization
- Dense Connections
- Dropout
- Global Average Pooling
- Grouped Convolution
- MixConv
- Squeeze-and-Excitation Block
- Swish
File Size: 16727982
Tasks:
- Image Classification
ID: mixnet_s
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1651
In Collection: MixNet
- Name: mixnet_l
Metadata:
FLOPs: 738671316
Training Data:
- ImageNet
Training Techniques:
- MNAS
Architecture:
- Batch Normalization
- Dense Connections
- Dropout
- Global Average Pooling
- Grouped Convolution
- MixConv
- Squeeze-and-Excitation Block
- Swish
File Size: 29608232
Tasks:
- Image Classification
ID: mixnet_l
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1669
In Collection: MixNet
Collections:
- Name: MixNet
Paper:
title: 'MixConv: Mixed Depthwise Convolutional Kernels'
url: https://papperswithcode.com//paper/mixnet-mixed-depthwise-convolutional-kernels
type: model-index
Type: model-index
-->

@ -0,0 +1,94 @@
# Summary
**MnasNet** is a type of convolutional neural network optimized for mobile devices that is discovered through mobile neural architecture search, which explicitly incorporates model latency into the main objective so that the search can identify a model that achieves a good trade-off between accuracy and latency. The main building block is an [inverted residual block](https://paperswithcode.com/method/inverted-residual-block) (from [MobileNetV2](https://paperswithcode.com/method/mobilenetv2)).
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{tan2019mnasnet,
title={MnasNet: Platform-Aware Neural Architecture Search for Mobile},
author={Mingxing Tan and Bo Chen and Ruoming Pang and Vijay Vasudevan and Mark Sandler and Andrew Howard and Quoc V. Le},
year={2019},
eprint={1807.11626},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: semnasnet_100
Metadata:
FLOPs: 414570766
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Depthwise Separable Convolution
- Dropout
- Global Average Pooling
- Inverted Residual Block
- Max Pooling
- ReLU
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 15731489
Tasks:
- Image Classification
ID: semnasnet_100
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L928
In Collection: MNASNet
- Name: mnasnet_100
Metadata:
FLOPs: 416415488
Batch Size: 4000
Training Data:
- ImageNet
Training Techniques:
- RMSProp
- Weight Decay
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Depthwise Separable Convolution
- Dropout
- Global Average Pooling
- Inverted Residual Block
- Max Pooling
- ReLU
- Residual Connection
- Softmax
File Size: 17731774
Tasks:
- Image Classification
ID: mnasnet_100
Layers: 100
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Interpolation: bicubic
RMSProp Decay: 0.9
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L894
In Collection: MNASNet
Collections:
- Name: MNASNet
Paper:
title: 'MnasNet: Platform-Aware Neural Architecture Search for Mobile'
url: https://papperswithcode.com//paper/mnasnet-platform-aware-neural-architecture
type: model-index
Type: model-index
-->

@ -0,0 +1,179 @@
# Summary
**MobileNetV2** is a convolutional neural network architecture that seeks to perform well on mobile devices. It is based on an [inverted residual structure](https://paperswithcode.com/method/inverted-residual-block) where the residual connections are between the bottleneck layers. The intermediate expansion layer uses lightweight depthwise convolutions to filter features as a source of non-linearity. As a whole, the architecture of MobileNetV2 contains the initial fully convolution layer with 32 filters, followed by 19 residual bottleneck layers.
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@article{DBLP:journals/corr/abs-1801-04381,
author = {Mark Sandler and
Andrew G. Howard and
Menglong Zhu and
Andrey Zhmoginov and
Liang{-}Chieh Chen},
title = {Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification,
Detection and Segmentation},
journal = {CoRR},
volume = {abs/1801.04381},
year = {2018},
url = {http://arxiv.org/abs/1801.04381},
archivePrefix = {arXiv},
eprint = {1801.04381},
timestamp = {Tue, 12 Jan 2021 15:30:06 +0100},
biburl = {https://dblp.org/rec/journals/corr/abs-1801-04381.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
```
<!--
Models:
- Name: mobilenetv2_100
Metadata:
FLOPs: 401920448
Batch Size: 1536
Training Data:
- ImageNet
Training Techniques:
- RMSProp
- Weight Decay
Training Resources: 16x GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Depthwise Separable Convolution
- Dropout
- Inverted Residual Block
- Max Pooling
- ReLU6
- Residual Connection
- Softmax
File Size: 14202571
Tasks:
- Image Classification
ID: mobilenetv2_100
LR: 0.045
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 4.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L955
In Collection: MobileNet V2
- Name: mobilenetv2_110d
Metadata:
FLOPs: 573958832
Batch Size: 1536
Training Data:
- ImageNet
Training Techniques:
- RMSProp
- Weight Decay
Training Resources: 16x GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Depthwise Separable Convolution
- Dropout
- Inverted Residual Block
- Max Pooling
- ReLU6
- Residual Connection
- Softmax
File Size: 18316431
Tasks:
- Image Classification
ID: mobilenetv2_110d
LR: 0.045
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 4.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L969
In Collection: MobileNet V2
- Name: mobilenetv2_120d
Metadata:
FLOPs: 888510048
Batch Size: 1536
Training Data:
- ImageNet
Training Techniques:
- RMSProp
- Weight Decay
Training Resources: 16x GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Depthwise Separable Convolution
- Dropout
- Inverted Residual Block
- Max Pooling
- ReLU6
- Residual Connection
- Softmax
File Size: 23651121
Tasks:
- Image Classification
ID: mobilenetv2_120d
LR: 0.045
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 4.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L977
In Collection: MobileNet V2
- Name: mobilenetv2_140
Metadata:
FLOPs: 770196784
Batch Size: 1536
Training Data:
- ImageNet
Training Techniques:
- RMSProp
- Weight Decay
Training Resources: 16x GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Depthwise Separable Convolution
- Dropout
- Inverted Residual Block
- Max Pooling
- ReLU6
- Residual Connection
- Softmax
File Size: 24673555
Tasks:
- Image Classification
ID: mobilenetv2_140
LR: 0.045
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 4.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L962
In Collection: MobileNet V2
Collections:
- Name: MobileNet V2
Paper:
title: 'MobileNetV2: Inverted Residuals and Linear Bottlenecks'
url: https://papperswithcode.com//paper/mobilenetv2-inverted-residuals-and-linear
type: model-index
Type: model-index
-->

@ -0,0 +1,123 @@
# Summary
**MobileNetV3** is a convolutional neural network that is designed for mobile phone CPUs. The network design includes the use of a [hard swish activation](https://paperswithcode.com/method/hard-swish) and [squeeze-and-excitation](https://paperswithcode.com/method/squeeze-and-excitation-block) modules in the [MBConv blocks](https://paperswithcode.com/method/inverted-residual-block).
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@article{DBLP:journals/corr/abs-1905-02244,
author = {Andrew Howard and
Mark Sandler and
Grace Chu and
Liang{-}Chieh Chen and
Bo Chen and
Mingxing Tan and
Weijun Wang and
Yukun Zhu and
Ruoming Pang and
Vijay Vasudevan and
Quoc V. Le and
Hartwig Adam},
title = {Searching for MobileNetV3},
journal = {CoRR},
volume = {abs/1905.02244},
year = {2019},
url = {http://arxiv.org/abs/1905.02244},
archivePrefix = {arXiv},
eprint = {1905.02244},
timestamp = {Tue, 12 Jan 2021 15:30:06 +0100},
biburl = {https://dblp.org/rec/journals/corr/abs-1905-02244.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
```
<!--
Models:
- Name: mobilenetv3_rw
Metadata:
FLOPs: 287190638
Batch Size: 4096
Training Data:
- ImageNet
Training Techniques:
- RMSProp
- Weight Decay
Training Resources: 4x4 TPU Pod
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Dense Connections
- Depthwise Separable Convolution
- Dropout
- Global Average Pooling
- Hard Swish
- Inverted Residual Block
- ReLU
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 22064048
Tasks:
- Image Classification
ID: mobilenetv3_rw
LR: 0.1
Dropout: 0.8
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 1.0e-05
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/mobilenetv3.py#L384
In Collection: MobileNet V3
- Name: mobilenetv3_large_100
Metadata:
FLOPs: 287193752
Batch Size: 4096
Training Data:
- ImageNet
Training Techniques:
- RMSProp
- Weight Decay
Training Resources: 4x4 TPU Pod
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Dense Connections
- Depthwise Separable Convolution
- Dropout
- Global Average Pooling
- Hard Swish
- Inverted Residual Block
- ReLU
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 22076443
Tasks:
- Image Classification
ID: mobilenetv3_large_100
LR: 0.1
Dropout: 0.8
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 1.0e-05
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/mobilenetv3.py#L363
In Collection: MobileNet V3
Collections:
- Name: MobileNet V3
Paper:
title: Searching for MobileNetV3
url: https://papperswithcode.com//paper/searching-for-mobilenetv3
type: model-index
Type: model-index
-->

@ -0,0 +1,65 @@
# Summary
**NASNet** is a type of convolutional neural network discovered through neural architecture search. The building blocks consist of normal and reduction cells.
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{zoph2018learning,
title={Learning Transferable Architectures for Scalable Image Recognition},
author={Barret Zoph and Vijay Vasudevan and Jonathon Shlens and Quoc V. Le},
year={2018},
eprint={1707.07012},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: nasnetalarge
Metadata:
FLOPs: 30242402862
Training Data:
- ImageNet
Training Techniques:
- Label Smoothing
- RMSProp
- Weight Decay
Training Resources: 50x Tesla K40 GPUs
Architecture:
- Average Pooling
- Batch Normalization
- Convolution
- Depthwise Separable Convolution
- Dropout
- ReLU
File Size: 356056626
Tasks:
- Image Classification
Training Time: ''
ID: nasnetalarge
Dropout: 0.5
Crop Pct: '0.911'
Momentum: 0.9
Image Size: '331'
Interpolation: bicubic
Label Smoothing: 0.1
RMSProp $\epsilon$: 1.0
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/nasnet.py#L562
Config: ''
In Collection: NASNet
Collections:
- Name: NASNet
Paper:
title: Learning Transferable Architectures for Scalable Image Recognition
url: https://papperswithcode.com//paper/learning-transferable-architectures-for
type: model-index
Type: model-index
-->

@ -0,0 +1,456 @@
# Summary
**Noisy Student Training** is a semi-supervised learning approach. It extends the idea of self-training
and distillation with the use of equal-or-larger student models and noise added to the student during learning. It has three main steps:
1. train a teacher model on labeled images
2. use the teacher to generate pseudo labels on unlabeled images
3. train a student model on the combination of labeled images and pseudo labeled images.
The algorithm is iterated a few times by treating the student as a teacher to relabel the unlabeled data and training a new student.
Noisy Student Training seeks to improve on self-training and distillation in two ways. First, it makes the student larger than, or at least equal to, the teacher so the student can better learn from a larger dataset. Second, it adds noise to the student so the noised student is forced to learn harder from the pseudo labels. To noise the student, it uses input noise such as RandAugment data augmentation, and model noise such as dropout and stochastic depth during training.
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{xie2020selftraining,
title={Self-training with Noisy Student improves ImageNet classification},
author={Qizhe Xie and Minh-Thang Luong and Eduard Hovy and Quoc V. Le},
year={2020},
eprint={1911.04252},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
```
<!--
Models:
- Name: tf_efficientnet_b3_ns
Metadata:
FLOPs: 2275247568
Epochs: 700
Batch Size: 2048
Training Data:
- ImageNet
- JFT-300M
Training Techniques:
- AutoAugment
- FixRes
- Label Smoothing
- Noisy Student
- RMSProp
- RandAugment
- Weight Decay
Training Resources: Cloud TPU v3 Pod
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 49385734
Tasks:
- Image Classification
Training Time: ''
ID: tf_efficientnet_b3_ns
LR: 0.128
Dropout: 0.5
Crop Pct: '0.904'
Momentum: 0.9
Image Size: '300'
Weight Decay: 1.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Label Smoothing: 0.1
BatchNorm Momentum: 0.99
Stochastic Depth Survival: 0.8
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1457
Config: ''
In Collection: Noisy Student
- Name: tf_efficientnet_b1_ns
Metadata:
FLOPs: 883633200
Epochs: 700
Batch Size: 2048
Training Data:
- ImageNet
- JFT-300M
Training Techniques:
- AutoAugment
- FixRes
- Label Smoothing
- Noisy Student
- RMSProp
- RandAugment
- Weight Decay
Training Resources: Cloud TPU v3 Pod
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 31516408
Tasks:
- Image Classification
Training Time: ''
ID: tf_efficientnet_b1_ns
LR: 0.128
Dropout: 0.5
Crop Pct: '0.882'
Momentum: 0.9
Image Size: '240'
Weight Decay: 1.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Label Smoothing: 0.1
BatchNorm Momentum: 0.99
Stochastic Depth Survival: 0.8
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1437
Config: ''
In Collection: Noisy Student
- Name: tf_efficientnet_l2_ns
Metadata:
FLOPs: 611646113804
Epochs: 350
Batch Size: 2048
Training Data:
- ImageNet
- JFT-300M
Training Techniques:
- AutoAugment
- FixRes
- Label Smoothing
- Noisy Student
- RMSProp
- RandAugment
- Weight Decay
Training Resources: Cloud TPU v3 Pod
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 1925950424
Tasks:
- Image Classification
Training Time: 6 days
ID: tf_efficientnet_l2_ns
LR: 0.128
Dropout: 0.5
Crop Pct: '0.96'
Momentum: 0.9
Image Size: '800'
Weight Decay: 1.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Label Smoothing: 0.1
BatchNorm Momentum: 0.99
Stochastic Depth Survival: 0.8
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1520
Config: ''
In Collection: Noisy Student
- Name: tf_efficientnet_b0_ns
Metadata:
FLOPs: 488688572
Epochs: 700
Batch Size: 2048
Training Data:
- ImageNet
- JFT-300M
Training Techniques:
- AutoAugment
- FixRes
- Label Smoothing
- Noisy Student
- RMSProp
- RandAugment
- Weight Decay
Training Resources: Cloud TPU v3 Pod
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 21386709
Tasks:
- Image Classification
Training Time: ''
ID: tf_efficientnet_b0_ns
LR: 0.128
Dropout: 0.5
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 1.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Label Smoothing: 0.1
BatchNorm Momentum: 0.99
Stochastic Depth Survival: 0.8
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1427
Config: ''
In Collection: Noisy Student
- Name: tf_efficientnet_b2_ns
Metadata:
FLOPs: 1234321170
Epochs: 700
Batch Size: 2048
Training Data:
- ImageNet
- JFT-300M
Training Techniques:
- AutoAugment
- FixRes
- Label Smoothing
- Noisy Student
- RMSProp
- RandAugment
- Weight Decay
Training Resources: Cloud TPU v3 Pod
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 36801803
Tasks:
- Image Classification
Training Time: ''
ID: tf_efficientnet_b2_ns
LR: 0.128
Dropout: 0.5
Crop Pct: '0.89'
Momentum: 0.9
Image Size: '260'
Weight Decay: 1.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Label Smoothing: 0.1
BatchNorm Momentum: 0.99
Stochastic Depth Survival: 0.8
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1447
Config: ''
In Collection: Noisy Student
- Name: tf_efficientnet_b5_ns
Metadata:
FLOPs: 13176501888
Epochs: 350
Batch Size: 2048
Training Data:
- ImageNet
- JFT-300M
Training Techniques:
- AutoAugment
- FixRes
- Label Smoothing
- Noisy Student
- RMSProp
- RandAugment
- Weight Decay
Training Resources: Cloud TPU v3 Pod
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 122404944
Tasks:
- Image Classification
Training Time: ''
ID: tf_efficientnet_b5_ns
LR: 0.128
Dropout: 0.5
Crop Pct: '0.934'
Momentum: 0.9
Image Size: '456'
Weight Decay: 1.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Label Smoothing: 0.1
BatchNorm Momentum: 0.99
Stochastic Depth Survival: 0.8
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1477
Config: ''
In Collection: Noisy Student
- Name: tf_efficientnet_b6_ns
Metadata:
FLOPs: 24180518488
Epochs: 350
Batch Size: 2048
Training Data:
- ImageNet
- JFT-300M
Training Techniques:
- AutoAugment
- FixRes
- Label Smoothing
- Noisy Student
- RMSProp
- RandAugment
- Weight Decay
Training Resources: Cloud TPU v3 Pod
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 173239537
Tasks:
- Image Classification
Training Time: ''
ID: tf_efficientnet_b6_ns
LR: 0.128
Dropout: 0.5
Crop Pct: '0.942'
Momentum: 0.9
Image Size: '528'
Weight Decay: 1.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Label Smoothing: 0.1
BatchNorm Momentum: 0.99
Stochastic Depth Survival: 0.8
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1487
Config: ''
In Collection: Noisy Student
- Name: tf_efficientnet_b4_ns
Metadata:
FLOPs: 5749638672
Epochs: 700
Batch Size: 2048
Training Data:
- ImageNet
- JFT-300M
Training Techniques:
- AutoAugment
- FixRes
- Label Smoothing
- Noisy Student
- RMSProp
- RandAugment
- Weight Decay
Training Resources: Cloud TPU v3 Pod
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 77995057
Tasks:
- Image Classification
Training Time: ''
ID: tf_efficientnet_b4_ns
LR: 0.128
Dropout: 0.5
Crop Pct: '0.922'
Momentum: 0.9
Image Size: '380'
Weight Decay: 1.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Label Smoothing: 0.1
BatchNorm Momentum: 0.99
Stochastic Depth Survival: 0.8
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1467
Config: ''
In Collection: Noisy Student
- Name: tf_efficientnet_b7_ns
Metadata:
FLOPs: 48205304880
Epochs: 350
Batch Size: 2048
Training Data:
- ImageNet
- JFT-300M
Training Techniques:
- AutoAugment
- FixRes
- Label Smoothing
- Noisy Student
- RMSProp
- RandAugment
- Weight Decay
Training Resources: Cloud TPU v3 Pod
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 266853140
Tasks:
- Image Classification
Training Time: ''
ID: tf_efficientnet_b7_ns
LR: 0.128
Dropout: 0.5
Crop Pct: '0.949'
Momentum: 0.9
Image Size: '600'
Weight Decay: 1.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Label Smoothing: 0.1
BatchNorm Momentum: 0.99
Stochastic Depth Survival: 0.8
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1498
Config: ''
In Collection: Noisy Student
Collections:
- Name: Noisy Student
Paper:
title: Self-training with Noisy Student improves ImageNet classification
url: https://papperswithcode.com//paper/self-training-with-noisy-student-improves
type: model-index
Type: model-index
-->

@ -0,0 +1,64 @@
# Summary
**Progressive Neural Architecture Search**, or **PNAS**, is a method for learning the structure of convolutional neural networks (CNNs). It uses a sequential model-based optimization (SMBO) strategy, where we search the space of cell structures, starting with simple (shallow) models and progressing to complex ones, pruning out unpromising structures as we go.
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{liu2018progressive,
title={Progressive Neural Architecture Search},
author={Chenxi Liu and Barret Zoph and Maxim Neumann and Jonathon Shlens and Wei Hua and Li-Jia Li and Li Fei-Fei and Alan Yuille and Jonathan Huang and Kevin Murphy},
year={2018},
eprint={1712.00559},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: pnasnet5large
Metadata:
FLOPs: 31458865950
Batch Size: 1600
Training Data:
- ImageNet
Training Techniques:
- Label Smoothing
- RMSProp
- Weight Decay
Training Resources: 100x NVIDIA P100 GPUs
Architecture:
- Average Pooling
- Batch Normalization
- Convolution
- Depthwise Separable Convolution
- Dropout
- ReLU
File Size: 345153926
Tasks:
- Image Classification
ID: pnasnet5large
LR: 0.015
Dropout: 0.5
Crop Pct: '0.911'
Momentum: 0.9
Image Size: '331'
Interpolation: bicubic
Label Smoothing: 0.1
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/pnasnet.py#L343
In Collection: PNASNet
Collections:
- Name: PNASNet
Paper:
title: Progressive Neural Architecture Search
url: https://papperswithcode.com//paper/progressive-neural-architecture-search
type: model-index
Type: model-index
-->

@ -0,0 +1,397 @@
# Summary
**RegNetX** is a convolutional network design space with simple, regular models with parameters: depth $d$, initial width $w\_{0} > 0$, and slope $w\_{a} > 0$, and generates a different block width $u\_{j}$ for each block $j < d$. The key restriction for the RegNet types of model is that there is a linear parameterisation of block widths (the design space only contains models with this linear structure):
$$ u\_{j} = w\_{0} + w\_{a}\cdot{j} $$
For **RegNetX** we have additional restrictions: we set $b = 1$ (the bottleneck ratio), $12 \leq d \leq 28$, and $w\_{m} \geq 2$ (the width multiplier).
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{radosavovic2020designing,
title={Designing Network Design Spaces},
author={Ilija Radosavovic and Raj Prateek Kosaraju and Ross Girshick and Kaiming He and Piotr Dollár},
year={2020},
eprint={2003.13678},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: regnetx_040
Metadata:
FLOPs: 5095167744
Epochs: 100
Batch Size: 512
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA V100 GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Dense Connections
- Global Average Pooling
- Grouped Convolution
- ReLU
File Size: 88844824
Tasks:
- Image Classification
ID: regnetx_040
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 5.0e-05
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/regnet.py#L373
In Collection: RegNetX
- Name: regnetx_004
Metadata:
FLOPs: 510619136
Epochs: 100
Batch Size: 1024
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA V100 GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Dense Connections
- Global Average Pooling
- Grouped Convolution
- ReLU
File Size: 20841309
Tasks:
- Image Classification
ID: regnetx_004
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 5.0e-05
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/regnet.py#L343
In Collection: RegNetX
- Name: regnetx_006
Metadata:
FLOPs: 771659136
Epochs: 100
Batch Size: 1024
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA V100 GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Dense Connections
- Global Average Pooling
- Grouped Convolution
- ReLU
File Size: 24965172
Tasks:
- Image Classification
ID: regnetx_006
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 5.0e-05
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/regnet.py#L349
In Collection: RegNetX
- Name: regnetx_002
Metadata:
FLOPs: 255276032
Epochs: 100
Batch Size: 1024
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA V100 GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Dense Connections
- Global Average Pooling
- Grouped Convolution
- ReLU
File Size: 10862199
Tasks:
- Image Classification
ID: regnetx_002
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 5.0e-05
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/regnet.py#L337
In Collection: RegNetX
- Name: regnetx_008
Metadata:
FLOPs: 1027038208
Epochs: 100
Batch Size: 1024
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA V100 GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Dense Connections
- Global Average Pooling
- Grouped Convolution
- ReLU
File Size: 29235944
Tasks:
- Image Classification
ID: regnetx_008
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 5.0e-05
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/regnet.py#L355
In Collection: RegNetX
- Name: regnetx_016
Metadata:
FLOPs: 2059337856
Epochs: 100
Batch Size: 1024
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA V100 GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Dense Connections
- Global Average Pooling
- Grouped Convolution
- ReLU
File Size: 36988158
Tasks:
- Image Classification
ID: regnetx_016
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 5.0e-05
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/regnet.py#L361
In Collection: RegNetX
- Name: regnetx_032
Metadata:
FLOPs: 4082555904
Epochs: 100
Batch Size: 512
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA V100 GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Dense Connections
- Global Average Pooling
- Grouped Convolution
- ReLU
File Size: 61509573
Tasks:
- Image Classification
ID: regnetx_032
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 5.0e-05
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/regnet.py#L367
In Collection: RegNetX
- Name: regnetx_064
Metadata:
FLOPs: 8303405824
Epochs: 100
Batch Size: 512
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA V100 GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Dense Connections
- Global Average Pooling
- Grouped Convolution
- ReLU
File Size: 105184854
Tasks:
- Image Classification
ID: regnetx_064
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 5.0e-05
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/regnet.py#L379
In Collection: RegNetX
- Name: regnetx_080
Metadata:
FLOPs: 10276726784
Epochs: 100
Batch Size: 512
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA V100 GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Dense Connections
- Global Average Pooling
- Grouped Convolution
- ReLU
File Size: 158720042
Tasks:
- Image Classification
ID: regnetx_080
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 5.0e-05
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/regnet.py#L385
In Collection: RegNetX
- Name: regnetx_120
Metadata:
FLOPs: 15536378368
Epochs: 100
Batch Size: 512
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA V100 GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Dense Connections
- Global Average Pooling
- Grouped Convolution
- ReLU
File Size: 184866342
Tasks:
- Image Classification
ID: regnetx_120
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 5.0e-05
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/regnet.py#L391
In Collection: RegNetX
- Name: regnetx_160
Metadata:
FLOPs: 20491740672
Epochs: 100
Batch Size: 512
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA V100 GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Dense Connections
- Global Average Pooling
- Grouped Convolution
- ReLU
File Size: 217623862
Tasks:
- Image Classification
ID: regnetx_160
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 5.0e-05
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/regnet.py#L397
In Collection: RegNetX
- Name: regnetx_320
Metadata:
FLOPs: 40798958592
Epochs: 100
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA V100 GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Dense Connections
- Global Average Pooling
- Grouped Convolution
- ReLU
File Size: 431962133
Tasks:
- Image Classification
ID: regnetx_320
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 5.0e-05
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/regnet.py#L403
In Collection: RegNetX
Collections:
- Name: RegNetX
Paper:
title: Designing Network Design Spaces
url: https://papperswithcode.com//paper/designing-network-design-spaces
type: model-index
Type: model-index
-->

@ -0,0 +1,411 @@
# Summary
**RegNetY** is a convolutional network design space with simple, regular models with parameters: depth $d$, initial width $w\_{0} > 0$, and slope $w\_{a} > 0$, and generates a different block width $u\_{j}$ for each block $j < d$. The key restriction for the RegNet types of model is that there is a linear parameterisation of block widths (the design space only contains models with this linear structure):
$$ u\_{j} = w\_{0} + w\_{a}\cdot{j} $$
For **RegNetX** authors have additional restrictions: we set $b = 1$ (the bottleneck ratio), $12 \leq d \leq 28$, and $w\_{m} \geq 2$ (the width multiplier).
For **RegNetY** authors make one change, which is to include [Squeeze-and-Excitation blocks](https://paperswithcode.com/method/squeeze-and-excitation-block).
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{radosavovic2020designing,
title={Designing Network Design Spaces},
author={Ilija Radosavovic and Raj Prateek Kosaraju and Ross Girshick and Kaiming He and Piotr Dollár},
year={2020},
eprint={2003.13678},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: regnety_002
Metadata:
FLOPs: 255754236
Epochs: 100
Batch Size: 1024
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA V100 GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Dense Connections
- Global Average Pooling
- Grouped Convolution
- ReLU
- Squeeze-and-Excitation Block
File Size: 12782926
Tasks:
- Image Classification
ID: regnety_002
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 5.0e-05
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/regnet.py#L409
In Collection: RegNetY
- Name: regnety_016
Metadata:
FLOPs: 2070895094
Epochs: 100
Batch Size: 1024
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA V100 GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Dense Connections
- Global Average Pooling
- Grouped Convolution
- ReLU
- Squeeze-and-Excitation Block
File Size: 45115589
Tasks:
- Image Classification
ID: regnety_016
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 5.0e-05
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/regnet.py#L433
In Collection: RegNetY
- Name: regnety_004
Metadata:
FLOPs: 515664568
Epochs: 100
Batch Size: 1024
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA V100 GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Dense Connections
- Global Average Pooling
- Grouped Convolution
- ReLU
- Squeeze-and-Excitation Block
File Size: 17542753
Tasks:
- Image Classification
ID: regnety_004
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 5.0e-05
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/regnet.py#L415
In Collection: RegNetY
- Name: regnety_006
Metadata:
FLOPs: 771746928
Epochs: 100
Batch Size: 1024
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA V100 GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Dense Connections
- Global Average Pooling
- Grouped Convolution
- ReLU
- Squeeze-and-Excitation Block
File Size: 24394127
Tasks:
- Image Classification
ID: regnety_006
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 5.0e-05
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/regnet.py#L421
In Collection: RegNetY
- Name: regnety_008
Metadata:
FLOPs: 1023448952
Epochs: 100
Batch Size: 1024
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA V100 GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Dense Connections
- Global Average Pooling
- Grouped Convolution
- ReLU
- Squeeze-and-Excitation Block
File Size: 25223268
Tasks:
- Image Classification
ID: regnety_008
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 5.0e-05
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/regnet.py#L427
In Collection: RegNetY
- Name: regnety_032
Metadata:
FLOPs: 4081118714
Epochs: 100
Batch Size: 512
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA V100 GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Dense Connections
- Global Average Pooling
- Grouped Convolution
- ReLU
- Squeeze-and-Excitation Block
File Size: 78084523
Tasks:
- Image Classification
ID: regnety_032
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 5.0e-05
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/regnet.py#L439
In Collection: RegNetY
- Name: regnety_080
Metadata:
FLOPs: 10233621420
Epochs: 100
Batch Size: 512
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA V100 GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Dense Connections
- Global Average Pooling
- Grouped Convolution
- ReLU
- Squeeze-and-Excitation Block
File Size: 157124671
Tasks:
- Image Classification
ID: regnety_080
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 5.0e-05
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/regnet.py#L457
In Collection: RegNetY
- Name: regnety_040
Metadata:
FLOPs: 5105933432
Epochs: 100
Batch Size: 512
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA V100 GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Dense Connections
- Global Average Pooling
- Grouped Convolution
- ReLU
- Squeeze-and-Excitation Block
File Size: 82913909
Tasks:
- Image Classification
ID: regnety_040
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 5.0e-05
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/regnet.py#L445
In Collection: RegNetY
- Name: regnety_064
Metadata:
FLOPs: 8167730444
Epochs: 100
Batch Size: 512
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA V100 GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Dense Connections
- Global Average Pooling
- Grouped Convolution
- ReLU
- Squeeze-and-Excitation Block
File Size: 122751416
Tasks:
- Image Classification
ID: regnety_064
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 5.0e-05
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/regnet.py#L451
In Collection: RegNetY
- Name: regnety_120
Metadata:
FLOPs: 15542094856
Epochs: 100
Batch Size: 512
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA V100 GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Dense Connections
- Global Average Pooling
- Grouped Convolution
- ReLU
- Squeeze-and-Excitation Block
File Size: 207743949
Tasks:
- Image Classification
ID: regnety_120
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 5.0e-05
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/regnet.py#L463
In Collection: RegNetY
- Name: regnety_160
Metadata:
FLOPs: 20450196852
Epochs: 100
Batch Size: 512
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA V100 GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Dense Connections
- Global Average Pooling
- Grouped Convolution
- ReLU
- Squeeze-and-Excitation Block
File Size: 334916722
Tasks:
- Image Classification
ID: regnety_160
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 5.0e-05
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/regnet.py#L469
In Collection: RegNetY
- Name: regnety_320
Metadata:
FLOPs: 41492618394
Epochs: 100
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA V100 GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Dense Connections
- Global Average Pooling
- Grouped Convolution
- ReLU
- Squeeze-and-Excitation Block
File Size: 580891965
Tasks:
- Image Classification
ID: regnety_320
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 5.0e-05
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/regnet.py#L475
In Collection: RegNetY
Collections:
- Name: RegNetY
Paper:
title: Designing Network Design Spaces
url: https://papperswithcode.com//paper/designing-network-design-spaces
type: model-index
Type: model-index
-->

@ -0,0 +1,213 @@
# Summary
**Res2Net** is an image model that employs a variation on bottleneck residual blocks, [Res2Net Blocks](https://paperswithcode.com/method/res2net-block). The motivation is to be able to represent features at multiple scales. This is achieved through a novel building block for CNNs that constructs hierarchical residual-like connections within one single residual block. This represents multi-scale features at a granular level and increases the range of receptive fields for each network layer.
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@article{Gao_2021,
title={Res2Net: A New Multi-Scale Backbone Architecture},
volume={43},
ISSN={1939-3539},
url={http://dx.doi.org/10.1109/TPAMI.2019.2938758},
DOI={10.1109/tpami.2019.2938758},
number={2},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
publisher={Institute of Electrical and Electronics Engineers (IEEE)},
author={Gao, Shang-Hua and Cheng, Ming-Ming and Zhao, Kai and Zhang, Xin-Yu and Yang, Ming-Hsuan and Torr, Philip},
year={2021},
month={Feb},
pages={652662}
}
```
<!--
Models:
- Name: res2net101_26w_4s
Metadata:
FLOPs: 10415881200
Epochs: 100
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 4x Titan Xp GPUs
Architecture:
- Batch Normalization
- Convolution
- Global Average Pooling
- ReLU
- Res2Net Block
File Size: 181456059
Tasks:
- Image Classification
ID: res2net101_26w_4s
LR: 0.1
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/res2net.py#L152
In Collection: Res2Net
- Name: res2net50_26w_6s
Metadata:
FLOPs: 8130156528
Epochs: 100
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 4x Titan Xp GPUs
Architecture:
- Batch Normalization
- Convolution
- Global Average Pooling
- ReLU
- Res2Net Block
File Size: 148603239
Tasks:
- Image Classification
ID: res2net50_26w_6s
LR: 0.1
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/res2net.py#L163
In Collection: Res2Net
- Name: res2net50_26w_8s
Metadata:
FLOPs: 10760338992
Epochs: 100
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 4x Titan Xp GPUs
Architecture:
- Batch Normalization
- Convolution
- Global Average Pooling
- ReLU
- Res2Net Block
File Size: 194085165
Tasks:
- Image Classification
ID: res2net50_26w_8s
LR: 0.1
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/res2net.py#L174
In Collection: Res2Net
- Name: res2net50_14w_8s
Metadata:
FLOPs: 5403546768
Epochs: 100
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 4x Titan Xp GPUs
Architecture:
- Batch Normalization
- Convolution
- Global Average Pooling
- ReLU
- Res2Net Block
File Size: 100638543
Tasks:
- Image Classification
ID: res2net50_14w_8s
LR: 0.1
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/res2net.py#L196
In Collection: Res2Net
- Name: res2net50_26w_4s
Metadata:
FLOPs: 5499974064
Epochs: 100
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 4x Titan Xp GPUs
Architecture:
- Batch Normalization
- Convolution
- Global Average Pooling
- ReLU
- Res2Net Block
File Size: 103110087
Tasks:
- Image Classification
ID: res2net50_26w_4s
LR: 0.1
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/res2net.py#L141
In Collection: Res2Net
- Name: res2net50_48w_2s
Metadata:
FLOPs: 5375291520
Epochs: 100
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 4x Titan Xp GPUs
Architecture:
- Batch Normalization
- Convolution
- Global Average Pooling
- ReLU
- Res2Net Block
File Size: 101421406
Tasks:
- Image Classification
ID: res2net50_48w_2s
LR: 0.1
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/res2net.py#L185
In Collection: Res2Net
Collections:
- Name: Res2Net
Paper:
title: 'Res2Net: A New Multi-scale Backbone Architecture'
url: https://papperswithcode.com//paper/res2net-a-new-multi-scale-backbone
type: model-index
Type: model-index
-->

@ -0,0 +1,68 @@
# Summary
**Res2Net** is an image model that employs a variation on [ResNeXt](https://paperswithcode.com/method/resnext) bottleneck residual blocks. The motivation is to be able to represent features at multiple scales. This is achieved through a novel building block for CNNs that constructs hierarchical residual-like connections within one single residual block. This represents multi-scale features at a granular level and increases the range of receptive fields for each network layer.
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@article{Gao_2021,
title={Res2Net: A New Multi-Scale Backbone Architecture},
volume={43},
ISSN={1939-3539},
url={http://dx.doi.org/10.1109/TPAMI.2019.2938758},
DOI={10.1109/tpami.2019.2938758},
number={2},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
publisher={Institute of Electrical and Electronics Engineers (IEEE)},
author={Gao, Shang-Hua and Cheng, Ming-Ming and Zhao, Kai and Zhang, Xin-Yu and Yang, Ming-Hsuan and Torr, Philip},
year={2021},
month={Feb},
pages={652662}
}
```
<!--
Models:
- Name: res2next50
Metadata:
FLOPs: 5396798208
Epochs: 100
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 4x Titan Xp GPUs
Architecture:
- Batch Normalization
- Convolution
- Global Average Pooling
- ReLU
- Res2NeXt Block
File Size: 99019592
Tasks:
- Image Classification
ID: res2next50
LR: 0.1
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/res2net.py#L207
In Collection: Res2NeXt
Collections:
- Name: Res2NeXt
Paper:
title: 'Res2Net: A New Multi-scale Backbone Architecture'
url: https://papperswithcode.com//paper/res2net-a-new-multi-scale-backbone
type: model-index
Type: model-index
-->

@ -0,0 +1,359 @@
# Summary
A **ResNest** is a variant on a [ResNet](https://paperswithcode.com/method/resnet), which instead stacks [Split-Attention blocks](https://paperswithcode.com/method/split-attention). The cardinal group representations are then concatenated along the channel dimension: $V = \text{Concat}${$V^{1},V^{2},\cdots{V}^{K}$}. As in standard residual blocks, the final output $Y$ of otheur Split-Attention block is produced using a shortcut connection: $Y=V+X$, if the input and output feature-map share the same shape. For blocks with a stride, an appropriate transformation $\mathcal{T}$ is applied to the shortcut connection to align the output shapes: $Y=V+\mathcal{T}(X)$. For example, $\mathcal{T}$ can be strided convolution or combined convolution-with-pooling.
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{zhang2020resnest,
title={ResNeSt: Split-Attention Networks},
author={Hang Zhang and Chongruo Wu and Zhongyue Zhang and Yi Zhu and Haibin Lin and Zhi Zhang and Yue Sun and Tong He and Jonas Mueller and R. Manmatha and Mu Li and Alexander Smola},
year={2020},
eprint={2004.08955},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: resnest50d_4s2x40d
Metadata:
FLOPs: 5657064720
Epochs: 270
Batch Size: 8192
Training Data:
- ImageNet
Training Techniques:
- AutoAugment
- DropBlock
- Label Smoothing
- Mixup
- SGD with Momentum
- Weight Decay
Training Resources: 64x NVIDIA V100 GPUs
Architecture:
- 1x1 Convolution
- Convolution
- Dense Connections
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Connection
- Softmax
- Split Attention
File Size: 122133282
Tasks:
- Image Classification
Training Time: ''
ID: resnest50d_4s2x40d
LR: 0.1
Layers: 50
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/resnest.py#L218
Config: ''
In Collection: ResNeSt
- Name: resnest200e
Metadata:
FLOPs: 45954387872
Epochs: 270
Batch Size: 2048
Training Data:
- ImageNet
Training Techniques:
- AutoAugment
- DropBlock
- Label Smoothing
- Mixup
- SGD with Momentum
- Weight Decay
Training Resources: 64x NVIDIA V100 GPUs
Architecture:
- 1x1 Convolution
- Convolution
- Dense Connections
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Connection
- Softmax
- Split Attention
File Size: 193782911
Tasks:
- Image Classification
Training Time: ''
ID: resnest200e
LR: 0.1
Layers: 200
Dropout: 0.2
Crop Pct: '0.909'
Momentum: 0.9
Image Size: '320'
Weight Decay: 0.0001
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/resnest.py#L194
Config: ''
In Collection: ResNeSt
- Name: resnest14d
Metadata:
FLOPs: 3548594464
Epochs: 270
Batch Size: 8192
Training Data:
- ImageNet
Training Techniques:
- AutoAugment
- DropBlock
- Label Smoothing
- Mixup
- SGD with Momentum
- Weight Decay
Training Resources: 64x NVIDIA V100 GPUs
Architecture:
- 1x1 Convolution
- Convolution
- Dense Connections
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Connection
- Softmax
- Split Attention
File Size: 42562639
Tasks:
- Image Classification
Training Time: ''
ID: resnest14d
LR: 0.1
Layers: 14
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/resnest.py#L148
Config: ''
In Collection: ResNeSt
- Name: resnest101e
Metadata:
FLOPs: 17423183648
Epochs: 270
Batch Size: 4096
Training Data:
- ImageNet
Training Techniques:
- AutoAugment
- DropBlock
- Label Smoothing
- Mixup
- SGD with Momentum
- Weight Decay
Training Resources: 64x NVIDIA V100 GPUs
Architecture:
- 1x1 Convolution
- Convolution
- Dense Connections
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Connection
- Softmax
- Split Attention
File Size: 193782911
Tasks:
- Image Classification
Training Time: ''
ID: resnest101e
LR: 0.1
Layers: 101
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '256'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/resnest.py#L182
Config: ''
In Collection: ResNeSt
- Name: resnest269e
Metadata:
FLOPs: 100830307104
Epochs: 270
Batch Size: 2048
Training Data:
- ImageNet
Training Techniques:
- AutoAugment
- DropBlock
- Label Smoothing
- Mixup
- SGD with Momentum
- Weight Decay
Training Resources: 64x NVIDIA V100 GPUs
Architecture:
- 1x1 Convolution
- Convolution
- Dense Connections
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Connection
- Softmax
- Split Attention
File Size: 445402691
Tasks:
- Image Classification
Training Time: ''
ID: resnest269e
LR: 0.1
Layers: 269
Dropout: 0.2
Crop Pct: '0.928'
Momentum: 0.9
Image Size: '416'
Weight Decay: 0.0001
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/resnest.py#L206
Config: ''
In Collection: ResNeSt
- Name: resnest26d
Metadata:
FLOPs: 4678918720
Epochs: 270
Batch Size: 8192
Training Data:
- ImageNet
Training Techniques:
- AutoAugment
- DropBlock
- Label Smoothing
- Mixup
- SGD with Momentum
- Weight Decay
Training Resources: 64x NVIDIA V100 GPUs
Architecture:
- 1x1 Convolution
- Convolution
- Dense Connections
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Connection
- Softmax
- Split Attention
File Size: 68470242
Tasks:
- Image Classification
Training Time: ''
ID: resnest26d
LR: 0.1
Layers: 26
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/resnest.py#L159
Config: ''
In Collection: ResNeSt
- Name: resnest50d
Metadata:
FLOPs: 6937106336
Epochs: 270
Batch Size: 8192
Training Data:
- ImageNet
Training Techniques:
- AutoAugment
- DropBlock
- Label Smoothing
- Mixup
- SGD with Momentum
- Weight Decay
Training Resources: 64x NVIDIA V100 GPUs
Architecture:
- 1x1 Convolution
- Convolution
- Dense Connections
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Connection
- Softmax
- Split Attention
File Size: 110273258
Tasks:
- Image Classification
Training Time: ''
ID: resnest50d
LR: 0.1
Layers: 50
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/resnest.py#L170
Config: ''
In Collection: ResNeSt
- Name: resnest50d_1s4x24d
Metadata:
FLOPs: 5686764544
Epochs: 270
Batch Size: 8192
Training Data:
- ImageNet
Training Techniques:
- AutoAugment
- DropBlock
- Label Smoothing
- Mixup
- SGD with Momentum
- Weight Decay
Training Resources: 64x NVIDIA V100 GPUs
Architecture:
- 1x1 Convolution
- Convolution
- Dense Connections
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Connection
- Softmax
- Split Attention
File Size: 103045531
Tasks:
- Image Classification
ID: resnest50d_1s4x24d
LR: 0.1
Layers: 50
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/resnest.py#L229
In Collection: ResNeSt
Collections:
- Name: ResNeSt
Paper:
title: 'ResNeSt: Split-Attention Networks'
url: https://papperswithcode.com//paper/resnest-split-attention-networks
type: model-index
Type: model-index
-->

@ -0,0 +1,208 @@
# Summary
**ResNet-D** is a modification on the [ResNet](https://paperswithcode.com/method/resnet) architecture that utilises an [average pooling](https://paperswithcode.com/method/average-pooling) tweak for downsampling. The motivation is that in the unmodified ResNet, the [1×1 convolution](https://paperswithcode.com/method/1x1-convolution) for the downsampling block ignores 3/4 of input feature maps, so this is modified so no information will be ignored
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{he2018bag,
title={Bag of Tricks for Image Classification with Convolutional Neural Networks},
author={Tong He and Zhi Zhang and Hang Zhang and Zhongyue Zhang and Junyuan Xie and Mu Li},
year={2018},
eprint={1812.01187},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: resnet50d
Metadata:
FLOPs: 5591002624
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 102567109
Tasks:
- Image Classification
ID: resnet50d
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/resnet.py#L699
In Collection: ResNet-D
- Name: resnet26d
Metadata:
FLOPs: 3335276032
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 64209122
Tasks:
- Image Classification
ID: resnet26d
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/resnet.py#L683
In Collection: ResNet-D
- Name: resnet18d
Metadata:
FLOPs: 2645205760
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 46893231
Tasks:
- Image Classification
ID: resnet18d
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/resnet.py#L649
In Collection: ResNet-D
- Name: resnet34d
Metadata:
FLOPs: 5026601728
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 87369807
Tasks:
- Image Classification
ID: resnet34d
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/resnet.py#L666
In Collection: ResNet-D
- Name: resnet200d
Metadata:
FLOPs: 26034378752
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 259662933
Tasks:
- Image Classification
ID: resnet200d
Crop Pct: '0.94'
Image Size: '256'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/resnet.py#L749
In Collection: ResNet-D
- Name: resnet101d
Metadata:
FLOPs: 13805639680
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 178791263
Tasks:
- Image Classification
ID: resnet101d
Crop Pct: '0.94'
Image Size: '256'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/resnet.py#L716
In Collection: ResNet-D
- Name: resnet152d
Metadata:
FLOPs: 20155275264
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 241596837
Tasks:
- Image Classification
ID: resnet152d
Crop Pct: '0.94'
Image Size: '256'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/resnet.py#L724
In Collection: ResNet-D
Collections:
- Name: ResNet-D
Paper:
title: Bag of Tricks for Image Classification with Convolutional Neural Networks
url: https://papperswithcode.com//paper/bag-of-tricks-for-image-classification-with
type: model-index
Type: model-index
-->

@ -0,0 +1,307 @@
# Summary
**Residual Networks**, or **ResNets**, learn residual functions with reference to the layer inputs, instead of learning unreferenced functions. Instead of hoping each few stacked layers directly fit a desired underlying mapping, residual nets let these layers fit a residual mapping. They stack [residual blocks](https://paperswithcode.com/method/residual-block) ontop of each other to form network: e.g. a ResNet-50 has fifty layers using these blocks.
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@article{DBLP:journals/corr/HeZRS15,
author = {Kaiming He and
Xiangyu Zhang and
Shaoqing Ren and
Jian Sun},
title = {Deep Residual Learning for Image Recognition},
journal = {CoRR},
volume = {abs/1512.03385},
year = {2015},
url = {http://arxiv.org/abs/1512.03385},
archivePrefix = {arXiv},
eprint = {1512.03385},
timestamp = {Wed, 17 Apr 2019 17:23:45 +0200},
biburl = {https://dblp.org/rec/journals/corr/HeZRS15.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
```
<!--
Models:
- Name: resnet26
Metadata:
FLOPs: 3026804736
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 64129972
Tasks:
- Image Classification
ID: resnet26
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/resnet.py#L675
In Collection: ResNet
- Name: tv_resnet152
Metadata:
FLOPs: 14857660416
Epochs: 90
Batch Size: 32
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 241530880
Tasks:
- Image Classification
ID: tv_resnet152
LR: 0.1
Crop Pct: '0.875'
LR Gamma: 0.1
Momentum: 0.9
Image Size: '224'
LR Step Size: 30
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/resnet.py#L769
In Collection: ResNet
- Name: resnet18
Metadata:
FLOPs: 2337073152
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 46827520
Tasks:
- Image Classification
ID: resnet18
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/resnet.py#L641
In Collection: ResNet
- Name: resnet50
Metadata:
FLOPs: 5282531328
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 102488165
Tasks:
- Image Classification
ID: resnet50
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/resnet.py#L691
In Collection: ResNet
- Name: resnet34
Metadata:
FLOPs: 4718469120
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 87290831
Tasks:
- Image Classification
ID: resnet34
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/resnet.py#L658
In Collection: ResNet
- Name: resnetblur50
Metadata:
FLOPs: 6621606912
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Blur Pooling
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 102488165
Tasks:
- Image Classification
ID: resnetblur50
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/resnet.py#L1160
In Collection: ResNet
- Name: tv_resnet34
Metadata:
FLOPs: 4718469120
Epochs: 90
Batch Size: 32
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 87306240
Tasks:
- Image Classification
ID: tv_resnet34
LR: 0.1
Crop Pct: '0.875'
LR Gamma: 0.1
Momentum: 0.9
Image Size: '224'
LR Step Size: 30
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/resnet.py#L745
In Collection: ResNet
- Name: tv_resnet101
Metadata:
FLOPs: 10068547584
Epochs: 90
Batch Size: 32
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 178728960
Tasks:
- Image Classification
ID: tv_resnet101
LR: 0.1
Crop Pct: '0.875'
LR Gamma: 0.1
Momentum: 0.9
Image Size: '224'
LR Step Size: 30
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/resnet.py#L761
In Collection: ResNet
- Name: tv_resnet50
Metadata:
FLOPs: 5282531328
Epochs: 90
Batch Size: 32
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 102502400
Tasks:
- Image Classification
ID: tv_resnet50
LR: 0.1
Crop Pct: '0.875'
LR Gamma: 0.1
Momentum: 0.9
Image Size: '224'
LR Step Size: 30
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/resnet.py#L753
In Collection: ResNet
Collections:
- Name: ResNet
Paper:
title: Deep Residual Learning for Image Recognition
url: https://papperswithcode.com//paper/deep-residual-learning-for-image-recognition
type: model-index
Type: model-index
-->

@ -0,0 +1,152 @@
# Summary
A **ResNeXt** repeats a [building block](https://paperswithcode.com/method/resnext-block) that aggregates a set of transformations with the same topology. Compared to a [ResNet](https://paperswithcode.com/method/resnet), it exposes a new dimension, *cardinality* (the size of the set of transformations) $C$, as an essential factor in addition to the dimensions of depth and width.
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@article{DBLP:journals/corr/XieGDTH16,
author = {Saining Xie and
Ross B. Girshick and
Piotr Doll{\'{a}}r and
Zhuowen Tu and
Kaiming He},
title = {Aggregated Residual Transformations for Deep Neural Networks},
journal = {CoRR},
volume = {abs/1611.05431},
year = {2016},
url = {http://arxiv.org/abs/1611.05431},
archivePrefix = {arXiv},
eprint = {1611.05431},
timestamp = {Mon, 13 Aug 2018 16:45:58 +0200},
biburl = {https://dblp.org/rec/journals/corr/XieGDTH16.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
```
<!--
Models:
- Name: resnext101_32x8d
Metadata:
FLOPs: 21180417024
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Grouped Convolution
- Max Pooling
- ReLU
- ResNeXt Block
- Residual Connection
- Softmax
File Size: 356082095
Tasks:
- Image Classification
ID: resnext101_32x8d
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/b9843f954b0457af2db4f9dea41a8538f51f5d78/timm/models/resnet.py#L877
In Collection: ResNeXt
- Name: resnext50_32x4d
Metadata:
FLOPs: 5472648192
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Grouped Convolution
- Max Pooling
- ReLU
- ResNeXt Block
- Residual Connection
- Softmax
File Size: 100435887
Tasks:
- Image Classification
ID: resnext50_32x4d
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/b9843f954b0457af2db4f9dea41a8538f51f5d78/timm/models/resnet.py#L851
In Collection: ResNeXt
- Name: tv_resnext50_32x4d
Metadata:
FLOPs: 5472648192
Epochs: 90
Batch Size: 32
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Grouped Convolution
- Max Pooling
- ReLU
- ResNeXt Block
- Residual Connection
- Softmax
File Size: 100441675
Tasks:
- Image Classification
ID: tv_resnext50_32x4d
LR: 0.1
Crop Pct: '0.875'
LR Gamma: 0.1
Momentum: 0.9
Image Size: '224'
LR Step Size: 30
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/resnet.py#L842
In Collection: ResNeXt
- Name: resnext50d_32x4d
Metadata:
FLOPs: 5781119488
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Grouped Convolution
- Max Pooling
- ReLU
- ResNeXt Block
- Residual Connection
- Softmax
File Size: 100515304
Tasks:
- Image Classification
ID: resnext50d_32x4d
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/b9843f954b0457af2db4f9dea41a8538f51f5d78/timm/models/resnet.py#L869
In Collection: ResNeXt
Collections:
- Name: ResNeXt
Paper:
title: Aggregated Residual Transformations for Deep Neural Networks
url: https://papperswithcode.com//paper/aggregated-residual-transformations-for-deep
type: model-index
Type: model-index
-->

@ -0,0 +1,174 @@
# Summary
**Rank Expansion Networks** (ReXNets) follow a set of new design principles for designing bottlenecks in image classification models. Authors refine each layer by 1) expanding the input channel size of the convolution layer and 2) replacing the [ReLU6s](https://www.paperswithcode.com/method/relu6).
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{han2020rexnet,
title={ReXNet: Diminishing Representational Bottleneck on Convolutional Neural Network},
author={Dongyoon Han and Sangdoo Yun and Byeongho Heo and YoungJoon Yoo},
year={2020},
eprint={2007.00992},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: rexnet_100
Metadata:
FLOPs: 509989377
Epochs: 400
Batch Size: 512
Training Data:
- ImageNet
Training Techniques:
- Label Smoothing
- Linear Warmup With Cosine Annealing
- Nesterov Accelerated Gradient
- Weight Decay
Training Resources: 4x NVIDIA V100 GPUs
Architecture:
- Batch Normalization
- Convolution
- Dropout
- ReLU6
- Residual Connection
File Size: 19417552
Tasks:
- Image Classification
Training Time: ''
ID: rexnet_100
LR: 0.5
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 1.0e-05
Interpolation: bicubic
Label Smoothing: 0.1
Code: https://github.com/rwightman/pytorch-image-models/blob/b9843f954b0457af2db4f9dea41a8538f51f5d78/timm/models/rexnet.py#L212
Config: ''
In Collection: RexNet
- Name: rexnet_130
Metadata:
FLOPs: 848364461
Epochs: 400
Batch Size: 512
Training Data:
- ImageNet
Training Techniques:
- Label Smoothing
- Linear Warmup With Cosine Annealing
- Nesterov Accelerated Gradient
- Weight Decay
Training Resources: 4x NVIDIA V100 GPUs
Architecture:
- Batch Normalization
- Convolution
- Dropout
- ReLU6
- Residual Connection
File Size: 30508197
Tasks:
- Image Classification
Training Time: ''
ID: rexnet_130
LR: 0.5
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 1.0e-05
Interpolation: bicubic
Label Smoothing: 0.1
Code: https://github.com/rwightman/pytorch-image-models/blob/b9843f954b0457af2db4f9dea41a8538f51f5d78/timm/models/rexnet.py#L218
Config: ''
In Collection: RexNet
- Name: rexnet_150
Metadata:
FLOPs: 1122374469
Epochs: 400
Batch Size: 512
Training Data:
- ImageNet
Training Techniques:
- Label Smoothing
- Linear Warmup With Cosine Annealing
- Nesterov Accelerated Gradient
- Weight Decay
Training Resources: 4x NVIDIA V100 GPUs
Architecture:
- Batch Normalization
- Convolution
- Dropout
- ReLU6
- Residual Connection
File Size: 39227315
Tasks:
- Image Classification
Training Time: ''
ID: rexnet_150
LR: 0.5
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 1.0e-05
Interpolation: bicubic
Label Smoothing: 0.1
Code: https://github.com/rwightman/pytorch-image-models/blob/b9843f954b0457af2db4f9dea41a8538f51f5d78/timm/models/rexnet.py#L224
Config: ''
In Collection: RexNet
- Name: rexnet_200
Metadata:
FLOPs: 1960224938
Epochs: 400
Batch Size: 512
Training Data:
- ImageNet
Training Techniques:
- Label Smoothing
- Linear Warmup With Cosine Annealing
- Nesterov Accelerated Gradient
- Weight Decay
Training Resources: 4x NVIDIA V100 GPUs
Architecture:
- Batch Normalization
- Convolution
- Dropout
- ReLU6
- Residual Connection
File Size: 65862221
Tasks:
- Image Classification
Training Time: ''
ID: rexnet_200
LR: 0.5
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 1.0e-05
Interpolation: bicubic
Label Smoothing: 0.1
Code: https://github.com/rwightman/pytorch-image-models/blob/b9843f954b0457af2db4f9dea41a8538f51f5d78/timm/models/rexnet.py#L230
Config: ''
In Collection: RexNet
Collections:
- Name: RexNet
Paper:
title: 'ReXNet: Diminishing Representational Bottleneck on Convolutional Neural
Network'
url: https://papperswithcode.com//paper/rexnet-diminishing-representational
type: model-index
Type: model-index
-->

@ -0,0 +1,107 @@
# Summary
**SE ResNet** is a variant of a [ResNet](https://www.paperswithcode.com/method/resnet) that employs [squeeze-and-excitation blocks](https://paperswithcode.com/method/squeeze-and-excitation-block) to enable the network to perform dynamic channel-wise feature recalibration.
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{hu2019squeezeandexcitation,
title={Squeeze-and-Excitation Networks},
author={Jie Hu and Li Shen and Samuel Albanie and Gang Sun and Enhua Wu},
year={2019},
eprint={1709.01507},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: seresnet152d
Metadata:
FLOPs: 20161904304
Epochs: 100
Batch Size: 1024
Training Data:
- ImageNet
Training Techniques:
- Label Smoothing
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA Titan X GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 268144497
Tasks:
- Image Classification
ID: seresnet152d
LR: 0.6
Layers: 152
Dropout: 0.2
Crop Pct: '0.94'
Momentum: 0.9
Image Size: '256'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/a7f95818e44b281137503bcf4b3e3e94d8ffa52f/timm/models/resnet.py#L1206
In Collection: SE ResNet
- Name: seresnet50
Metadata:
FLOPs: 5285062320
Epochs: 100
Batch Size: 1024
Training Data:
- ImageNet
Training Techniques:
- Label Smoothing
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA Titan X GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 112621903
Tasks:
- Image Classification
ID: seresnet50
LR: 0.6
Layers: 50
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/a7f95818e44b281137503bcf4b3e3e94d8ffa52f/timm/models/resnet.py#L1180
In Collection: SE ResNet
Collections:
- Name: SE ResNet
Paper:
title: Squeeze-and-Excitation Networks
url: https://papperswithcode.com//paper/squeeze-and-excitation-networks
type: model-index
Type: model-index
-->

@ -0,0 +1,113 @@
# Summary
**SelecSLS** uses novel selective long and short range skip connections to improve the information flow allowing for a drastically faster network without compromising accuracy.
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@article{Mehta_2020,
title={XNect},
volume={39},
ISSN={1557-7368},
url={http://dx.doi.org/10.1145/3386569.3392410},
DOI={10.1145/3386569.3392410},
number={4},
journal={ACM Transactions on Graphics},
publisher={Association for Computing Machinery (ACM)},
author={Mehta, Dushyant and Sotnychenko, Oleksandr and Mueller, Franziska and Xu, Weipeng and Elgharib, Mohamed and Fua, Pascal and Seidel, Hans-Peter and Rhodin, Helge and Pons-Moll, Gerard and Theobalt, Christian},
year={2020},
month={Jul}
}
```
<!--
Models:
- Name: selecsls42b
Metadata:
FLOPs: 3824022528
Training Data:
- ImageNet
Training Techniques:
- Cosine Annealing
- Random Erasing
Architecture:
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Global Average Pooling
- ReLU
- SelecSLS Block
File Size: 129948954
Tasks:
- Image Classification
ID: selecsls42b
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/b9843f954b0457af2db4f9dea41a8538f51f5d78/timm/models/selecsls.py#L335
In Collection: SelecSLS
- Name: selecsls60
Metadata:
FLOPs: 4610472600
Training Data:
- ImageNet
Training Techniques:
- Cosine Annealing
- Random Erasing
Architecture:
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Global Average Pooling
- ReLU
- SelecSLS Block
File Size: 122839714
Tasks:
- Image Classification
ID: selecsls60
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/b9843f954b0457af2db4f9dea41a8538f51f5d78/timm/models/selecsls.py#L342
In Collection: SelecSLS
- Name: selecsls60b
Metadata:
FLOPs: 4657653144
Training Data:
- ImageNet
Training Techniques:
- Cosine Annealing
- Random Erasing
Architecture:
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Global Average Pooling
- ReLU
- SelecSLS Block
File Size: 131252898
Tasks:
- Image Classification
ID: selecsls60b
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/b9843f954b0457af2db4f9dea41a8538f51f5d78/timm/models/selecsls.py#L349
In Collection: SelecSLS
Collections:
- Name: SelecSLS
Paper:
title: 'XNect: Real-time Multi-Person 3D Motion Capture with a Single RGB Camera'
url: https://papperswithcode.com//paper/xnect-real-time-multi-person-3d-human-pose
type: model-index
Type: model-index
-->

@ -0,0 +1,144 @@
# Summary
**SE ResNeXt** is a variant of a [ResNext](https://www.paperswithcode.com/method/resneXt) that employs [squeeze-and-excitation blocks](https://paperswithcode.com/method/squeeze-and-excitation-block) to enable the network to perform dynamic channel-wise feature recalibration.
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{hu2019squeezeandexcitation,
title={Squeeze-and-Excitation Networks},
author={Jie Hu and Li Shen and Samuel Albanie and Gang Sun and Enhua Wu},
year={2019},
eprint={1709.01507},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: seresnext26d_32x4d
Metadata:
FLOPs: 3507053024
Epochs: 100
Batch Size: 1024
Training Data:
- ImageNet
Training Techniques:
- Label Smoothing
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA Titan X GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Grouped Convolution
- Max Pooling
- ReLU
- ResNeXt Block
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 67425193
Tasks:
- Image Classification
ID: seresnext26d_32x4d
LR: 0.6
Layers: 26
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/a7f95818e44b281137503bcf4b3e3e94d8ffa52f/timm/models/resnet.py#L1234
In Collection: SEResNeXt
- Name: seresnext26t_32x4d
Metadata:
FLOPs: 3466436448
Epochs: 100
Batch Size: 1024
Training Data:
- ImageNet
Training Techniques:
- Label Smoothing
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA Titan X GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Grouped Convolution
- Max Pooling
- ReLU
- ResNeXt Block
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 67414838
Tasks:
- Image Classification
ID: seresnext26t_32x4d
LR: 0.6
Layers: 26
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/a7f95818e44b281137503bcf4b3e3e94d8ffa52f/timm/models/resnet.py#L1246
In Collection: SEResNeXt
- Name: seresnext50_32x4d
Metadata:
FLOPs: 5475179184
Epochs: 100
Batch Size: 1024
Training Data:
- ImageNet
Training Techniques:
- Label Smoothing
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA Titan X GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Grouped Convolution
- Max Pooling
- ReLU
- ResNeXt Block
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 110569859
Tasks:
- Image Classification
ID: seresnext50_32x4d
LR: 0.6
Layers: 50
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/a7f95818e44b281137503bcf4b3e3e94d8ffa52f/timm/models/resnet.py#L1267
In Collection: SEResNeXt
Collections:
- Name: SEResNeXt
Paper:
title: Squeeze-and-Excitation Networks
url: https://papperswithcode.com//paper/squeeze-and-excitation-networks
type: model-index
Type: model-index
-->

@ -0,0 +1,97 @@
# Summary
**SK ResNet** is a variant of a [ResNet](https://www.paperswithcode.com/method/resnet) that employs a [Selective Kernel](https://paperswithcode.com/method/selective-kernel) unit. In general, all the large kernel convolutions in the original bottleneck blocks in ResNet are replaced by the proposed [SK convolutions](https://paperswithcode.com/method/selective-kernel-convolution), enabling the network to choose appropriate receptive field sizes in an adaptive manner.
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{li2019selective,
title={Selective Kernel Networks},
author={Xiang Li and Wenhai Wang and Xiaolin Hu and Jian Yang},
year={2019},
eprint={1903.06586},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: skresnet18
Metadata:
FLOPs: 2333467136
Epochs: 100
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 8x GPUs
Architecture:
- Convolution
- Dense Connections
- Global Average Pooling
- Max Pooling
- Residual Connection
- Selective Kernel
- Softmax
File Size: 47923238
Tasks:
- Image Classification
ID: skresnet18
LR: 0.1
Layers: 18
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 4.0e-05
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/a7f95818e44b281137503bcf4b3e3e94d8ffa52f/timm/models/sknet.py#L148
In Collection: SKResNet
- Name: skresnet34
Metadata:
FLOPs: 4711849952
Epochs: 100
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 8x GPUs
Architecture:
- Convolution
- Dense Connections
- Global Average Pooling
- Max Pooling
- Residual Connection
- Selective Kernel
- Softmax
File Size: 89299314
Tasks:
- Image Classification
ID: skresnet34
LR: 0.1
Layers: 34
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 4.0e-05
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/a7f95818e44b281137503bcf4b3e3e94d8ffa52f/timm/models/sknet.py#L165
In Collection: SKResNet
Collections:
- Name: SKResNet
Paper:
title: Selective Kernel Networks
url: https://papperswithcode.com//paper/selective-kernel-networks
type: model-index
Type: model-index
-->

@ -0,0 +1,63 @@
# Summary
**SK ResNeXt** is a variant of a [ResNeXt](https://www.paperswithcode.com/method/resnext) that employs a [Selective Kernel](https://paperswithcode.com/method/selective-kernel) unit. In general, all the large kernel convolutions in the original bottleneck blocks in ResNext are replaced by the proposed [SK convolutions](https://paperswithcode.com/method/selective-kernel-convolution), enabling the network to choose appropriate receptive field sizes in an adaptive manner.
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{li2019selective,
title={Selective Kernel Networks},
author={Xiang Li and Wenhai Wang and Xiaolin Hu and Jian Yang},
year={2019},
eprint={1903.06586},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: skresnext50_32x4d
Metadata:
FLOPs: 5739845824
Epochs: 100
Batch Size: 256
Training Data:
- ImageNet
Training Resources: 8x GPUs
Architecture:
- Convolution
- Dense Connections
- Global Average Pooling
- Grouped Convolution
- Max Pooling
- Residual Connection
- Selective Kernel
- Softmax
File Size: 110340975
Tasks:
- Image Classification
ID: skresnext50_32x4d
LR: 0.1
Layers: 50
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/a7f95818e44b281137503bcf4b3e3e94d8ffa52f/timm/models/sknet.py#L210
In Collection: SKResNeXt
Collections:
- Name: SKResNeXt
Paper:
title: Selective Kernel Networks
url: https://papperswithcode.com//paper/selective-kernel-networks
type: model-index
Type: model-index
-->

@ -0,0 +1,55 @@
# Summary
**Single-Path NAS** is a novel differentiable NAS method for designing hardware-efficient ConvNets in less than 4 hours.
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{stamoulis2019singlepath,
title={Single-Path NAS: Designing Hardware-Efficient ConvNets in less than 4 Hours},
author={Dimitrios Stamoulis and Ruizhou Ding and Di Wang and Dimitrios Lymberopoulos and Bodhi Priyantha and Jie Liu and Diana Marculescu},
year={2019},
eprint={1904.02877},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
```
<!--
Models:
- Name: spnasnet_100
Metadata:
FLOPs: 442385600
Training Data:
- ImageNet
Architecture:
- Average Pooling
- Batch Normalization
- Convolution
- Depthwise Separable Convolution
- Dropout
- ReLU
File Size: 17902337
Tasks:
- Image Classification
ID: spnasnet_100
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L995
In Collection: SPNASNet
Collections:
- Name: SPNASNet
Paper:
title: 'Single-Path NAS: Designing Hardware-Efficient ConvNets in less than 4
Hours'
url: https://papperswithcode.com//paper/single-path-nas-designing-hardware-efficient
type: model-index
Type: model-index
-->

@ -0,0 +1,116 @@
# Summary
**Residual Networks**, or **ResNets**, learn residual functions with reference to the layer inputs, instead of learning unreferenced functions. Instead of hoping each few stacked layers directly fit a desired underlying mapping, residual nets let these layers fit a residual mapping. They stack [residual blocks](https://paperswithcode.com/method/residual-block) ontop of each other to form network: e.g. a ResNet-50 has fifty layers using these blocks.
The model in this collection utilises semi-supervised learning to improve the performance of the model. The approach brings important gains to standard architectures for image, video and fine-grained classification.
Please note the CC-BY-NC 4.0 license on theses weights, non-commercial use only.
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@article{DBLP:journals/corr/abs-1905-00546,
author = {I. Zeki Yalniz and
Herv{\'{e}} J{\'{e}}gou and
Kan Chen and
Manohar Paluri and
Dhruv Mahajan},
title = {Billion-scale semi-supervised learning for image classification},
journal = {CoRR},
volume = {abs/1905.00546},
year = {2019},
url = {http://arxiv.org/abs/1905.00546},
archivePrefix = {arXiv},
eprint = {1905.00546},
timestamp = {Mon, 28 Sep 2020 08:19:37 +0200},
biburl = {https://dblp.org/rec/journals/corr/abs-1905-00546.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
```
<!--
Models:
- Name: ssl_resnet50
Metadata:
FLOPs: 5282531328
Epochs: 30
Batch Size: 1536
Training Data:
- ImageNet
- YFCC-100M
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 64x GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 102480594
Tasks:
- Image Classification
ID: ssl_resnet50
LR: 0.0015
Layers: 50
Crop Pct: '0.875'
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/resnet.py#L904
In Collection: SSL ResNet
- Name: ssl_resnet18
Metadata:
FLOPs: 2337073152
Epochs: 30
Batch Size: 1536
Training Data:
- ImageNet
- YFCC-100M
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 64x GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 46811375
Tasks:
- Image Classification
ID: ssl_resnet18
LR: 0.0015
Layers: 18
Crop Pct: '0.875'
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/resnet.py#L894
In Collection: SSL ResNet
Collections:
- Name: SSL ResNet
Paper:
title: Billion-scale semi-supervised learning for image classification
url: https://papperswithcode.com//paper/billion-scale-semi-supervised-learning-for
type: model-index
Type: model-index
-->

@ -0,0 +1,186 @@
# Summary
A **ResNeXt** repeats a [building block](https://paperswithcode.com/method/resnext-block) that aggregates a set of transformations with the same topology. Compared to a [ResNet](https://paperswithcode.com/method/resnet), it exposes a new dimension, *cardinality* (the size of the set of transformations) $C$, as an essential factor in addition to the dimensions of depth and width.
The model in this collection utilises semi-supervised learning to improve the performance of the model. The approach brings important gains to standard architectures for image, video and fine-grained classification.
Please note the CC-BY-NC 4.0 license on theses weights, non-commercial use only.
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@article{DBLP:journals/corr/abs-1905-00546,
author = {I. Zeki Yalniz and
Herv{\'{e}} J{\'{e}}gou and
Kan Chen and
Manohar Paluri and
Dhruv Mahajan},
title = {Billion-scale semi-supervised learning for image classification},
journal = {CoRR},
volume = {abs/1905.00546},
year = {2019},
url = {http://arxiv.org/abs/1905.00546},
archivePrefix = {arXiv},
eprint = {1905.00546},
timestamp = {Mon, 28 Sep 2020 08:19:37 +0200},
biburl = {https://dblp.org/rec/journals/corr/abs-1905-00546.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
```
<!--
Models:
- Name: ssl_resnext101_32x16d
Metadata:
FLOPs: 46623691776
Epochs: 30
Batch Size: 1536
Training Data:
- ImageNet
- YFCC-100M
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 64x GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Grouped Convolution
- Max Pooling
- ReLU
- ResNeXt Block
- Residual Connection
- Softmax
File Size: 777518664
Tasks:
- Image Classification
ID: ssl_resnext101_32x16d
LR: 0.0015
Layers: 101
Crop Pct: '0.875'
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/resnet.py#L944
In Collection: SSL ResNext
- Name: ssl_resnext50_32x4d
Metadata:
FLOPs: 5472648192
Epochs: 30
Batch Size: 1536
Training Data:
- ImageNet
- YFCC-100M
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 64x GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Grouped Convolution
- Max Pooling
- ReLU
- ResNeXt Block
- Residual Connection
- Softmax
File Size: 100428550
Tasks:
- Image Classification
ID: ssl_resnext50_32x4d
LR: 0.0015
Layers: 50
Crop Pct: '0.875'
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/resnet.py#L914
In Collection: SSL ResNext
- Name: ssl_resnext101_32x4d
Metadata:
FLOPs: 10298145792
Epochs: 30
Batch Size: 1536
Training Data:
- ImageNet
- YFCC-100M
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 64x GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Grouped Convolution
- Max Pooling
- ReLU
- ResNeXt Block
- Residual Connection
- Softmax
File Size: 177341913
Tasks:
- Image Classification
ID: ssl_resnext101_32x4d
LR: 0.0015
Layers: 101
Crop Pct: '0.875'
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/resnet.py#L924
In Collection: SSL ResNext
- Name: ssl_resnext101_32x8d
Metadata:
FLOPs: 21180417024
Epochs: 30
Batch Size: 1536
Training Data:
- ImageNet
- YFCC-100M
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 64x GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Grouped Convolution
- Max Pooling
- ReLU
- ResNeXt Block
- Residual Connection
- Softmax
File Size: 356056638
Tasks:
- Image Classification
ID: ssl_resnext101_32x8d
LR: 0.0015
Layers: 101
Crop Pct: '0.875'
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/resnet.py#L934
In Collection: SSL ResNext
Collections:
- Name: SSL ResNext
Paper:
title: Billion-scale semi-supervised learning for image classification
url: https://papperswithcode.com//paper/billion-scale-semi-supervised-learning-for
type: model-index
Type: model-index
-->

@ -0,0 +1,116 @@
# Summary
**Residual Networks**, or **ResNets**, learn residual functions with reference to the layer inputs, instead of learning unreferenced functions. Instead of hoping each few stacked layers directly fit a desired underlying mapping, residual nets let these layers fit a residual mapping. They stack [residual blocks](https://paperswithcode.com/method/residual-block) ontop of each other to form network: e.g. a ResNet-50 has fifty layers using these blocks.
The models in this collection utilise semi-weakly supervised learning to improve the performance of the model. The approach brings important gains to standard architectures for image, video and fine-grained classification.
Please note the CC-BY-NC 4.0 license on theses weights, non-commercial use only.
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@article{DBLP:journals/corr/abs-1905-00546,
author = {I. Zeki Yalniz and
Herv{\'{e}} J{\'{e}}gou and
Kan Chen and
Manohar Paluri and
Dhruv Mahajan},
title = {Billion-scale semi-supervised learning for image classification},
journal = {CoRR},
volume = {abs/1905.00546},
year = {2019},
url = {http://arxiv.org/abs/1905.00546},
archivePrefix = {arXiv},
eprint = {1905.00546},
timestamp = {Mon, 28 Sep 2020 08:19:37 +0200},
biburl = {https://dblp.org/rec/journals/corr/abs-1905-00546.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
```
<!--
Models:
- Name: swsl_resnet18
Metadata:
FLOPs: 2337073152
Epochs: 30
Batch Size: 1536
Training Data:
- IG-1B-Targeted
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 64x GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 46811375
Tasks:
- Image Classification
ID: swsl_resnet18
LR: 0.0015
Layers: 18
Crop Pct: '0.875'
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/resnet.py#L954
In Collection: SWSL ResNet
- Name: swsl_resnet50
Metadata:
FLOPs: 5282531328
Epochs: 30
Batch Size: 1536
Training Data:
- IG-1B-Targeted
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 64x GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 102480594
Tasks:
- Image Classification
ID: swsl_resnet50
LR: 0.0015
Layers: 50
Crop Pct: '0.875'
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/resnet.py#L965
In Collection: SWSL ResNet
Collections:
- Name: SWSL ResNet
Paper:
title: Billion-scale semi-supervised learning for image classification
url: https://papperswithcode.com//paper/billion-scale-semi-supervised-learning-for
type: model-index
Type: model-index
-->

@ -0,0 +1,186 @@
# Summary
A **ResNeXt** repeats a [building block](https://paperswithcode.com/method/resnext-block) that aggregates a set of transformations with the same topology. Compared to a [ResNet](https://paperswithcode.com/method/resnet), it exposes a new dimension, *cardinality* (the size of the set of transformations) $C$, as an essential factor in addition to the dimensions of depth and width.
The models in this collection utilise semi-weakly supervised learning to improve the performance of the model. The approach brings important gains to standard architectures for image, video and fine-grained classification.
Please note the CC-BY-NC 4.0 license on theses weights, non-commercial use only.
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@article{DBLP:journals/corr/abs-1905-00546,
author = {I. Zeki Yalniz and
Herv{\'{e}} J{\'{e}}gou and
Kan Chen and
Manohar Paluri and
Dhruv Mahajan},
title = {Billion-scale semi-supervised learning for image classification},
journal = {CoRR},
volume = {abs/1905.00546},
year = {2019},
url = {http://arxiv.org/abs/1905.00546},
archivePrefix = {arXiv},
eprint = {1905.00546},
timestamp = {Mon, 28 Sep 2020 08:19:37 +0200},
biburl = {https://dblp.org/rec/journals/corr/abs-1905-00546.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
```
<!--
Models:
- Name: swsl_resnext101_32x4d
Metadata:
FLOPs: 10298145792
Epochs: 30
Batch Size: 1536
Training Data:
- IG-1B-Targeted
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 64x GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Grouped Convolution
- Max Pooling
- ReLU
- ResNeXt Block
- Residual Connection
- Softmax
File Size: 177341913
Tasks:
- Image Classification
ID: swsl_resnext101_32x4d
LR: 0.0015
Layers: 101
Crop Pct: '0.875'
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/resnet.py#L987
In Collection: SWSL ResNext
- Name: swsl_resnext50_32x4d
Metadata:
FLOPs: 5472648192
Epochs: 30
Batch Size: 1536
Training Data:
- IG-1B-Targeted
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 64x GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Grouped Convolution
- Max Pooling
- ReLU
- ResNeXt Block
- Residual Connection
- Softmax
File Size: 100428550
Tasks:
- Image Classification
ID: swsl_resnext50_32x4d
LR: 0.0015
Layers: 50
Crop Pct: '0.875'
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/resnet.py#L976
In Collection: SWSL ResNext
- Name: swsl_resnext101_32x16d
Metadata:
FLOPs: 46623691776
Epochs: 30
Batch Size: 1536
Training Data:
- IG-1B-Targeted
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 64x GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Grouped Convolution
- Max Pooling
- ReLU
- ResNeXt Block
- Residual Connection
- Softmax
File Size: 777518664
Tasks:
- Image Classification
ID: swsl_resnext101_32x16d
LR: 0.0015
Layers: 101
Crop Pct: '0.875'
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/resnet.py#L1009
In Collection: SWSL ResNext
- Name: swsl_resnext101_32x8d
Metadata:
FLOPs: 21180417024
Epochs: 30
Batch Size: 1536
Training Data:
- IG-1B-Targeted
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 64x GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Grouped Convolution
- Max Pooling
- ReLU
- ResNeXt Block
- Residual Connection
- Softmax
File Size: 356056638
Tasks:
- Image Classification
ID: swsl_resnext101_32x8d
LR: 0.0015
Layers: 101
Crop Pct: '0.875'
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/resnet.py#L998
In Collection: SWSL ResNext
Collections:
- Name: SWSL ResNext
Paper:
title: Billion-scale semi-supervised learning for image classification
url: https://papperswithcode.com//paper/billion-scale-semi-supervised-learning-for
type: model-index
Type: model-index
-->

@ -0,0 +1,164 @@
# Summary
**EfficientNet** is a convolutional neural network architecture and scaling method that uniformly scales all dimensions of depth/width/resolution using a *compound coefficient*. Unlike conventional practice that arbitrary scales these factors, the EfficientNet scaling method uniformly scales network width, depth, and resolution with a set of fixed scaling coefficients. For example, if we want to use $2^N$ times more computational resources, then we can simply increase the network depth by $\alpha ^ N$, width by $\beta ^ N$, and image size by $\gamma ^ N$, where $\alpha, \beta, \gamma$ are constant coefficients determined by a small grid search on the original small model. EfficientNet uses a compound coefficient $\phi$ to uniformly scales network width, depth, and resolution in a principled way.
The compound scaling method is justified by the intuition that if the input image is bigger, then the network needs more layers to increase the receptive field and more channels to capture more fine-grained patterns on the bigger image.
The base EfficientNet-B0 network is based on the inverted bottleneck residual blocks of [MobileNetV2](https://paperswithcode.com/method/mobilenetv2), in addition to squeeze-and-excitation blocks.
This collection of models amends EfficientNet by adding [CondConv](https://paperswithcode.com/method/condconv) convolutions.
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@article{DBLP:journals/corr/abs-1904-04971,
author = {Brandon Yang and
Gabriel Bender and
Quoc V. Le and
Jiquan Ngiam},
title = {Soft Conditional Computation},
journal = {CoRR},
volume = {abs/1904.04971},
year = {2019},
url = {http://arxiv.org/abs/1904.04971},
archivePrefix = {arXiv},
eprint = {1904.04971},
timestamp = {Thu, 25 Apr 2019 13:55:01 +0200},
biburl = {https://dblp.org/rec/journals/corr/abs-1904-04971.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
```
<!--
Models:
- Name: tf_efficientnet_cc_b1_8e
Metadata:
FLOPs: 370427824
Epochs: 350
Batch Size: 2048
Training Data:
- ImageNet
Training Techniques:
- AutoAugment
- Label Smoothing
- RMSProp
- Stochastic Depth
- Weight Decay
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- CondConv
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 159206198
Tasks:
- Image Classification
ID: tf_efficientnet_cc_b1_8e
LR: 0.256
Crop Pct: '0.882'
Momentum: 0.9
Image Size: '240'
Weight Decay: 1.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Label Smoothing: 0.1
BatchNorm Momentum: 0.99
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1584
In Collection: TF EfficientNet CondConv
- Name: tf_efficientnet_cc_b0_4e
Metadata:
FLOPs: 224153788
Epochs: 350
Batch Size: 2048
Training Data:
- ImageNet
Training Techniques:
- AutoAugment
- Label Smoothing
- RMSProp
- Stochastic Depth
- Weight Decay
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- CondConv
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 53490940
Tasks:
- Image Classification
ID: tf_efficientnet_cc_b0_4e
LR: 0.256
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 1.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Label Smoothing: 0.1
BatchNorm Momentum: 0.99
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1561
In Collection: TF EfficientNet CondConv
- Name: tf_efficientnet_cc_b0_8e
Metadata:
FLOPs: 224158524
Epochs: 350
Batch Size: 2048
Training Data:
- ImageNet
Training Techniques:
- AutoAugment
- Label Smoothing
- RMSProp
- Stochastic Depth
- Weight Decay
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- CondConv
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 96287616
Tasks:
- Image Classification
ID: tf_efficientnet_cc_b0_8e
LR: 0.256
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 1.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Label Smoothing: 0.1
BatchNorm Momentum: 0.99
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1572
In Collection: TF EfficientNet CondConv
Collections:
- Name: TF EfficientNet CondConv
Paper:
title: 'CondConv: Conditionally Parameterized Convolutions for Efficient Inference'
url: https://papperswithcode.com//paper/soft-conditional-computation
type: model-index
Type: model-index
-->

@ -0,0 +1,154 @@
# Summary
**EfficientNet** is a convolutional neural network architecture and scaling method that uniformly scales all dimensions of depth/width/resolution using a *compound coefficient*. Unlike conventional practice that arbitrary scales these factors, the EfficientNet scaling method uniformly scales network width, depth, and resolution with a set of fixed scaling coefficients. For example, if we want to use $2^N$ times more computational resources, then we can simply increase the network depth by $\alpha ^ N$, width by $\beta ^ N$, and image size by $\gamma ^ N$, where $\alpha, \beta, \gamma$ are constant coefficients determined by a small grid search on the original small model. EfficientNet uses a compound coefficient $\phi$ to uniformly scales network width, depth, and resolution in a principled way.
The compound scaling method is justified by the intuition that if the input image is bigger, then the network needs more layers to increase the receptive field and more channels to capture more fine-grained patterns on the bigger image.
The base EfficientNet-B0 network is based on the inverted bottleneck residual blocks of [MobileNetV2](https://paperswithcode.com/method/mobilenetv2).
EfficientNet-Lite makes EfficientNet more suitable for mobile devices by introducing [ReLU6](https://paperswithcode.com/method/relu6) activation functions and removing [squeeze-and-excitation blocks](https://paperswithcode.com/method/squeeze-and-excitation).
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{tan2020efficientnet,
title={EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks},
author={Mingxing Tan and Quoc V. Le},
year={2020},
eprint={1905.11946},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
```
<!--
Models:
- Name: tf_efficientnet_lite3
Metadata:
FLOPs: 2011534304
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- RELU6
File Size: 33161413
Tasks:
- Image Classification
ID: tf_efficientnet_lite3
Crop Pct: '0.904'
Image Size: '300'
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1629
In Collection: TF EfficientNet Lite
- Name: tf_efficientnet_lite4
Metadata:
FLOPs: 5164802912
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- RELU6
File Size: 52558819
Tasks:
- Image Classification
ID: tf_efficientnet_lite4
Crop Pct: '0.92'
Image Size: '380'
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1640
In Collection: TF EfficientNet Lite
- Name: tf_efficientnet_lite2
Metadata:
FLOPs: 1068494432
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- RELU6
File Size: 24658687
Tasks:
- Image Classification
ID: tf_efficientnet_lite2
Crop Pct: '0.89'
Image Size: '260'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1618
In Collection: TF EfficientNet Lite
- Name: tf_efficientnet_lite1
Metadata:
FLOPs: 773639520
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- RELU6
File Size: 21939331
Tasks:
- Image Classification
ID: tf_efficientnet_lite1
Crop Pct: '0.882'
Image Size: '240'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1607
In Collection: TF EfficientNet Lite
- Name: tf_efficientnet_lite0
Metadata:
FLOPs: 488052032
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- RELU6
File Size: 18820223
Tasks:
- Image Classification
ID: tf_efficientnet_lite0
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1596
In Collection: TF EfficientNet Lite
Collections:
- Name: TF EfficientNet Lite
Paper:
title: 'EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks'
url: https://papperswithcode.com//paper/efficientnet-rethinking-model-scaling-for
type: model-index
Type: model-index
-->

@ -0,0 +1,503 @@
# Summary
**EfficientNet** is a convolutional neural network architecture and scaling method that uniformly scales all dimensions of depth/width/resolution using a *compound coefficient*. Unlike conventional practice that arbitrary scales these factors, the EfficientNet scaling method uniformly scales network width, depth, and resolution with a set of fixed scaling coefficients. For example, if we want to use $2^N$ times more computational resources, then we can simply increase the network depth by $\alpha ^ N$, width by $\beta ^ N$, and image size by $\gamma ^ N$, where $\alpha, \beta, \gamma$ are constant coefficients determined by a small grid search on the original small model. EfficientNet uses a compound coefficient $\phi$ to uniformly scales network width, depth, and resolution in a principled way.
The compound scaling method is justified by the intuition that if the input image is bigger, then the network needs more layers to increase the receptive field and more channels to capture more fine-grained patterns on the bigger image.
The base EfficientNet-B0 network is based on the inverted bottleneck residual blocks of [MobileNetV2](https://paperswithcode.com/method/mobilenetv2), in addition to [squeeze-and-excitation blocks](https://paperswithcode.com/method/squeeze-and-excitation-block).
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{tan2020efficientnet,
title={EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks},
author={Mingxing Tan and Quoc V. Le},
year={2020},
eprint={1905.11946},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
```
<!--
Models:
- Name: tf_efficientnet_b1
Metadata:
FLOPs: 883633200
Epochs: 350
Batch Size: 2048
Training Data:
- ImageNet
Training Techniques:
- AutoAugment
- Label Smoothing
- RMSProp
- Stochastic Depth
- Weight Decay
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 31512534
Tasks:
- Image Classification
ID: tf_efficientnet_b1
LR: 0.256
Crop Pct: '0.882'
Momentum: 0.9
Image Size: '240'
Weight Decay: 1.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Label Smoothing: 0.1
BatchNorm Momentum: 0.99
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1251
In Collection: TF EfficientNet
- Name: tf_efficientnet_b4
Metadata:
FLOPs: 5749638672
Epochs: 350
Batch Size: 2048
Training Data:
- ImageNet
Training Techniques:
- AutoAugment
- Label Smoothing
- RMSProp
- Stochastic Depth
- Weight Decay
Training Resources: TPUv3 Cloud TPU
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 77989689
Tasks:
- Image Classification
Training Time: ''
ID: tf_efficientnet_b4
LR: 0.256
Crop Pct: '0.922'
Momentum: 0.9
Image Size: '380'
Weight Decay: 1.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Label Smoothing: 0.1
BatchNorm Momentum: 0.99
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1281
Config: ''
In Collection: TF EfficientNet
- Name: tf_efficientnet_b2
Metadata:
FLOPs: 1234321170
Epochs: 350
Batch Size: 2048
Training Data:
- ImageNet
Training Techniques:
- AutoAugment
- Label Smoothing
- RMSProp
- Stochastic Depth
- Weight Decay
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 36797929
Tasks:
- Image Classification
ID: tf_efficientnet_b2
LR: 0.256
Crop Pct: '0.89'
Momentum: 0.9
Image Size: '260'
Weight Decay: 1.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Label Smoothing: 0.1
BatchNorm Momentum: 0.99
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1261
In Collection: TF EfficientNet
- Name: tf_efficientnet_b3
Metadata:
FLOPs: 2275247568
Epochs: 350
Batch Size: 2048
Training Data:
- ImageNet
Training Techniques:
- AutoAugment
- Label Smoothing
- RMSProp
- Stochastic Depth
- Weight Decay
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 49381362
Tasks:
- Image Classification
ID: tf_efficientnet_b3
LR: 0.256
Crop Pct: '0.904'
Momentum: 0.9
Image Size: '300'
Weight Decay: 1.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Label Smoothing: 0.1
BatchNorm Momentum: 0.99
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1271
In Collection: TF EfficientNet
- Name: tf_efficientnet_b0
Metadata:
FLOPs: 488688572
Epochs: 350
Batch Size: 2048
Training Data:
- ImageNet
Training Techniques:
- AutoAugment
- Label Smoothing
- RMSProp
- Stochastic Depth
- Weight Decay
Training Resources: TPUv3 Cloud TPU
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 21383997
Tasks:
- Image Classification
Training Time: ''
ID: tf_efficientnet_b0
LR: 0.256
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 1.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Label Smoothing: 0.1
BatchNorm Momentum: 0.99
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1241
Config: ''
In Collection: TF EfficientNet
- Name: tf_efficientnet_b5
Metadata:
FLOPs: 13176501888
Epochs: 350
Batch Size: 2048
Training Data:
- ImageNet
Training Techniques:
- AutoAugment
- Label Smoothing
- RMSProp
- Stochastic Depth
- Weight Decay
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 122403150
Tasks:
- Image Classification
ID: tf_efficientnet_b5
LR: 0.256
Crop Pct: '0.934'
Momentum: 0.9
Image Size: '456'
Weight Decay: 1.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Label Smoothing: 0.1
BatchNorm Momentum: 0.99
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1291
In Collection: TF EfficientNet
- Name: tf_efficientnet_b6
Metadata:
FLOPs: 24180518488
Epochs: 350
Batch Size: 2048
Training Data:
- ImageNet
Training Techniques:
- AutoAugment
- Label Smoothing
- RMSProp
- Stochastic Depth
- Weight Decay
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 173232007
Tasks:
- Image Classification
ID: tf_efficientnet_b6
LR: 0.256
Crop Pct: '0.942'
Momentum: 0.9
Image Size: '528'
Weight Decay: 1.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Label Smoothing: 0.1
BatchNorm Momentum: 0.99
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1301
In Collection: TF EfficientNet
- Name: tf_efficientnet_b7
Metadata:
FLOPs: 48205304880
Epochs: 350
Batch Size: 2048
Training Data:
- ImageNet
Training Techniques:
- AutoAugment
- Label Smoothing
- RMSProp
- Stochastic Depth
- Weight Decay
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 266850607
Tasks:
- Image Classification
ID: tf_efficientnet_b7
LR: 0.256
Crop Pct: '0.949'
Momentum: 0.9
Image Size: '600'
Weight Decay: 1.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Label Smoothing: 0.1
BatchNorm Momentum: 0.99
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1312
In Collection: TF EfficientNet
- Name: tf_efficientnet_b8
Metadata:
FLOPs: 80962956270
Epochs: 350
Batch Size: 2048
Training Data:
- ImageNet
Training Techniques:
- AutoAugment
- Label Smoothing
- RMSProp
- Stochastic Depth
- Weight Decay
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 351379853
Tasks:
- Image Classification
ID: tf_efficientnet_b8
LR: 0.256
Crop Pct: '0.954'
Momentum: 0.9
Image Size: '672'
Weight Decay: 1.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Label Smoothing: 0.1
BatchNorm Momentum: 0.99
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1323
In Collection: TF EfficientNet
- Name: tf_efficientnet_el
Metadata:
FLOPs: 9356616096
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 42800271
Tasks:
- Image Classification
ID: tf_efficientnet_el
Crop Pct: '0.904'
Image Size: '300'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1551
In Collection: TF EfficientNet
- Name: tf_efficientnet_em
Metadata:
FLOPs: 3636607040
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 27933644
Tasks:
- Image Classification
ID: tf_efficientnet_em
Crop Pct: '0.882'
Image Size: '240'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1541
In Collection: TF EfficientNet
- Name: tf_efficientnet_es
Metadata:
FLOPs: 2057577472
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 22008479
Tasks:
- Image Classification
ID: tf_efficientnet_es
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1531
In Collection: TF EfficientNet
- Name: tf_efficientnet_l2_ns_475
Metadata:
FLOPs: 217795669644
Epochs: 350
Batch Size: 2048
Training Data:
- ImageNet
- JFT-300M
Training Techniques:
- AutoAugment
- FixRes
- Label Smoothing
- Noisy Student
- RMSProp
- RandAugment
- Weight Decay
Training Resources: TPUv3 Cloud TPU
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 1925950424
Tasks:
- Image Classification
Training Time: ''
ID: tf_efficientnet_l2_ns_475
LR: 0.128
Dropout: 0.5
Crop Pct: '0.936'
Momentum: 0.9
Image Size: '475'
Weight Decay: 1.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Label Smoothing: 0.1
BatchNorm Momentum: 0.99
Stochastic Depth Survival: 0.8
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1509
Config: ''
In Collection: TF EfficientNet
Collections:
- Name: TF EfficientNet
Paper:
title: 'EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks'
url: https://papperswithcode.com//paper/efficientnet-rethinking-model-scaling-for
type: model-index
Type: model-index
-->

@ -0,0 +1,78 @@
# Summary
**Inception v3** is a convolutional neural network architecture from the Inception family that makes several improvements including using [Label Smoothing](https://paperswithcode.com/method/label-smoothing), Factorized 7 x 7 convolutions, and the use of an [auxiliary classifer](https://paperswithcode.com/method/auxiliary-classifier) to propagate label information lower down the network (along with the use of batch normalization for layers in the sidehead). The key building block is an [Inception Module](https://paperswithcode.com/method/inception-v3-module).
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@article{DBLP:journals/corr/SzegedyVISW15,
author = {Christian Szegedy and
Vincent Vanhoucke and
Sergey Ioffe and
Jonathon Shlens and
Zbigniew Wojna},
title = {Rethinking the Inception Architecture for Computer Vision},
journal = {CoRR},
volume = {abs/1512.00567},
year = {2015},
url = {http://arxiv.org/abs/1512.00567},
archivePrefix = {arXiv},
eprint = {1512.00567},
timestamp = {Mon, 13 Aug 2018 16:49:07 +0200},
biburl = {https://dblp.org/rec/journals/corr/SzegedyVISW15.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
```
<!--
Models:
- Name: tf_inception_v3
Metadata:
FLOPs: 7352418880
Training Data:
- ImageNet
Training Techniques:
- Gradient Clipping
- Label Smoothing
- RMSProp
- Weight Decay
Training Resources: 50x NVIDIA Kepler GPUs
Architecture:
- 1x1 Convolution
- Auxiliary Classifier
- Average Pooling
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inception-v3 Module
- Max Pooling
- ReLU
- Softmax
File Size: 95549439
Tasks:
- Image Classification
ID: tf_inception_v3
LR: 0.045
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '299'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/inception_v3.py#L449
In Collection: TF Inception v3
Collections:
- Name: TF Inception v3
Paper:
title: Rethinking the Inception Architecture for Computer Vision
url: https://papperswithcode.com//paper/rethinking-the-inception-architecture-for
type: model-index
Type: model-index
-->

@ -0,0 +1,108 @@
# Summary
**MixNet** is a type of convolutional neural network discovered via AutoML that utilises [MixConvs](https://paperswithcode.com/method/mixconv) instead of regular [depthwise convolutions](https://paperswithcode.com/method/depthwise-convolution).
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{tan2019mixconv,
title={MixConv: Mixed Depthwise Convolutional Kernels},
author={Mingxing Tan and Quoc V. Le},
year={2019},
eprint={1907.09595},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: tf_mixnet_l
Metadata:
FLOPs: 688674516
Training Data:
- ImageNet
Training Techniques:
- MNAS
Architecture:
- Batch Normalization
- Dense Connections
- Dropout
- Global Average Pooling
- Grouped Convolution
- MixConv
- Squeeze-and-Excitation Block
- Swish
File Size: 29620756
Tasks:
- Image Classification
ID: tf_mixnet_l
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1720
In Collection: TF MixNet
- Name: tf_mixnet_m
Metadata:
FLOPs: 416633502
Training Data:
- ImageNet
Training Techniques:
- MNAS
Architecture:
- Batch Normalization
- Dense Connections
- Dropout
- Global Average Pooling
- Grouped Convolution
- MixConv
- Squeeze-and-Excitation Block
- Swish
File Size: 20310871
Tasks:
- Image Classification
ID: tf_mixnet_m
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1709
In Collection: TF MixNet
- Name: tf_mixnet_s
Metadata:
FLOPs: 302587678
Training Data:
- ImageNet
Training Techniques:
- MNAS
Architecture:
- Batch Normalization
- Dense Connections
- Dropout
- Global Average Pooling
- Grouped Convolution
- MixConv
- Squeeze-and-Excitation Block
- Swish
File Size: 16738218
Tasks:
- Image Classification
ID: tf_mixnet_s
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1698
In Collection: TF MixNet
Collections:
- Name: TF MixNet
Paper:
title: 'MixConv: Mixed Depthwise Convolutional Kernels'
url: https://papperswithcode.com//paper/mixnet-mixed-depthwise-convolutional-kernels
type: model-index
Type: model-index
-->

@ -0,0 +1,271 @@
# Summary
**MobileNetV3** is a convolutional neural network that is designed for mobile phone CPUs. The network design includes the use of a [hard swish activation](https://paperswithcode.com/method/hard-swish) and [squeeze-and-excitation](https://paperswithcode.com/method/squeeze-and-excitation-block) modules in the [MBConv blocks](https://paperswithcode.com/method/inverted-residual-block).
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@article{DBLP:journals/corr/abs-1905-02244,
author = {Andrew Howard and
Mark Sandler and
Grace Chu and
Liang{-}Chieh Chen and
Bo Chen and
Mingxing Tan and
Weijun Wang and
Yukun Zhu and
Ruoming Pang and
Vijay Vasudevan and
Quoc V. Le and
Hartwig Adam},
title = {Searching for MobileNetV3},
journal = {CoRR},
volume = {abs/1905.02244},
year = {2019},
url = {http://arxiv.org/abs/1905.02244},
archivePrefix = {arXiv},
eprint = {1905.02244},
timestamp = {Tue, 12 Jan 2021 15:30:06 +0100},
biburl = {https://dblp.org/rec/journals/corr/abs-1905-02244.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
```
<!--
Models:
- Name: tf_mobilenetv3_large_075
Metadata:
FLOPs: 194323712
Batch Size: 4096
Training Data:
- ImageNet
Training Techniques:
- RMSProp
- Weight Decay
Training Resources: 4x4 TPU Pod
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Dense Connections
- Depthwise Separable Convolution
- Dropout
- Global Average Pooling
- Hard Swish
- Inverted Residual Block
- ReLU
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 16097377
Tasks:
- Image Classification
ID: tf_mobilenetv3_large_075
LR: 0.1
Dropout: 0.8
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 1.0e-05
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/mobilenetv3.py#L394
In Collection: TF MobileNet V3
- Name: tf_mobilenetv3_large_100
Metadata:
FLOPs: 274535288
Batch Size: 4096
Training Data:
- ImageNet
Training Techniques:
- RMSProp
- Weight Decay
Training Resources: 4x4 TPU Pod
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Dense Connections
- Depthwise Separable Convolution
- Dropout
- Global Average Pooling
- Hard Swish
- Inverted Residual Block
- ReLU
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 22076649
Tasks:
- Image Classification
ID: tf_mobilenetv3_large_100
LR: 0.1
Dropout: 0.8
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 1.0e-05
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/mobilenetv3.py#L403
In Collection: TF MobileNet V3
- Name: tf_mobilenetv3_large_minimal_100
Metadata:
FLOPs: 267216928
Batch Size: 4096
Training Data:
- ImageNet
Training Techniques:
- RMSProp
- Weight Decay
Training Resources: 4x4 TPU Pod
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Dense Connections
- Depthwise Separable Convolution
- Dropout
- Global Average Pooling
- Hard Swish
- Inverted Residual Block
- ReLU
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 15836368
Tasks:
- Image Classification
ID: tf_mobilenetv3_large_minimal_100
LR: 0.1
Dropout: 0.8
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 1.0e-05
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/mobilenetv3.py#L412
In Collection: TF MobileNet V3
- Name: tf_mobilenetv3_small_075
Metadata:
FLOPs: 48457664
Batch Size: 4096
Training Data:
- ImageNet
Training Techniques:
- RMSProp
- Weight Decay
Training Resources: 16x GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Dense Connections
- Depthwise Separable Convolution
- Dropout
- Global Average Pooling
- Hard Swish
- Inverted Residual Block
- ReLU
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 8242701
Tasks:
- Image Classification
ID: tf_mobilenetv3_small_075
LR: 0.045
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 4.0e-05
Interpolation: bilinear
RMSProp Decay: 0.9
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/mobilenetv3.py#L421
In Collection: TF MobileNet V3
- Name: tf_mobilenetv3_small_100
Metadata:
FLOPs: 65450600
Batch Size: 4096
Training Data:
- ImageNet
Training Techniques:
- RMSProp
- Weight Decay
Training Resources: 16x GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Dense Connections
- Depthwise Separable Convolution
- Dropout
- Global Average Pooling
- Hard Swish
- Inverted Residual Block
- ReLU
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 10256398
Tasks:
- Image Classification
ID: tf_mobilenetv3_small_100
LR: 0.045
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 4.0e-05
Interpolation: bilinear
RMSProp Decay: 0.9
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/mobilenetv3.py#L430
In Collection: TF MobileNet V3
- Name: tf_mobilenetv3_small_minimal_100
Metadata:
FLOPs: 60827936
Batch Size: 4096
Training Data:
- ImageNet
Training Techniques:
- RMSProp
- Weight Decay
Training Resources: 16x GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Dense Connections
- Depthwise Separable Convolution
- Dropout
- Global Average Pooling
- Hard Swish
- Inverted Residual Block
- ReLU
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 8258083
Tasks:
- Image Classification
ID: tf_mobilenetv3_small_minimal_100
LR: 0.045
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 4.0e-05
Interpolation: bilinear
RMSProp Decay: 0.9
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/mobilenetv3.py#L439
In Collection: TF MobileNet V3
Collections:
- Name: TF MobileNet V3
Paper:
title: Searching for MobileNetV3
url: https://papperswithcode.com//paper/searching-for-mobilenetv3
type: model-index
Type: model-index
-->

@ -0,0 +1,255 @@
# Summary
A **TResNet** is a variant on a [ResNet](https://paperswithcode.com/method/resnet) that aim to boost accuracy while maintaining GPU training and inference efficiency. They contain several design tricks including a SpaceToDepth stem, [Anti-Alias downsampling](https://paperswithcode.com/method/anti-alias-downsampling), In-Place Activated BatchNorm, Blocks selection and [squeeze-and-excitation layers](https://paperswithcode.com/method/squeeze-and-excitation-block).
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{ridnik2020tresnet,
title={TResNet: High Performance GPU-Dedicated Architecture},
author={Tal Ridnik and Hussam Lawen and Asaf Noy and Emanuel Ben Baruch and Gilad Sharir and Itamar Friedman},
year={2020},
eprint={2003.13630},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: tresnet_l
Metadata:
FLOPs: 10873416792
Epochs: 300
Training Data:
- ImageNet
Training Techniques:
- AutoAugment
- Cutout
- Label Smoothing
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA 100 GPUs
Architecture:
- 1x1 Convolution
- Anti-Alias Downsampling
- Convolution
- Global Average Pooling
- InPlace-ABN
- Leaky ReLU
- ReLU
- Residual Connection
- Squeeze-and-Excitation Block
File Size: 224440219
Tasks:
- Image Classification
Training Time: ''
ID: tresnet_l
LR: 0.01
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/tresnet.py#L267
Config: ''
In Collection: TResNet
- Name: tresnet_l_448
Metadata:
FLOPs: 43488238584
Epochs: 300
Training Data:
- ImageNet
Training Techniques:
- AutoAugment
- Cutout
- Label Smoothing
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA 100 GPUs
Architecture:
- 1x1 Convolution
- Anti-Alias Downsampling
- Convolution
- Global Average Pooling
- InPlace-ABN
- Leaky ReLU
- ReLU
- Residual Connection
- Squeeze-and-Excitation Block
File Size: 224440219
Tasks:
- Image Classification
Training Time: ''
ID: tresnet_l_448
LR: 0.01
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '448'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/tresnet.py#L285
Config: ''
In Collection: TResNet
- Name: tresnet_m
Metadata:
FLOPs: 5733048064
Epochs: 300
Training Data:
- ImageNet
Training Techniques:
- AutoAugment
- Cutout
- Label Smoothing
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA 100 GPUs
Architecture:
- 1x1 Convolution
- Anti-Alias Downsampling
- Convolution
- Global Average Pooling
- InPlace-ABN
- Leaky ReLU
- ReLU
- Residual Connection
- Squeeze-and-Excitation Block
File Size: 125861314
Tasks:
- Image Classification
Training Time: < 24 hours
ID: tresnet_m
LR: 0.01
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/tresnet.py#L261
Config: ''
In Collection: TResNet
- Name: tresnet_m_448
Metadata:
FLOPs: 22929743104
Epochs: 300
Training Data:
- ImageNet
Training Techniques:
- AutoAugment
- Cutout
- Label Smoothing
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA 100 GPUs
Architecture:
- 1x1 Convolution
- Anti-Alias Downsampling
- Convolution
- Global Average Pooling
- InPlace-ABN
- Leaky ReLU
- ReLU
- Residual Connection
- Squeeze-and-Excitation Block
File Size: 125861314
Tasks:
- Image Classification
Training Time: ''
ID: tresnet_m_448
LR: 0.01
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '448'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/tresnet.py#L279
Config: ''
In Collection: TResNet
- Name: tresnet_xl
Metadata:
FLOPs: 15162534034
Epochs: 300
Training Data:
- ImageNet
Training Techniques:
- AutoAugment
- Cutout
- Label Smoothing
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA 100 GPUs
Architecture:
- 1x1 Convolution
- Anti-Alias Downsampling
- Convolution
- Global Average Pooling
- InPlace-ABN
- Leaky ReLU
- ReLU
- Residual Connection
- Squeeze-and-Excitation Block
File Size: 314378965
Tasks:
- Image Classification
Training Time: ''
ID: tresnet_xl
LR: 0.01
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/tresnet.py#L273
Config: ''
In Collection: TResNet
- Name: tresnet_xl_448
Metadata:
FLOPs: 60641712730
Epochs: 300
Training Data:
- ImageNet
Training Techniques:
- AutoAugment
- Cutout
- Label Smoothing
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA 100 GPUs
Architecture:
- 1x1 Convolution
- Anti-Alias Downsampling
- Convolution
- Global Average Pooling
- InPlace-ABN
- Leaky ReLU
- ReLU
- Residual Connection
- Squeeze-and-Excitation Block
File Size: 224440219
Tasks:
- Image Classification
Training Time: ''
ID: tresnet_xl_448
LR: 0.01
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '448'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/tresnet.py#L291
Config: ''
In Collection: TResNet
Collections:
- Name: TResNet
Paper:
title: 'TResNet: High Performance GPU-Dedicated Architecture'
url: https://papperswithcode.com//paper/tresnet-high-performance-gpu-dedicated
type: model-index
Type: model-index
-->

@ -0,0 +1,278 @@
# Summary
The **Vision Transformer** is a model for image classification that employs a Transformer-like architecture over patches of the image. This includes the use of [Multi-Head Attention](https://paperswithcode.com/method/multi-head-attention), [Scaled Dot-Product Attention](https://paperswithcode.com/method/scaled) and other architectural features seen in the [Transformer](https://paperswithcode.com/method/transformer) architecture traditionally used for NLP.
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{dosovitskiy2020image,
title={An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale},
author={Alexey Dosovitskiy and Lucas Beyer and Alexander Kolesnikov and Dirk Weissenborn and Xiaohua Zhai and Thomas Unterthiner and Mostafa Dehghani and Matthias Minderer and Georg Heigold and Sylvain Gelly and Jakob Uszkoreit and Neil Houlsby},
year={2020},
eprint={2010.11929},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: vit_large_patch16_384
Metadata:
FLOPs: 174702764032
Batch Size: 512
Training Data:
- ImageNet
- JFT-300M
Training Techniques:
- Cosine Annealing
- Gradient Clipping
- SGD with Momentum
Training Resources: TPUv3
Architecture:
- Attention Dropout
- Convolution
- Dense Connections
- Dropout
- GELU
- Layer Normalization
- Multi-Head Attention
- Scaled Dot-Product Attention
- Tanh Activation
File Size: 1218907013
Tasks:
- Image Classification
Training Time: ''
ID: vit_large_patch16_384
Crop Pct: '1.0'
Momentum: 0.9
Image Size: '384'
Weight Decay: 0.0
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/5f9aff395c224492e9e44248b15f44b5cc095d9c/timm/models/vision_transformer.py#L561
Config: ''
In Collection: Vision Transformer
- Name: vit_base_patch16_224
Metadata:
FLOPs: 67394605056
Epochs: 90
Batch Size: 4096
Training Data:
- ImageNet
- JFT-300M
Training Techniques:
- Cosine Annealing
- Gradient Clipping
- SGD with Momentum
Training Resources: TPUv3
Architecture:
- Attention Dropout
- Convolution
- Dense Connections
- Dropout
- GELU
- Layer Normalization
- Multi-Head Attention
- Scaled Dot-Product Attention
- Tanh Activation
File Size: 346292833
Tasks:
- Image Classification
Training Time: ''
ID: vit_base_patch16_224
LR: 0.0008
Dropout: 0.0
Crop Pct: '0.9'
Image Size: '224'
Warmup Steps: 10000
Weight Decay: 0.03
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/5f9aff395c224492e9e44248b15f44b5cc095d9c/timm/models/vision_transformer.py#L503
Config: ''
In Collection: Vision Transformer
- Name: vit_base_patch16_384
Metadata:
FLOPs: 49348245504
Batch Size: 512
Training Data:
- ImageNet
- JFT-300M
Training Techniques:
- Cosine Annealing
- Gradient Clipping
- SGD with Momentum
Training Resources: TPUv3
Architecture:
- Attention Dropout
- Convolution
- Dense Connections
- Dropout
- GELU
- Layer Normalization
- Multi-Head Attention
- Scaled Dot-Product Attention
- Tanh Activation
File Size: 347460194
Tasks:
- Image Classification
Training Time: ''
ID: vit_base_patch16_384
Crop Pct: '1.0'
Momentum: 0.9
Image Size: '384'
Weight Decay: 0.0
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/5f9aff395c224492e9e44248b15f44b5cc095d9c/timm/models/vision_transformer.py#L522
Config: ''
In Collection: Vision Transformer
- Name: vit_large_patch16_224
Metadata:
FLOPs: 119294746624
Batch Size: 512
Training Data:
- ImageNet
- JFT-300M
Training Techniques:
- Cosine Annealing
- Gradient Clipping
- SGD with Momentum
Training Resources: TPUv3
Architecture:
- Attention Dropout
- Convolution
- Dense Connections
- Dropout
- GELU
- Layer Normalization
- Multi-Head Attention
- Scaled Dot-Product Attention
- Tanh Activation
File Size: 1217350532
Tasks:
- Image Classification
Training Time: ''
ID: vit_large_patch16_224
Crop Pct: '0.9'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/5f9aff395c224492e9e44248b15f44b5cc095d9c/timm/models/vision_transformer.py#L542
Config: ''
In Collection: Vision Transformer
- Name: vit_base_patch32_384
Metadata:
FLOPs: 12656142336
Batch Size: 512
Training Data:
- ImageNet
- JFT-300M
Training Techniques:
- Cosine Annealing
- Gradient Clipping
- SGD with Momentum
Training Resources: TPUv3
Architecture:
- Attention Dropout
- Convolution
- Dense Connections
- Dropout
- GELU
- Layer Normalization
- Multi-Head Attention
- Scaled Dot-Product Attention
- Tanh Activation
File Size: 353210979
Tasks:
- Image Classification
Training Time: ''
ID: vit_base_patch32_384
Crop Pct: '1.0'
Momentum: 0.9
Image Size: '384'
Weight Decay: 0.0
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/5f9aff395c224492e9e44248b15f44b5cc095d9c/timm/models/vision_transformer.py#L532
Config: ''
In Collection: Vision Transformer
- Name: vit_base_resnet50_384
Metadata:
FLOPs: 49461491712
Batch Size: 512
Training Data:
- ImageNet
- JFT-300M
Training Techniques:
- Cosine Annealing
- Gradient Clipping
- SGD with Momentum
Training Resources: TPUv3
Architecture:
- Attention Dropout
- Convolution
- Dense Connections
- Dropout
- GELU
- Layer Normalization
- Multi-Head Attention
- Scaled Dot-Product Attention
- Tanh Activation
File Size: 395854632
Tasks:
- Image Classification
Training Time: ''
ID: vit_base_resnet50_384
Crop Pct: '1.0'
Momentum: 0.9
Image Size: '384'
Weight Decay: 0.0
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/5f9aff395c224492e9e44248b15f44b5cc095d9c/timm/models/vision_transformer.py#L653
Config: ''
In Collection: Vision Transformer
- Name: vit_small_patch16_224
Metadata:
FLOPs: 28236450816
Training Data:
- ImageNet
- JFT-300M
Training Techniques:
- Cosine Annealing
- Gradient Clipping
- SGD with Momentum
Training Resources: TPUv3
Architecture:
- Attention Dropout
- Convolution
- Dense Connections
- Dropout
- GELU
- Layer Normalization
- Multi-Head Attention
- Scaled Dot-Product Attention
- Tanh Activation
File Size: 195031454
Tasks:
- Image Classification
Training Time: ''
ID: vit_small_patch16_224
Crop Pct: '0.9'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/5f9aff395c224492e9e44248b15f44b5cc095d9c/timm/models/vision_transformer.py#L490
Config: ''
In Collection: Vision Transformer
Collections:
- Name: Vision Transformer
Paper:
title: 'An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale'
url: https://papperswithcode.com//paper/an-image-is-worth-16x16-words-transformers-1
type: model-index
Type: model-index
-->

@ -0,0 +1,87 @@
# Summary
**Wide Residual Networks** are a variant on [ResNets](https://paperswithcode.com/method/resnet) where we decrease depth and increase the width of residual networks. This is achieved through the use of [wide residual blocks](https://paperswithcode.com/method/wide-residual-block).
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@article{DBLP:journals/corr/ZagoruykoK16,
author = {Sergey Zagoruyko and
Nikos Komodakis},
title = {Wide Residual Networks},
journal = {CoRR},
volume = {abs/1605.07146},
year = {2016},
url = {http://arxiv.org/abs/1605.07146},
archivePrefix = {arXiv},
eprint = {1605.07146},
timestamp = {Mon, 13 Aug 2018 16:46:42 +0200},
biburl = {https://dblp.org/rec/journals/corr/ZagoruykoK16.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
```
<!--
Models:
- Name: wide_resnet101_2
Metadata:
FLOPs: 29304929280
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Connection
- Softmax
- Wide Residual Block
File Size: 254695146
Tasks:
- Image Classification
ID: wide_resnet101_2
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/5f9aff395c224492e9e44248b15f44b5cc095d9c/timm/models/resnet.py#L802
In Collection: Wide ResNet
- Name: wide_resnet50_2
Metadata:
FLOPs: 14688058368
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Connection
- Softmax
- Wide Residual Block
File Size: 275853271
Tasks:
- Image Classification
ID: wide_resnet50_2
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/5f9aff395c224492e9e44248b15f44b5cc095d9c/timm/models/resnet.py#L790
In Collection: Wide ResNet
Collections:
- Name: Wide ResNet
Paper:
title: Wide Residual Networks
url: https://papperswithcode.com//paper/wide-residual-networks
type: model-index
Type: model-index
-->

@ -0,0 +1,130 @@
# Summary
**Xception** is a convolutional neural network architecture that relies solely on [depthwise separable convolution layers](https://paperswithcode.com/method/depthwise-separable-convolution).
{% include 'code_snippets.md' %}
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@article{DBLP:journals/corr/ZagoruykoK16,
@misc{chollet2017xception,
title={Xception: Deep Learning with Depthwise Separable Convolutions},
author={François Chollet},
year={2017},
eprint={1610.02357},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: xception
Metadata:
FLOPs: 10600506792
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Convolution
- Dense Connections
- Depthwise Separable Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Connection
- Softmax
File Size: 91675053
Tasks:
- Image Classification
ID: xception
Crop Pct: '0.897'
Image Size: '299'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/xception.py#L229
In Collection: Xception
- Name: xception41
Metadata:
FLOPs: 11681983232
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Convolution
- Dense Connections
- Depthwise Separable Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Connection
- Softmax
File Size: 108422028
Tasks:
- Image Classification
ID: xception41
Crop Pct: '0.903'
Image Size: '299'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/xception_aligned.py#L181
In Collection: Xception
- Name: xception65
Metadata:
FLOPs: 17585702144
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Convolution
- Dense Connections
- Depthwise Separable Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Connection
- Softmax
File Size: 160536780
Tasks:
- Image Classification
ID: xception65
Crop Pct: '0.903'
Image Size: '299'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/xception_aligned.py#L200
In Collection: Xception
- Name: xception71
Metadata:
FLOPs: 22817346560
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Convolution
- Dense Connections
- Depthwise Separable Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Connection
- Softmax
File Size: 170295556
Tasks:
- Image Classification
ID: xception71
Crop Pct: '0.903'
Image Size: '299'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/xception_aligned.py#L219
In Collection: Xception
Collections:
- Name: Xception
Paper:
title: 'Xception: Deep Learning with Depthwise Separable Convolutions'
url: https://papperswithcode.com//paper/xception-deep-learning-with-depthwise
type: model-index
Type: model-index
-->

@ -0,0 +1,60 @@
import argparse
from pathlib import Path
from jinja2 import Environment, FileSystemLoader
import modelindex
def generate_readmes(templates_path: Path, dest_path: Path):
"""Add the code snippet template to the readmes"""
readme_templates_path = templates_path / "models"
code_template_path = templates_path / "code_snippets.md"
env = Environment(
loader=FileSystemLoader([readme_templates_path, readme_templates_path.parent]),
)
for readme in readme_templates_path.iterdir():
if readme.suffix == ".md":
template = env.get_template(readme.name)
# get the first model_name for this model family
mi = modelindex.load(str(readme))
model_name = mi.models[0].name
full_content = template.render(model_name=model_name)
# generate full_readme
with open(dest_path / readme.name, "w") as f:
f.write(full_content)
def main():
parser = argparse.ArgumentParser(description="Model index generation config")
parser.add_argument(
"-t",
"--templates",
default=Path(__file__).parent / ".templates",
type=str,
help="Location of the markdown templates",
)
parser.add_argument(
"-d",
"--dest",
default=Path(__file__).parent / "models",
type=str,
help="Destination folder that contains the generated model-index files.",
)
args = parser.parse_args()
templates_path = Path(args.templates)
dest_readmes_path = Path(args.dest)
generate_readmes(
templates_path,
dest_readmes_path,
)
if __name__ == "__main__":
main()

@ -0,0 +1,14 @@
Import:
- ./models/*.md
Library:
Name: PyTorch Image Models
Headline: PyTorch image models, scripts, pretrained weights
Website: https://rwightman.github.io/pytorch-image-models/
Repository: https://github.com/rwightman/pytorch-image-models
Docs: https://rwightman.github.io/pytorch-image-models/
README: "# PyTorch Image Models\r\n\r\nPyTorch Image Models (TIMM) is a library\
\ for state-of-the-art image classification. With this library you can:\r\n\r\n\
- Choose from 300+ pre-trained state-of-the-art image classification models.\r\
\n- Train models afresh on research datasets such as ImageNet using provided scripts.\r\
\n- Finetune pre-trained models on your own datasets, including the latest cutting\
\ edge models."

@ -0,0 +1,150 @@
# Summary
**Inception v3** is a convolutional neural network architecture from the Inception family that makes several improvements including using [Label Smoothing](https://paperswithcode.com/method/label-smoothing), Factorized 7 x 7 convolutions, and the use of an [auxiliary classifer](https://paperswithcode.com/method/auxiliary-classifier) to propagate label information lower down the network (along with the use of batch normalization for layers in the sidehead). The key building block is an [Inception Module](https://paperswithcode.com/method/inception-v3-module).
This particular model was trained for study of adversarial examples (adversarial training).
## How do I use this model on an image?
To load a pretrained model:
```python
import timm
model = timm.create_model('adv_inception_v3', pretrained=True)
model.eval()
```
To load and preprocess the image:
```python
import urllib
from PIL import Image
from timm.data import resolve_data_config
from timm.data.transforms_factory import create_transform
config = resolve_data_config({}, model=model)
transform = create_transform(**config)
url, filename = ("https://github.com/pytorch/hub/raw/master/images/dog.jpg", "dog.jpg")
urllib.request.urlretrieve(url, filename)
img = Image.open(filename).convert('RGB')
tensor = transform(img).unsqueeze(0) # transform and add batch dimension
```
To get the model predictions:
```python
import torch
with torch.no_grad():
out = model(tensor)
probabilities = torch.nn.functional.softmax(out[0], dim=0)
print(probabilities.shape)
# prints: torch.Size([1000])
```
To get the top-5 predictions class names:
```python
# Get imagenet class mappings
url, filename = ("https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt", "imagenet_classes.txt")
urllib.request.urlretrieve(url, filename)
with open("imagenet_classes.txt", "r") as f:
categories = [s.strip() for s in f.readlines()]
# Print top categories per image
top5_prob, top5_catid = torch.topk(probabilities, 5)
for i in range(top5_prob.size(0)):
print(categories[top5_catid[i]], top5_prob[i].item())
# prints class names and probabilities like:
# [('Samoyed', 0.6425196528434753), ('Pomeranian', 0.04062102362513542), ('keeshond', 0.03186424449086189), ('white wolf', 0.01739676296710968), ('Eskimo dog', 0.011717947199940681)]
```
Replace the model name with the variant you want to use, e.g. `adv_inception_v3`. You can find the IDs in the model summaries at the top of this page.
To extract image features with this model, follow the [timm feature extraction examples](https://rwightman.github.io/pytorch-image-models/feature_extraction/), just change the name of the model you want to use.
## How do I finetune this model?
You can finetune any of the pre-trained models just by changing the classifier (the last layer).
```python
model = timm.create_model('adv_inception_v3', pretrained=True).reset_classifier(NUM_FINETUNE_CLASSES)
```
To finetune on your own dataset, you have to write a training loop or adapt [timm's training
script](https://github.com/rwightman/pytorch-image-models/blob/master/train.py) to use your dataset.
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@article{DBLP:journals/corr/abs-1804-00097,
author = {Alexey Kurakin and
Ian J. Goodfellow and
Samy Bengio and
Yinpeng Dong and
Fangzhou Liao and
Ming Liang and
Tianyu Pang and
Jun Zhu and
Xiaolin Hu and
Cihang Xie and
Jianyu Wang and
Zhishuai Zhang and
Zhou Ren and
Alan L. Yuille and
Sangxia Huang and
Yao Zhao and
Yuzhe Zhao and
Zhonglin Han and
Junjiajia Long and
Yerkebulan Berdibekov and
Takuya Akiba and
Seiya Tokui and
Motoki Abe},
title = {Adversarial Attacks and Defences Competition},
journal = {CoRR},
volume = {abs/1804.00097},
year = {2018},
url = {http://arxiv.org/abs/1804.00097},
archivePrefix = {arXiv},
eprint = {1804.00097},
timestamp = {Thu, 31 Oct 2019 16:31:22 +0100},
biburl = {https://dblp.org/rec/journals/corr/abs-1804-00097.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
```
<!--
Models:
- Name: adv_inception_v3
Metadata:
FLOPs: 7352418880
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Auxiliary Classifier
- Average Pooling
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inception-v3 Module
- Max Pooling
- ReLU
- Softmax
File Size: 95549439
Tasks:
- Image Classification
ID: adv_inception_v3
Crop Pct: '0.875'
Image Size: '299'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/inception_v3.py#L456
In Collection: Adversarial Inception v3
Collections:
- Name: Adversarial Inception v3
Paper:
title: Adversarial Attacks and Defences Competition
url: https://papperswithcode.com//paper/adversarial-attacks-and-defences-competition
type: model-index
Type: model-index
-->

@ -0,0 +1,445 @@
# Summary
**AdvProp** is an adversarial training scheme which treats adversarial examples as additional examples, to prevent overfitting. Key to the method is the usage of a separate auxiliary batch norm for adversarial examples, as they have different underlying distributions to normal examples.
## How do I use this model on an image?
To load a pretrained model:
```python
import timm
model = timm.create_model('tf_efficientnet_b1_ap', pretrained=True)
model.eval()
```
To load and preprocess the image:
```python
import urllib
from PIL import Image
from timm.data import resolve_data_config
from timm.data.transforms_factory import create_transform
config = resolve_data_config({}, model=model)
transform = create_transform(**config)
url, filename = ("https://github.com/pytorch/hub/raw/master/images/dog.jpg", "dog.jpg")
urllib.request.urlretrieve(url, filename)
img = Image.open(filename).convert('RGB')
tensor = transform(img).unsqueeze(0) # transform and add batch dimension
```
To get the model predictions:
```python
import torch
with torch.no_grad():
out = model(tensor)
probabilities = torch.nn.functional.softmax(out[0], dim=0)
print(probabilities.shape)
# prints: torch.Size([1000])
```
To get the top-5 predictions class names:
```python
# Get imagenet class mappings
url, filename = ("https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt", "imagenet_classes.txt")
urllib.request.urlretrieve(url, filename)
with open("imagenet_classes.txt", "r") as f:
categories = [s.strip() for s in f.readlines()]
# Print top categories per image
top5_prob, top5_catid = torch.topk(probabilities, 5)
for i in range(top5_prob.size(0)):
print(categories[top5_catid[i]], top5_prob[i].item())
# prints class names and probabilities like:
# [('Samoyed', 0.6425196528434753), ('Pomeranian', 0.04062102362513542), ('keeshond', 0.03186424449086189), ('white wolf', 0.01739676296710968), ('Eskimo dog', 0.011717947199940681)]
```
Replace the model name with the variant you want to use, e.g. `tf_efficientnet_b1_ap`. You can find the IDs in the model summaries at the top of this page.
To extract image features with this model, follow the [timm feature extraction examples](https://rwightman.github.io/pytorch-image-models/feature_extraction/), just change the name of the model you want to use.
## How do I finetune this model?
You can finetune any of the pre-trained models just by changing the classifier (the last layer).
```python
model = timm.create_model('tf_efficientnet_b1_ap', pretrained=True).reset_classifier(NUM_FINETUNE_CLASSES)
```
To finetune on your own dataset, you have to write a training loop or adapt [timm's training
script](https://github.com/rwightman/pytorch-image-models/blob/master/train.py) to use your dataset.
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{xie2020adversarial,
title={Adversarial Examples Improve Image Recognition},
author={Cihang Xie and Mingxing Tan and Boqing Gong and Jiang Wang and Alan Yuille and Quoc V. Le},
year={2020},
eprint={1911.09665},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: tf_efficientnet_b1_ap
Metadata:
FLOPs: 883633200
Epochs: 350
Batch Size: 2048
Training Data:
- ImageNet
Training Techniques:
- AdvProp
- AutoAugment
- Label Smoothing
- RMSProp
- Stochastic Depth
- Weight Decay
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 31515350
Tasks:
- Image Classification
ID: tf_efficientnet_b1_ap
LR: 0.256
Crop Pct: '0.882'
Momentum: 0.9
Image Size: '240'
Weight Decay: 1.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Label Smoothing: 0.1
BatchNorm Momentum: 0.99
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1344
In Collection: AdvProp
- Name: tf_efficientnet_b2_ap
Metadata:
FLOPs: 1234321170
Epochs: 350
Batch Size: 2048
Training Data:
- ImageNet
Training Techniques:
- AdvProp
- AutoAugment
- Label Smoothing
- RMSProp
- Stochastic Depth
- Weight Decay
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 36800745
Tasks:
- Image Classification
ID: tf_efficientnet_b2_ap
LR: 0.256
Crop Pct: '0.89'
Momentum: 0.9
Image Size: '260'
Weight Decay: 1.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Label Smoothing: 0.1
BatchNorm Momentum: 0.99
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1354
In Collection: AdvProp
- Name: tf_efficientnet_b3_ap
Metadata:
FLOPs: 2275247568
Epochs: 350
Batch Size: 2048
Training Data:
- ImageNet
Training Techniques:
- AdvProp
- AutoAugment
- Label Smoothing
- RMSProp
- Stochastic Depth
- Weight Decay
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 49384538
Tasks:
- Image Classification
ID: tf_efficientnet_b3_ap
LR: 0.256
Crop Pct: '0.904'
Momentum: 0.9
Image Size: '300'
Weight Decay: 1.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Label Smoothing: 0.1
BatchNorm Momentum: 0.99
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1364
In Collection: AdvProp
- Name: tf_efficientnet_b4_ap
Metadata:
FLOPs: 5749638672
Epochs: 350
Batch Size: 2048
Training Data:
- ImageNet
Training Techniques:
- AdvProp
- AutoAugment
- Label Smoothing
- RMSProp
- Stochastic Depth
- Weight Decay
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 77993585
Tasks:
- Image Classification
ID: tf_efficientnet_b4_ap
LR: 0.256
Crop Pct: '0.922'
Momentum: 0.9
Image Size: '380'
Weight Decay: 1.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Label Smoothing: 0.1
BatchNorm Momentum: 0.99
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1374
In Collection: AdvProp
- Name: tf_efficientnet_b5_ap
Metadata:
FLOPs: 13176501888
Epochs: 350
Batch Size: 2048
Training Data:
- ImageNet
Training Techniques:
- AdvProp
- AutoAugment
- Label Smoothing
- RMSProp
- Stochastic Depth
- Weight Decay
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 122403150
Tasks:
- Image Classification
ID: tf_efficientnet_b5_ap
LR: 0.256
Crop Pct: '0.934'
Momentum: 0.9
Image Size: '456'
Weight Decay: 1.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Label Smoothing: 0.1
BatchNorm Momentum: 0.99
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1384
In Collection: AdvProp
- Name: tf_efficientnet_b6_ap
Metadata:
FLOPs: 24180518488
Epochs: 350
Batch Size: 2048
Training Data:
- ImageNet
Training Techniques:
- AdvProp
- AutoAugment
- Label Smoothing
- RMSProp
- Stochastic Depth
- Weight Decay
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 173237466
Tasks:
- Image Classification
ID: tf_efficientnet_b6_ap
LR: 0.256
Crop Pct: '0.942'
Momentum: 0.9
Image Size: '528'
Weight Decay: 1.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Label Smoothing: 0.1
BatchNorm Momentum: 0.99
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1394
In Collection: AdvProp
- Name: tf_efficientnet_b7_ap
Metadata:
FLOPs: 48205304880
Epochs: 350
Batch Size: 2048
Training Data:
- ImageNet
Training Techniques:
- AdvProp
- AutoAugment
- Label Smoothing
- RMSProp
- Stochastic Depth
- Weight Decay
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 266850607
Tasks:
- Image Classification
ID: tf_efficientnet_b7_ap
LR: 0.256
Crop Pct: '0.949'
Momentum: 0.9
Image Size: '600'
Weight Decay: 1.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Label Smoothing: 0.1
BatchNorm Momentum: 0.99
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1405
In Collection: AdvProp
- Name: tf_efficientnet_b8_ap
Metadata:
FLOPs: 80962956270
Epochs: 350
Batch Size: 2048
Training Data:
- ImageNet
Training Techniques:
- AdvProp
- AutoAugment
- Label Smoothing
- RMSProp
- Stochastic Depth
- Weight Decay
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 351412563
Tasks:
- Image Classification
ID: tf_efficientnet_b8_ap
LR: 0.128
Crop Pct: '0.954'
Momentum: 0.9
Image Size: '672'
Weight Decay: 1.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Label Smoothing: 0.1
BatchNorm Momentum: 0.99
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1416
In Collection: AdvProp
- Name: tf_efficientnet_b0_ap
Metadata:
FLOPs: 488688572
Epochs: 350
Batch Size: 2048
Training Data:
- ImageNet
Training Techniques:
- AdvProp
- AutoAugment
- Label Smoothing
- RMSProp
- Stochastic Depth
- Weight Decay
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 21385973
Tasks:
- Image Classification
ID: tf_efficientnet_b0_ap
LR: 0.256
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 1.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Label Smoothing: 0.1
BatchNorm Momentum: 0.99
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1334
In Collection: AdvProp
Collections:
- Name: AdvProp
Paper:
title: Adversarial Examples Improve Image Recognition
url: https://papperswithcode.com//paper/adversarial-examples-improve-image
type: model-index
Type: model-index
-->

@ -0,0 +1,316 @@
# Summary
**Big Transfer (BiT)** is a type of pretraining recipe that pre-trains on a large supervised source dataset, and fine-tunes the weights on the target task. Models are trained on the JFT-300M dataset. The finetuned models contained in this collection are finetuned on ImageNet.
## How do I use this model on an image?
To load a pretrained model:
```python
import timm
model = timm.create_model('resnetv2_152x4_bitm', pretrained=True)
model.eval()
```
To load and preprocess the image:
```python
import urllib
from PIL import Image
from timm.data import resolve_data_config
from timm.data.transforms_factory import create_transform
config = resolve_data_config({}, model=model)
transform = create_transform(**config)
url, filename = ("https://github.com/pytorch/hub/raw/master/images/dog.jpg", "dog.jpg")
urllib.request.urlretrieve(url, filename)
img = Image.open(filename).convert('RGB')
tensor = transform(img).unsqueeze(0) # transform and add batch dimension
```
To get the model predictions:
```python
import torch
with torch.no_grad():
out = model(tensor)
probabilities = torch.nn.functional.softmax(out[0], dim=0)
print(probabilities.shape)
# prints: torch.Size([1000])
```
To get the top-5 predictions class names:
```python
# Get imagenet class mappings
url, filename = ("https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt", "imagenet_classes.txt")
urllib.request.urlretrieve(url, filename)
with open("imagenet_classes.txt", "r") as f:
categories = [s.strip() for s in f.readlines()]
# Print top categories per image
top5_prob, top5_catid = torch.topk(probabilities, 5)
for i in range(top5_prob.size(0)):
print(categories[top5_catid[i]], top5_prob[i].item())
# prints class names and probabilities like:
# [('Samoyed', 0.6425196528434753), ('Pomeranian', 0.04062102362513542), ('keeshond', 0.03186424449086189), ('white wolf', 0.01739676296710968), ('Eskimo dog', 0.011717947199940681)]
```
Replace the model name with the variant you want to use, e.g. `resnetv2_152x4_bitm`. You can find the IDs in the model summaries at the top of this page.
To extract image features with this model, follow the [timm feature extraction examples](https://rwightman.github.io/pytorch-image-models/feature_extraction/), just change the name of the model you want to use.
## How do I finetune this model?
You can finetune any of the pre-trained models just by changing the classifier (the last layer).
```python
model = timm.create_model('resnetv2_152x4_bitm', pretrained=True).reset_classifier(NUM_FINETUNE_CLASSES)
```
To finetune on your own dataset, you have to write a training loop or adapt [timm's training
script](https://github.com/rwightman/pytorch-image-models/blob/master/train.py) to use your dataset.
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{kolesnikov2020big,
title={Big Transfer (BiT): General Visual Representation Learning},
author={Alexander Kolesnikov and Lucas Beyer and Xiaohua Zhai and Joan Puigcerver and Jessica Yung and Sylvain Gelly and Neil Houlsby},
year={2020},
eprint={1912.11370},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: resnetv2_152x4_bitm
Metadata:
Training Data:
- ImageNet
- JFT-300M
Training Techniques:
- Mixup
- SGD with Momentum
- Weight Decay
Training Resources: Cloud TPUv3-512
Architecture:
- 1x1 Convolution
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Group Normalization
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
- Weight Standardization
File Size: 3746270104
Tasks:
- Image Classification
Training Time: ''
ID: resnetv2_152x4_bitm
Crop Pct: '1.0'
Image Size: '480'
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/b9843f954b0457af2db4f9dea41a8538f51f5d78/timm/models/resnetv2.py#L465
Config: ''
In Collection: Big Transfer
- Name: resnetv2_152x2_bitm
Metadata:
Training Data:
- ImageNet
- JFT-300M
Training Techniques:
- Mixup
- SGD with Momentum
- Weight Decay
Training Resources: ''
Architecture:
- 1x1 Convolution
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Group Normalization
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
- Weight Standardization
File Size: 945476668
Tasks:
- Image Classification
Training Time: ''
ID: resnetv2_152x2_bitm
Crop Pct: '1.0'
Image Size: '480'
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/b9843f954b0457af2db4f9dea41a8538f51f5d78/timm/models/resnetv2.py#L458
Config: ''
In Collection: Big Transfer
- Name: resnetv2_50x1_bitm
Metadata:
Epochs: 90
Batch Size: 4096
Training Data:
- ImageNet
- JFT-300M
Training Techniques:
- Mixup
- SGD with Momentum
- Weight Decay
Training Resources: Cloud TPUv3-512
Architecture:
- 1x1 Convolution
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Group Normalization
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
- Weight Standardization
File Size: 102242668
Tasks:
- Image Classification
Training Time: ''
ID: resnetv2_50x1_bitm
LR: 0.03
Layers: 50
Crop Pct: '1.0'
Momentum: 0.9
Image Size: '480'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/b9843f954b0457af2db4f9dea41a8538f51f5d78/timm/models/resnetv2.py#L430
Config: ''
In Collection: Big Transfer
- Name: resnetv2_101x3_bitm
Metadata:
Epochs: 90
Batch Size: 4096
Training Data:
- ImageNet
- JFT-300M
Training Techniques:
- Mixup
- SGD with Momentum
- Weight Decay
Training Resources: Cloud TPUv3-512
Architecture:
- 1x1 Convolution
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Group Normalization
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
- Weight Standardization
File Size: 1551830100
Tasks:
- Image Classification
Training Time: ''
ID: resnetv2_101x3_bitm
LR: 0.03
Layers: 101
Crop Pct: '1.0'
Momentum: 0.9
Image Size: '480'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/b9843f954b0457af2db4f9dea41a8538f51f5d78/timm/models/resnetv2.py#L451
Config: ''
In Collection: Big Transfer
- Name: resnetv2_50x3_bitm
Metadata:
Epochs: 90
Batch Size: 4096
Training Data:
- ImageNet
- JFT-300M
Training Techniques:
- Mixup
- SGD with Momentum
- Weight Decay
Training Resources: Cloud TPUv3-512
Architecture:
- 1x1 Convolution
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Group Normalization
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
- Weight Standardization
File Size: 869321580
Tasks:
- Image Classification
Training Time: ''
ID: resnetv2_50x3_bitm
LR: 0.03
Layers: 50
Crop Pct: '1.0'
Momentum: 0.9
Image Size: '480'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/b9843f954b0457af2db4f9dea41a8538f51f5d78/timm/models/resnetv2.py#L437
Config: ''
In Collection: Big Transfer
- Name: resnetv2_101x1_bitm
Metadata:
Epochs: 90
Batch Size: 4096
Training Data:
- ImageNet
- JFT-300M
Training Techniques:
- Mixup
- SGD with Momentum
- Weight Decay
Training Resources: Cloud TPUv3-512
Architecture:
- 1x1 Convolution
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Group Normalization
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
- Weight Standardization
File Size: 178256468
Tasks:
- Image Classification
Training Time: ''
ID: resnetv2_101x1_bitm
LR: 0.03
Layers: 101
Crop Pct: '1.0'
Momentum: 0.9
Image Size: '480'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/b9843f954b0457af2db4f9dea41a8538f51f5d78/timm/models/resnetv2.py#L444
Config: ''
In Collection: Big Transfer
Collections:
- Name: Big Transfer
Paper:
title: 'Big Transfer (BiT): General Visual Representation Learning'
url: https://papperswithcode.com//paper/large-scale-learning-of-general-visual
type: model-index
Type: model-index
-->

@ -0,0 +1,135 @@
# Summary
**CSPDarknet53** is a convolutional neural network and backbone for object detection that uses [DarkNet-53](https://paperswithcode.com/method/darknet-53). It employs a CSPNet strategy to partition the feature map of the base layer into two parts and then merges them through a cross-stage hierarchy. The use of a split and merge strategy allows for more gradient flow through the network.
This CNN is used as the backbone for [YOLOv4](https://paperswithcode.com/method/yolov4).
## How do I use this model on an image?
To load a pretrained model:
```python
import timm
model = timm.create_model('cspdarknet53', pretrained=True)
model.eval()
```
To load and preprocess the image:
```python
import urllib
from PIL import Image
from timm.data import resolve_data_config
from timm.data.transforms_factory import create_transform
config = resolve_data_config({}, model=model)
transform = create_transform(**config)
url, filename = ("https://github.com/pytorch/hub/raw/master/images/dog.jpg", "dog.jpg")
urllib.request.urlretrieve(url, filename)
img = Image.open(filename).convert('RGB')
tensor = transform(img).unsqueeze(0) # transform and add batch dimension
```
To get the model predictions:
```python
import torch
with torch.no_grad():
out = model(tensor)
probabilities = torch.nn.functional.softmax(out[0], dim=0)
print(probabilities.shape)
# prints: torch.Size([1000])
```
To get the top-5 predictions class names:
```python
# Get imagenet class mappings
url, filename = ("https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt", "imagenet_classes.txt")
urllib.request.urlretrieve(url, filename)
with open("imagenet_classes.txt", "r") as f:
categories = [s.strip() for s in f.readlines()]
# Print top categories per image
top5_prob, top5_catid = torch.topk(probabilities, 5)
for i in range(top5_prob.size(0)):
print(categories[top5_catid[i]], top5_prob[i].item())
# prints class names and probabilities like:
# [('Samoyed', 0.6425196528434753), ('Pomeranian', 0.04062102362513542), ('keeshond', 0.03186424449086189), ('white wolf', 0.01739676296710968), ('Eskimo dog', 0.011717947199940681)]
```
Replace the model name with the variant you want to use, e.g. `cspdarknet53`. You can find the IDs in the model summaries at the top of this page.
To extract image features with this model, follow the [timm feature extraction examples](https://rwightman.github.io/pytorch-image-models/feature_extraction/), just change the name of the model you want to use.
## How do I finetune this model?
You can finetune any of the pre-trained models just by changing the classifier (the last layer).
```python
model = timm.create_model('cspdarknet53', pretrained=True).reset_classifier(NUM_FINETUNE_CLASSES)
```
To finetune on your own dataset, you have to write a training loop or adapt [timm's training
script](https://github.com/rwightman/pytorch-image-models/blob/master/train.py) to use your dataset.
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{bochkovskiy2020yolov4,
title={YOLOv4: Optimal Speed and Accuracy of Object Detection},
author={Alexey Bochkovskiy and Chien-Yao Wang and Hong-Yuan Mark Liao},
year={2020},
eprint={2004.10934},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: cspdarknet53
Metadata:
FLOPs: 8545018880
Batch Size: 128
Training Data:
- ImageNet
Training Techniques:
- CutMix
- Label Smoothing
- Mosaic
- Polynomial Learning Rate Decay
- SGD with Momentum
- Self-Adversarial Training
- Weight Decay
Training Resources: 1x NVIDIA RTX 2070 GPU
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Mish
- Residual Connection
- Softmax
File Size: 110775135
Tasks:
- Image Classification
ID: cspdarknet53
LR: 0.1
Layers: 53
Crop Pct: '0.887'
Momentum: 0.9
Image Size: '256'
Warmup Steps: 1000
Weight Decay: 0.0005
Interpolation: bilinear
Training Steps: 8000000
FPS (GPU RTX 2070): 66
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/cspnet.py#L441
In Collection: CSP DarkNet
Collections:
- Name: CSP DarkNet
Paper:
title: 'YOLOv4: Optimal Speed and Accuracy of Object Detection'
url: https://papperswithcode.com//paper/yolov4-optimal-speed-and-accuracy-of-object
type: model-index
Type: model-index
-->

@ -0,0 +1,133 @@
# Summary
**CSPResNet** is a convolutional neural network where we apply the Cross Stage Partial Network (CSPNet) approach to [ResNet](https://paperswithcode.com/method/resnet). The CSPNet partitions the feature map of the base layer into two parts and then merges them through a cross-stage hierarchy. The use of a split and merge strategy allows for more gradient flow through the network.
## How do I use this model on an image?
To load a pretrained model:
```python
import timm
model = timm.create_model('cspresnet50', pretrained=True)
model.eval()
```
To load and preprocess the image:
```python
import urllib
from PIL import Image
from timm.data import resolve_data_config
from timm.data.transforms_factory import create_transform
config = resolve_data_config({}, model=model)
transform = create_transform(**config)
url, filename = ("https://github.com/pytorch/hub/raw/master/images/dog.jpg", "dog.jpg")
urllib.request.urlretrieve(url, filename)
img = Image.open(filename).convert('RGB')
tensor = transform(img).unsqueeze(0) # transform and add batch dimension
```
To get the model predictions:
```python
import torch
with torch.no_grad():
out = model(tensor)
probabilities = torch.nn.functional.softmax(out[0], dim=0)
print(probabilities.shape)
# prints: torch.Size([1000])
```
To get the top-5 predictions class names:
```python
# Get imagenet class mappings
url, filename = ("https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt", "imagenet_classes.txt")
urllib.request.urlretrieve(url, filename)
with open("imagenet_classes.txt", "r") as f:
categories = [s.strip() for s in f.readlines()]
# Print top categories per image
top5_prob, top5_catid = torch.topk(probabilities, 5)
for i in range(top5_prob.size(0)):
print(categories[top5_catid[i]], top5_prob[i].item())
# prints class names and probabilities like:
# [('Samoyed', 0.6425196528434753), ('Pomeranian', 0.04062102362513542), ('keeshond', 0.03186424449086189), ('white wolf', 0.01739676296710968), ('Eskimo dog', 0.011717947199940681)]
```
Replace the model name with the variant you want to use, e.g. `cspresnet50`. You can find the IDs in the model summaries at the top of this page.
To extract image features with this model, follow the [timm feature extraction examples](https://rwightman.github.io/pytorch-image-models/feature_extraction/), just change the name of the model you want to use.
## How do I finetune this model?
You can finetune any of the pre-trained models just by changing the classifier (the last layer).
```python
model = timm.create_model('cspresnet50', pretrained=True).reset_classifier(NUM_FINETUNE_CLASSES)
```
To finetune on your own dataset, you have to write a training loop or adapt [timm's training
script](https://github.com/rwightman/pytorch-image-models/blob/master/train.py) to use your dataset.
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{wang2019cspnet,
title={CSPNet: A New Backbone that can Enhance Learning Capability of CNN},
author={Chien-Yao Wang and Hong-Yuan Mark Liao and I-Hau Yeh and Yueh-Hua Wu and Ping-Yang Chen and Jun-Wei Hsieh},
year={2019},
eprint={1911.11929},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: cspresnet50
Metadata:
FLOPs: 5924992000
Batch Size: 128
Training Data:
- ImageNet
Training Techniques:
- Label Smoothing
- Polynomial Learning Rate Decay
- SGD with Momentum
- Weight Decay
Training Resources: ''
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 86679303
Tasks:
- Image Classification
Training Time: ''
ID: cspresnet50
LR: 0.1
Layers: 50
Crop Pct: '0.887'
Momentum: 0.9
Image Size: '256'
Weight Decay: 0.005
Interpolation: bilinear
Training Steps: 8000000
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/cspnet.py#L415
Config: ''
In Collection: CSP ResNet
Collections:
- Name: CSP ResNet
Paper:
title: 'CSPNet: A New Backbone that can Enhance Learning Capability of CNN'
url: https://papperswithcode.com//paper/cspnet-a-new-backbone-that-can-enhance
type: model-index
Type: model-index
-->

@ -0,0 +1,133 @@
# Summary
**CSPResNeXt** is a convolutional neural network where we apply the Cross Stage Partial Network (CSPNet) approach to [ResNeXt](https://paperswithcode.com/method/resnext). The CSPNet partitions the feature map of the base layer into two parts and then merges them through a cross-stage hierarchy. The use of a split and merge strategy allows for more gradient flow through the network.
## How do I use this model on an image?
To load a pretrained model:
```python
import timm
model = timm.create_model('cspresnext50', pretrained=True)
model.eval()
```
To load and preprocess the image:
```python
import urllib
from PIL import Image
from timm.data import resolve_data_config
from timm.data.transforms_factory import create_transform
config = resolve_data_config({}, model=model)
transform = create_transform(**config)
url, filename = ("https://github.com/pytorch/hub/raw/master/images/dog.jpg", "dog.jpg")
urllib.request.urlretrieve(url, filename)
img = Image.open(filename).convert('RGB')
tensor = transform(img).unsqueeze(0) # transform and add batch dimension
```
To get the model predictions:
```python
import torch
with torch.no_grad():
out = model(tensor)
probabilities = torch.nn.functional.softmax(out[0], dim=0)
print(probabilities.shape)
# prints: torch.Size([1000])
```
To get the top-5 predictions class names:
```python
# Get imagenet class mappings
url, filename = ("https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt", "imagenet_classes.txt")
urllib.request.urlretrieve(url, filename)
with open("imagenet_classes.txt", "r") as f:
categories = [s.strip() for s in f.readlines()]
# Print top categories per image
top5_prob, top5_catid = torch.topk(probabilities, 5)
for i in range(top5_prob.size(0)):
print(categories[top5_catid[i]], top5_prob[i].item())
# prints class names and probabilities like:
# [('Samoyed', 0.6425196528434753), ('Pomeranian', 0.04062102362513542), ('keeshond', 0.03186424449086189), ('white wolf', 0.01739676296710968), ('Eskimo dog', 0.011717947199940681)]
```
Replace the model name with the variant you want to use, e.g. `cspresnext50`. You can find the IDs in the model summaries at the top of this page.
To extract image features with this model, follow the [timm feature extraction examples](https://rwightman.github.io/pytorch-image-models/feature_extraction/), just change the name of the model you want to use.
## How do I finetune this model?
You can finetune any of the pre-trained models just by changing the classifier (the last layer).
```python
model = timm.create_model('cspresnext50', pretrained=True).reset_classifier(NUM_FINETUNE_CLASSES)
```
To finetune on your own dataset, you have to write a training loop or adapt [timm's training
script](https://github.com/rwightman/pytorch-image-models/blob/master/train.py) to use your dataset.
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{wang2019cspnet,
title={CSPNet: A New Backbone that can Enhance Learning Capability of CNN},
author={Chien-Yao Wang and Hong-Yuan Mark Liao and I-Hau Yeh and Yueh-Hua Wu and Ping-Yang Chen and Jun-Wei Hsieh},
year={2019},
eprint={1911.11929},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: cspresnext50
Metadata:
FLOPs: 3962945536
Batch Size: 128
Training Data:
- ImageNet
Training Techniques:
- Label Smoothing
- Polynomial Learning Rate Decay
- SGD with Momentum
- Weight Decay
Training Resources: 1x GPU
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Grouped Convolution
- Max Pooling
- ReLU
- ResNeXt Block
- Residual Connection
- Softmax
File Size: 82562887
Tasks:
- Image Classification
Training Time: ''
ID: cspresnext50
LR: 0.1
Layers: 50
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.005
Interpolation: bilinear
Training Steps: 8000000
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/cspnet.py#L430
Config: ''
In Collection: CSP ResNeXt
Collections:
- Name: CSP ResNeXt
Paper:
title: 'CSPNet: A New Backbone that can Enhance Learning Capability of CNN'
url: https://papperswithcode.com//paper/cspnet-a-new-backbone-that-can-enhance
type: model-index
Type: model-index
-->

@ -0,0 +1,322 @@
# Summary
**DenseNet** is a type of convolutional neural network that utilises dense connections between layers, through [Dense Blocks](http://www.paperswithcode.com/method/dense-block), where we connect *all layers* (with matching feature-map sizes) directly with each other. To preserve the feed-forward nature, each layer obtains additional inputs from all preceding layers and passes on its own feature-maps to all subsequent layers.
The **DenseNet Blur** variant in this collection by Ross Wightman employs [Blur Pooling](http://www.paperswithcode.com/method/blur-pooling)
## How do I use this model on an image?
To load a pretrained model:
```python
import timm
model = timm.create_model('densenetblur121d', pretrained=True)
model.eval()
```
To load and preprocess the image:
```python
import urllib
from PIL import Image
from timm.data import resolve_data_config
from timm.data.transforms_factory import create_transform
config = resolve_data_config({}, model=model)
transform = create_transform(**config)
url, filename = ("https://github.com/pytorch/hub/raw/master/images/dog.jpg", "dog.jpg")
urllib.request.urlretrieve(url, filename)
img = Image.open(filename).convert('RGB')
tensor = transform(img).unsqueeze(0) # transform and add batch dimension
```
To get the model predictions:
```python
import torch
with torch.no_grad():
out = model(tensor)
probabilities = torch.nn.functional.softmax(out[0], dim=0)
print(probabilities.shape)
# prints: torch.Size([1000])
```
To get the top-5 predictions class names:
```python
# Get imagenet class mappings
url, filename = ("https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt", "imagenet_classes.txt")
urllib.request.urlretrieve(url, filename)
with open("imagenet_classes.txt", "r") as f:
categories = [s.strip() for s in f.readlines()]
# Print top categories per image
top5_prob, top5_catid = torch.topk(probabilities, 5)
for i in range(top5_prob.size(0)):
print(categories[top5_catid[i]], top5_prob[i].item())
# prints class names and probabilities like:
# [('Samoyed', 0.6425196528434753), ('Pomeranian', 0.04062102362513542), ('keeshond', 0.03186424449086189), ('white wolf', 0.01739676296710968), ('Eskimo dog', 0.011717947199940681)]
```
Replace the model name with the variant you want to use, e.g. `densenetblur121d`. You can find the IDs in the model summaries at the top of this page.
To extract image features with this model, follow the [timm feature extraction examples](https://rwightman.github.io/pytorch-image-models/feature_extraction/), just change the name of the model you want to use.
## How do I finetune this model?
You can finetune any of the pre-trained models just by changing the classifier (the last layer).
```python
model = timm.create_model('densenetblur121d', pretrained=True).reset_classifier(NUM_FINETUNE_CLASSES)
```
To finetune on your own dataset, you have to write a training loop or adapt [timm's training
script](https://github.com/rwightman/pytorch-image-models/blob/master/train.py) to use your dataset.
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@article{DBLP:journals/corr/HuangLW16a,
author = {Gao Huang and
Zhuang Liu and
Kilian Q. Weinberger},
title = {Densely Connected Convolutional Networks},
journal = {CoRR},
volume = {abs/1608.06993},
year = {2016},
url = {http://arxiv.org/abs/1608.06993},
archivePrefix = {arXiv},
eprint = {1608.06993},
timestamp = {Mon, 10 Sep 2018 15:49:32 +0200},
biburl = {https://dblp.org/rec/journals/corr/HuangLW16a.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
```
```
@misc{rw2019timm,
author = {Ross Wightman},
title = {PyTorch Image Models},
year = {2019},
publisher = {GitHub},
journal = {GitHub repository},
doi = {10.5281/zenodo.4414861},
howpublished = {\url{https://github.com/rwightman/pytorch-image-models}}
}
```
<!--
Models:
- Name: densenetblur121d
Metadata:
FLOPs: 3947812864
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Blur Pooling
- Convolution
- Dense Block
- Dense Connections
- Dropout
- Max Pooling
- ReLU
- Softmax
File Size: 32456500
Tasks:
- Image Classification
ID: densenetblur121d
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/densenet.py#L305
In Collection: DenseNet
- Name: tv_densenet121
Metadata:
FLOPs: 3641843200
Epochs: 90
Batch Size: 32
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Block
- Dense Connections
- Dropout
- Max Pooling
- ReLU
- Softmax
File Size: 32342954
Tasks:
- Image Classification
ID: tv_densenet121
LR: 0.1
Crop Pct: '0.875'
LR Gamma: 0.1
Momentum: 0.9
Image Size: '224'
LR Step Size: 30
Weight Decay: 0.0001
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/densenet.py#L379
In Collection: DenseNet
- Name: densenet121
Metadata:
FLOPs: 3641843200
Epochs: 90
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- Kaiming Initialization
- Nesterov Accelerated Gradient
- Weight Decay
Training Resources: ''
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Block
- Dense Connections
- Dropout
- Max Pooling
- ReLU
- Softmax
File Size: 32376726
Tasks:
- Image Classification
Training Time: ''
ID: densenet121
LR: 0.1
Layers: 121
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/densenet.py#L295
Config: ''
In Collection: DenseNet
- Name: densenet201
Metadata:
FLOPs: 5514321024
Epochs: 90
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- Kaiming Initialization
- Nesterov Accelerated Gradient
- Weight Decay
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Block
- Dense Connections
- Dropout
- Max Pooling
- ReLU
- Softmax
File Size: 81131730
Tasks:
- Image Classification
ID: densenet201
LR: 0.1
Layers: 201
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/densenet.py#L337
In Collection: DenseNet
- Name: densenet169
Metadata:
FLOPs: 4316945792
Epochs: 90
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- Kaiming Initialization
- Nesterov Accelerated Gradient
- Weight Decay
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Block
- Dense Connections
- Dropout
- Max Pooling
- ReLU
- Softmax
File Size: 57365526
Tasks:
- Image Classification
ID: densenet169
LR: 0.1
Layers: 169
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/densenet.py#L327
In Collection: DenseNet
- Name: densenet161
Metadata:
FLOPs: 9931959264
Epochs: 90
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- Kaiming Initialization
- Nesterov Accelerated Gradient
- Weight Decay
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Block
- Dense Connections
- Dropout
- Max Pooling
- ReLU
- Softmax
File Size: 115730790
Tasks:
- Image Classification
ID: densenet161
LR: 0.1
Layers: 161
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/densenet.py#L347
In Collection: DenseNet
Collections:
- Name: DenseNet
Paper:
title: Densely Connected Convolutional Networks
url: https://papperswithcode.com//paper/densely-connected-convolutional-networks
type: model-index
Type: model-index
-->

@ -0,0 +1,543 @@
# Summary
Extending “shallow” skip connections, **Dense Layer Aggregation (DLA)** incorporates more depth and sharing. The authors introduce two structures for deep layer aggregation (DLA): iterative deep aggregation (IDA) and hierarchical deep aggregation (HDA). These structures are expressed through an architectural framework, independent of the choice of backbone, for compatibility with current and future networks.
IDA focuses on fusing resolutions and scales while HDA focuses on merging features from all modules and channels. IDA follows the base hierarchy to refine resolution and aggregate scale stage-bystage. HDA assembles its own hierarchy of tree-structured connections that cross and merge stages to aggregate different levels of representation.
## How do I use this model on an image?
To load a pretrained model:
```python
import timm
model = timm.create_model('dla60', pretrained=True)
model.eval()
```
To load and preprocess the image:
```python
import urllib
from PIL import Image
from timm.data import resolve_data_config
from timm.data.transforms_factory import create_transform
config = resolve_data_config({}, model=model)
transform = create_transform(**config)
url, filename = ("https://github.com/pytorch/hub/raw/master/images/dog.jpg", "dog.jpg")
urllib.request.urlretrieve(url, filename)
img = Image.open(filename).convert('RGB')
tensor = transform(img).unsqueeze(0) # transform and add batch dimension
```
To get the model predictions:
```python
import torch
with torch.no_grad():
out = model(tensor)
probabilities = torch.nn.functional.softmax(out[0], dim=0)
print(probabilities.shape)
# prints: torch.Size([1000])
```
To get the top-5 predictions class names:
```python
# Get imagenet class mappings
url, filename = ("https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt", "imagenet_classes.txt")
urllib.request.urlretrieve(url, filename)
with open("imagenet_classes.txt", "r") as f:
categories = [s.strip() for s in f.readlines()]
# Print top categories per image
top5_prob, top5_catid = torch.topk(probabilities, 5)
for i in range(top5_prob.size(0)):
print(categories[top5_catid[i]], top5_prob[i].item())
# prints class names and probabilities like:
# [('Samoyed', 0.6425196528434753), ('Pomeranian', 0.04062102362513542), ('keeshond', 0.03186424449086189), ('white wolf', 0.01739676296710968), ('Eskimo dog', 0.011717947199940681)]
```
Replace the model name with the variant you want to use, e.g. `dla60`. You can find the IDs in the model summaries at the top of this page.
To extract image features with this model, follow the [timm feature extraction examples](https://rwightman.github.io/pytorch-image-models/feature_extraction/), just change the name of the model you want to use.
## How do I finetune this model?
You can finetune any of the pre-trained models just by changing the classifier (the last layer).
```python
model = timm.create_model('dla60', pretrained=True).reset_classifier(NUM_FINETUNE_CLASSES)
```
To finetune on your own dataset, you have to write a training loop or adapt [timm's training
script](https://github.com/rwightman/pytorch-image-models/blob/master/train.py) to use your dataset.
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{yu2019deep,
title={Deep Layer Aggregation},
author={Fisher Yu and Dequan Wang and Evan Shelhamer and Trevor Darrell},
year={2019},
eprint={1707.06484},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: dla60
Metadata:
FLOPs: 4256251880
Epochs: 120
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: ''
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- DLA Bottleneck Residual Block
- DLA Residual Block
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 89560235
Tasks:
- Image Classification
Training Time: ''
ID: dla60
LR: 0.1
Layers: 60
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/dla.py#L394
Config: ''
In Collection: DLA
- Name: dla46_c
Metadata:
FLOPs: 583277288
Epochs: 120
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: ''
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- DLA Bottleneck Residual Block
- DLA Residual Block
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 5307963
Tasks:
- Image Classification
Training Time: ''
ID: dla46_c
LR: 0.1
Layers: 46
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/dla.py#L369
Config: ''
In Collection: DLA
- Name: dla102x2
Metadata:
FLOPs: 9343847400
Epochs: 120
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 8x GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- DLA Bottleneck Residual Block
- DLA Residual Block
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 167645295
Tasks:
- Image Classification
Training Time: ''
ID: dla102x2
LR: 0.1
Layers: 102
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/dla.py#L426
Config: ''
In Collection: DLA
- Name: dla102
Metadata:
FLOPs: 7192952808
Epochs: 120
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 8x GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- DLA Bottleneck Residual Block
- DLA Residual Block
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 135290579
Tasks:
- Image Classification
Training Time: ''
ID: dla102
LR: 0.1
Layers: 102
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/dla.py#L410
Config: ''
In Collection: DLA
- Name: dla102x
Metadata:
FLOPs: 5886821352
Epochs: 120
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 8x GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- DLA Bottleneck Residual Block
- DLA Residual Block
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 107552695
Tasks:
- Image Classification
Training Time: ''
ID: dla102x
LR: 0.1
Layers: 102
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/dla.py#L418
Config: ''
In Collection: DLA
- Name: dla169
Metadata:
FLOPs: 11598004200
Epochs: 120
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 8x GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- DLA Bottleneck Residual Block
- DLA Residual Block
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 216547113
Tasks:
- Image Classification
Training Time: ''
ID: dla169
LR: 0.1
Layers: 169
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/dla.py#L434
Config: ''
In Collection: DLA
- Name: dla46x_c
Metadata:
FLOPs: 544052200
Epochs: 120
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: ''
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- DLA Bottleneck Residual Block
- DLA Residual Block
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 4387641
Tasks:
- Image Classification
Training Time: ''
ID: dla46x_c
LR: 0.1
Layers: 46
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/dla.py#L378
Config: ''
In Collection: DLA
- Name: dla60_res2net
Metadata:
FLOPs: 4147578504
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: ''
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- DLA Bottleneck Residual Block
- DLA Residual Block
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 84886593
Tasks:
- Image Classification
Training Time: ''
ID: dla60_res2net
Layers: 60
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/dla.py#L346
Config: ''
In Collection: DLA
- Name: dla60_res2next
Metadata:
FLOPs: 3485335272
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: ''
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- DLA Bottleneck Residual Block
- DLA Residual Block
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 69639245
Tasks:
- Image Classification
Training Time: ''
ID: dla60_res2next
Layers: 60
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/dla.py#L354
Config: ''
In Collection: DLA
- Name: dla34
Metadata:
FLOPs: 3070105576
Epochs: 120
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: ''
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- DLA Bottleneck Residual Block
- DLA Residual Block
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 63228658
Tasks:
- Image Classification
Training Time: ''
ID: dla34
LR: 0.1
Layers: 32
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/dla.py#L362
Config: ''
In Collection: DLA
- Name: dla60x
Metadata:
FLOPs: 3544204264
Epochs: 120
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: ''
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- DLA Bottleneck Residual Block
- DLA Residual Block
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 70883139
Tasks:
- Image Classification
Training Time: ''
ID: dla60x
LR: 0.1
Layers: 60
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/dla.py#L402
Config: ''
In Collection: DLA
- Name: dla60x_c
Metadata:
FLOPs: 593325032
Epochs: 120
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: ''
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- DLA Bottleneck Residual Block
- DLA Residual Block
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 5454396
Tasks:
- Image Classification
Training Time: ''
ID: dla60x_c
LR: 0.1
Layers: 60
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/dla.py#L386
Config: ''
In Collection: DLA
Collections:
- Name: DLA
Paper:
title: Deep Layer Aggregation
url: https://papperswithcode.com//paper/deep-layer-aggregation
type: model-index
Type: model-index
-->

@ -0,0 +1,270 @@
# Summary
A **Dual Path Network (DPN)** is a convolutional neural network which presents a new topology of connection paths internally. The intuition is that [ResNets](https://paperswithcode.com/method/resnet) enables feature re-usage while DenseNet enables new feature exploration, and both are important for learning good representations. To enjoy the benefits from both path topologies, Dual Path Networks share common features while maintaining the flexibility to explore new features through dual path architectures.
The principal building block is an [DPN Block](https://paperswithcode.com/method/dpn-block).
## How do I use this model on an image?
To load a pretrained model:
```python
import timm
model = timm.create_model('dpn68', pretrained=True)
model.eval()
```
To load and preprocess the image:
```python
import urllib
from PIL import Image
from timm.data import resolve_data_config
from timm.data.transforms_factory import create_transform
config = resolve_data_config({}, model=model)
transform = create_transform(**config)
url, filename = ("https://github.com/pytorch/hub/raw/master/images/dog.jpg", "dog.jpg")
urllib.request.urlretrieve(url, filename)
img = Image.open(filename).convert('RGB')
tensor = transform(img).unsqueeze(0) # transform and add batch dimension
```
To get the model predictions:
```python
import torch
with torch.no_grad():
out = model(tensor)
probabilities = torch.nn.functional.softmax(out[0], dim=0)
print(probabilities.shape)
# prints: torch.Size([1000])
```
To get the top-5 predictions class names:
```python
# Get imagenet class mappings
url, filename = ("https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt", "imagenet_classes.txt")
urllib.request.urlretrieve(url, filename)
with open("imagenet_classes.txt", "r") as f:
categories = [s.strip() for s in f.readlines()]
# Print top categories per image
top5_prob, top5_catid = torch.topk(probabilities, 5)
for i in range(top5_prob.size(0)):
print(categories[top5_catid[i]], top5_prob[i].item())
# prints class names and probabilities like:
# [('Samoyed', 0.6425196528434753), ('Pomeranian', 0.04062102362513542), ('keeshond', 0.03186424449086189), ('white wolf', 0.01739676296710968), ('Eskimo dog', 0.011717947199940681)]
```
Replace the model name with the variant you want to use, e.g. `dpn68`. You can find the IDs in the model summaries at the top of this page.
To extract image features with this model, follow the [timm feature extraction examples](https://rwightman.github.io/pytorch-image-models/feature_extraction/), just change the name of the model you want to use.
## How do I finetune this model?
You can finetune any of the pre-trained models just by changing the classifier (the last layer).
```python
model = timm.create_model('dpn68', pretrained=True).reset_classifier(NUM_FINETUNE_CLASSES)
```
To finetune on your own dataset, you have to write a training loop or adapt [timm's training
script](https://github.com/rwightman/pytorch-image-models/blob/master/train.py) to use your dataset.
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{chen2017dual,
title={Dual Path Networks},
author={Yunpeng Chen and Jianan Li and Huaxin Xiao and Xiaojie Jin and Shuicheng Yan and Jiashi Feng},
year={2017},
eprint={1707.01629},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: dpn68
Metadata:
FLOPs: 2990567880
Batch Size: 1280
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 40x K80 GPUs
Architecture:
- Batch Normalization
- Convolution
- DPN Block
- Dense Connections
- Global Average Pooling
- Max Pooling
- Softmax
File Size: 50761994
Tasks:
- Image Classification
ID: dpn68
LR: 0.316
Layers: 68
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/dpn.py#L270
In Collection: DPN
- Name: dpn68b
Metadata:
FLOPs: 2990567880
Batch Size: 1280
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 40x K80 GPUs
Architecture:
- Batch Normalization
- Convolution
- DPN Block
- Dense Connections
- Global Average Pooling
- Max Pooling
- Softmax
File Size: 50781025
Tasks:
- Image Classification
ID: dpn68b
LR: 0.316
Layers: 68
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/dpn.py#L278
In Collection: DPN
- Name: dpn92
Metadata:
FLOPs: 8357659624
Batch Size: 1280
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 40x K80 GPUs
Architecture:
- Batch Normalization
- Convolution
- DPN Block
- Dense Connections
- Global Average Pooling
- Max Pooling
- Softmax
File Size: 151248422
Tasks:
- Image Classification
ID: dpn92
LR: 0.316
Layers: 92
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/dpn.py#L286
In Collection: DPN
- Name: dpn131
Metadata:
FLOPs: 20586274792
Batch Size: 960
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 40x K80 GPUs
Architecture:
- Batch Normalization
- Convolution
- DPN Block
- Dense Connections
- Global Average Pooling
- Max Pooling
- Softmax
File Size: 318016207
Tasks:
- Image Classification
ID: dpn131
LR: 0.316
Layers: 131
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/dpn.py#L302
In Collection: DPN
- Name: dpn107
Metadata:
FLOPs: 23524280296
Batch Size: 1280
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 40x K80 GPUs
Architecture:
- Batch Normalization
- Convolution
- DPN Block
- Dense Connections
- Global Average Pooling
- Max Pooling
- Softmax
File Size: 348612331
Tasks:
- Image Classification
ID: dpn107
LR: 0.316
Layers: 107
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/dpn.py#L310
In Collection: DPN
- Name: dpn98
Metadata:
FLOPs: 15003675112
Batch Size: 1280
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 40x K80 GPUs
Architecture:
- Batch Normalization
- Convolution
- DPN Block
- Dense Connections
- Global Average Pooling
- Max Pooling
- Softmax
File Size: 247021307
Tasks:
- Image Classification
ID: dpn98
LR: 0.4
Layers: 98
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/dpn.py#L294
In Collection: DPN
Collections:
- Name: DPN
Paper:
title: Dual Path Networks
url: https://papperswithcode.com//paper/dual-path-networks
type: model-index
Type: model-index
-->

@ -0,0 +1,261 @@
# Summary
An **ECA ResNet** is a variant on a [ResNet](https://paperswithcode.com/method/resnet) that utilises an [Efficient Channel Attention module](https://paperswithcode.com/method/efficient-channel-attention). Efficient Channel Attention is an architectural unit based on [squeeze-and-excitation blocks](https://paperswithcode.com/method/squeeze-and-excitation-block) that reduces model complexity without dimensionality reduction.
## How do I use this model on an image?
To load a pretrained model:
```python
import timm
model = timm.create_model('ecaresnet101d', pretrained=True)
model.eval()
```
To load and preprocess the image:
```python
import urllib
from PIL import Image
from timm.data import resolve_data_config
from timm.data.transforms_factory import create_transform
config = resolve_data_config({}, model=model)
transform = create_transform(**config)
url, filename = ("https://github.com/pytorch/hub/raw/master/images/dog.jpg", "dog.jpg")
urllib.request.urlretrieve(url, filename)
img = Image.open(filename).convert('RGB')
tensor = transform(img).unsqueeze(0) # transform and add batch dimension
```
To get the model predictions:
```python
import torch
with torch.no_grad():
out = model(tensor)
probabilities = torch.nn.functional.softmax(out[0], dim=0)
print(probabilities.shape)
# prints: torch.Size([1000])
```
To get the top-5 predictions class names:
```python
# Get imagenet class mappings
url, filename = ("https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt", "imagenet_classes.txt")
urllib.request.urlretrieve(url, filename)
with open("imagenet_classes.txt", "r") as f:
categories = [s.strip() for s in f.readlines()]
# Print top categories per image
top5_prob, top5_catid = torch.topk(probabilities, 5)
for i in range(top5_prob.size(0)):
print(categories[top5_catid[i]], top5_prob[i].item())
# prints class names and probabilities like:
# [('Samoyed', 0.6425196528434753), ('Pomeranian', 0.04062102362513542), ('keeshond', 0.03186424449086189), ('white wolf', 0.01739676296710968), ('Eskimo dog', 0.011717947199940681)]
```
Replace the model name with the variant you want to use, e.g. `ecaresnet101d`. You can find the IDs in the model summaries at the top of this page.
To extract image features with this model, follow the [timm feature extraction examples](https://rwightman.github.io/pytorch-image-models/feature_extraction/), just change the name of the model you want to use.
## How do I finetune this model?
You can finetune any of the pre-trained models just by changing the classifier (the last layer).
```python
model = timm.create_model('ecaresnet101d', pretrained=True).reset_classifier(NUM_FINETUNE_CLASSES)
```
To finetune on your own dataset, you have to write a training loop or adapt [timm's training
script](https://github.com/rwightman/pytorch-image-models/blob/master/train.py) to use your dataset.
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{wang2020ecanet,
title={ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks},
author={Qilong Wang and Banggu Wu and Pengfei Zhu and Peihua Li and Wangmeng Zuo and Qinghua Hu},
year={2020},
eprint={1910.03151},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: ecaresnet101d
Metadata:
FLOPs: 10377193728
Epochs: 100
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 4x RTX 2080Ti GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Efficient Channel Attention
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 178815067
Tasks:
- Image Classification
ID: ecaresnet101d
LR: 0.1
Layers: 101
Crop Pct: '0.875'
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/a7f95818e44b281137503bcf4b3e3e94d8ffa52f/timm/models/resnet.py#L1087
In Collection: ECAResNet
- Name: ecaresnet101d_pruned
Metadata:
FLOPs: 4463972081
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: ''
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Efficient Channel Attention
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 99852736
Tasks:
- Image Classification
Training Time: ''
ID: ecaresnet101d_pruned
Layers: 101
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/a7f95818e44b281137503bcf4b3e3e94d8ffa52f/timm/models/resnet.py#L1097
Config: ''
In Collection: ECAResNet
- Name: ecaresnet50d_pruned
Metadata:
FLOPs: 3250730657
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Efficient Channel Attention
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 79990436
Tasks:
- Image Classification
ID: ecaresnet50d_pruned
Layers: 50
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/a7f95818e44b281137503bcf4b3e3e94d8ffa52f/timm/models/resnet.py#L1055
In Collection: ECAResNet
- Name: ecaresnet50d
Metadata:
FLOPs: 5591090432
Epochs: 100
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 4x RTX 2080Ti GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Efficient Channel Attention
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 102579290
Tasks:
- Image Classification
ID: ecaresnet50d
LR: 0.1
Layers: 50
Crop Pct: '0.875'
Image Size: '224'
Weight Decay: 0.0001
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/a7f95818e44b281137503bcf4b3e3e94d8ffa52f/timm/models/resnet.py#L1045
In Collection: ECAResNet
- Name: ecaresnetlight
Metadata:
FLOPs: 5276118784
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Efficient Channel Attention
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 120956612
Tasks:
- Image Classification
ID: ecaresnetlight
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/a7f95818e44b281137503bcf4b3e3e94d8ffa52f/timm/models/resnet.py#L1077
In Collection: ECAResNet
Collections:
- Name: ECAResNet
Paper:
title: 'ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks'
url: https://papperswithcode.com//paper/eca-net-efficient-channel-attention-for-deep
type: model-index
Type: model-index
-->

@ -0,0 +1,184 @@
# Summary
**EfficientNet** is a convolutional neural network architecture and scaling method that uniformly scales all dimensions of depth/width/resolution using a *compound coefficient*. Unlike conventional practice that arbitrary scales these factors, the EfficientNet scaling method uniformly scales network width, depth, and resolution with a set of fixed scaling coefficients. For example, if we want to use $2^N$ times more computational resources, then we can simply increase the network depth by $\alpha ^ N$, width by $\beta ^ N$, and image size by $\gamma ^ N$, where $\alpha, \beta, \gamma$ are constant coefficients determined by a small grid search on the original small model. EfficientNet uses a compound coefficient $\phi$ to uniformly scales network width, depth, and resolution in a principled way.
The compound scaling method is justified by the intuition that if the input image is bigger, then the network needs more layers to increase the receptive field and more channels to capture more fine-grained patterns on the bigger image.
The base EfficientNet-B0 network is based on the inverted bottleneck residual blocks of [MobileNetV2](https://paperswithcode.com/method/mobilenetv2), in addition to [squeeze-and-excitation blocks](https://paperswithcode.com/method/squeeze-and-excitation-block).
This collection consists of pruned EfficientNet models.
## How do I use this model on an image?
To load a pretrained model:
```python
import timm
model = timm.create_model('efficientnet_b1_pruned', pretrained=True)
model.eval()
```
To load and preprocess the image:
```python
import urllib
from PIL import Image
from timm.data import resolve_data_config
from timm.data.transforms_factory import create_transform
config = resolve_data_config({}, model=model)
transform = create_transform(**config)
url, filename = ("https://github.com/pytorch/hub/raw/master/images/dog.jpg", "dog.jpg")
urllib.request.urlretrieve(url, filename)
img = Image.open(filename).convert('RGB')
tensor = transform(img).unsqueeze(0) # transform and add batch dimension
```
To get the model predictions:
```python
import torch
with torch.no_grad():
out = model(tensor)
probabilities = torch.nn.functional.softmax(out[0], dim=0)
print(probabilities.shape)
# prints: torch.Size([1000])
```
To get the top-5 predictions class names:
```python
# Get imagenet class mappings
url, filename = ("https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt", "imagenet_classes.txt")
urllib.request.urlretrieve(url, filename)
with open("imagenet_classes.txt", "r") as f:
categories = [s.strip() for s in f.readlines()]
# Print top categories per image
top5_prob, top5_catid = torch.topk(probabilities, 5)
for i in range(top5_prob.size(0)):
print(categories[top5_catid[i]], top5_prob[i].item())
# prints class names and probabilities like:
# [('Samoyed', 0.6425196528434753), ('Pomeranian', 0.04062102362513542), ('keeshond', 0.03186424449086189), ('white wolf', 0.01739676296710968), ('Eskimo dog', 0.011717947199940681)]
```
Replace the model name with the variant you want to use, e.g. `efficientnet_b1_pruned`. You can find the IDs in the model summaries at the top of this page.
To extract image features with this model, follow the [timm feature extraction examples](https://rwightman.github.io/pytorch-image-models/feature_extraction/), just change the name of the model you want to use.
## How do I finetune this model?
You can finetune any of the pre-trained models just by changing the classifier (the last layer).
```python
model = timm.create_model('efficientnet_b1_pruned', pretrained=True).reset_classifier(NUM_FINETUNE_CLASSES)
```
To finetune on your own dataset, you have to write a training loop or adapt [timm's training
script](https://github.com/rwightman/pytorch-image-models/blob/master/train.py) to use your dataset.
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{tan2020efficientnet,
title={EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks},
author={Mingxing Tan and Quoc V. Le},
year={2020},
eprint={1905.11946},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
```
```
@misc{rw2019timm,
author = {Ross Wightman},
title = {PyTorch Image Models},
year = {2019},
publisher = {GitHub},
journal = {GitHub repository},
doi = {10.5281/zenodo.4414861},
howpublished = {\url{https://github.com/rwightman/pytorch-image-models}}
}
```
<!--
Models:
- Name: efficientnet_b1_pruned
Metadata:
FLOPs: 489653114
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 25595162
Tasks:
- Image Classification
ID: efficientnet_b1_pruned
Crop Pct: '0.882'
Image Size: '240'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/a7f95818e44b281137503bcf4b3e3e94d8ffa52f/timm/models/efficientnet.py#L1208
In Collection: EfficientNet Pruned
- Name: efficientnet_b3_pruned
Metadata:
FLOPs: 1239590641
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 39770812
Tasks:
- Image Classification
ID: efficientnet_b3_pruned
Crop Pct: '0.904'
Image Size: '300'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/a7f95818e44b281137503bcf4b3e3e94d8ffa52f/timm/models/efficientnet.py#L1230
In Collection: EfficientNet Pruned
- Name: efficientnet_b2_pruned
Metadata:
FLOPs: 878133915
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 33555005
Tasks:
- Image Classification
ID: efficientnet_b2_pruned
Crop Pct: '0.89'
Image Size: '260'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/a7f95818e44b281137503bcf4b3e3e94d8ffa52f/timm/models/efficientnet.py#L1219
In Collection: EfficientNet Pruned
Collections:
- Name: EfficientNet Pruned
Paper:
title: Knapsack Pruning with Inner Distillation
url: https://papperswithcode.com//paper/knapsack-pruning-with-inner-distillation
type: model-index
Type: model-index
-->

@ -0,0 +1,315 @@
# Summary
**EfficientNet** is a convolutional neural network architecture and scaling method that uniformly scales all dimensions of depth/width/resolution using a *compound coefficient*. Unlike conventional practice that arbitrary scales these factors, the EfficientNet scaling method uniformly scales network width, depth, and resolution with a set of fixed scaling coefficients. For example, if we want to use $2^N$ times more computational resources, then we can simply increase the network depth by $\alpha ^ N$, width by $\beta ^ N$, and image size by $\gamma ^ N$, where $\alpha, \beta, \gamma$ are constant coefficients determined by a small grid search on the original small model. EfficientNet uses a compound coefficient $\phi$ to uniformly scales network width, depth, and resolution in a principled way.
The compound scaling method is justified by the intuition that if the input image is bigger, then the network needs more layers to increase the receptive field and more channels to capture more fine-grained patterns on the bigger image.
The base EfficientNet-B0 network is based on the inverted bottleneck residual blocks of [MobileNetV2](https://paperswithcode.com/method/mobilenetv2), in addition to [squeeze-and-excitation blocks](https://paperswithcode.com/method/squeeze-and-excitation-block).
## How do I use this model on an image?
To load a pretrained model:
```python
import timm
model = timm.create_model('efficientnet_b2a', pretrained=True)
model.eval()
```
To load and preprocess the image:
```python
import urllib
from PIL import Image
from timm.data import resolve_data_config
from timm.data.transforms_factory import create_transform
config = resolve_data_config({}, model=model)
transform = create_transform(**config)
url, filename = ("https://github.com/pytorch/hub/raw/master/images/dog.jpg", "dog.jpg")
urllib.request.urlretrieve(url, filename)
img = Image.open(filename).convert('RGB')
tensor = transform(img).unsqueeze(0) # transform and add batch dimension
```
To get the model predictions:
```python
import torch
with torch.no_grad():
out = model(tensor)
probabilities = torch.nn.functional.softmax(out[0], dim=0)
print(probabilities.shape)
# prints: torch.Size([1000])
```
To get the top-5 predictions class names:
```python
# Get imagenet class mappings
url, filename = ("https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt", "imagenet_classes.txt")
urllib.request.urlretrieve(url, filename)
with open("imagenet_classes.txt", "r") as f:
categories = [s.strip() for s in f.readlines()]
# Print top categories per image
top5_prob, top5_catid = torch.topk(probabilities, 5)
for i in range(top5_prob.size(0)):
print(categories[top5_catid[i]], top5_prob[i].item())
# prints class names and probabilities like:
# [('Samoyed', 0.6425196528434753), ('Pomeranian', 0.04062102362513542), ('keeshond', 0.03186424449086189), ('white wolf', 0.01739676296710968), ('Eskimo dog', 0.011717947199940681)]
```
Replace the model name with the variant you want to use, e.g. `efficientnet_b2a`. You can find the IDs in the model summaries at the top of this page.
To extract image features with this model, follow the [timm feature extraction examples](https://rwightman.github.io/pytorch-image-models/feature_extraction/), just change the name of the model you want to use.
## How do I finetune this model?
You can finetune any of the pre-trained models just by changing the classifier (the last layer).
```python
model = timm.create_model('efficientnet_b2a', pretrained=True).reset_classifier(NUM_FINETUNE_CLASSES)
```
To finetune on your own dataset, you have to write a training loop or adapt [timm's training
script](https://github.com/rwightman/pytorch-image-models/blob/master/train.py) to use your dataset.
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{tan2020efficientnet,
title={EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks},
author={Mingxing Tan and Quoc V. Le},
year={2020},
eprint={1905.11946},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
```
<!--
Models:
- Name: efficientnet_b2a
Metadata:
FLOPs: 1452041554
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 49369973
Tasks:
- Image Classification
ID: efficientnet_b2a
Crop Pct: '1.0'
Image Size: '288'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/a7f95818e44b281137503bcf4b3e3e94d8ffa52f/timm/models/efficientnet.py#L1029
In Collection: EfficientNet
- Name: efficientnet_b3a
Metadata:
FLOPs: 2600628304
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 49369973
Tasks:
- Image Classification
ID: efficientnet_b3a
Crop Pct: '1.0'
Image Size: '320'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/a7f95818e44b281137503bcf4b3e3e94d8ffa52f/timm/models/efficientnet.py#L1047
In Collection: EfficientNet
- Name: efficientnet_em
Metadata:
FLOPs: 3935516480
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 27927309
Tasks:
- Image Classification
ID: efficientnet_em
Crop Pct: '0.882'
Image Size: '240'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/a7f95818e44b281137503bcf4b3e3e94d8ffa52f/timm/models/efficientnet.py#L1118
In Collection: EfficientNet
- Name: efficientnet_lite0
Metadata:
FLOPs: 510605024
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 18820005
Tasks:
- Image Classification
ID: efficientnet_lite0
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/a7f95818e44b281137503bcf4b3e3e94d8ffa52f/timm/models/efficientnet.py#L1163
In Collection: EfficientNet
- Name: efficientnet_es
Metadata:
FLOPs: 2317181824
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 22003339
Tasks:
- Image Classification
ID: efficientnet_es
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/a7f95818e44b281137503bcf4b3e3e94d8ffa52f/timm/models/efficientnet.py#L1110
In Collection: EfficientNet
- Name: efficientnet_b3
Metadata:
FLOPs: 2327905920
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 49369973
Tasks:
- Image Classification
ID: efficientnet_b3
Crop Pct: '0.904'
Image Size: '300'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/a7f95818e44b281137503bcf4b3e3e94d8ffa52f/timm/models/efficientnet.py#L1038
In Collection: EfficientNet
- Name: efficientnet_b0
Metadata:
FLOPs: 511241564
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 21376743
Tasks:
- Image Classification
ID: efficientnet_b0
Layers: 18
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/a7f95818e44b281137503bcf4b3e3e94d8ffa52f/timm/models/efficientnet.py#L1002
In Collection: EfficientNet
- Name: efficientnet_b1
Metadata:
FLOPs: 909691920
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 31502706
Tasks:
- Image Classification
ID: efficientnet_b1
Crop Pct: '0.875'
Image Size: '240'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/a7f95818e44b281137503bcf4b3e3e94d8ffa52f/timm/models/efficientnet.py#L1011
In Collection: EfficientNet
- Name: efficientnet_b2
Metadata:
FLOPs: 1265324514
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inverted Residual Block
- Squeeze-and-Excitation Block
- Swish
File Size: 36788104
Tasks:
- Image Classification
ID: efficientnet_b2
Crop Pct: '0.875'
Image Size: '260'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/a7f95818e44b281137503bcf4b3e3e94d8ffa52f/timm/models/efficientnet.py#L1020
In Collection: EfficientNet
Collections:
- Name: EfficientNet
Paper:
title: 'EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks'
url: https://papperswithcode.com//paper/efficientnet-rethinking-model-scaling-for
type: model-index
Type: model-index
-->

@ -0,0 +1,150 @@
# Summary
**Inception-ResNet-v2** is a convolutional neural architecture that builds on the Inception family of architectures but incorporates [residual connections](https://paperswithcode.com/method/residual-connection) (replacing the filter concatenation stage of the Inception architecture).
This particular model was trained for study of adversarial examples (adversarial training).
## How do I use this model on an image?
To load a pretrained model:
```python
import timm
model = timm.create_model('ens_adv_inception_resnet_v2', pretrained=True)
model.eval()
```
To load and preprocess the image:
```python
import urllib
from PIL import Image
from timm.data import resolve_data_config
from timm.data.transforms_factory import create_transform
config = resolve_data_config({}, model=model)
transform = create_transform(**config)
url, filename = ("https://github.com/pytorch/hub/raw/master/images/dog.jpg", "dog.jpg")
urllib.request.urlretrieve(url, filename)
img = Image.open(filename).convert('RGB')
tensor = transform(img).unsqueeze(0) # transform and add batch dimension
```
To get the model predictions:
```python
import torch
with torch.no_grad():
out = model(tensor)
probabilities = torch.nn.functional.softmax(out[0], dim=0)
print(probabilities.shape)
# prints: torch.Size([1000])
```
To get the top-5 predictions class names:
```python
# Get imagenet class mappings
url, filename = ("https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt", "imagenet_classes.txt")
urllib.request.urlretrieve(url, filename)
with open("imagenet_classes.txt", "r") as f:
categories = [s.strip() for s in f.readlines()]
# Print top categories per image
top5_prob, top5_catid = torch.topk(probabilities, 5)
for i in range(top5_prob.size(0)):
print(categories[top5_catid[i]], top5_prob[i].item())
# prints class names and probabilities like:
# [('Samoyed', 0.6425196528434753), ('Pomeranian', 0.04062102362513542), ('keeshond', 0.03186424449086189), ('white wolf', 0.01739676296710968), ('Eskimo dog', 0.011717947199940681)]
```
Replace the model name with the variant you want to use, e.g. `ens_adv_inception_resnet_v2`. You can find the IDs in the model summaries at the top of this page.
To extract image features with this model, follow the [timm feature extraction examples](https://rwightman.github.io/pytorch-image-models/feature_extraction/), just change the name of the model you want to use.
## How do I finetune this model?
You can finetune any of the pre-trained models just by changing the classifier (the last layer).
```python
model = timm.create_model('ens_adv_inception_resnet_v2', pretrained=True).reset_classifier(NUM_FINETUNE_CLASSES)
```
To finetune on your own dataset, you have to write a training loop or adapt [timm's training
script](https://github.com/rwightman/pytorch-image-models/blob/master/train.py) to use your dataset.
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@article{DBLP:journals/corr/abs-1804-00097,
author = {Alexey Kurakin and
Ian J. Goodfellow and
Samy Bengio and
Yinpeng Dong and
Fangzhou Liao and
Ming Liang and
Tianyu Pang and
Jun Zhu and
Xiaolin Hu and
Cihang Xie and
Jianyu Wang and
Zhishuai Zhang and
Zhou Ren and
Alan L. Yuille and
Sangxia Huang and
Yao Zhao and
Yuzhe Zhao and
Zhonglin Han and
Junjiajia Long and
Yerkebulan Berdibekov and
Takuya Akiba and
Seiya Tokui and
Motoki Abe},
title = {Adversarial Attacks and Defences Competition},
journal = {CoRR},
volume = {abs/1804.00097},
year = {2018},
url = {http://arxiv.org/abs/1804.00097},
archivePrefix = {arXiv},
eprint = {1804.00097},
timestamp = {Thu, 31 Oct 2019 16:31:22 +0100},
biburl = {https://dblp.org/rec/journals/corr/abs-1804-00097.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
```
<!--
Models:
- Name: ens_adv_inception_resnet_v2
Metadata:
FLOPs: 16959133120
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Auxiliary Classifier
- Average Pooling
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inception-v3 Module
- Max Pooling
- ReLU
- Softmax
File Size: 223774238
Tasks:
- Image Classification
ID: ens_adv_inception_resnet_v2
Crop Pct: '0.897'
Image Size: '299'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/inception_resnet_v2.py#L351
In Collection: Ensemble Adversarial
Collections:
- Name: Ensemble Adversarial
Paper:
title: Adversarial Attacks and Defences Competition
url: https://papperswithcode.com//paper/adversarial-attacks-and-defences-competition
type: model-index
Type: model-index
-->

@ -0,0 +1,138 @@
# Summary
**VoVNet** is a convolutional neural network that seeks to make [DenseNet](https://paperswithcode.com/method/densenet) more efficient by concatenating all features only once in the last feature map, which makes input size constant and enables enlarging new output channel.
Read about [one-shot aggregation here](https://paperswithcode.com/method/one-shot-aggregation).
## How do I use this model on an image?
To load a pretrained model:
```python
import timm
model = timm.create_model('ese_vovnet39b', pretrained=True)
model.eval()
```
To load and preprocess the image:
```python
import urllib
from PIL import Image
from timm.data import resolve_data_config
from timm.data.transforms_factory import create_transform
config = resolve_data_config({}, model=model)
transform = create_transform(**config)
url, filename = ("https://github.com/pytorch/hub/raw/master/images/dog.jpg", "dog.jpg")
urllib.request.urlretrieve(url, filename)
img = Image.open(filename).convert('RGB')
tensor = transform(img).unsqueeze(0) # transform and add batch dimension
```
To get the model predictions:
```python
import torch
with torch.no_grad():
out = model(tensor)
probabilities = torch.nn.functional.softmax(out[0], dim=0)
print(probabilities.shape)
# prints: torch.Size([1000])
```
To get the top-5 predictions class names:
```python
# Get imagenet class mappings
url, filename = ("https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt", "imagenet_classes.txt")
urllib.request.urlretrieve(url, filename)
with open("imagenet_classes.txt", "r") as f:
categories = [s.strip() for s in f.readlines()]
# Print top categories per image
top5_prob, top5_catid = torch.topk(probabilities, 5)
for i in range(top5_prob.size(0)):
print(categories[top5_catid[i]], top5_prob[i].item())
# prints class names and probabilities like:
# [('Samoyed', 0.6425196528434753), ('Pomeranian', 0.04062102362513542), ('keeshond', 0.03186424449086189), ('white wolf', 0.01739676296710968), ('Eskimo dog', 0.011717947199940681)]
```
Replace the model name with the variant you want to use, e.g. `ese_vovnet39b`. You can find the IDs in the model summaries at the top of this page.
To extract image features with this model, follow the [timm feature extraction examples](https://rwightman.github.io/pytorch-image-models/feature_extraction/), just change the name of the model you want to use.
## How do I finetune this model?
You can finetune any of the pre-trained models just by changing the classifier (the last layer).
```python
model = timm.create_model('ese_vovnet39b', pretrained=True).reset_classifier(NUM_FINETUNE_CLASSES)
```
To finetune on your own dataset, you have to write a training loop or adapt [timm's training
script](https://github.com/rwightman/pytorch-image-models/blob/master/train.py) to use your dataset.
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{lee2019energy,
title={An Energy and GPU-Computation Efficient Backbone Network for Real-Time Object Detection},
author={Youngwan Lee and Joong-won Hwang and Sangrok Lee and Yuseok Bae and Jongyoul Park},
year={2019},
eprint={1904.09730},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: ese_vovnet39b
Metadata:
FLOPs: 9089259008
Training Data:
- ImageNet
Architecture:
- Batch Normalization
- Convolution
- Max Pooling
- One-Shot Aggregation
- ReLU
File Size: 98397138
Tasks:
- Image Classification
ID: ese_vovnet39b
Layers: 39
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/vovnet.py#L371
In Collection: ESE VovNet
- Name: ese_vovnet19b_dw
Metadata:
FLOPs: 1711959904
Training Data:
- ImageNet
Architecture:
- Batch Normalization
- Convolution
- Max Pooling
- One-Shot Aggregation
- ReLU
File Size: 26243175
Tasks:
- Image Classification
ID: ese_vovnet19b_dw
Layers: 19
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/vovnet.py#L361
In Collection: ESE VovNet
Collections:
- Name: ESE VovNet
Paper:
title: 'CenterMask : Real-Time Anchor-Free Instance Segmentation'
url: https://papperswithcode.com//paper/centermask-real-time-anchor-free-instance-1
type: model-index
Type: model-index
-->

@ -0,0 +1,130 @@
# Summary
**FBNet** is a type of convolutional neural architectures discovered through [DNAS](https://paperswithcode.com/method/dnas) neural architecture search. It utilises a basic type of image model block inspired by [MobileNetv2](https://paperswithcode.com/method/mobilenetv2) that utilises depthwise convolutions and an inverted residual structure (see components).
The principal building block is the [FBNet Block](https://paperswithcode.com/method/fbnet-block).
## How do I use this model on an image?
To load a pretrained model:
```python
import timm
model = timm.create_model('fbnetc_100', pretrained=True)
model.eval()
```
To load and preprocess the image:
```python
import urllib
from PIL import Image
from timm.data import resolve_data_config
from timm.data.transforms_factory import create_transform
config = resolve_data_config({}, model=model)
transform = create_transform(**config)
url, filename = ("https://github.com/pytorch/hub/raw/master/images/dog.jpg", "dog.jpg")
urllib.request.urlretrieve(url, filename)
img = Image.open(filename).convert('RGB')
tensor = transform(img).unsqueeze(0) # transform and add batch dimension
```
To get the model predictions:
```python
import torch
with torch.no_grad():
out = model(tensor)
probabilities = torch.nn.functional.softmax(out[0], dim=0)
print(probabilities.shape)
# prints: torch.Size([1000])
```
To get the top-5 predictions class names:
```python
# Get imagenet class mappings
url, filename = ("https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt", "imagenet_classes.txt")
urllib.request.urlretrieve(url, filename)
with open("imagenet_classes.txt", "r") as f:
categories = [s.strip() for s in f.readlines()]
# Print top categories per image
top5_prob, top5_catid = torch.topk(probabilities, 5)
for i in range(top5_prob.size(0)):
print(categories[top5_catid[i]], top5_prob[i].item())
# prints class names and probabilities like:
# [('Samoyed', 0.6425196528434753), ('Pomeranian', 0.04062102362513542), ('keeshond', 0.03186424449086189), ('white wolf', 0.01739676296710968), ('Eskimo dog', 0.011717947199940681)]
```
Replace the model name with the variant you want to use, e.g. `fbnetc_100`. You can find the IDs in the model summaries at the top of this page.
To extract image features with this model, follow the [timm feature extraction examples](https://rwightman.github.io/pytorch-image-models/feature_extraction/), just change the name of the model you want to use.
## How do I finetune this model?
You can finetune any of the pre-trained models just by changing the classifier (the last layer).
```python
model = timm.create_model('fbnetc_100', pretrained=True).reset_classifier(NUM_FINETUNE_CLASSES)
```
To finetune on your own dataset, you have to write a training loop or adapt [timm's training
script](https://github.com/rwightman/pytorch-image-models/blob/master/train.py) to use your dataset.
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{wu2019fbnet,
title={FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search},
author={Bichen Wu and Xiaoliang Dai and Peizhao Zhang and Yanghan Wang and Fei Sun and Yiming Wu and Yuandong Tian and Peter Vajda and Yangqing Jia and Kurt Keutzer},
year={2019},
eprint={1812.03443},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: fbnetc_100
Metadata:
FLOPs: 508940064
Epochs: 360
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- SGD with Momentum
- Weight Decay
Training Resources: 8x GPUs
Architecture:
- 1x1 Convolution
- Convolution
- Dense Connections
- Dropout
- FBNet Block
- Global Average Pooling
- Softmax
File Size: 22525094
Tasks:
- Image Classification
ID: fbnetc_100
LR: 0.1
Layers: 22
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.0005
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L985
In Collection: FBNet
Collections:
- Name: FBNet
Paper:
title: 'FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural
Architecture Search'
url: https://papperswithcode.com//paper/fbnet-hardware-aware-efficient-convnet-design
type: model-index
Type: model-index
-->

@ -0,0 +1,132 @@
# Summary
**Inception v3** is a convolutional neural network architecture from the Inception family that makes several improvements including using [Label Smoothing](https://paperswithcode.com/method/label-smoothing), Factorized 7 x 7 convolutions, and the use of an [auxiliary classifer](https://paperswithcode.com/method/auxiliary-classifier) to propagate label information lower down the network (along with the use of batch normalization for layers in the sidehead). The key building block is an [Inception Module](https://paperswithcode.com/method/inception-v3-module).
The weights from this model were ported from Gluon.
## How do I use this model on an image?
To load a pretrained model:
```python
import timm
model = timm.create_model('gluon_inception_v3', pretrained=True)
model.eval()
```
To load and preprocess the image:
```python
import urllib
from PIL import Image
from timm.data import resolve_data_config
from timm.data.transforms_factory import create_transform
config = resolve_data_config({}, model=model)
transform = create_transform(**config)
url, filename = ("https://github.com/pytorch/hub/raw/master/images/dog.jpg", "dog.jpg")
urllib.request.urlretrieve(url, filename)
img = Image.open(filename).convert('RGB')
tensor = transform(img).unsqueeze(0) # transform and add batch dimension
```
To get the model predictions:
```python
import torch
with torch.no_grad():
out = model(tensor)
probabilities = torch.nn.functional.softmax(out[0], dim=0)
print(probabilities.shape)
# prints: torch.Size([1000])
```
To get the top-5 predictions class names:
```python
# Get imagenet class mappings
url, filename = ("https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt", "imagenet_classes.txt")
urllib.request.urlretrieve(url, filename)
with open("imagenet_classes.txt", "r") as f:
categories = [s.strip() for s in f.readlines()]
# Print top categories per image
top5_prob, top5_catid = torch.topk(probabilities, 5)
for i in range(top5_prob.size(0)):
print(categories[top5_catid[i]], top5_prob[i].item())
# prints class names and probabilities like:
# [('Samoyed', 0.6425196528434753), ('Pomeranian', 0.04062102362513542), ('keeshond', 0.03186424449086189), ('white wolf', 0.01739676296710968), ('Eskimo dog', 0.011717947199940681)]
```
Replace the model name with the variant you want to use, e.g. `gluon_inception_v3`. You can find the IDs in the model summaries at the top of this page.
To extract image features with this model, follow the [timm feature extraction examples](https://rwightman.github.io/pytorch-image-models/feature_extraction/), just change the name of the model you want to use.
## How do I finetune this model?
You can finetune any of the pre-trained models just by changing the classifier (the last layer).
```python
model = timm.create_model('gluon_inception_v3', pretrained=True).reset_classifier(NUM_FINETUNE_CLASSES)
```
To finetune on your own dataset, you have to write a training loop or adapt [timm's training
script](https://github.com/rwightman/pytorch-image-models/blob/master/train.py) to use your dataset.
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@article{DBLP:journals/corr/SzegedyVISW15,
author = {Christian Szegedy and
Vincent Vanhoucke and
Sergey Ioffe and
Jonathon Shlens and
Zbigniew Wojna},
title = {Rethinking the Inception Architecture for Computer Vision},
journal = {CoRR},
volume = {abs/1512.00567},
year = {2015},
url = {http://arxiv.org/abs/1512.00567},
archivePrefix = {arXiv},
eprint = {1512.00567},
timestamp = {Mon, 13 Aug 2018 16:49:07 +0200},
biburl = {https://dblp.org/rec/journals/corr/SzegedyVISW15.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
```
<!--
Models:
- Name: gluon_inception_v3
Metadata:
FLOPs: 7352418880
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Auxiliary Classifier
- Average Pooling
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inception-v3 Module
- Max Pooling
- ReLU
- Softmax
File Size: 95567055
Tasks:
- Image Classification
ID: gluon_inception_v3
Crop Pct: '0.875'
Image Size: '299'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/inception_v3.py#L464
In Collection: Gloun Inception v3
Collections:
- Name: Gloun Inception v3
Paper:
title: Rethinking the Inception Architecture for Computer Vision
url: https://papperswithcode.com//paper/rethinking-the-inception-architecture-for
type: model-index
Type: model-index
-->

@ -0,0 +1,454 @@
# Summary
**Residual Networks**, or **ResNets**, learn residual functions with reference to the layer inputs, instead of learning unreferenced functions. Instead of hoping each few stacked layers directly fit a desired underlying mapping, residual nets let these layers fit a residual mapping. They stack [residual blocks](https://paperswithcode.com/method/residual-block) ontop of each other to form network: e.g. a ResNet-50 has fifty layers using these blocks.
The weights from this model were ported from Gluon.
## How do I use this model on an image?
To load a pretrained model:
```python
import timm
model = timm.create_model('gluon_resnet101_v1b', pretrained=True)
model.eval()
```
To load and preprocess the image:
```python
import urllib
from PIL import Image
from timm.data import resolve_data_config
from timm.data.transforms_factory import create_transform
config = resolve_data_config({}, model=model)
transform = create_transform(**config)
url, filename = ("https://github.com/pytorch/hub/raw/master/images/dog.jpg", "dog.jpg")
urllib.request.urlretrieve(url, filename)
img = Image.open(filename).convert('RGB')
tensor = transform(img).unsqueeze(0) # transform and add batch dimension
```
To get the model predictions:
```python
import torch
with torch.no_grad():
out = model(tensor)
probabilities = torch.nn.functional.softmax(out[0], dim=0)
print(probabilities.shape)
# prints: torch.Size([1000])
```
To get the top-5 predictions class names:
```python
# Get imagenet class mappings
url, filename = ("https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt", "imagenet_classes.txt")
urllib.request.urlretrieve(url, filename)
with open("imagenet_classes.txt", "r") as f:
categories = [s.strip() for s in f.readlines()]
# Print top categories per image
top5_prob, top5_catid = torch.topk(probabilities, 5)
for i in range(top5_prob.size(0)):
print(categories[top5_catid[i]], top5_prob[i].item())
# prints class names and probabilities like:
# [('Samoyed', 0.6425196528434753), ('Pomeranian', 0.04062102362513542), ('keeshond', 0.03186424449086189), ('white wolf', 0.01739676296710968), ('Eskimo dog', 0.011717947199940681)]
```
Replace the model name with the variant you want to use, e.g. `gluon_resnet101_v1b`. You can find the IDs in the model summaries at the top of this page.
To extract image features with this model, follow the [timm feature extraction examples](https://rwightman.github.io/pytorch-image-models/feature_extraction/), just change the name of the model you want to use.
## How do I finetune this model?
You can finetune any of the pre-trained models just by changing the classifier (the last layer).
```python
model = timm.create_model('gluon_resnet101_v1b', pretrained=True).reset_classifier(NUM_FINETUNE_CLASSES)
```
To finetune on your own dataset, you have to write a training loop or adapt [timm's training
script](https://github.com/rwightman/pytorch-image-models/blob/master/train.py) to use your dataset.
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@article{DBLP:journals/corr/HeZRS15,
author = {Kaiming He and
Xiangyu Zhang and
Shaoqing Ren and
Jian Sun},
title = {Deep Residual Learning for Image Recognition},
journal = {CoRR},
volume = {abs/1512.03385},
year = {2015},
url = {http://arxiv.org/abs/1512.03385},
archivePrefix = {arXiv},
eprint = {1512.03385},
timestamp = {Wed, 17 Apr 2019 17:23:45 +0200},
biburl = {https://dblp.org/rec/journals/corr/HeZRS15.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
```
<!--
Models:
- Name: gluon_resnet101_v1b
Metadata:
FLOPs: 10068547584
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 178723172
Tasks:
- Image Classification
ID: gluon_resnet101_v1b
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/gluon_resnet.py#L89
In Collection: Gloun ResNet
- Name: gluon_resnet101_v1s
Metadata:
FLOPs: 11805511680
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 179221777
Tasks:
- Image Classification
ID: gluon_resnet101_v1s
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/gluon_resnet.py#L166
In Collection: Gloun ResNet
- Name: gluon_resnet101_v1c
Metadata:
FLOPs: 10376567296
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 178802575
Tasks:
- Image Classification
ID: gluon_resnet101_v1c
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/gluon_resnet.py#L113
In Collection: Gloun ResNet
- Name: gluon_resnet152_v1c
Metadata:
FLOPs: 15165680128
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 241613404
Tasks:
- Image Classification
ID: gluon_resnet152_v1c
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/gluon_resnet.py#L121
In Collection: Gloun ResNet
- Name: gluon_resnet152_v1b
Metadata:
FLOPs: 14857660416
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 241534001
Tasks:
- Image Classification
ID: gluon_resnet152_v1b
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/gluon_resnet.py#L97
In Collection: Gloun ResNet
- Name: gluon_resnet101_v1d
Metadata:
FLOPs: 10377018880
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 178802755
Tasks:
- Image Classification
ID: gluon_resnet101_v1d
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/gluon_resnet.py#L138
In Collection: Gloun ResNet
- Name: gluon_resnet152_v1d
Metadata:
FLOPs: 15166131712
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 241613584
Tasks:
- Image Classification
ID: gluon_resnet152_v1d
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/gluon_resnet.py#L147
In Collection: Gloun ResNet
- Name: gluon_resnet152_v1s
Metadata:
FLOPs: 16594624512
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 242032606
Tasks:
- Image Classification
ID: gluon_resnet152_v1s
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/gluon_resnet.py#L175
In Collection: Gloun ResNet
- Name: gluon_resnet50_v1b
Metadata:
FLOPs: 5282531328
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 102493763
Tasks:
- Image Classification
ID: gluon_resnet50_v1b
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/gluon_resnet.py#L81
In Collection: Gloun ResNet
- Name: gluon_resnet18_v1b
Metadata:
FLOPs: 2337073152
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 46816736
Tasks:
- Image Classification
ID: gluon_resnet18_v1b
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/gluon_resnet.py#L65
In Collection: Gloun ResNet
- Name: gluon_resnet34_v1b
Metadata:
FLOPs: 4718469120
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 87295112
Tasks:
- Image Classification
ID: gluon_resnet34_v1b
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/gluon_resnet.py#L73
In Collection: Gloun ResNet
- Name: gluon_resnet50_v1c
Metadata:
FLOPs: 5590551040
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 102573166
Tasks:
- Image Classification
ID: gluon_resnet50_v1c
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/gluon_resnet.py#L105
In Collection: Gloun ResNet
- Name: gluon_resnet50_v1d
Metadata:
FLOPs: 5591002624
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 102573346
Tasks:
- Image Classification
ID: gluon_resnet50_v1d
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/gluon_resnet.py#L129
In Collection: Gloun ResNet
- Name: gluon_resnet50_v1s
Metadata:
FLOPs: 7019495424
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
File Size: 102992368
Tasks:
- Image Classification
ID: gluon_resnet50_v1s
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/gluon_resnet.py#L156
In Collection: Gloun ResNet
Collections:
- Name: Gloun ResNet
Paper:
title: Deep Residual Learning for Image Recognition
url: https://papperswithcode.com//paper/deep-residual-learning-for-image-recognition
type: model-index
Type: model-index
-->

@ -0,0 +1,180 @@
# Summary
A **ResNeXt** repeats a [building block](https://paperswithcode.com/method/resnext-block) that aggregates a set of transformations with the same topology. Compared to a [ResNet](https://paperswithcode.com/method/resnet), it exposes a new dimension, *cardinality* (the size of the set of transformations) $C$, as an essential factor in addition to the dimensions of depth and width.
The weights from this model were ported from Gluon.
## How do I use this model on an image?
To load a pretrained model:
```python
import timm
model = timm.create_model('gluon_resnext50_32x4d', pretrained=True)
model.eval()
```
To load and preprocess the image:
```python
import urllib
from PIL import Image
from timm.data import resolve_data_config
from timm.data.transforms_factory import create_transform
config = resolve_data_config({}, model=model)
transform = create_transform(**config)
url, filename = ("https://github.com/pytorch/hub/raw/master/images/dog.jpg", "dog.jpg")
urllib.request.urlretrieve(url, filename)
img = Image.open(filename).convert('RGB')
tensor = transform(img).unsqueeze(0) # transform and add batch dimension
```
To get the model predictions:
```python
import torch
with torch.no_grad():
out = model(tensor)
probabilities = torch.nn.functional.softmax(out[0], dim=0)
print(probabilities.shape)
# prints: torch.Size([1000])
```
To get the top-5 predictions class names:
```python
# Get imagenet class mappings
url, filename = ("https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt", "imagenet_classes.txt")
urllib.request.urlretrieve(url, filename)
with open("imagenet_classes.txt", "r") as f:
categories = [s.strip() for s in f.readlines()]
# Print top categories per image
top5_prob, top5_catid = torch.topk(probabilities, 5)
for i in range(top5_prob.size(0)):
print(categories[top5_catid[i]], top5_prob[i].item())
# prints class names and probabilities like:
# [('Samoyed', 0.6425196528434753), ('Pomeranian', 0.04062102362513542), ('keeshond', 0.03186424449086189), ('white wolf', 0.01739676296710968), ('Eskimo dog', 0.011717947199940681)]
```
Replace the model name with the variant you want to use, e.g. `gluon_resnext50_32x4d`. You can find the IDs in the model summaries at the top of this page.
To extract image features with this model, follow the [timm feature extraction examples](https://rwightman.github.io/pytorch-image-models/feature_extraction/), just change the name of the model you want to use.
## How do I finetune this model?
You can finetune any of the pre-trained models just by changing the classifier (the last layer).
```python
model = timm.create_model('gluon_resnext50_32x4d', pretrained=True).reset_classifier(NUM_FINETUNE_CLASSES)
```
To finetune on your own dataset, you have to write a training loop or adapt [timm's training
script](https://github.com/rwightman/pytorch-image-models/blob/master/train.py) to use your dataset.
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@article{DBLP:journals/corr/XieGDTH16,
author = {Saining Xie and
Ross B. Girshick and
Piotr Doll{\'{a}}r and
Zhuowen Tu and
Kaiming He},
title = {Aggregated Residual Transformations for Deep Neural Networks},
journal = {CoRR},
volume = {abs/1611.05431},
year = {2016},
url = {http://arxiv.org/abs/1611.05431},
archivePrefix = {arXiv},
eprint = {1611.05431},
timestamp = {Mon, 13 Aug 2018 16:45:58 +0200},
biburl = {https://dblp.org/rec/journals/corr/XieGDTH16.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
```
<!--
Models:
- Name: gluon_resnext50_32x4d
Metadata:
FLOPs: 5472648192
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Grouped Convolution
- Max Pooling
- ReLU
- ResNeXt Block
- Residual Connection
- Softmax
File Size: 100441719
Tasks:
- Image Classification
ID: gluon_resnext50_32x4d
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/gluon_resnet.py#L185
In Collection: Gloun ResNeXt
- Name: gluon_resnext101_32x4d
Metadata:
FLOPs: 10298145792
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Grouped Convolution
- Max Pooling
- ReLU
- ResNeXt Block
- Residual Connection
- Softmax
File Size: 177367414
Tasks:
- Image Classification
ID: gluon_resnext101_32x4d
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/gluon_resnet.py#L193
In Collection: Gloun ResNeXt
- Name: gluon_resnext101_64x4d
Metadata:
FLOPs: 19954172928
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Grouped Convolution
- Max Pooling
- ReLU
- ResNeXt Block
- Residual Connection
- Softmax
File Size: 334737852
Tasks:
- Image Classification
ID: gluon_resnext101_64x4d
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/gluon_resnet.py#L201
In Collection: Gloun ResNeXt
Collections:
- Name: Gloun ResNeXt
Paper:
title: Aggregated Residual Transformations for Deep Neural Networks
url: https://papperswithcode.com//paper/aggregated-residual-transformations-for-deep
type: model-index
Type: model-index
-->

@ -0,0 +1,117 @@
# Summary
A **SENet** is a convolutional neural network architecture that employs [squeeze-and-excitation blocks](https://paperswithcode.com/method/squeeze-and-excitation-block) to enable the network to perform dynamic channel-wise feature recalibration.
The weights from this model were ported from Gluon.
## How do I use this model on an image?
To load a pretrained model:
```python
import timm
model = timm.create_model('gluon_senet154', pretrained=True)
model.eval()
```
To load and preprocess the image:
```python
import urllib
from PIL import Image
from timm.data import resolve_data_config
from timm.data.transforms_factory import create_transform
config = resolve_data_config({}, model=model)
transform = create_transform(**config)
url, filename = ("https://github.com/pytorch/hub/raw/master/images/dog.jpg", "dog.jpg")
urllib.request.urlretrieve(url, filename)
img = Image.open(filename).convert('RGB')
tensor = transform(img).unsqueeze(0) # transform and add batch dimension
```
To get the model predictions:
```python
import torch
with torch.no_grad():
out = model(tensor)
probabilities = torch.nn.functional.softmax(out[0], dim=0)
print(probabilities.shape)
# prints: torch.Size([1000])
```
To get the top-5 predictions class names:
```python
# Get imagenet class mappings
url, filename = ("https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt", "imagenet_classes.txt")
urllib.request.urlretrieve(url, filename)
with open("imagenet_classes.txt", "r") as f:
categories = [s.strip() for s in f.readlines()]
# Print top categories per image
top5_prob, top5_catid = torch.topk(probabilities, 5)
for i in range(top5_prob.size(0)):
print(categories[top5_catid[i]], top5_prob[i].item())
# prints class names and probabilities like:
# [('Samoyed', 0.6425196528434753), ('Pomeranian', 0.04062102362513542), ('keeshond', 0.03186424449086189), ('white wolf', 0.01739676296710968), ('Eskimo dog', 0.011717947199940681)]
```
Replace the model name with the variant you want to use, e.g. `gluon_senet154`. You can find the IDs in the model summaries at the top of this page.
To extract image features with this model, follow the [timm feature extraction examples](https://rwightman.github.io/pytorch-image-models/feature_extraction/), just change the name of the model you want to use.
## How do I finetune this model?
You can finetune any of the pre-trained models just by changing the classifier (the last layer).
```python
model = timm.create_model('gluon_senet154', pretrained=True).reset_classifier(NUM_FINETUNE_CLASSES)
```
To finetune on your own dataset, you have to write a training loop or adapt [timm's training
script](https://github.com/rwightman/pytorch-image-models/blob/master/train.py) to use your dataset.
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{hu2019squeezeandexcitation,
title={Squeeze-and-Excitation Networks},
author={Jie Hu and Li Shen and Samuel Albanie and Gang Sun and Enhua Wu},
year={2019},
eprint={1709.01507},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: gluon_senet154
Metadata:
FLOPs: 26681705136
Training Data:
- ImageNet
Architecture:
- Convolution
- Dense Connections
- Global Average Pooling
- Max Pooling
- Softmax
- Squeeze-and-Excitation Block
File Size: 461546622
Tasks:
- Image Classification
ID: gluon_senet154
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/gluon_resnet.py#L239
In Collection: Gloun SENet
Collections:
- Name: Gloun SENet
Paper:
title: Squeeze-and-Excitation Networks
url: https://papperswithcode.com//paper/squeeze-and-excitation-networks
type: model-index
Type: model-index
-->

@ -0,0 +1,174 @@
# Summary
**SE ResNeXt** is a variant of a [ResNext](https://www.paperswithcode.com/method/resnext) that employs [squeeze-and-excitation blocks](https://paperswithcode.com/method/squeeze-and-excitation-block) to enable the network to perform dynamic channel-wise feature recalibration.
The weights from this model were ported from Gluon.
## How do I use this model on an image?
To load a pretrained model:
```python
import timm
model = timm.create_model('gluon_seresnext50_32x4d', pretrained=True)
model.eval()
```
To load and preprocess the image:
```python
import urllib
from PIL import Image
from timm.data import resolve_data_config
from timm.data.transforms_factory import create_transform
config = resolve_data_config({}, model=model)
transform = create_transform(**config)
url, filename = ("https://github.com/pytorch/hub/raw/master/images/dog.jpg", "dog.jpg")
urllib.request.urlretrieve(url, filename)
img = Image.open(filename).convert('RGB')
tensor = transform(img).unsqueeze(0) # transform and add batch dimension
```
To get the model predictions:
```python
import torch
with torch.no_grad():
out = model(tensor)
probabilities = torch.nn.functional.softmax(out[0], dim=0)
print(probabilities.shape)
# prints: torch.Size([1000])
```
To get the top-5 predictions class names:
```python
# Get imagenet class mappings
url, filename = ("https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt", "imagenet_classes.txt")
urllib.request.urlretrieve(url, filename)
with open("imagenet_classes.txt", "r") as f:
categories = [s.strip() for s in f.readlines()]
# Print top categories per image
top5_prob, top5_catid = torch.topk(probabilities, 5)
for i in range(top5_prob.size(0)):
print(categories[top5_catid[i]], top5_prob[i].item())
# prints class names and probabilities like:
# [('Samoyed', 0.6425196528434753), ('Pomeranian', 0.04062102362513542), ('keeshond', 0.03186424449086189), ('white wolf', 0.01739676296710968), ('Eskimo dog', 0.011717947199940681)]
```
Replace the model name with the variant you want to use, e.g. `gluon_seresnext50_32x4d`. You can find the IDs in the model summaries at the top of this page.
To extract image features with this model, follow the [timm feature extraction examples](https://rwightman.github.io/pytorch-image-models/feature_extraction/), just change the name of the model you want to use.
## How do I finetune this model?
You can finetune any of the pre-trained models just by changing the classifier (the last layer).
```python
model = timm.create_model('gluon_seresnext50_32x4d', pretrained=True).reset_classifier(NUM_FINETUNE_CLASSES)
```
To finetune on your own dataset, you have to write a training loop or adapt [timm's training
script](https://github.com/rwightman/pytorch-image-models/blob/master/train.py) to use your dataset.
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{hu2019squeezeandexcitation,
title={Squeeze-and-Excitation Networks},
author={Jie Hu and Li Shen and Samuel Albanie and Gang Sun and Enhua Wu},
year={2019},
eprint={1709.01507},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: gluon_seresnext50_32x4d
Metadata:
FLOPs: 5475179184
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Grouped Convolution
- Max Pooling
- ReLU
- ResNeXt Block
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 110578827
Tasks:
- Image Classification
ID: gluon_seresnext50_32x4d
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/gluon_resnet.py#L209
In Collection: Gloun SEResNeXt
- Name: gluon_seresnext101_32x4d
Metadata:
FLOPs: 10302923504
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Grouped Convolution
- Max Pooling
- ReLU
- ResNeXt Block
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 196505510
Tasks:
- Image Classification
ID: gluon_seresnext101_32x4d
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/gluon_resnet.py#L219
In Collection: Gloun SEResNeXt
- Name: gluon_seresnext101_64x4d
Metadata:
FLOPs: 19958950640
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Grouped Convolution
- Max Pooling
- ReLU
- ResNeXt Block
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 353875948
Tasks:
- Image Classification
ID: gluon_seresnext101_64x4d
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/gluon_resnet.py#L229
In Collection: Gloun SEResNeXt
Collections:
- Name: Gloun SEResNeXt
Paper:
title: Squeeze-and-Excitation Networks
url: https://papperswithcode.com//paper/squeeze-and-excitation-networks
type: model-index
Type: model-index
-->

@ -0,0 +1,118 @@
# Summary
**Xception** is a convolutional neural network architecture that relies solely on [depthwise separable convolution](https://paperswithcode.com/method/depthwise-separable-convolution) layers. The weights from this model were ported from Gluon.
## How do I use this model on an image?
To load a pretrained model:
```python
import timm
model = timm.create_model('gluon_xception65', pretrained=True)
model.eval()
```
To load and preprocess the image:
```python
import urllib
from PIL import Image
from timm.data import resolve_data_config
from timm.data.transforms_factory import create_transform
config = resolve_data_config({}, model=model)
transform = create_transform(**config)
url, filename = ("https://github.com/pytorch/hub/raw/master/images/dog.jpg", "dog.jpg")
urllib.request.urlretrieve(url, filename)
img = Image.open(filename).convert('RGB')
tensor = transform(img).unsqueeze(0) # transform and add batch dimension
```
To get the model predictions:
```python
import torch
with torch.no_grad():
out = model(tensor)
probabilities = torch.nn.functional.softmax(out[0], dim=0)
print(probabilities.shape)
# prints: torch.Size([1000])
```
To get the top-5 predictions class names:
```python
# Get imagenet class mappings
url, filename = ("https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt", "imagenet_classes.txt")
urllib.request.urlretrieve(url, filename)
with open("imagenet_classes.txt", "r") as f:
categories = [s.strip() for s in f.readlines()]
# Print top categories per image
top5_prob, top5_catid = torch.topk(probabilities, 5)
for i in range(top5_prob.size(0)):
print(categories[top5_catid[i]], top5_prob[i].item())
# prints class names and probabilities like:
# [('Samoyed', 0.6425196528434753), ('Pomeranian', 0.04062102362513542), ('keeshond', 0.03186424449086189), ('white wolf', 0.01739676296710968), ('Eskimo dog', 0.011717947199940681)]
```
Replace the model name with the variant you want to use, e.g. `gluon_xception65`. You can find the IDs in the model summaries at the top of this page.
To extract image features with this model, follow the [timm feature extraction examples](https://rwightman.github.io/pytorch-image-models/feature_extraction/), just change the name of the model you want to use.
## How do I finetune this model?
You can finetune any of the pre-trained models just by changing the classifier (the last layer).
```python
model = timm.create_model('gluon_xception65', pretrained=True).reset_classifier(NUM_FINETUNE_CLASSES)
```
To finetune on your own dataset, you have to write a training loop or adapt [timm's training
script](https://github.com/rwightman/pytorch-image-models/blob/master/train.py) to use your dataset.
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{chollet2017xception,
title={Xception: Deep Learning with Depthwise Separable Convolutions},
author={François Chollet},
year={2017},
eprint={1610.02357},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: gluon_xception65
Metadata:
FLOPs: 17594889728
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Convolution
- Dense Connections
- Depthwise Separable Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Connection
- Softmax
File Size: 160551306
Tasks:
- Image Classification
ID: gluon_xception65
Crop Pct: '0.903'
Image Size: '299'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/gluon_xception.py#L241
In Collection: Gloun Xception
Collections:
- Name: Gloun Xception
Paper:
title: 'Xception: Deep Learning with Depthwise Separable Convolutions'
url: https://papperswithcode.com//paper/xception-deep-learning-with-depthwise
type: model-index
Type: model-index
-->

@ -0,0 +1,364 @@
# Summary
**HRNet**, or **High-Resolution Net**, is a general purpose convolutional neural network for tasks like semantic segmentation, object detection and image classification. It is able to maintain high resolution representations through the whole process. We start from a high-resolution convolution stream, gradually add high-to-low resolution convolution streams one by one, and connect the multi-resolution streams in parallel. The resulting network consists of several ($4$ in the paper) stages and the $n$th stage contains $n$ streams corresponding to $n$ resolutions. The authors conduct repeated multi-resolution fusions by exchanging the information across the parallel streams over and over.
## How do I use this model on an image?
To load a pretrained model:
```python
import timm
model = timm.create_model('hrnet_w18_small', pretrained=True)
model.eval()
```
To load and preprocess the image:
```python
import urllib
from PIL import Image
from timm.data import resolve_data_config
from timm.data.transforms_factory import create_transform
config = resolve_data_config({}, model=model)
transform = create_transform(**config)
url, filename = ("https://github.com/pytorch/hub/raw/master/images/dog.jpg", "dog.jpg")
urllib.request.urlretrieve(url, filename)
img = Image.open(filename).convert('RGB')
tensor = transform(img).unsqueeze(0) # transform and add batch dimension
```
To get the model predictions:
```python
import torch
with torch.no_grad():
out = model(tensor)
probabilities = torch.nn.functional.softmax(out[0], dim=0)
print(probabilities.shape)
# prints: torch.Size([1000])
```
To get the top-5 predictions class names:
```python
# Get imagenet class mappings
url, filename = ("https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt", "imagenet_classes.txt")
urllib.request.urlretrieve(url, filename)
with open("imagenet_classes.txt", "r") as f:
categories = [s.strip() for s in f.readlines()]
# Print top categories per image
top5_prob, top5_catid = torch.topk(probabilities, 5)
for i in range(top5_prob.size(0)):
print(categories[top5_catid[i]], top5_prob[i].item())
# prints class names and probabilities like:
# [('Samoyed', 0.6425196528434753), ('Pomeranian', 0.04062102362513542), ('keeshond', 0.03186424449086189), ('white wolf', 0.01739676296710968), ('Eskimo dog', 0.011717947199940681)]
```
Replace the model name with the variant you want to use, e.g. `hrnet_w18_small`. You can find the IDs in the model summaries at the top of this page.
To extract image features with this model, follow the [timm feature extraction examples](https://rwightman.github.io/pytorch-image-models/feature_extraction/), just change the name of the model you want to use.
## How do I finetune this model?
You can finetune any of the pre-trained models just by changing the classifier (the last layer).
```python
model = timm.create_model('hrnet_w18_small', pretrained=True).reset_classifier(NUM_FINETUNE_CLASSES)
```
To finetune on your own dataset, you have to write a training loop or adapt [timm's training
script](https://github.com/rwightman/pytorch-image-models/blob/master/train.py) to use your dataset.
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{sun2019highresolution,
title={High-Resolution Representations for Labeling Pixels and Regions},
author={Ke Sun and Yang Zhao and Borui Jiang and Tianheng Cheng and Bin Xiao and Dong Liu and Yadong Mu and Xinggang Wang and Wenyu Liu and Jingdong Wang},
year={2019},
eprint={1904.04514},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: hrnet_w18_small
Metadata:
FLOPs: 2071651488
Epochs: 100
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- Nesterov Accelerated Gradient
- Weight Decay
Training Resources: 4x NVIDIA V100 GPUs
Architecture:
- Batch Normalization
- Convolution
- ReLU
- Residual Connection
File Size: 52934302
Tasks:
- Image Classification
Training Time: ''
ID: hrnet_w18_small
Layers: 18
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/hrnet.py#L790
Config: ''
In Collection: HRNet
- Name: hrnet_w18_small_v2
Metadata:
FLOPs: 3360023160
Epochs: 100
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- Nesterov Accelerated Gradient
- Weight Decay
Training Resources: 4x NVIDIA V100 GPUs
Architecture:
- Batch Normalization
- Convolution
- ReLU
- Residual Connection
File Size: 62682879
Tasks:
- Image Classification
Training Time: ''
ID: hrnet_w18_small_v2
Layers: 18
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/hrnet.py#L795
Config: ''
In Collection: HRNet
- Name: hrnet_w32
Metadata:
FLOPs: 11524528320
Epochs: 100
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- Nesterov Accelerated Gradient
- Weight Decay
Training Resources: 4x NVIDIA V100 GPUs
Architecture:
- Batch Normalization
- Convolution
- ReLU
- Residual Connection
File Size: 165547812
Tasks:
- Image Classification
Training Time: 60 hours
ID: hrnet_w32
Layers: 32
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/hrnet.py#L810
Config: ''
In Collection: HRNet
- Name: hrnet_w40
Metadata:
FLOPs: 16381182192
Epochs: 100
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- Nesterov Accelerated Gradient
- Weight Decay
Training Resources: 4x NVIDIA V100 GPUs
Architecture:
- Batch Normalization
- Convolution
- ReLU
- Residual Connection
File Size: 230899236
Tasks:
- Image Classification
Training Time: ''
ID: hrnet_w40
Layers: 40
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/hrnet.py#L815
Config: ''
In Collection: HRNet
- Name: hrnet_w44
Metadata:
FLOPs: 19202520264
Epochs: 100
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- Nesterov Accelerated Gradient
- Weight Decay
Training Resources: 4x NVIDIA V100 GPUs
Architecture:
- Batch Normalization
- Convolution
- ReLU
- Residual Connection
File Size: 268957432
Tasks:
- Image Classification
Training Time: ''
ID: hrnet_w44
Layers: 44
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/hrnet.py#L820
Config: ''
In Collection: HRNet
- Name: hrnet_w48
Metadata:
FLOPs: 22285865760
Epochs: 100
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- Nesterov Accelerated Gradient
- Weight Decay
Training Resources: 4x NVIDIA V100 GPUs
Architecture:
- Batch Normalization
- Convolution
- ReLU
- Residual Connection
File Size: 310603710
Tasks:
- Image Classification
Training Time: 80 hours
ID: hrnet_w48
Layers: 48
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/hrnet.py#L825
Config: ''
In Collection: HRNet
- Name: hrnet_w18
Metadata:
FLOPs: 5547205500
Epochs: 100
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- Nesterov Accelerated Gradient
- Weight Decay
Training Resources: 4x NVIDIA V100 GPUs
Architecture:
- Batch Normalization
- Convolution
- ReLU
- Residual Connection
File Size: 85718883
Tasks:
- Image Classification
Training Time: ''
ID: hrnet_w18
Layers: 18
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/hrnet.py#L800
Config: ''
In Collection: HRNet
- Name: hrnet_w64
Metadata:
FLOPs: 37239321984
Epochs: 100
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- Nesterov Accelerated Gradient
- Weight Decay
Training Resources: 4x NVIDIA V100 GPUs
Architecture:
- Batch Normalization
- Convolution
- ReLU
- Residual Connection
File Size: 513071818
Tasks:
- Image Classification
Training Time: ''
ID: hrnet_w64
Layers: 64
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/hrnet.py#L830
Config: ''
In Collection: HRNet
- Name: hrnet_w30
Metadata:
FLOPs: 10474119492
Epochs: 100
Batch Size: 256
Training Data:
- ImageNet
Training Techniques:
- Nesterov Accelerated Gradient
- Weight Decay
Training Resources: 4x NVIDIA V100 GPUs
Architecture:
- Batch Normalization
- Convolution
- ReLU
- Residual Connection
File Size: 151452218
Tasks:
- Image Classification
Training Time: ''
ID: hrnet_w30
Layers: 30
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/hrnet.py#L805
Config: ''
In Collection: HRNet
Collections:
- Name: HRNet
Paper:
title: Deep High-Resolution Representation Learning for Visual Recognition
url: https://papperswithcode.com//paper/190807919
type: model-index
Type: model-index
-->

@ -0,0 +1,239 @@
# Summary
A **ResNeXt** repeats a [building block](https://paperswithcode.com/method/resnext-block) that aggregates a set of transformations with the same topology. Compared to a [ResNet](https://paperswithcode.com/method/resnet), it exposes a new dimension, *cardinality* (the size of the set of transformations) $C$, as an essential factor in addition to the dimensions of depth and width.
This model was trained on billions of Instagram images using thousands of distinct hashtags as labels exhibit excellent transfer learning performance.
Please note the CC-BY-NC 4.0 license on theses weights, non-commercial use only.
## How do I use this model on an image?
To load a pretrained model:
```python
import timm
model = timm.create_model('ig_resnext101_32x32d', pretrained=True)
model.eval()
```
To load and preprocess the image:
```python
import urllib
from PIL import Image
from timm.data import resolve_data_config
from timm.data.transforms_factory import create_transform
config = resolve_data_config({}, model=model)
transform = create_transform(**config)
url, filename = ("https://github.com/pytorch/hub/raw/master/images/dog.jpg", "dog.jpg")
urllib.request.urlretrieve(url, filename)
img = Image.open(filename).convert('RGB')
tensor = transform(img).unsqueeze(0) # transform and add batch dimension
```
To get the model predictions:
```python
import torch
with torch.no_grad():
out = model(tensor)
probabilities = torch.nn.functional.softmax(out[0], dim=0)
print(probabilities.shape)
# prints: torch.Size([1000])
```
To get the top-5 predictions class names:
```python
# Get imagenet class mappings
url, filename = ("https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt", "imagenet_classes.txt")
urllib.request.urlretrieve(url, filename)
with open("imagenet_classes.txt", "r") as f:
categories = [s.strip() for s in f.readlines()]
# Print top categories per image
top5_prob, top5_catid = torch.topk(probabilities, 5)
for i in range(top5_prob.size(0)):
print(categories[top5_catid[i]], top5_prob[i].item())
# prints class names and probabilities like:
# [('Samoyed', 0.6425196528434753), ('Pomeranian', 0.04062102362513542), ('keeshond', 0.03186424449086189), ('white wolf', 0.01739676296710968), ('Eskimo dog', 0.011717947199940681)]
```
Replace the model name with the variant you want to use, e.g. `ig_resnext101_32x32d`. You can find the IDs in the model summaries at the top of this page.
To extract image features with this model, follow the [timm feature extraction examples](https://rwightman.github.io/pytorch-image-models/feature_extraction/), just change the name of the model you want to use.
## How do I finetune this model?
You can finetune any of the pre-trained models just by changing the classifier (the last layer).
```python
model = timm.create_model('ig_resnext101_32x32d', pretrained=True).reset_classifier(NUM_FINETUNE_CLASSES)
```
To finetune on your own dataset, you have to write a training loop or adapt [timm's training
script](https://github.com/rwightman/pytorch-image-models/blob/master/train.py) to use your dataset.
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{mahajan2018exploring,
title={Exploring the Limits of Weakly Supervised Pretraining},
author={Dhruv Mahajan and Ross Girshick and Vignesh Ramanathan and Kaiming He and Manohar Paluri and Yixuan Li and Ashwin Bharambe and Laurens van der Maaten},
year={2018},
eprint={1805.00932},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: ig_resnext101_32x32d
Metadata:
FLOPs: 112225170432
Epochs: 100
Batch Size: 8064
Training Data:
- IG-3.5B-17k
- ImageNet
Training Techniques:
- Nesterov Accelerated Gradient
- Weight Decay
Training Resources: 336x GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Grouped Convolution
- Max Pooling
- ReLU
- ResNeXt Block
- Residual Connection
- Softmax
File Size: 1876573776
Tasks:
- Image Classification
ID: ig_resnext101_32x32d
Layers: 101
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.001
Interpolation: bilinear
Minibatch Size: 8064
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/resnet.py#L885
In Collection: IG ResNeXt
- Name: ig_resnext101_32x16d
Metadata:
FLOPs: 46623691776
Epochs: 100
Batch Size: 8064
Training Data:
- IG-3.5B-17k
- ImageNet
Training Techniques:
- Nesterov Accelerated Gradient
- Weight Decay
Training Resources: 336x GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Grouped Convolution
- Max Pooling
- ReLU
- ResNeXt Block
- Residual Connection
- Softmax
File Size: 777518664
Tasks:
- Image Classification
ID: ig_resnext101_32x16d
Layers: 101
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/resnet.py#L874
In Collection: IG ResNeXt
- Name: ig_resnext101_32x48d
Metadata:
FLOPs: 197446554624
Epochs: 100
Batch Size: 8064
Training Data:
- IG-3.5B-17k
- ImageNet
Training Techniques:
- Nesterov Accelerated Gradient
- Weight Decay
Training Resources: 336x GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Grouped Convolution
- Max Pooling
- ReLU
- ResNeXt Block
- Residual Connection
- Softmax
File Size: 3317136976
Tasks:
- Image Classification
ID: ig_resnext101_32x48d
Layers: 101
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/resnet.py#L896
In Collection: IG ResNeXt
- Name: ig_resnext101_32x8d
Metadata:
FLOPs: 21180417024
Epochs: 100
Batch Size: 8064
Training Data:
- IG-3.5B-17k
- ImageNet
Training Techniques:
- Nesterov Accelerated Gradient
- Weight Decay
Training Resources: 336x GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Grouped Convolution
- Max Pooling
- ReLU
- ResNeXt Block
- Residual Connection
- Softmax
File Size: 356056638
Tasks:
- Image Classification
ID: ig_resnext101_32x8d
Layers: 101
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 0.001
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/resnet.py#L863
In Collection: IG ResNeXt
Collections:
- Name: IG ResNeXt
Paper:
title: Exploring the Limits of Weakly Supervised Pretraining
url: https://papperswithcode.com//paper/exploring-the-limits-of-weakly-supervised
type: model-index
Type: model-index
-->

@ -0,0 +1,126 @@
# Summary
**Inception-ResNet-v2** is a convolutional neural architecture that builds on the Inception family of architectures but incorporates [residual connections](https://paperswithcode.com/method/residual-connection) (replacing the filter concatenation stage of the Inception architecture).
## How do I use this model on an image?
To load a pretrained model:
```python
import timm
model = timm.create_model('inception_resnet_v2', pretrained=True)
model.eval()
```
To load and preprocess the image:
```python
import urllib
from PIL import Image
from timm.data import resolve_data_config
from timm.data.transforms_factory import create_transform
config = resolve_data_config({}, model=model)
transform = create_transform(**config)
url, filename = ("https://github.com/pytorch/hub/raw/master/images/dog.jpg", "dog.jpg")
urllib.request.urlretrieve(url, filename)
img = Image.open(filename).convert('RGB')
tensor = transform(img).unsqueeze(0) # transform and add batch dimension
```
To get the model predictions:
```python
import torch
with torch.no_grad():
out = model(tensor)
probabilities = torch.nn.functional.softmax(out[0], dim=0)
print(probabilities.shape)
# prints: torch.Size([1000])
```
To get the top-5 predictions class names:
```python
# Get imagenet class mappings
url, filename = ("https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt", "imagenet_classes.txt")
urllib.request.urlretrieve(url, filename)
with open("imagenet_classes.txt", "r") as f:
categories = [s.strip() for s in f.readlines()]
# Print top categories per image
top5_prob, top5_catid = torch.topk(probabilities, 5)
for i in range(top5_prob.size(0)):
print(categories[top5_catid[i]], top5_prob[i].item())
# prints class names and probabilities like:
# [('Samoyed', 0.6425196528434753), ('Pomeranian', 0.04062102362513542), ('keeshond', 0.03186424449086189), ('white wolf', 0.01739676296710968), ('Eskimo dog', 0.011717947199940681)]
```
Replace the model name with the variant you want to use, e.g. `inception_resnet_v2`. You can find the IDs in the model summaries at the top of this page.
To extract image features with this model, follow the [timm feature extraction examples](https://rwightman.github.io/pytorch-image-models/feature_extraction/), just change the name of the model you want to use.
## How do I finetune this model?
You can finetune any of the pre-trained models just by changing the classifier (the last layer).
```python
model = timm.create_model('inception_resnet_v2', pretrained=True).reset_classifier(NUM_FINETUNE_CLASSES)
```
To finetune on your own dataset, you have to write a training loop or adapt [timm's training
script](https://github.com/rwightman/pytorch-image-models/blob/master/train.py) to use your dataset.
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{szegedy2016inceptionv4,
title={Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning},
author={Christian Szegedy and Sergey Ioffe and Vincent Vanhoucke and Alex Alemi},
year={2016},
eprint={1602.07261},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: inception_resnet_v2
Metadata:
FLOPs: 16959133120
Training Data:
- ImageNet
Training Techniques:
- Label Smoothing
- RMSProp
- Weight Decay
Training Resources: 20x NVIDIA Kepler GPUs
Architecture:
- Average Pooling
- Dropout
- Inception-ResNet-v2 Reduction-B
- Inception-ResNet-v2-A
- Inception-ResNet-v2-B
- Inception-ResNet-v2-C
- Reduction-A
- Softmax
File Size: 223774238
Tasks:
- Image Classification
ID: inception_resnet_v2
LR: 0.045
Dropout: 0.2
Crop Pct: '0.897'
Momentum: 0.9
Image Size: '299'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/inception_resnet_v2.py#L343
In Collection: Inception ResNet v2
Collections:
- Name: Inception ResNet v2
Paper:
title: Inception-v4, Inception-ResNet and the Impact of Residual Connections on
Learning
url: https://papperswithcode.com//paper/inception-v4-inception-resnet-and-the-impact
type: model-index
Type: model-index
-->

@ -0,0 +1,139 @@
# Summary
**Inception v3** is a convolutional neural network architecture from the Inception family that makes several improvements including using [Label Smoothing](https://paperswithcode.com/method/label-smoothing), Factorized 7 x 7 convolutions, and the use of an [auxiliary classifer](https://paperswithcode.com/method/auxiliary-classifier) to propagate label information lower down the network (along with the use of batch normalization for layers in the sidehead). The key building block is an [Inception Module](https://paperswithcode.com/method/inception-v3-module).
## How do I use this model on an image?
To load a pretrained model:
```python
import timm
model = timm.create_model('inception_v3', pretrained=True)
model.eval()
```
To load and preprocess the image:
```python
import urllib
from PIL import Image
from timm.data import resolve_data_config
from timm.data.transforms_factory import create_transform
config = resolve_data_config({}, model=model)
transform = create_transform(**config)
url, filename = ("https://github.com/pytorch/hub/raw/master/images/dog.jpg", "dog.jpg")
urllib.request.urlretrieve(url, filename)
img = Image.open(filename).convert('RGB')
tensor = transform(img).unsqueeze(0) # transform and add batch dimension
```
To get the model predictions:
```python
import torch
with torch.no_grad():
out = model(tensor)
probabilities = torch.nn.functional.softmax(out[0], dim=0)
print(probabilities.shape)
# prints: torch.Size([1000])
```
To get the top-5 predictions class names:
```python
# Get imagenet class mappings
url, filename = ("https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt", "imagenet_classes.txt")
urllib.request.urlretrieve(url, filename)
with open("imagenet_classes.txt", "r") as f:
categories = [s.strip() for s in f.readlines()]
# Print top categories per image
top5_prob, top5_catid = torch.topk(probabilities, 5)
for i in range(top5_prob.size(0)):
print(categories[top5_catid[i]], top5_prob[i].item())
# prints class names and probabilities like:
# [('Samoyed', 0.6425196528434753), ('Pomeranian', 0.04062102362513542), ('keeshond', 0.03186424449086189), ('white wolf', 0.01739676296710968), ('Eskimo dog', 0.011717947199940681)]
```
Replace the model name with the variant you want to use, e.g. `inception_v3`. You can find the IDs in the model summaries at the top of this page.
To extract image features with this model, follow the [timm feature extraction examples](https://rwightman.github.io/pytorch-image-models/feature_extraction/), just change the name of the model you want to use.
## How do I finetune this model?
You can finetune any of the pre-trained models just by changing the classifier (the last layer).
```python
model = timm.create_model('inception_v3', pretrained=True).reset_classifier(NUM_FINETUNE_CLASSES)
```
To finetune on your own dataset, you have to write a training loop or adapt [timm's training
script](https://github.com/rwightman/pytorch-image-models/blob/master/train.py) to use your dataset.
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@article{DBLP:journals/corr/SzegedyVISW15,
author = {Christian Szegedy and
Vincent Vanhoucke and
Sergey Ioffe and
Jonathon Shlens and
Zbigniew Wojna},
title = {Rethinking the Inception Architecture for Computer Vision},
journal = {CoRR},
volume = {abs/1512.00567},
year = {2015},
url = {http://arxiv.org/abs/1512.00567},
archivePrefix = {arXiv},
eprint = {1512.00567},
timestamp = {Mon, 13 Aug 2018 16:49:07 +0200},
biburl = {https://dblp.org/rec/journals/corr/SzegedyVISW15.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
```
<!--
Models:
- Name: inception_v3
Metadata:
FLOPs: 7352418880
Training Data:
- ImageNet
Training Techniques:
- Gradient Clipping
- Label Smoothing
- RMSProp
- Weight Decay
Training Resources: 50x NVIDIA Kepler GPUs
Architecture:
- 1x1 Convolution
- Auxiliary Classifier
- Average Pooling
- Average Pooling
- Batch Normalization
- Convolution
- Dense Connections
- Dropout
- Inception-v3 Module
- Max Pooling
- ReLU
- Softmax
File Size: 108857766
Tasks:
- Image Classification
ID: inception_v3
LR: 0.045
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '299'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/inception_v3.py#L442
In Collection: Inception v3
Collections:
- Name: Inception v3
Paper:
title: Rethinking the Inception Architecture for Computer Vision
url: https://papperswithcode.com//paper/rethinking-the-inception-architecture-for
type: model-index
Type: model-index
-->

@ -0,0 +1,125 @@
# Summary
**Inception-v4** is a convolutional neural network architecture that builds on previous iterations of the Inception family by simplifying the architecture and using more inception modules than [Inception-v3](https://paperswithcode.com/method/inception-v3).
## How do I use this model on an image?
To load a pretrained model:
```python
import timm
model = timm.create_model('inception_v4', pretrained=True)
model.eval()
```
To load and preprocess the image:
```python
import urllib
from PIL import Image
from timm.data import resolve_data_config
from timm.data.transforms_factory import create_transform
config = resolve_data_config({}, model=model)
transform = create_transform(**config)
url, filename = ("https://github.com/pytorch/hub/raw/master/images/dog.jpg", "dog.jpg")
urllib.request.urlretrieve(url, filename)
img = Image.open(filename).convert('RGB')
tensor = transform(img).unsqueeze(0) # transform and add batch dimension
```
To get the model predictions:
```python
import torch
with torch.no_grad():
out = model(tensor)
probabilities = torch.nn.functional.softmax(out[0], dim=0)
print(probabilities.shape)
# prints: torch.Size([1000])
```
To get the top-5 predictions class names:
```python
# Get imagenet class mappings
url, filename = ("https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt", "imagenet_classes.txt")
urllib.request.urlretrieve(url, filename)
with open("imagenet_classes.txt", "r") as f:
categories = [s.strip() for s in f.readlines()]
# Print top categories per image
top5_prob, top5_catid = torch.topk(probabilities, 5)
for i in range(top5_prob.size(0)):
print(categories[top5_catid[i]], top5_prob[i].item())
# prints class names and probabilities like:
# [('Samoyed', 0.6425196528434753), ('Pomeranian', 0.04062102362513542), ('keeshond', 0.03186424449086189), ('white wolf', 0.01739676296710968), ('Eskimo dog', 0.011717947199940681)]
```
Replace the model name with the variant you want to use, e.g. `inception_v4`. You can find the IDs in the model summaries at the top of this page.
To extract image features with this model, follow the [timm feature extraction examples](https://rwightman.github.io/pytorch-image-models/feature_extraction/), just change the name of the model you want to use.
## How do I finetune this model?
You can finetune any of the pre-trained models just by changing the classifier (the last layer).
```python
model = timm.create_model('inception_v4', pretrained=True).reset_classifier(NUM_FINETUNE_CLASSES)
```
To finetune on your own dataset, you have to write a training loop or adapt [timm's training
script](https://github.com/rwightman/pytorch-image-models/blob/master/train.py) to use your dataset.
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{szegedy2016inceptionv4,
title={Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning},
author={Christian Szegedy and Sergey Ioffe and Vincent Vanhoucke and Alex Alemi},
year={2016},
eprint={1602.07261},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: inception_v4
Metadata:
FLOPs: 15806527936
Training Data:
- ImageNet
Training Techniques:
- Label Smoothing
- RMSProp
- Weight Decay
Training Resources: 20x NVIDIA Kepler GPUs
Architecture:
- Average Pooling
- Dropout
- Inception-A
- Inception-B
- Inception-C
- Reduction-A
- Reduction-B
- Softmax
File Size: 171082495
Tasks:
- Image Classification
ID: inception_v4
LR: 0.045
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '299'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/inception_v4.py#L313
In Collection: Inception v4
Collections:
- Name: Inception v4
Paper:
title: Inception-v4, Inception-ResNet and the Impact of Residual Connections on
Learning
url: https://papperswithcode.com//paper/inception-v4-inception-resnet-and-the-impact
type: model-index
Type: model-index
-->

@ -0,0 +1,279 @@
# Summary
**SE ResNet** is a variant of a [ResNet](https://www.paperswithcode.com/method/resnet) that employs [squeeze-and-excitation blocks](https://paperswithcode.com/method/squeeze-and-excitation-block) to enable the network to perform dynamic channel-wise feature recalibration.
## How do I use this model on an image?
To load a pretrained model:
```python
import timm
model = timm.create_model('legacy_seresnet101', pretrained=True)
model.eval()
```
To load and preprocess the image:
```python
import urllib
from PIL import Image
from timm.data import resolve_data_config
from timm.data.transforms_factory import create_transform
config = resolve_data_config({}, model=model)
transform = create_transform(**config)
url, filename = ("https://github.com/pytorch/hub/raw/master/images/dog.jpg", "dog.jpg")
urllib.request.urlretrieve(url, filename)
img = Image.open(filename).convert('RGB')
tensor = transform(img).unsqueeze(0) # transform and add batch dimension
```
To get the model predictions:
```python
import torch
with torch.no_grad():
out = model(tensor)
probabilities = torch.nn.functional.softmax(out[0], dim=0)
print(probabilities.shape)
# prints: torch.Size([1000])
```
To get the top-5 predictions class names:
```python
# Get imagenet class mappings
url, filename = ("https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt", "imagenet_classes.txt")
urllib.request.urlretrieve(url, filename)
with open("imagenet_classes.txt", "r") as f:
categories = [s.strip() for s in f.readlines()]
# Print top categories per image
top5_prob, top5_catid = torch.topk(probabilities, 5)
for i in range(top5_prob.size(0)):
print(categories[top5_catid[i]], top5_prob[i].item())
# prints class names and probabilities like:
# [('Samoyed', 0.6425196528434753), ('Pomeranian', 0.04062102362513542), ('keeshond', 0.03186424449086189), ('white wolf', 0.01739676296710968), ('Eskimo dog', 0.011717947199940681)]
```
Replace the model name with the variant you want to use, e.g. `legacy_seresnet101`. You can find the IDs in the model summaries at the top of this page.
To extract image features with this model, follow the [timm feature extraction examples](https://rwightman.github.io/pytorch-image-models/feature_extraction/), just change the name of the model you want to use.
## How do I finetune this model?
You can finetune any of the pre-trained models just by changing the classifier (the last layer).
```python
model = timm.create_model('legacy_seresnet101', pretrained=True).reset_classifier(NUM_FINETUNE_CLASSES)
```
To finetune on your own dataset, you have to write a training loop or adapt [timm's training
script](https://github.com/rwightman/pytorch-image-models/blob/master/train.py) to use your dataset.
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{hu2019squeezeandexcitation,
title={Squeeze-and-Excitation Networks},
author={Jie Hu and Li Shen and Samuel Albanie and Gang Sun and Enhua Wu},
year={2019},
eprint={1709.01507},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: legacy_seresnet101
Metadata:
FLOPs: 9762614000
Epochs: 100
Batch Size: 1024
Training Data:
- ImageNet
Training Techniques:
- Label Smoothing
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA Titan X GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 197822624
Tasks:
- Image Classification
ID: legacy_seresnet101
LR: 0.6
Layers: 101
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/senet.py#L426
In Collection: Legacy SE ResNet
- Name: legacy_seresnet152
Metadata:
FLOPs: 14553578160
Epochs: 100
Batch Size: 1024
Training Data:
- ImageNet
Training Techniques:
- Label Smoothing
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA Titan X GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 268033864
Tasks:
- Image Classification
ID: legacy_seresnet152
LR: 0.6
Layers: 152
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/senet.py#L433
In Collection: Legacy SE ResNet
- Name: legacy_seresnet18
Metadata:
FLOPs: 2328876024
Epochs: 100
Batch Size: 1024
Training Data:
- ImageNet
Training Techniques:
- Label Smoothing
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA Titan X GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 47175663
Tasks:
- Image Classification
ID: legacy_seresnet18
LR: 0.6
Layers: 18
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/senet.py#L405
In Collection: Legacy SE ResNet
- Name: legacy_seresnet34
Metadata:
FLOPs: 4706201004
Epochs: 100
Batch Size: 1024
Training Data:
- ImageNet
Training Techniques:
- Label Smoothing
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA Titan X GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 87958697
Tasks:
- Image Classification
ID: legacy_seresnet34
LR: 0.6
Layers: 34
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/senet.py#L412
In Collection: Legacy SE ResNet
- Name: legacy_seresnet50
Metadata:
FLOPs: 4974351024
Epochs: 100
Training Data:
- ImageNet
Training Techniques:
- Label Smoothing
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA Titan X GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Bottleneck Residual Block
- Convolution
- Global Average Pooling
- Max Pooling
- ReLU
- Residual Block
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 112611220
Tasks:
- Image Classification
ID: legacy_seresnet50
LR: 0.6
Layers: 50
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Interpolation: bilinear
Minibatch Size: 1024
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/senet.py#L419
In Collection: Legacy SE ResNet
Collections:
- Name: Legacy SE ResNet
Paper:
title: Squeeze-and-Excitation Networks
url: https://papperswithcode.com//paper/squeeze-and-excitation-networks
type: model-index
Type: model-index
-->

@ -0,0 +1,205 @@
# Summary
**SE ResNeXt** is a variant of a [ResNeXt](https://www.paperswithcode.com/method/resnext) that employs [squeeze-and-excitation blocks](https://paperswithcode.com/method/squeeze-and-excitation-block) to enable the network to perform dynamic channel-wise feature recalibration.
## How do I use this model on an image?
To load a pretrained model:
```python
import timm
model = timm.create_model('legacy_seresnext101_32x4d', pretrained=True)
model.eval()
```
To load and preprocess the image:
```python
import urllib
from PIL import Image
from timm.data import resolve_data_config
from timm.data.transforms_factory import create_transform
config = resolve_data_config({}, model=model)
transform = create_transform(**config)
url, filename = ("https://github.com/pytorch/hub/raw/master/images/dog.jpg", "dog.jpg")
urllib.request.urlretrieve(url, filename)
img = Image.open(filename).convert('RGB')
tensor = transform(img).unsqueeze(0) # transform and add batch dimension
```
To get the model predictions:
```python
import torch
with torch.no_grad():
out = model(tensor)
probabilities = torch.nn.functional.softmax(out[0], dim=0)
print(probabilities.shape)
# prints: torch.Size([1000])
```
To get the top-5 predictions class names:
```python
# Get imagenet class mappings
url, filename = ("https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt", "imagenet_classes.txt")
urllib.request.urlretrieve(url, filename)
with open("imagenet_classes.txt", "r") as f:
categories = [s.strip() for s in f.readlines()]
# Print top categories per image
top5_prob, top5_catid = torch.topk(probabilities, 5)
for i in range(top5_prob.size(0)):
print(categories[top5_catid[i]], top5_prob[i].item())
# prints class names and probabilities like:
# [('Samoyed', 0.6425196528434753), ('Pomeranian', 0.04062102362513542), ('keeshond', 0.03186424449086189), ('white wolf', 0.01739676296710968), ('Eskimo dog', 0.011717947199940681)]
```
Replace the model name with the variant you want to use, e.g. `legacy_seresnext101_32x4d`. You can find the IDs in the model summaries at the top of this page.
To extract image features with this model, follow the [timm feature extraction examples](https://rwightman.github.io/pytorch-image-models/feature_extraction/), just change the name of the model you want to use.
## How do I finetune this model?
You can finetune any of the pre-trained models just by changing the classifier (the last layer).
```python
model = timm.create_model('legacy_seresnext101_32x4d', pretrained=True).reset_classifier(NUM_FINETUNE_CLASSES)
```
To finetune on your own dataset, you have to write a training loop or adapt [timm's training
script](https://github.com/rwightman/pytorch-image-models/blob/master/train.py) to use your dataset.
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{hu2019squeezeandexcitation,
title={Squeeze-and-Excitation Networks},
author={Jie Hu and Li Shen and Samuel Albanie and Gang Sun and Enhua Wu},
year={2019},
eprint={1709.01507},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: legacy_seresnext101_32x4d
Metadata:
FLOPs: 10287698672
Epochs: 100
Batch Size: 1024
Training Data:
- ImageNet
Training Techniques:
- Label Smoothing
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA Titan X GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Grouped Convolution
- Max Pooling
- ReLU
- ResNeXt Block
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 196466866
Tasks:
- Image Classification
ID: legacy_seresnext101_32x4d
LR: 0.6
Layers: 101
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/senet.py#L462
In Collection: Legacy SE ResNeXt
- Name: legacy_seresnext26_32x4d
Metadata:
FLOPs: 3187342304
Epochs: 100
Batch Size: 1024
Training Data:
- ImageNet
Training Techniques:
- Label Smoothing
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA Titan X GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Grouped Convolution
- Max Pooling
- ReLU
- ResNeXt Block
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 67346327
Tasks:
- Image Classification
ID: legacy_seresnext26_32x4d
LR: 0.6
Layers: 26
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/senet.py#L448
In Collection: Legacy SE ResNeXt
- Name: legacy_seresnext50_32x4d
Metadata:
FLOPs: 5459954352
Epochs: 100
Batch Size: 1024
Training Data:
- ImageNet
Training Techniques:
- Label Smoothing
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA Titan X GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Global Average Pooling
- Grouped Convolution
- Max Pooling
- ReLU
- ResNeXt Block
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 110559176
Tasks:
- Image Classification
ID: legacy_seresnext50_32x4d
LR: 0.6
Layers: 50
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/senet.py#L455
In Collection: Legacy SE ResNeXt
Collections:
- Name: Legacy SE ResNeXt
Paper:
title: Squeeze-and-Excitation Networks
url: https://papperswithcode.com//paper/squeeze-and-excitation-networks
type: model-index
Type: model-index
-->

@ -0,0 +1,128 @@
# Summary
A **SENet** is a convolutional neural network architecture that employs [squeeze-and-excitation blocks](https://paperswithcode.com/method/squeeze-and-excitation-block) to enable the network to perform dynamic channel-wise feature recalibration.
The weights from this model were ported from Gluon.
## How do I use this model on an image?
To load a pretrained model:
```python
import timm
model = timm.create_model('legacy_senet154', pretrained=True)
model.eval()
```
To load and preprocess the image:
```python
import urllib
from PIL import Image
from timm.data import resolve_data_config
from timm.data.transforms_factory import create_transform
config = resolve_data_config({}, model=model)
transform = create_transform(**config)
url, filename = ("https://github.com/pytorch/hub/raw/master/images/dog.jpg", "dog.jpg")
urllib.request.urlretrieve(url, filename)
img = Image.open(filename).convert('RGB')
tensor = transform(img).unsqueeze(0) # transform and add batch dimension
```
To get the model predictions:
```python
import torch
with torch.no_grad():
out = model(tensor)
probabilities = torch.nn.functional.softmax(out[0], dim=0)
print(probabilities.shape)
# prints: torch.Size([1000])
```
To get the top-5 predictions class names:
```python
# Get imagenet class mappings
url, filename = ("https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt", "imagenet_classes.txt")
urllib.request.urlretrieve(url, filename)
with open("imagenet_classes.txt", "r") as f:
categories = [s.strip() for s in f.readlines()]
# Print top categories per image
top5_prob, top5_catid = torch.topk(probabilities, 5)
for i in range(top5_prob.size(0)):
print(categories[top5_catid[i]], top5_prob[i].item())
# prints class names and probabilities like:
# [('Samoyed', 0.6425196528434753), ('Pomeranian', 0.04062102362513542), ('keeshond', 0.03186424449086189), ('white wolf', 0.01739676296710968), ('Eskimo dog', 0.011717947199940681)]
```
Replace the model name with the variant you want to use, e.g. `legacy_senet154`. You can find the IDs in the model summaries at the top of this page.
To extract image features with this model, follow the [timm feature extraction examples](https://rwightman.github.io/pytorch-image-models/feature_extraction/), just change the name of the model you want to use.
## How do I finetune this model?
You can finetune any of the pre-trained models just by changing the classifier (the last layer).
```python
model = timm.create_model('legacy_senet154', pretrained=True).reset_classifier(NUM_FINETUNE_CLASSES)
```
To finetune on your own dataset, you have to write a training loop or adapt [timm's training
script](https://github.com/rwightman/pytorch-image-models/blob/master/train.py) to use your dataset.
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{hu2019squeezeandexcitation,
title={Squeeze-and-Excitation Networks},
author={Jie Hu and Li Shen and Samuel Albanie and Gang Sun and Enhua Wu},
year={2019},
eprint={1709.01507},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: legacy_senet154
Metadata:
FLOPs: 26659556016
Epochs: 100
Batch Size: 1024
Training Data:
- ImageNet
Training Techniques:
- Label Smoothing
- SGD with Momentum
- Weight Decay
Training Resources: 8x NVIDIA Titan X GPUs
Architecture:
- Convolution
- Dense Connections
- Global Average Pooling
- Max Pooling
- Softmax
- Squeeze-and-Excitation Block
File Size: 461488402
Tasks:
- Image Classification
ID: legacy_senet154
LR: 0.6
Layers: 154
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Interpolation: bilinear
Code: https://github.com/rwightman/pytorch-image-models/blob/d8e69206be253892b2956341fea09fdebfaae4e3/timm/models/senet.py#L440
In Collection: Legacy SENet
Collections:
- Name: Legacy SENet
Paper:
title: Squeeze-and-Excitation Networks
url: https://papperswithcode.com//paper/squeeze-and-excitation-networks
type: model-index
Type: model-index
-->

@ -0,0 +1,194 @@
# Summary
**MixNet** is a type of convolutional neural network discovered via AutoML that utilises [MixConvs](https://paperswithcode.com/method/mixconv) instead of regular [depthwise convolutions](https://paperswithcode.com/method/depthwise-convolution).
## How do I use this model on an image?
To load a pretrained model:
```python
import timm
model = timm.create_model('mixnet_xl', pretrained=True)
model.eval()
```
To load and preprocess the image:
```python
import urllib
from PIL import Image
from timm.data import resolve_data_config
from timm.data.transforms_factory import create_transform
config = resolve_data_config({}, model=model)
transform = create_transform(**config)
url, filename = ("https://github.com/pytorch/hub/raw/master/images/dog.jpg", "dog.jpg")
urllib.request.urlretrieve(url, filename)
img = Image.open(filename).convert('RGB')
tensor = transform(img).unsqueeze(0) # transform and add batch dimension
```
To get the model predictions:
```python
import torch
with torch.no_grad():
out = model(tensor)
probabilities = torch.nn.functional.softmax(out[0], dim=0)
print(probabilities.shape)
# prints: torch.Size([1000])
```
To get the top-5 predictions class names:
```python
# Get imagenet class mappings
url, filename = ("https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt", "imagenet_classes.txt")
urllib.request.urlretrieve(url, filename)
with open("imagenet_classes.txt", "r") as f:
categories = [s.strip() for s in f.readlines()]
# Print top categories per image
top5_prob, top5_catid = torch.topk(probabilities, 5)
for i in range(top5_prob.size(0)):
print(categories[top5_catid[i]], top5_prob[i].item())
# prints class names and probabilities like:
# [('Samoyed', 0.6425196528434753), ('Pomeranian', 0.04062102362513542), ('keeshond', 0.03186424449086189), ('white wolf', 0.01739676296710968), ('Eskimo dog', 0.011717947199940681)]
```
Replace the model name with the variant you want to use, e.g. `mixnet_xl`. You can find the IDs in the model summaries at the top of this page.
To extract image features with this model, follow the [timm feature extraction examples](https://rwightman.github.io/pytorch-image-models/feature_extraction/), just change the name of the model you want to use.
## How do I finetune this model?
You can finetune any of the pre-trained models just by changing the classifier (the last layer).
```python
model = timm.create_model('mixnet_xl', pretrained=True).reset_classifier(NUM_FINETUNE_CLASSES)
```
To finetune on your own dataset, you have to write a training loop or adapt [timm's training
script](https://github.com/rwightman/pytorch-image-models/blob/master/train.py) to use your dataset.
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{tan2019mixconv,
title={MixConv: Mixed Depthwise Convolutional Kernels},
author={Mingxing Tan and Quoc V. Le},
year={2019},
eprint={1907.09595},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: mixnet_xl
Metadata:
FLOPs: 1195880424
Training Data:
- ImageNet
Training Techniques:
- MNAS
Architecture:
- Batch Normalization
- Dense Connections
- Dropout
- Global Average Pooling
- Grouped Convolution
- MixConv
- Squeeze-and-Excitation Block
- Swish
File Size: 48001170
Tasks:
- Image Classification
ID: mixnet_xl
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1678
In Collection: MixNet
- Name: mixnet_m
Metadata:
FLOPs: 454543374
Training Data:
- ImageNet
Training Techniques:
- MNAS
Architecture:
- Batch Normalization
- Dense Connections
- Dropout
- Global Average Pooling
- Grouped Convolution
- MixConv
- Squeeze-and-Excitation Block
- Swish
File Size: 20298347
Tasks:
- Image Classification
ID: mixnet_m
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1660
In Collection: MixNet
- Name: mixnet_s
Metadata:
FLOPs: 321264910
Training Data:
- ImageNet
Training Techniques:
- MNAS
Architecture:
- Batch Normalization
- Dense Connections
- Dropout
- Global Average Pooling
- Grouped Convolution
- MixConv
- Squeeze-and-Excitation Block
- Swish
File Size: 16727982
Tasks:
- Image Classification
ID: mixnet_s
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1651
In Collection: MixNet
- Name: mixnet_l
Metadata:
FLOPs: 738671316
Training Data:
- ImageNet
Training Techniques:
- MNAS
Architecture:
- Batch Normalization
- Dense Connections
- Dropout
- Global Average Pooling
- Grouped Convolution
- MixConv
- Squeeze-and-Excitation Block
- Swish
File Size: 29608232
Tasks:
- Image Classification
ID: mixnet_l
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L1669
In Collection: MixNet
Collections:
- Name: MixNet
Paper:
title: 'MixConv: Mixed Depthwise Convolutional Kernels'
url: https://papperswithcode.com//paper/mixnet-mixed-depthwise-convolutional-kernels
type: model-index
Type: model-index
-->

@ -0,0 +1,155 @@
# Summary
**MnasNet** is a type of convolutional neural network optimized for mobile devices that is discovered through mobile neural architecture search, which explicitly incorporates model latency into the main objective so that the search can identify a model that achieves a good trade-off between accuracy and latency. The main building block is an [inverted residual block](https://paperswithcode.com/method/inverted-residual-block) (from [MobileNetV2](https://paperswithcode.com/method/mobilenetv2)).
## How do I use this model on an image?
To load a pretrained model:
```python
import timm
model = timm.create_model('semnasnet_100', pretrained=True)
model.eval()
```
To load and preprocess the image:
```python
import urllib
from PIL import Image
from timm.data import resolve_data_config
from timm.data.transforms_factory import create_transform
config = resolve_data_config({}, model=model)
transform = create_transform(**config)
url, filename = ("https://github.com/pytorch/hub/raw/master/images/dog.jpg", "dog.jpg")
urllib.request.urlretrieve(url, filename)
img = Image.open(filename).convert('RGB')
tensor = transform(img).unsqueeze(0) # transform and add batch dimension
```
To get the model predictions:
```python
import torch
with torch.no_grad():
out = model(tensor)
probabilities = torch.nn.functional.softmax(out[0], dim=0)
print(probabilities.shape)
# prints: torch.Size([1000])
```
To get the top-5 predictions class names:
```python
# Get imagenet class mappings
url, filename = ("https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt", "imagenet_classes.txt")
urllib.request.urlretrieve(url, filename)
with open("imagenet_classes.txt", "r") as f:
categories = [s.strip() for s in f.readlines()]
# Print top categories per image
top5_prob, top5_catid = torch.topk(probabilities, 5)
for i in range(top5_prob.size(0)):
print(categories[top5_catid[i]], top5_prob[i].item())
# prints class names and probabilities like:
# [('Samoyed', 0.6425196528434753), ('Pomeranian', 0.04062102362513542), ('keeshond', 0.03186424449086189), ('white wolf', 0.01739676296710968), ('Eskimo dog', 0.011717947199940681)]
```
Replace the model name with the variant you want to use, e.g. `semnasnet_100`. You can find the IDs in the model summaries at the top of this page.
To extract image features with this model, follow the [timm feature extraction examples](https://rwightman.github.io/pytorch-image-models/feature_extraction/), just change the name of the model you want to use.
## How do I finetune this model?
You can finetune any of the pre-trained models just by changing the classifier (the last layer).
```python
model = timm.create_model('semnasnet_100', pretrained=True).reset_classifier(NUM_FINETUNE_CLASSES)
```
To finetune on your own dataset, you have to write a training loop or adapt [timm's training
script](https://github.com/rwightman/pytorch-image-models/blob/master/train.py) to use your dataset.
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@misc{tan2019mnasnet,
title={MnasNet: Platform-Aware Neural Architecture Search for Mobile},
author={Mingxing Tan and Bo Chen and Ruoming Pang and Vijay Vasudevan and Mark Sandler and Andrew Howard and Quoc V. Le},
year={2019},
eprint={1807.11626},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
<!--
Models:
- Name: semnasnet_100
Metadata:
FLOPs: 414570766
Training Data:
- ImageNet
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Depthwise Separable Convolution
- Dropout
- Global Average Pooling
- Inverted Residual Block
- Max Pooling
- ReLU
- Residual Connection
- Softmax
- Squeeze-and-Excitation Block
File Size: 15731489
Tasks:
- Image Classification
ID: semnasnet_100
Crop Pct: '0.875'
Image Size: '224'
Interpolation: bicubic
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L928
In Collection: MNASNet
- Name: mnasnet_100
Metadata:
FLOPs: 416415488
Batch Size: 4000
Training Data:
- ImageNet
Training Techniques:
- RMSProp
- Weight Decay
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Depthwise Separable Convolution
- Dropout
- Global Average Pooling
- Inverted Residual Block
- Max Pooling
- ReLU
- Residual Connection
- Softmax
File Size: 17731774
Tasks:
- Image Classification
ID: mnasnet_100
Layers: 100
Dropout: 0.2
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Interpolation: bicubic
RMSProp Decay: 0.9
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L894
In Collection: MNASNet
Collections:
- Name: MNASNet
Paper:
title: 'MnasNet: Platform-Aware Neural Architecture Search for Mobile'
url: https://papperswithcode.com//paper/mnasnet-platform-aware-neural-architecture
type: model-index
Type: model-index
-->

@ -0,0 +1,240 @@
# Summary
**MobileNetV2** is a convolutional neural network architecture that seeks to perform well on mobile devices. It is based on an [inverted residual structure](https://paperswithcode.com/method/inverted-residual-block) where the residual connections are between the bottleneck layers. The intermediate expansion layer uses lightweight depthwise convolutions to filter features as a source of non-linearity. As a whole, the architecture of MobileNetV2 contains the initial fully convolution layer with 32 filters, followed by 19 residual bottleneck layers.
## How do I use this model on an image?
To load a pretrained model:
```python
import timm
model = timm.create_model('mobilenetv2_100', pretrained=True)
model.eval()
```
To load and preprocess the image:
```python
import urllib
from PIL import Image
from timm.data import resolve_data_config
from timm.data.transforms_factory import create_transform
config = resolve_data_config({}, model=model)
transform = create_transform(**config)
url, filename = ("https://github.com/pytorch/hub/raw/master/images/dog.jpg", "dog.jpg")
urllib.request.urlretrieve(url, filename)
img = Image.open(filename).convert('RGB')
tensor = transform(img).unsqueeze(0) # transform and add batch dimension
```
To get the model predictions:
```python
import torch
with torch.no_grad():
out = model(tensor)
probabilities = torch.nn.functional.softmax(out[0], dim=0)
print(probabilities.shape)
# prints: torch.Size([1000])
```
To get the top-5 predictions class names:
```python
# Get imagenet class mappings
url, filename = ("https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt", "imagenet_classes.txt")
urllib.request.urlretrieve(url, filename)
with open("imagenet_classes.txt", "r") as f:
categories = [s.strip() for s in f.readlines()]
# Print top categories per image
top5_prob, top5_catid = torch.topk(probabilities, 5)
for i in range(top5_prob.size(0)):
print(categories[top5_catid[i]], top5_prob[i].item())
# prints class names and probabilities like:
# [('Samoyed', 0.6425196528434753), ('Pomeranian', 0.04062102362513542), ('keeshond', 0.03186424449086189), ('white wolf', 0.01739676296710968), ('Eskimo dog', 0.011717947199940681)]
```
Replace the model name with the variant you want to use, e.g. `mobilenetv2_100`. You can find the IDs in the model summaries at the top of this page.
To extract image features with this model, follow the [timm feature extraction examples](https://rwightman.github.io/pytorch-image-models/feature_extraction/), just change the name of the model you want to use.
## How do I finetune this model?
You can finetune any of the pre-trained models just by changing the classifier (the last layer).
```python
model = timm.create_model('mobilenetv2_100', pretrained=True).reset_classifier(NUM_FINETUNE_CLASSES)
```
To finetune on your own dataset, you have to write a training loop or adapt [timm's training
script](https://github.com/rwightman/pytorch-image-models/blob/master/train.py) to use your dataset.
## How do I train this model?
You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-image-models/scripts/) for training a new model afresh.
## Citation
```BibTeX
@article{DBLP:journals/corr/abs-1801-04381,
author = {Mark Sandler and
Andrew G. Howard and
Menglong Zhu and
Andrey Zhmoginov and
Liang{-}Chieh Chen},
title = {Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification,
Detection and Segmentation},
journal = {CoRR},
volume = {abs/1801.04381},
year = {2018},
url = {http://arxiv.org/abs/1801.04381},
archivePrefix = {arXiv},
eprint = {1801.04381},
timestamp = {Tue, 12 Jan 2021 15:30:06 +0100},
biburl = {https://dblp.org/rec/journals/corr/abs-1801-04381.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
```
<!--
Models:
- Name: mobilenetv2_100
Metadata:
FLOPs: 401920448
Batch Size: 1536
Training Data:
- ImageNet
Training Techniques:
- RMSProp
- Weight Decay
Training Resources: 16x GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Depthwise Separable Convolution
- Dropout
- Inverted Residual Block
- Max Pooling
- ReLU6
- Residual Connection
- Softmax
File Size: 14202571
Tasks:
- Image Classification
ID: mobilenetv2_100
LR: 0.045
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 4.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L955
In Collection: MobileNet V2
- Name: mobilenetv2_110d
Metadata:
FLOPs: 573958832
Batch Size: 1536
Training Data:
- ImageNet
Training Techniques:
- RMSProp
- Weight Decay
Training Resources: 16x GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Depthwise Separable Convolution
- Dropout
- Inverted Residual Block
- Max Pooling
- ReLU6
- Residual Connection
- Softmax
File Size: 18316431
Tasks:
- Image Classification
ID: mobilenetv2_110d
LR: 0.045
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 4.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L969
In Collection: MobileNet V2
- Name: mobilenetv2_120d
Metadata:
FLOPs: 888510048
Batch Size: 1536
Training Data:
- ImageNet
Training Techniques:
- RMSProp
- Weight Decay
Training Resources: 16x GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Depthwise Separable Convolution
- Dropout
- Inverted Residual Block
- Max Pooling
- ReLU6
- Residual Connection
- Softmax
File Size: 23651121
Tasks:
- Image Classification
ID: mobilenetv2_120d
LR: 0.045
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 4.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L977
In Collection: MobileNet V2
- Name: mobilenetv2_140
Metadata:
FLOPs: 770196784
Batch Size: 1536
Training Data:
- ImageNet
Training Techniques:
- RMSProp
- Weight Decay
Training Resources: 16x GPUs
Architecture:
- 1x1 Convolution
- Batch Normalization
- Convolution
- Depthwise Separable Convolution
- Dropout
- Inverted Residual Block
- Max Pooling
- ReLU6
- Residual Connection
- Softmax
File Size: 24673555
Tasks:
- Image Classification
ID: mobilenetv2_140
LR: 0.045
Crop Pct: '0.875'
Momentum: 0.9
Image Size: '224'
Weight Decay: 4.0e-05
Interpolation: bicubic
RMSProp Decay: 0.9
Code: https://github.com/rwightman/pytorch-image-models/blob/9a25fdf3ad0414b4d66da443fe60ae0aa14edc84/timm/models/efficientnet.py#L962
In Collection: MobileNet V2
Collections:
- Name: MobileNet V2
Paper:
title: 'MobileNetV2: Inverted Residuals and Linear Bottlenecks'
url: https://papperswithcode.com//paper/mobilenetv2-inverted-residuals-and-linear
type: model-index
Type: model-index
-->

Some files were not shown because too many files have changed in this diff Show More

Loading…
Cancel
Save