PyTorch image models, scripts, pretrained weights -- ResNet, ResNeXT, EfficientNet, EfficientNetV2, NFNet, Vision Transformer, MixNet, MobileNet-V3/V2, RegNet, DPN, CSPNet, and more

Go to file

Ross Wightman 4663fc2132 Add support for tflite mnasnet pretrained weights and included spnasnet pretrained weights of my own. * tensorflow 'SAME' padding support added to GenMobileNet models for tflite pretrained weights * folded batch norm support (made batch norm optional and enable conv bias) for tflite pretrained weights * add url for spnasnet1_00 weights that I recently trained * fix SE reduction size for semnasnet models		6 years ago
data	Add distributed sampler that maintains order of original dataset (for validation)	6 years ago
loss	Add smooth loss	6 years ago
models	Add support for tflite mnasnet pretrained weights and included spnasnet pretrained weights of my own.	6 years ago
optim	Exclude batchnorm and bias params from weight_decay by default	6 years ago
scheduler	Update a few comment, add some references	6 years ago
README.md	Update README.md	6 years ago
clean_checkpoint.py	Add checkpoint clean script, add link to pretrained resnext50 weights	6 years ago
distributed_train.sh	Fix distributed train script	6 years ago
inference.py	Add per model crop pct, interpolation defaults, tie it all together	6 years ago
train.py	Exclude batchnorm and bias params from weight_decay by default	6 years ago
utils.py	Update a few comment, add some references	6 years ago
validate.py	Update a few comment, add some references	6 years ago

README.md

PyTorch Image Models, etc

Introduction

For each competition, personal, or freelance project involving images + Convolution Neural Networks, I build on top of an evolving collection of code and models. This repo contains a (somewhat) cleaned up and paired down iteration of that code. Hopefully it'll be of use to others.

The work of many others is present here. I've tried to make sure all source material is acknowledged:

Training/validation scripts evolved from early versions of the PyTorch Imagenet Examples
CUDA specific performance enhancements have been pulled from NVIDIA's APEX Examples
Models are from a wide variety of sources
LR scheduler ideas from AllenNLP and FAIRseq
Random Erasing from Zhun Zhong

Models

I've included a few of my favourite models, but this is not an exhaustive collection. You can't do better than Cadene's collection in that regard. Most models do have pretrained weights from their respective sources or original authors.

ResNet/ResNeXt (from torchvision with ResNeXt mods by myself)
- ResNet-18, ResNet-34, ResNet-50, ResNet-101, ResNet-152, ResNeXt50 (32x4d), ResNeXt101 (32x4d and 64x4d)
DenseNet (from torchvision)
- DenseNet-121, DenseNet-169, DenseNet-201, DenseNet-161
Squeeze-and-Excitation ResNet/ResNeXt (from Cadene with some pretrained weight additions by myself)
- SENet-154, SE-ResNet-18, SE-ResNet-34, SE-ResNet-50, SE-ResNet-101, SE-ResNet-152, SE-ResNeXt-26 (32x4d), SE-ResNeXt50 (32x4d), ResNeXt101 (32x4d)
Inception-ResNet-V2 and Inception-V4 (from Cadene )
Xception (from Cadene)
PNasNet (from Cadene)
DPN (from me, weights hosted by Cadene)
- DPN-68, DPN-68b, DPN-92, DPN-98, DPN-131, DPN-107
My generic MobileNet (GenMobileNet) - A generic model that implements many of the mobile optimized architecture search derived models that utilize similar DepthwiseSeparable, InvertedResidual, etc blocks
- MNASNet B1, A1 (Squeeze-Excite), and Small
- MobileNet-V1
- MobileNet-V2
- ChamNet (details hard to find, currently an educated guess)
- FBNet-C (TODO A/B variants)

Features

Several (less common) features that I often utilize in my projects are included. Many of their additions are the reason why I maintain my own set of models, instead of using others' via PIP:

All models have a common default configuration interface and API for
- accessing/changing the classifier - get_classifier and reset_classifier
- doing a forward pass on just the features - forward_features
- these makes it easy to write consistent network wrappers that work with any of the models
All models have a consistent pretrained weight loader that adapts last linear if necessary, and from 3 to 1 channel input if desired
The train script works in several process/GPU modes:
- NVIDIA DDP w/ a single GPU per process, multiple processes with APEX present (AMP mixed-precision optional)
- PyTorch DistributedDataParallel w/ multi-gpu, single process (AMP disabled as it crashes when enabled)
- PyTorch w/ single GPU single process (AMP optional)
A dynamic global pool implementation that allows selecting from average pooling, max pooling, average + max, or concat([average, max]) at model creation. All global pooling is adaptive average by default and compatible with pretrained weights.
A 'Test Time Pool' wrapper that can wrap any of the included models and usually provide improved performance doing inference with input images larger than the training size. Idea adapted from original DPN implementation when I ported (https://github.com/cypw/DPNs)
Training schedules and techniques that provide competitive results (Cosine LR, Random Erasing, Smoothed Softmax, etc)
An inference script that dumps output to CSV is provided as an example

Custom Weights

I've leveraged the training scripts in this repository to train a few of the models with missing weights to good levels of performance. These numbers are all for 224x224 training and validation image sizing with the usual 87.5% validation crop.

Model	Prec@1 (Err)	Prec@5 (Err)	Param #	Image Scaling
ResNeXt-50 (32x4d)	78.512 (21.488)	94.042 (5.958)	25M	bicubic
SE-ResNeXt-26 (32x4d)	77.104 (22.896)	93.316 (6.684)	16.8M	bicubic
SE-ResNet-34	74.808 (25.192)	92.124 (7.876)	22M	bilinear
SE-ResNet-18	71.742 (28.258)	90.334 (9.666)	11.8M	bicubic

TODO

A number of additions planned in the future for various projects, incl

Find optimal training hyperparams and create/port pretraiend weights for the generic MobileNet variants
Do a model performance (speed + accuracy) benchmarking across all models (make runable as script)
More training experiments
Make folder/file layout compat with usage as a module
Add usage examples to comments, good hyper params for training
Comments, cleanup and the usual things that get pushed back