diff --git a/README.md b/README.md index 8b4874b3..911fd88f 100644 --- a/README.md +++ b/README.md @@ -1,5 +1,59 @@ # PyTorch Image Models, etc +## Setup +Follow the steps in the "Usage" section to set up the environment. Here are the main steps to follow: +All development and testing has been done in Conda Python 3 environments on Linux x86-64 systems, specifically Python 3.6.x and 3.7.x. (It also works with 3.8.x, I had tried it.) + +Steps to set up the conda environment (if running on CPU, don't need to install cudatoolkit=10.1): +``` +conda create -n torch-env +conda activate torch-env +conda install -c pytorch pytorch torchvision cudatoolkit=10.1 +conda install pyyaml +``` + +After the conda environment is set up, install the 'timm' package: +``` +pip install timm +``` + +To check to make sure the 'timm' package was install, open an Python shell and try to run below: +``` +>>> import timm +>>> m = timm.create_model('mobilenetv3_small_100', pretrained=True) +>>> m.eval() +``` + +## Run training with the ResNeXt50 model + +There are a few examples on running different models for training in the "Training Hyperparameters" section. To run the one for ResNeXt50 with CUDA/GPU, the example is below. + +### ResNeXt-50 32x4d w/ RandAugment - 79.762 top-1, 94.60 top-5 +These params will also work well for SE-ResNeXt-50 and SK-ResNeXt-50 and likely 101. I used them for the SK-ResNeXt-50 32x4d that I trained with 2 GPU using a slightly higher LR per effective batch size (lr=0.18, b=192 per GPU). The cmd line below are tuned for 8 GPU training. + + +`./distributed_train.sh 8 /imagenet --model resnext50_32x4d --lr 0.6 --warmup-epochs 5 --epochs 240 --weight-decay 1e-4 --sched cosine --reprob 0.4 --recount 3 --remode pixel --aa rand-m7-mstd0.5-inc1 -b 192 -j 6 --amp --dist-bn reduce` + +Here is the breakdown of the arguments from the above example (detailed explanation for all possible arguments are in the train.py file in the argparser): + +`./distributed_train.sh --model resnext50_32x4d --lr --warmup-epochs --epochs --weight-decay --sched --reprob --recount --remode --aa -b -j --amp --dist-bn ` + + +Some other arguments that may be helpful: + +`--log-interval ` + +The default log interval is set to 50. + +`--drop ` + +The default dropout rate is 0. + +`--opt ` + +The default optimizer is SGD, but some other choices include Adam ("adam"), RMSProp ("rmsprop"), etc. The optimizer choices are in the timm/optim/optim_factory.py file. + +---------------------------------------------------------------------------------------- ## What's New ### June 11, 2020