diff --git a/docs/changes.md b/docs/changes.md
index 2e63fe23..0b5c3752 100644
--- a/docs/changes.md
+++ b/docs/changes.md
@@ -1,7 +1,8 @@
 # Recent Changes
 
-### Aug 1, 2020
+### Aug 5, 2020
 Universal feature extraction, new models, new weights, new test sets.
+
 * All models support the `features_only=True` argument for `create_model` call to return a network that extracts features from the deepest layer at each stride.
 * New models
   * CSPResNet, CSPResNeXt, CSPDarkNet, DarkNet
@@ -14,6 +15,8 @@ Universal feature extraction, new models, new weights, new test sets.
   * DPN68b - 79.2 top-1
   * EfficientNet-Lite0 (non-TF ver) - 75.5 (submitted by @hal-314)
 * Add 'real' labels for ImageNet and ImageNet-Renditions test set, see [`results/README.md`](results/README.md)
+* Train script and loader/transform tweaks to punch through more aug arguments
+* README and documentation overhaul. See initial (WIP) documentation at https://rwightman.github.io/pytorch-image-models/
 
 ### June 11, 2020
 Bunch of changes:
diff --git a/docs/models.md b/docs/models.md
index a0268138..000d1c18 100644
--- a/docs/models.md
+++ b/docs/models.md
@@ -3,6 +3,7 @@
 The model architectures included come from a wide variety of sources. Sources, including papers, original impl ("reference code") that I rewrote / adapted, and PyTorch impl that I leveraged directly ("code") are listed below.
 
 Most included models have pretrained weights. The weights are either:
+
 1. from their original sources
 2. ported by myself from their original impl in a different framework (e.g. Tensorflow models)
 3. trained from scratch using the included training script
@@ -55,16 +56,17 @@ The validation results for the pretrained weights can be found [here](results.md
 * Reference code: https://github.com/tensorflow/models/tree/master/research/slim/nets/nasnet
 
 ## EfficientNet [[efficientnet.py](https://github.com/rwightman/pytorch-image-models/blob/master/timm/models/efficientnet.py)]
-* Papers
-    * EfficientNet NoisyStudent (B0-B7, L2) - https://arxiv.org/abs/1911.04252
-    * EfficientNet AdvProp (B0-B8) - https://arxiv.org/abs/1911.09665
-    * EfficientNet (B0-B7) - https://arxiv.org/abs/1905.11946
-    * EfficientNet-EdgeTPU (S, M, L) - https://ai.googleblog.com/2019/08/efficientnet-edgetpu-creating.html
-    * MixNet - https://arxiv.org/abs/1907.09595
-    * MNASNet B1, A1 (Squeeze-Excite), and Small - https://arxiv.org/abs/1807.11626
-    * MobileNet-V2 - https://arxiv.org/abs/1801.04381
-    * FBNet-C - https://arxiv.org/abs/1812.03443
-    * Single-Path NAS - https://arxiv.org/abs/1904.02877
+
+* Papers:
+  * EfficientNet NoisyStudent (B0-B7, L2) - https://arxiv.org/abs/1911.04252
+  * EfficientNet AdvProp (B0-B8) - https://arxiv.org/abs/1911.09665
+  * EfficientNet (B0-B7) - https://arxiv.org/abs/1905.11946
+  * EfficientNet-EdgeTPU (S, M, L) - https://ai.googleblog.com/2019/08/efficientnet-edgetpu-creating.html
+  * MixNet - https://arxiv.org/abs/1907.09595
+  * MNASNet B1, A1 (Squeeze-Excite), and Small - https://arxiv.org/abs/1807.11626
+  * MobileNet-V2 - https://arxiv.org/abs/1801.04381
+  * FBNet-C - https://arxiv.org/abs/1812.03443
+  * Single-Path NAS - https://arxiv.org/abs/1904.02877
 * My PyTorch code: https://github.com/rwightman/gen-efficientnet-pytorch
 * Reference code: https://github.com/tensorflow/tpu/tree/master/models/official/efficientnet
 
@@ -77,27 +79,28 @@ The validation results for the pretrained weights can be found [here](results.md
 * Reference code: https://github.com/facebookresearch/pycls/blob/master/pycls/models/regnet.py
 
 ## ResNet, ResNeXt [[resnet.py](https://github.com/rwightman/pytorch-image-models/blob/master/timm/models/resnet.py)]
+
 * ResNet (V1B)
-    * Paper: `Deep Residual Learning for Image Recognition` - https://arxiv.org/abs/1512.03385
-    * Code: https://github.com/pytorch/vision/tree/master/torchvision/models
+  * Paper: `Deep Residual Learning for Image Recognition` - https://arxiv.org/abs/1512.03385
+  * Code: https://github.com/pytorch/vision/tree/master/torchvision/models
 * ResNeXt
-    * Paper: `Aggregated Residual Transformations for Deep Neural Networks` - https://arxiv.org/abs/1611.05431
-    * Code: https://github.com/pytorch/vision/tree/master/torchvision/models
+  * Paper: `Aggregated Residual Transformations for Deep Neural Networks` - https://arxiv.org/abs/1611.05431
+  * Code: https://github.com/pytorch/vision/tree/master/torchvision/models
 * 'Bag of Tricks' / Gluon C, D, E, S ResNet variants
-    * Paper: `Bag of Tricks for Image Classification with CNNs` - https://arxiv.org/abs/1812.01187
-    * Code: https://github.com/dmlc/gluon-cv/blob/master/gluoncv/model_zoo/resnetv1b.py
+  * Paper: `Bag of Tricks for Image Classification with CNNs` - https://arxiv.org/abs/1812.01187
+  * Code: https://github.com/dmlc/gluon-cv/blob/master/gluoncv/model_zoo/resnetv1b.py
 * Instagram pretrained / ImageNet tuned ResNeXt101
-    * Paper: `Exploring the Limits of Weakly Supervised Pretraining` - https://arxiv.org/abs/1805.00932
-    * Weights: https://pytorch.org/hub/facebookresearch_WSL-Images_resnext (NOTE: CC BY-NC 4.0 License, NOT commercial friendly)
+  * Paper: `Exploring the Limits of Weakly Supervised Pretraining` - https://arxiv.org/abs/1805.00932
+  * Weights: https://pytorch.org/hub/facebookresearch_WSL-Images_resnext (NOTE: CC BY-NC 4.0 License, NOT commercial friendly)
 * Semi-supervised (SSL) / Semi-weakly Supervised (SWSL) ResNet and ResNeXts
-    * Paper: `Billion-scale semi-supervised learning for image classification` - https://arxiv.org/abs/1905.00546
-    * Weights: https://github.com/facebookresearch/semi-supervised-ImageNet1K-models (NOTE: CC BY-NC 4.0 License, NOT commercial friendly)
+  * Paper: `Billion-scale semi-supervised learning for image classification` - https://arxiv.org/abs/1905.00546
+  * Weights: https://github.com/facebookresearch/semi-supervised-ImageNet1K-models (NOTE: CC BY-NC 4.0 License, NOT commercial friendly)
 * Squeeze-and-Excitation Networks
-    * Paper: `Squeeze-and-Excitation Networks` - https://arxiv.org/abs/1709.01507
-    * Code: Added to ResNet base, this is current version going forward, old `senet.py` is being deprecated
+  * Paper: `Squeeze-and-Excitation Networks` - https://arxiv.org/abs/1709.01507
+  * Code: Added to ResNet base, this is current version going forward, old `senet.py` is being deprecated
 * ECAResNet (ECA-Net)
-    * Paper: `ECA-Net: Efficient Channel Attention for Deep CNN` - https://arxiv.org/abs/1910.03151v4
-    * Code: Added to ResNet base, ECA module contributed by @VRandme, reference https://github.com/BangguWu/ECANet
+  * Paper: `ECA-Net: Efficient Channel Attention for Deep CNN` - https://arxiv.org/abs/1910.03151v4
+  * Code: Added to ResNet base, ECA module contributed by @VRandme, reference https://github.com/BangguWu/ECANet
 
 ## Res2Net [[res2net.py](https://github.com/rwightman/pytorch-image-models/blob/master/timm/models/res2net.py)]
 * Paper: `Res2Net: A New Multi-scale Backbone Architecture` - https://arxiv.org/abs/1904.01169
@@ -121,6 +124,7 @@ The validation results for the pretrained weights can be found [here](results.md
 
 ## Squeeze-and-Excitation Networks [[senet.py](https://github.com/rwightman/pytorch-image-models/blob/master/timm/models/senet.py)]
 NOTE: I am deprecating this version of the networks, the new ones are part of `resnet.py`
+
 * Paper: `Squeeze-and-Excitation Networks` - https://arxiv.org/abs/1709.01507
 * Code: https://github.com/Cadene/pretrained-models.pytorch 
 
diff --git a/mkdocs.yml b/mkdocs.yml
index f6473342..86a9b679 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -39,3 +39,4 @@ markdown_extensions:
   - pymdownx.tasklist:
       custom_checkbox: true
   - pymdownx.tilde
+  - mdx_truly_sane_lists
diff --git a/requirements-docs.txt b/requirements-docs.txt
index fbac998b..40d3e9bd 100644
--- a/requirements-docs.txt
+++ b/requirements-docs.txt
@@ -1,2 +1,3 @@
 mkdocs==1.1.2
-mkdocs-material==5.4.0
\ No newline at end of file
+mkdocs-material==5.4.0
+mdx_truly_sane_lists
\ No newline at end of file