diff --git a/modelindex/.templates/code_snippets.md b/docs/models/.templates/code_snippets.md
similarity index 100%
rename from modelindex/.templates/code_snippets.md
rename to docs/models/.templates/code_snippets.md
diff --git a/modelindex/generate_readmes.py b/docs/models/.templates/generate_readmes.py
similarity index 100%
rename from modelindex/generate_readmes.py
rename to docs/models/.templates/generate_readmes.py
diff --git a/modelindex/.templates/models/adversarial-inception-v3.md b/docs/models/.templates/models/adversarial-inception-v3.md
similarity index 95%
rename from modelindex/.templates/models/adversarial-inception-v3.md
rename to docs/models/.templates/models/adversarial-inception-v3.md
index c53e180a..56004ca2 100644
--- a/modelindex/.templates/models/adversarial-inception-v3.md
+++ b/docs/models/.templates/models/adversarial-inception-v3.md
@@ -1,9 +1,11 @@
-# Summary
+# Adversarial Inception v3
 
 **Inception v3** is a convolutional neural network architecture from the Inception family that makes several improvements including using [Label Smoothing](https://paperswithcode.com/method/label-smoothing), Factorized 7 x 7 convolutions, and the use of an [auxiliary classifer](https://paperswithcode.com/method/auxiliary-classifier) to propagate label information lower down the network (along with the use of batch normalization for layers in the sidehead). The key building block is an [Inception Module](https://paperswithcode.com/method/inception-v3-module).
 
 This particular model was trained for study of adversarial examples (adversarial training).
 
+The weights from this model were ported from [Tensorflow/Models](https://github.com/tensorflow/models).
+
 {% include 'code_snippets.md' %}
 
 ## How do I train this model?
diff --git a/modelindex/.templates/models/advprop.md b/docs/models/.templates/models/advprop.md
similarity index 99%
rename from modelindex/.templates/models/advprop.md
rename to docs/models/.templates/models/advprop.md
index e1deaaa6..f2b6ad60 100644
--- a/modelindex/.templates/models/advprop.md
+++ b/docs/models/.templates/models/advprop.md
@@ -1,4 +1,4 @@
-# Summary
+# AdvProp
 
 **AdvProp** is an adversarial training scheme which treats adversarial examples as additional examples, to prevent overfitting. Key to the method is the usage of a separate auxiliary batch norm for adversarial examples, as they have different underlying distributions to normal examples.
 
diff --git a/modelindex/.templates/models/big-transfer.md b/docs/models/.templates/models/big-transfer.md
similarity index 99%
rename from modelindex/.templates/models/big-transfer.md
rename to docs/models/.templates/models/big-transfer.md
index 2cdfd1cb..6ef5c118 100644
--- a/modelindex/.templates/models/big-transfer.md
+++ b/docs/models/.templates/models/big-transfer.md
@@ -1,4 +1,4 @@
-# Summary
+# Big Transfer (BiT)
 
 **Big Transfer (BiT)** is a type of pretraining recipe that pre-trains  on a large supervised source dataset, and fine-tunes the weights on the target task. Models are trained on the JFT-300M dataset. The finetuned models contained in this collection are finetuned on ImageNet.
 
diff --git a/modelindex/.templates/models/csp-darknet.md b/docs/models/.templates/models/csp-darknet.md
similarity index 99%
rename from modelindex/.templates/models/csp-darknet.md
rename to docs/models/.templates/models/csp-darknet.md
index 37b65e50..8aec168e 100644
--- a/modelindex/.templates/models/csp-darknet.md
+++ b/docs/models/.templates/models/csp-darknet.md
@@ -1,4 +1,4 @@
-# Summary
+# CSP DarkNet
 
 **CSPDarknet53** is a convolutional neural network and backbone for object detection that uses [DarkNet-53](https://paperswithcode.com/method/darknet-53). It employs a CSPNet strategy to partition the feature map of the base layer into two parts and then merges them through a cross-stage hierarchy. The use of a split and merge strategy allows for more gradient flow through the network. 
 
diff --git a/modelindex/.templates/models/csp-resnet.md b/docs/models/.templates/models/csp-resnet.md
similarity index 99%
rename from modelindex/.templates/models/csp-resnet.md
rename to docs/models/.templates/models/csp-resnet.md
index 30ea74fa..57e53122 100644
--- a/modelindex/.templates/models/csp-resnet.md
+++ b/docs/models/.templates/models/csp-resnet.md
@@ -1,4 +1,4 @@
-# Summary
+# CSP ResNet
 
 **CSPResNet** is a convolutional neural network where we apply the Cross Stage Partial Network (CSPNet) approach to [ResNet](https://paperswithcode.com/method/resnet). The CSPNet partitions the feature map of the base layer into two parts and then merges them through a cross-stage hierarchy. The use of a split and merge strategy allows for more gradient flow through the network.
 
diff --git a/modelindex/.templates/models/csp-resnext.md b/docs/models/.templates/models/csp-resnext.md
similarity index 99%
rename from modelindex/.templates/models/csp-resnext.md
rename to docs/models/.templates/models/csp-resnext.md
index 64e0b0ad..745109b6 100644
--- a/modelindex/.templates/models/csp-resnext.md
+++ b/docs/models/.templates/models/csp-resnext.md
@@ -1,4 +1,4 @@
-# Summary
+# CSP ResNeXt
 
 **CSPResNeXt** is a convolutional neural network where we apply the Cross Stage Partial Network (CSPNet) approach to [ResNeXt](https://paperswithcode.com/method/resnext). The CSPNet partitions the feature map of the base layer into two parts and then merges them through a cross-stage hierarchy. The use of a split and merge strategy allows for more gradient flow through the network.
 
diff --git a/modelindex/.templates/models/densenet.md b/docs/models/.templates/models/densenet.md
similarity index 99%
rename from modelindex/.templates/models/densenet.md
rename to docs/models/.templates/models/densenet.md
index a816aa78..2fda1c45 100644
--- a/modelindex/.templates/models/densenet.md
+++ b/docs/models/.templates/models/densenet.md
@@ -1,4 +1,4 @@
-# Summary
+# DenseNet
 
 **DenseNet** is a type of convolutional neural network that utilises dense connections between layers, through [Dense Blocks](http://www.paperswithcode.com/method/dense-block), where we connect *all layers* (with matching feature-map sizes) directly with each other. To preserve the feed-forward nature, each layer obtains additional inputs from all preceding layers and passes on its own feature-maps to all subsequent layers.
 
diff --git a/modelindex/.templates/models/dla.md b/docs/models/.templates/models/dla.md
similarity index 99%
rename from modelindex/.templates/models/dla.md
rename to docs/models/.templates/models/dla.md
index 22b2a241..ec8e48c8 100644
--- a/modelindex/.templates/models/dla.md
+++ b/docs/models/.templates/models/dla.md
@@ -1,4 +1,4 @@
-# Summary
+# Deep Layer Aggregation
 
 Extending  “shallow” skip connections, **Dense Layer Aggregation (DLA)** incorporates more depth and sharing. The authors introduce two structures for deep layer aggregation (DLA): iterative deep aggregation (IDA) and hierarchical deep aggregation (HDA). These structures are expressed through an architectural framework, independent of the choice of backbone, for compatibility with current and future networks. 
 
diff --git a/modelindex/.templates/models/dpn.md b/docs/models/.templates/models/dpn.md
similarity index 99%
rename from modelindex/.templates/models/dpn.md
rename to docs/models/.templates/models/dpn.md
index 133fb9f5..21863085 100644
--- a/modelindex/.templates/models/dpn.md
+++ b/docs/models/.templates/models/dpn.md
@@ -1,4 +1,4 @@
-# Summary
+# Dual Path Network (DPN)
 
 A **Dual Path Network (DPN)** is a convolutional neural network which presents a new topology of connection paths internally. The intuition is that [ResNets](https://paperswithcode.com/method/resnet) enables feature re-usage while DenseNet enables new feature exploration, and both are important for learning good representations. To enjoy the benefits from both path topologies, Dual Path Networks share common features while maintaining the flexibility to explore new features through dual path architectures. 
 
diff --git a/modelindex/.templates/models/ecaresnet.md b/docs/models/.templates/models/ecaresnet.md
similarity index 99%
rename from modelindex/.templates/models/ecaresnet.md
rename to docs/models/.templates/models/ecaresnet.md
index 14da4730..0917720c 100644
--- a/modelindex/.templates/models/ecaresnet.md
+++ b/docs/models/.templates/models/ecaresnet.md
@@ -1,4 +1,4 @@
-# Summary
+# ECA ResNet
 
 An **ECA ResNet** is a variant on a [ResNet](https://paperswithcode.com/method/resnet) that utilises an [Efficient Channel Attention module](https://paperswithcode.com/method/efficient-channel-attention). Efficient Channel Attention is an architectural unit based on [squeeze-and-excitation blocks](https://paperswithcode.com/method/squeeze-and-excitation-block) that reduces model complexity without dimensionality reduction. 
 
diff --git a/modelindex/.templates/models/efficientnet-pruned.md b/docs/models/.templates/models/efficientnet-pruned.md
similarity index 92%
rename from modelindex/.templates/models/efficientnet-pruned.md
rename to docs/models/.templates/models/efficientnet-pruned.md
index 7a0c39c7..e194e457 100644
--- a/modelindex/.templates/models/efficientnet-pruned.md
+++ b/docs/models/.templates/models/efficientnet-pruned.md
@@ -1,4 +1,4 @@
-# Summary
+# EfficientNet (Knapsack Pruned)
 
 **EfficientNet** is a convolutional neural network architecture and scaling method that uniformly scales all dimensions of depth/width/resolution using a *compound coefficient*. Unlike conventional practice that arbitrary scales  these factors, the EfficientNet scaling method uniformly scales network width, depth, and resolution with a set of fixed scaling coefficients. For example, if we want to use $2^N$ times more computational resources, then we can simply increase the network depth by $\alpha ^ N$,  width by $\beta ^ N$, and image size by $\gamma ^ N$, where $\alpha, \beta, \gamma$ are constant coefficients determined by a small grid search on the original small model. EfficientNet uses a compound coefficient $\phi$ to uniformly scales network width, depth, and resolution in a  principled way.
 
@@ -28,14 +28,13 @@ You can follow the [timm recipe scripts](https://rwightman.github.io/pytorch-ima
 ```
 
 ```
-@misc{rw2019timm,
-  author = {Ross Wightman},
-  title = {PyTorch Image Models},
-  year = {2019},
-  publisher = {GitHub},
-  journal = {GitHub repository},
-  doi = {10.5281/zenodo.4414861},
-  howpublished = {\url{https://github.com/rwightman/pytorch-image-models}}
+@misc{aflalo2020knapsack,
+      title={Knapsack Pruning with Inner Distillation},
+      author={Yonathan Aflalo and Asaf Noy and Ming Lin and Itamar Friedman and Lihi Zelnik},
+      year={2020},
+      eprint={2002.08258},
+      archivePrefix={arXiv},
+      primaryClass={cs.LG}
 }
 ```
 
diff --git a/modelindex/.templates/models/efficientnet.md b/docs/models/.templates/models/efficientnet.md
similarity index 99%
rename from modelindex/.templates/models/efficientnet.md
rename to docs/models/.templates/models/efficientnet.md
index e7bc1452..dce4e78d 100644
--- a/modelindex/.templates/models/efficientnet.md
+++ b/docs/models/.templates/models/efficientnet.md
@@ -1,4 +1,4 @@
-# Summary
+# EfficientNet
 
 **EfficientNet** is a convolutional neural network architecture and scaling method that uniformly scales all dimensions of depth/width/resolution using a *compound coefficient*. Unlike conventional practice that arbitrary scales  these factors, the EfficientNet scaling method uniformly scales network width, depth, and resolution with a set of fixed scaling coefficients. For example, if we want to use $2^N$ times more computational resources, then we can simply increase the network depth by $\alpha ^ N$,  width by $\beta ^ N$, and image size by $\gamma ^ N$, where $\alpha, \beta, \gamma$ are constant coefficients determined by a small grid search on the original small model. EfficientNet uses a compound coefficient $\phi$ to uniformly scales network width, depth, and resolution in a  principled way.
 
diff --git a/modelindex/.templates/models/ensemble-adversarial.md b/docs/models/.templates/models/ensemble-adversarial.md
similarity index 94%
rename from modelindex/.templates/models/ensemble-adversarial.md
rename to docs/models/.templates/models/ensemble-adversarial.md
index ed6e209e..996d81eb 100644
--- a/modelindex/.templates/models/ensemble-adversarial.md
+++ b/docs/models/.templates/models/ensemble-adversarial.md
@@ -1,9 +1,11 @@
-# Summary
+# # Ensemble Adversarial Inception ResNet v2
 
 **Inception-ResNet-v2** is a convolutional neural architecture that builds on the Inception family of architectures but incorporates [residual connections](https://paperswithcode.com/method/residual-connection) (replacing the filter concatenation stage of the Inception architecture).
 
 This particular model was trained for study of adversarial examples (adversarial training).
 
+The weights from this model were ported from [Tensorflow/Models](https://github.com/tensorflow/models).
+
 {% include 'code_snippets.md' %}
 
 ## How do I train this model?
diff --git a/modelindex/.templates/models/ese-vovnet.md b/docs/models/.templates/models/ese-vovnet.md
similarity index 99%
rename from modelindex/.templates/models/ese-vovnet.md
rename to docs/models/.templates/models/ese-vovnet.md
index da5ea322..3b532be5 100644
--- a/modelindex/.templates/models/ese-vovnet.md
+++ b/docs/models/.templates/models/ese-vovnet.md
@@ -1,4 +1,4 @@
-# Summary
+# ESE VoVNet
 
 **VoVNet** is a convolutional neural network that seeks to make [DenseNet](https://paperswithcode.com/method/densenet) more efficient by concatenating all features only once in the last feature map, which makes input size constant and enables enlarging new output channel. 
 
diff --git a/modelindex/.templates/models/fbnet.md b/docs/models/.templates/models/fbnet.md
similarity index 99%
rename from modelindex/.templates/models/fbnet.md
rename to docs/models/.templates/models/fbnet.md
index eec83b49..2302a152 100644
--- a/modelindex/.templates/models/fbnet.md
+++ b/docs/models/.templates/models/fbnet.md
@@ -1,4 +1,4 @@
-# Summary
+# FBNet
 
 **FBNet** is a type of convolutional neural architectures discovered through [DNAS](https://paperswithcode.com/method/dnas) neural architecture search. It utilises a basic type of image model block inspired by [MobileNetv2](https://paperswithcode.com/method/mobilenetv2) that utilises depthwise convolutions and an inverted residual structure (see components).
 
diff --git a/modelindex/.templates/models/gloun-inception-v3.md b/docs/models/.templates/models/gloun-inception-v3.md
similarity index 95%
rename from modelindex/.templates/models/gloun-inception-v3.md
rename to docs/models/.templates/models/gloun-inception-v3.md
index 5e227bde..10bc2e32 100644
--- a/modelindex/.templates/models/gloun-inception-v3.md
+++ b/docs/models/.templates/models/gloun-inception-v3.md
@@ -1,8 +1,8 @@
-# Summary
+# Gluon Inception v3
 
 **Inception v3** is a convolutional neural network architecture from the Inception family that makes several improvements including using [Label Smoothing](https://paperswithcode.com/method/label-smoothing), Factorized 7 x 7 convolutions, and the use of an [auxiliary classifer](https://paperswithcode.com/method/auxiliary-classifier) to propagate label information lower down the network (along with the use of batch normalization for layers in the sidehead). The key building block is an [Inception Module](https://paperswithcode.com/method/inception-v3-module).
 
-The weights from this model were ported from Gluon.
+The weights from this model were ported from [Gluon](https://cv.gluon.ai/model_zoo/classification.html).
 
 {% include 'code_snippets.md' %}
 
diff --git a/modelindex/.templates/models/gloun-resnet.md b/docs/models/.templates/models/gloun-resnet.md
similarity index 98%
rename from modelindex/.templates/models/gloun-resnet.md
rename to docs/models/.templates/models/gloun-resnet.md
index cc72cfba..47dd2512 100644
--- a/modelindex/.templates/models/gloun-resnet.md
+++ b/docs/models/.templates/models/gloun-resnet.md
@@ -1,8 +1,8 @@
-# Summary
+# Glu on ResNet
 
 **Residual Networks**, or **ResNets**, learn residual functions with reference to the layer inputs, instead of learning unreferenced functions. Instead of hoping each few stacked layers directly fit a desired underlying mapping, residual nets let these layers fit a residual mapping. They stack [residual blocks](https://paperswithcode.com/method/residual-block) ontop of each other to form network: e.g. a ResNet-50 has fifty layers using these blocks. 
 
-The weights from this model were ported from Gluon.
+The weights from this model were ported from [Gluon](https://cv.gluon.ai/model_zoo/classification.html).
 
 {% include 'code_snippets.md' %}
 
diff --git a/modelindex/.templates/models/gloun-resnext.md b/docs/models/.templates/models/gloun-resnext.md
similarity index 96%
rename from modelindex/.templates/models/gloun-resnext.md
rename to docs/models/.templates/models/gloun-resnext.md
index abdd8ad7..97151d8a 100644
--- a/modelindex/.templates/models/gloun-resnext.md
+++ b/docs/models/.templates/models/gloun-resnext.md
@@ -1,8 +1,8 @@
-# Summary
+# Gluon ResNeXt
 
 A **ResNeXt** repeats a [building block](https://paperswithcode.com/method/resnext-block) that aggregates a set of transformations with the same topology. Compared to a [ResNet](https://paperswithcode.com/method/resnet), it exposes a new dimension,  *cardinality* (the size of the set of transformations) $C$, as an essential factor in addition to the dimensions of depth and width. 
 
-The weights from this model were ported from Gluon.
+The weights from this model were ported from [Gluon](https://cv.gluon.ai/model_zoo/classification.html).
 
 {% include 'code_snippets.md' %}
 
diff --git a/modelindex/.templates/models/gloun-senet.md b/docs/models/.templates/models/gloun-senet.md
similarity index 93%
rename from modelindex/.templates/models/gloun-senet.md
rename to docs/models/.templates/models/gloun-senet.md
index 4f953ce7..60f7477f 100644
--- a/modelindex/.templates/models/gloun-senet.md
+++ b/docs/models/.templates/models/gloun-senet.md
@@ -2,7 +2,7 @@
 
 A **SENet** is a convolutional neural network architecture that employs [squeeze-and-excitation blocks](https://paperswithcode.com/method/squeeze-and-excitation-block) to enable the network to perform dynamic channel-wise feature recalibration.
 
-The weights from this model were ported from Gluon.
+The weights from this model were ported from [Gluon](https://cv.gluon.ai/model_zoo/classification.html).
 
 {% include 'code_snippets.md' %}
 
diff --git a/modelindex/.templates/models/gloun-seresnext.md b/docs/models/.templates/models/gloun-seresnext.md
similarity index 96%
rename from modelindex/.templates/models/gloun-seresnext.md
rename to docs/models/.templates/models/gloun-seresnext.md
index 88ae8657..b2052b01 100644
--- a/modelindex/.templates/models/gloun-seresnext.md
+++ b/docs/models/.templates/models/gloun-seresnext.md
@@ -2,7 +2,7 @@
 
 **SE ResNeXt** is a variant of a [ResNext](https://www.paperswithcode.com/method/resnext) that employs [squeeze-and-excitation blocks](https://paperswithcode.com/method/squeeze-and-excitation-block) to enable the network to perform dynamic channel-wise feature recalibration.
 
-The weights from this model were ported from Gluon.
+The weights from this model were ported from [Gluon](https://cv.gluon.ai/model_zoo/classification.html).
 
 {% include 'code_snippets.md' %}
 
diff --git a/modelindex/.templates/models/gloun-xception.md b/docs/models/.templates/models/gloun-xception.md
similarity index 89%
rename from modelindex/.templates/models/gloun-xception.md
rename to docs/models/.templates/models/gloun-xception.md
index 0523b6fa..fc5ed82e 100644
--- a/modelindex/.templates/models/gloun-xception.md
+++ b/docs/models/.templates/models/gloun-xception.md
@@ -1,6 +1,8 @@
 # Summary
 
-**Xception** is a convolutional neural network architecture that relies solely on [depthwise separable convolution](https://paperswithcode.com/method/depthwise-separable-convolution) layers. The weights from this model were ported from Gluon.
+**Xception** is a convolutional neural network architecture that relies solely on [depthwise separable convolution](https://paperswithcode.com/method/depthwise-separable-convolution) layers.
+
+The weights from this model were ported from [Gluon](https://cv.gluon.ai/model_zoo/classification.html).
 
 {% include 'code_snippets.md' %}
 
diff --git a/modelindex/.templates/models/hrnet.md b/docs/models/.templates/models/hrnet.md
similarity index 99%
rename from modelindex/.templates/models/hrnet.md
rename to docs/models/.templates/models/hrnet.md
index 2f9e8597..0a9f08f9 100644
--- a/modelindex/.templates/models/hrnet.md
+++ b/docs/models/.templates/models/hrnet.md
@@ -1,4 +1,4 @@
-# Summary
+# HRNet
 
 **HRNet**, or **High-Resolution Net**, is a general purpose convolutional neural network for tasks like semantic segmentation, object detection and image classification. It is able to maintain high resolution representations through the whole process. We start from a high-resolution convolution stream, gradually add high-to-low resolution convolution streams one by one, and connect the multi-resolution streams in parallel. The resulting network consists of several ($4$ in the paper) stages and the $n$th stage contains $n$ streams corresponding to $n$ resolutions. The authors conduct repeated multi-resolution fusions by exchanging the information across the parallel streams over and over.
 
diff --git a/modelindex/.templates/models/ig-resnext.md b/docs/models/.templates/models/ig-resnext.md
similarity index 99%
rename from modelindex/.templates/models/ig-resnext.md
rename to docs/models/.templates/models/ig-resnext.md
index 69f84c4f..5cc91410 100644
--- a/modelindex/.templates/models/ig-resnext.md
+++ b/docs/models/.templates/models/ig-resnext.md
@@ -1,4 +1,4 @@
-# Summary
+# Instagram ResNeXt WSL
 
 A **ResNeXt** repeats a [building block](https://paperswithcode.com/method/resnext-block) that aggregates a set of transformations with the same topology. Compared to a [ResNet](https://paperswithcode.com/method/resnet), it exposes a new dimension,  *cardinality* (the size of the set of transformations) $C$, as an essential factor in addition to the dimensions of depth and width. 
 
diff --git a/modelindex/.templates/models/inception-resnet-v2.md b/docs/models/.templates/models/inception-resnet-v2.md
similarity index 98%
rename from modelindex/.templates/models/inception-resnet-v2.md
rename to docs/models/.templates/models/inception-resnet-v2.md
index ad11f16f..1173c736 100644
--- a/modelindex/.templates/models/inception-resnet-v2.md
+++ b/docs/models/.templates/models/inception-resnet-v2.md
@@ -1,4 +1,4 @@
-# Summary
+# Inception Resnet v2
 
 **Inception-ResNet-v2** is a convolutional neural architecture that builds on the Inception family of architectures but incorporates [residual connections](https://paperswithcode.com/method/residual-connection) (replacing the filter concatenation stage of the Inception architecture).
 
diff --git a/modelindex/.templates/models/inception-v3.md b/docs/models/.templates/models/inception-v3.md
similarity index 99%
rename from modelindex/.templates/models/inception-v3.md
rename to docs/models/.templates/models/inception-v3.md
index 876008f2..9002b223 100644
--- a/modelindex/.templates/models/inception-v3.md
+++ b/docs/models/.templates/models/inception-v3.md
@@ -1,4 +1,4 @@
-# Summary
+# Inception v3
 
 **Inception v3** is a convolutional neural network architecture from the Inception family that makes several improvements including using [Label Smoothing](https://paperswithcode.com/method/label-smoothing), Factorized 7 x 7 convolutions, and the use of an [auxiliary classifer](https://paperswithcode.com/method/auxiliary-classifier) to propagate label information lower down the network (along with the use of batch normalization for layers in the sidehead). The key building block is an [Inception Module](https://paperswithcode.com/method/inception-v3-module).
 
diff --git a/modelindex/.templates/models/inception-v4.md b/docs/models/.templates/models/inception-v4.md
similarity index 99%
rename from modelindex/.templates/models/inception-v4.md
rename to docs/models/.templates/models/inception-v4.md
index a70337b2..e3fe7826 100644
--- a/modelindex/.templates/models/inception-v4.md
+++ b/docs/models/.templates/models/inception-v4.md
@@ -1,4 +1,4 @@
-# Summary
+# Inception v4
 
 **Inception-v4** is a convolutional neural network architecture that builds on previous iterations of the Inception family by simplifying the architecture and using more inception modules than [Inception-v3](https://paperswithcode.com/method/inception-v3).
 {% include 'code_snippets.md' %}
diff --git a/modelindex/.templates/models/legacy-se-resnet.md b/docs/models/.templates/models/legacy-se-resnet.md
similarity index 99%
rename from modelindex/.templates/models/legacy-se-resnet.md
rename to docs/models/.templates/models/legacy-se-resnet.md
index d7b78967..20730bec 100644
--- a/modelindex/.templates/models/legacy-se-resnet.md
+++ b/docs/models/.templates/models/legacy-se-resnet.md
@@ -1,4 +1,4 @@
-# Summary
+# (Legacy) SE ResNet
 
 **SE ResNet** is a variant of a [ResNet](https://www.paperswithcode.com/method/resnet) that employs [squeeze-and-excitation blocks](https://paperswithcode.com/method/squeeze-and-excitation-block) to enable the network to perform dynamic channel-wise feature recalibration.
 
diff --git a/modelindex/.templates/models/legacy-se-resnext.md b/docs/models/.templates/models/legacy-se-resnext.md
similarity index 99%
rename from modelindex/.templates/models/legacy-se-resnext.md
rename to docs/models/.templates/models/legacy-se-resnext.md
index 298567b5..acc91ce3 100644
--- a/modelindex/.templates/models/legacy-se-resnext.md
+++ b/docs/models/.templates/models/legacy-se-resnext.md
@@ -1,4 +1,4 @@
-# Summary
+# (Legacy) SE ResNeXt
 
 **SE ResNeXt** is a variant of a [ResNeXt](https://www.paperswithcode.com/method/resnext) that employs [squeeze-and-excitation blocks](https://paperswithcode.com/method/squeeze-and-excitation-block) to enable the network to perform dynamic channel-wise feature recalibration.
 
diff --git a/modelindex/.templates/models/legacy-senet.md b/docs/models/.templates/models/legacy-senet.md
similarity index 99%
rename from modelindex/.templates/models/legacy-senet.md
rename to docs/models/.templates/models/legacy-senet.md
index c3b8db00..fd54d41a 100644
--- a/modelindex/.templates/models/legacy-senet.md
+++ b/docs/models/.templates/models/legacy-senet.md
@@ -1,4 +1,4 @@
-# Summary
+# (Legacy) SENet
 
 A **SENet** is a convolutional neural network architecture that employs [squeeze-and-excitation blocks](https://paperswithcode.com/method/squeeze-and-excitation-block) to enable the network to perform dynamic channel-wise feature recalibration.
 
diff --git a/modelindex/.templates/models/mixnet.md b/docs/models/.templates/models/mixnet.md
similarity index 99%
rename from modelindex/.templates/models/mixnet.md
rename to docs/models/.templates/models/mixnet.md
index 48b9d096..3f986623 100644
--- a/modelindex/.templates/models/mixnet.md
+++ b/docs/models/.templates/models/mixnet.md
@@ -1,4 +1,4 @@
-# Summary
+# MixNet
 
 **MixNet** is a type of convolutional neural network discovered via AutoML that utilises [MixConvs](https://paperswithcode.com/method/mixconv) instead of regular [depthwise convolutions](https://paperswithcode.com/method/depthwise-convolution).
 
diff --git a/modelindex/.templates/models/mnasnet.md b/docs/models/.templates/models/mnasnet.md
similarity index 99%
rename from modelindex/.templates/models/mnasnet.md
rename to docs/models/.templates/models/mnasnet.md
index 86a39eec..3c6cd5a0 100644
--- a/modelindex/.templates/models/mnasnet.md
+++ b/docs/models/.templates/models/mnasnet.md
@@ -1,4 +1,4 @@
-# Summary
+# MnasNet
 
 **MnasNet** is a type of convolutional neural network optimized for mobile devices that is discovered through mobile neural architecture search, which explicitly incorporates model latency into the main objective so that the search can identify a model that achieves a good trade-off between accuracy and latency. The main building block is an [inverted residual block](https://paperswithcode.com/method/inverted-residual-block) (from [MobileNetV2](https://paperswithcode.com/method/mobilenetv2)).
 
diff --git a/modelindex/.templates/models/mobilenet-v2.md b/docs/models/.templates/models/mobilenet-v2.md
similarity index 99%
rename from modelindex/.templates/models/mobilenet-v2.md
rename to docs/models/.templates/models/mobilenet-v2.md
index d7f7e2e6..a8940a47 100644
--- a/modelindex/.templates/models/mobilenet-v2.md
+++ b/docs/models/.templates/models/mobilenet-v2.md
@@ -1,4 +1,4 @@
-# Summary
+# MobileNet v2
 
 **MobileNetV2** is a convolutional neural network architecture that seeks to perform well on mobile devices. It is based on an [inverted residual structure](https://paperswithcode.com/method/inverted-residual-block) where the residual connections are between the bottleneck layers.  The intermediate expansion layer uses lightweight depthwise convolutions to filter features as a source of non-linearity. As a whole, the architecture of MobileNetV2 contains the initial fully convolution layer with 32 filters, followed by 19 residual bottleneck layers.
 
diff --git a/modelindex/.templates/models/mobilenet-v3.md b/docs/models/.templates/models/mobilenet-v3.md
similarity index 99%
rename from modelindex/.templates/models/mobilenet-v3.md
rename to docs/models/.templates/models/mobilenet-v3.md
index c863c579..1fbfed51 100644
--- a/modelindex/.templates/models/mobilenet-v3.md
+++ b/docs/models/.templates/models/mobilenet-v3.md
@@ -1,4 +1,4 @@
-# Summary
+# MobileNet v3
 
 **MobileNetV3** is a convolutional neural network that is designed for mobile phone CPUs. The network design includes the use of a [hard swish activation](https://paperswithcode.com/method/hard-swish) and [squeeze-and-excitation](https://paperswithcode.com/method/squeeze-and-excitation-block) modules in the [MBConv blocks](https://paperswithcode.com/method/inverted-residual-block).
 
diff --git a/modelindex/.templates/models/nasnet.md b/docs/models/.templates/models/nasnet.md
similarity index 99%
rename from modelindex/.templates/models/nasnet.md
rename to docs/models/.templates/models/nasnet.md
index 241bbea9..c8dccc88 100644
--- a/modelindex/.templates/models/nasnet.md
+++ b/docs/models/.templates/models/nasnet.md
@@ -1,4 +1,4 @@
-# Summary
+# NASNet
 
 **NASNet** is a type of convolutional neural network discovered through neural architecture search. The building blocks consist of normal and reduction cells.
 
diff --git a/modelindex/.templates/models/noisy-student.md b/docs/models/.templates/models/noisy-student.md
similarity index 99%
rename from modelindex/.templates/models/noisy-student.md
rename to docs/models/.templates/models/noisy-student.md
index 594f9597..39c0017c 100644
--- a/modelindex/.templates/models/noisy-student.md
+++ b/docs/models/.templates/models/noisy-student.md
@@ -1,4 +1,4 @@
-# Summary
+# Noisy Student (EfficientNet)
 
 **Noisy Student Training** is a semi-supervised learning approach. It extends the idea of self-training
 and distillation with the use of equal-or-larger student models and noise added to the student during learning. It has three main steps: 
diff --git a/modelindex/.templates/models/pnasnet.md b/docs/models/.templates/models/pnasnet.md
similarity index 99%
rename from modelindex/.templates/models/pnasnet.md
rename to docs/models/.templates/models/pnasnet.md
index 03080a9c..1d59c510 100644
--- a/modelindex/.templates/models/pnasnet.md
+++ b/docs/models/.templates/models/pnasnet.md
@@ -1,4 +1,4 @@
-# Summary
+# PNASNet
 
 **Progressive Neural Architecture Search**, or **PNAS**, is a method for learning the structure of convolutional neural networks (CNNs). It uses a sequential model-based optimization (SMBO) strategy, where we search the space of cell structures, starting with simple (shallow) models and progressing to complex ones, pruning out unpromising structures as we go. 
 
diff --git a/modelindex/.templates/models/regnetx.md b/docs/models/.templates/models/regnetx.md
similarity index 99%
rename from modelindex/.templates/models/regnetx.md
rename to docs/models/.templates/models/regnetx.md
index 43936573..f3628a93 100644
--- a/modelindex/.templates/models/regnetx.md
+++ b/docs/models/.templates/models/regnetx.md
@@ -1,4 +1,4 @@
-# Summary
+# RegNetX
 
 **RegNetX** is a convolutional network design space with simple, regular models with parameters: depth $d$, initial width $w\_{0} > 0$, and slope $w\_{a} > 0$, and generates a different block width $u\_{j}$ for each block $j < d$. The key restriction for the RegNet types of model is that there is a linear parameterisation of block widths (the design space only contains models with this linear structure):
 
diff --git a/modelindex/.templates/models/regnety.md b/docs/models/.templates/models/regnety.md
similarity index 99%
rename from modelindex/.templates/models/regnety.md
rename to docs/models/.templates/models/regnety.md
index a14d54d3..a80f4fd7 100644
--- a/modelindex/.templates/models/regnety.md
+++ b/docs/models/.templates/models/regnety.md
@@ -1,4 +1,4 @@
-# Summary
+# RegNetY
 
 **RegNetY** is a convolutional network design space with simple, regular models with parameters: depth $d$, initial width $w\_{0} > 0$, and slope $w\_{a} > 0$, and generates a different block width $u\_{j}$ for each block $j < d$. The key restriction for the RegNet types of model is that there is a linear parameterisation of block widths (the design space only contains models with this linear structure):
 
diff --git a/modelindex/.templates/models/res2net.md b/docs/models/.templates/models/res2net.md
similarity index 99%
rename from modelindex/.templates/models/res2net.md
rename to docs/models/.templates/models/res2net.md
index 69cf0b5f..02946288 100644
--- a/modelindex/.templates/models/res2net.md
+++ b/docs/models/.templates/models/res2net.md
@@ -1,4 +1,4 @@
-# Summary
+# Res2Net
 
 **Res2Net** is an image model that employs a variation on bottleneck residual blocks, [Res2Net Blocks](https://paperswithcode.com/method/res2net-block). The motivation is to be able to represent features at multiple scales. This is achieved through a novel building block for CNNs that constructs hierarchical residual-like connections within one single residual block. This represents multi-scale features at a granular level and increases the range of receptive fields for each network layer.
 
diff --git a/modelindex/.templates/models/res2next.md b/docs/models/.templates/models/res2next.md
similarity index 77%
rename from modelindex/.templates/models/res2next.md
rename to docs/models/.templates/models/res2next.md
index 4ce242de..d71e8aee 100644
--- a/modelindex/.templates/models/res2next.md
+++ b/docs/models/.templates/models/res2next.md
@@ -1,6 +1,6 @@
-# Summary
+# Res2NeXt
 
-**Res2Net** is an image model that employs a variation on [ResNeXt](https://paperswithcode.com/method/resnext) bottleneck residual blocks. The motivation is to be able to represent features at multiple scales. This is achieved through a novel building block for CNNs that constructs hierarchical residual-like connections within one single residual block. This represents multi-scale features at a granular level and increases the range of receptive fields for each network layer.
+**Res2NeXt** is an image model that employs a variation on [ResNeXt](https://paperswithcode.com/method/resnext) bottleneck residual blocks. The motivation is to be able to represent features at multiple scales. This is achieved through a novel building block for CNNs that constructs hierarchical residual-like connections within one single residual block. This represents multi-scale features at a granular level and increases the range of receptive fields for each network layer.
 
 {% include 'code_snippets.md' %}
 
diff --git a/modelindex/.templates/models/resnest.md b/docs/models/.templates/models/resnest.md
similarity index 99%
rename from modelindex/.templates/models/resnest.md
rename to docs/models/.templates/models/resnest.md
index 9288f0fd..cbb92ba0 100644
--- a/modelindex/.templates/models/resnest.md
+++ b/docs/models/.templates/models/resnest.md
@@ -1,6 +1,6 @@
-# Summary
+# ResNeSt
 
-A **ResNest** is a variant on a [ResNet](https://paperswithcode.com/method/resnet), which instead stacks [Split-Attention blocks](https://paperswithcode.com/method/split-attention). The cardinal group representations are then concatenated along the channel dimension: $V = \text{Concat}${$V^{1},V^{2},\cdots{V}^{K}$}. As in standard residual blocks, the final output $Y$ of otheur Split-Attention block is produced using a shortcut connection: $Y=V+X$, if the input and output feature-map share the same shape.  For blocks with a stride, an appropriate transformation $\mathcal{T}$ is applied to the shortcut connection to align the output shapes:  $Y=V+\mathcal{T}(X)$. For example, $\mathcal{T}$ can be strided convolution or combined convolution-with-pooling.
+A **ResNeSt** is a variant on a [ResNet](https://paperswithcode.com/method/resnet), which instead stacks [Split-Attention blocks](https://paperswithcode.com/method/split-attention). The cardinal group representations are then concatenated along the channel dimension: $V = \text{Concat}${$V^{1},V^{2},\cdots{V}^{K}$}. As in standard residual blocks, the final output $Y$ of otheur Split-Attention block is produced using a shortcut connection: $Y=V+X$, if the input and output feature-map share the same shape.  For blocks with a stride, an appropriate transformation $\mathcal{T}$ is applied to the shortcut connection to align the output shapes:  $Y=V+\mathcal{T}(X)$. For example, $\mathcal{T}$ can be strided convolution or combined convolution-with-pooling.
 
 {% include 'code_snippets.md' %}
 
diff --git a/modelindex/.templates/models/resnet-d.md b/docs/models/.templates/models/resnet-d.md
similarity index 99%
rename from modelindex/.templates/models/resnet-d.md
rename to docs/models/.templates/models/resnet-d.md
index 04b55f62..211b05b0 100644
--- a/modelindex/.templates/models/resnet-d.md
+++ b/docs/models/.templates/models/resnet-d.md
@@ -1,4 +1,4 @@
-# Summary
+# ResNet-D
 
 **ResNet-D** is a modification on the [ResNet](https://paperswithcode.com/method/resnet) architecture that utilises an [average pooling](https://paperswithcode.com/method/average-pooling) tweak for downsampling. The motivation is that in the unmodified ResNet, the [1×1 convolution](https://paperswithcode.com/method/1x1-convolution) for the downsampling block ignores 3/4 of input feature maps, so this is modified so no information will be ignored
 
diff --git a/modelindex/.templates/models/resnet.md b/docs/models/.templates/models/resnet.md
similarity index 99%
rename from modelindex/.templates/models/resnet.md
rename to docs/models/.templates/models/resnet.md
index effadcec..3e43ae52 100644
--- a/modelindex/.templates/models/resnet.md
+++ b/docs/models/.templates/models/resnet.md
@@ -1,4 +1,4 @@
-# Summary
+# ResNet
 
 **Residual Networks**, or **ResNets**, learn residual functions with reference to the layer inputs, instead of learning unreferenced functions. Instead of hoping each few stacked layers directly fit a desired underlying mapping, residual nets let these layers fit a residual mapping. They stack [residual blocks](https://paperswithcode.com/method/residual-block) ontop of each other to form network: e.g. a ResNet-50 has fifty layers using these blocks. 
 
diff --git a/modelindex/.templates/models/resnext.md b/docs/models/.templates/models/resnext.md
similarity index 99%
rename from modelindex/.templates/models/resnext.md
rename to docs/models/.templates/models/resnext.md
index 8f8b349a..ec83d092 100644
--- a/modelindex/.templates/models/resnext.md
+++ b/docs/models/.templates/models/resnext.md
@@ -1,4 +1,4 @@
-# Summary
+# ResNeXt
 
 A **ResNeXt** repeats a [building block](https://paperswithcode.com/method/resnext-block) that aggregates a set of transformations with the same topology. Compared to a [ResNet](https://paperswithcode.com/method/resnet), it exposes a new dimension,  *cardinality* (the size of the set of transformations) $C$, as an essential factor in addition to the dimensions of depth and width. 
 
diff --git a/modelindex/.templates/models/rexnet.md b/docs/models/.templates/models/rexnet.md
similarity index 99%
rename from modelindex/.templates/models/rexnet.md
rename to docs/models/.templates/models/rexnet.md
index a96813ee..3c4a25b2 100644
--- a/modelindex/.templates/models/rexnet.md
+++ b/docs/models/.templates/models/rexnet.md
@@ -1,4 +1,4 @@
-# Summary
+# RexNet
 
 **Rank Expansion Networks** (ReXNets) follow a set of new design principles for designing bottlenecks in image classification models. Authors refine each layer by 1) expanding the input channel size of the convolution layer and 2) replacing the [ReLU6s](https://www.paperswithcode.com/method/relu6).
 
diff --git a/modelindex/.templates/models/se-resnet.md b/docs/models/.templates/models/se-resnet.md
similarity index 99%
rename from modelindex/.templates/models/se-resnet.md
rename to docs/models/.templates/models/se-resnet.md
index 3b382dbf..a03d31fe 100644
--- a/modelindex/.templates/models/se-resnet.md
+++ b/docs/models/.templates/models/se-resnet.md
@@ -1,4 +1,4 @@
-# Summary
+# SE ResNet
 
 **SE ResNet** is a variant of a [ResNet](https://www.paperswithcode.com/method/resnet) that employs [squeeze-and-excitation blocks](https://paperswithcode.com/method/squeeze-and-excitation-block) to enable the network to perform dynamic channel-wise feature recalibration.
 
diff --git a/modelindex/.templates/models/selecsls.md b/docs/models/.templates/models/selecsls.md
similarity index 99%
rename from modelindex/.templates/models/selecsls.md
rename to docs/models/.templates/models/selecsls.md
index f0770990..e297a99c 100644
--- a/modelindex/.templates/models/selecsls.md
+++ b/docs/models/.templates/models/selecsls.md
@@ -1,4 +1,4 @@
-# Summary
+# SelecSLS
 
 **SelecSLS** uses novel selective long and short range skip connections to improve the information flow allowing for a drastically faster network without compromising accuracy.
 
diff --git a/modelindex/.templates/models/seresnext.md b/docs/models/.templates/models/seresnext.md
similarity index 99%
rename from modelindex/.templates/models/seresnext.md
rename to docs/models/.templates/models/seresnext.md
index 9f0c8da0..82ff797c 100644
--- a/modelindex/.templates/models/seresnext.md
+++ b/docs/models/.templates/models/seresnext.md
@@ -1,4 +1,4 @@
-# Summary
+# SE ResNeXt
 
 **SE ResNeXt** is a variant of a [ResNext](https://www.paperswithcode.com/method/resneXt) that employs [squeeze-and-excitation blocks](https://paperswithcode.com/method/squeeze-and-excitation-block) to enable the network to perform dynamic channel-wise feature recalibration.
 
diff --git a/modelindex/.templates/models/skresnet.md b/docs/models/.templates/models/skresnet.md
similarity index 99%
rename from modelindex/.templates/models/skresnet.md
rename to docs/models/.templates/models/skresnet.md
index 9c8d6109..86f44688 100644
--- a/modelindex/.templates/models/skresnet.md
+++ b/docs/models/.templates/models/skresnet.md
@@ -1,4 +1,4 @@
-# Summary
+# SK ResNet
 
 **SK ResNet** is a variant of a [ResNet](https://www.paperswithcode.com/method/resnet) that employs a [Selective Kernel](https://paperswithcode.com/method/selective-kernel) unit. In general, all the large kernel convolutions in the original bottleneck blocks in ResNet are replaced by the proposed [SK convolutions](https://paperswithcode.com/method/selective-kernel-convolution), enabling the network to choose appropriate receptive field sizes in an adaptive manner.
 
diff --git a/modelindex/.templates/models/skresnext.md b/docs/models/.templates/models/skresnext.md
similarity index 99%
rename from modelindex/.templates/models/skresnext.md
rename to docs/models/.templates/models/skresnext.md
index 856623c1..c1b3dfc2 100644
--- a/modelindex/.templates/models/skresnext.md
+++ b/docs/models/.templates/models/skresnext.md
@@ -1,4 +1,4 @@
-# Summary
+# SK ResNeXt
 
 **SK ResNeXt** is a variant of a [ResNeXt](https://www.paperswithcode.com/method/resnext) that employs a [Selective Kernel](https://paperswithcode.com/method/selective-kernel) unit. In general, all the large kernel convolutions in the original bottleneck blocks in ResNext are replaced by the proposed [SK convolutions](https://paperswithcode.com/method/selective-kernel-convolution), enabling the network to choose appropriate receptive field sizes in an adaptive manner.
 
diff --git a/modelindex/.templates/models/spnasnet.md b/docs/models/.templates/models/spnasnet.md
similarity index 99%
rename from modelindex/.templates/models/spnasnet.md
rename to docs/models/.templates/models/spnasnet.md
index 55b8ad08..7fc1442f 100644
--- a/modelindex/.templates/models/spnasnet.md
+++ b/docs/models/.templates/models/spnasnet.md
@@ -1,4 +1,4 @@
-# Summary
+# SPNASNet
 
 **Single-Path NAS** is a novel differentiable NAS method for designing hardware-efficient ConvNets in less than 4 hours.
 
diff --git a/modelindex/.templates/models/ssl-resnet.md b/docs/models/.templates/models/ssl-resnet.md
similarity index 99%
rename from modelindex/.templates/models/ssl-resnet.md
rename to docs/models/.templates/models/ssl-resnet.md
index 185a167d..af9f0c66 100644
--- a/modelindex/.templates/models/ssl-resnet.md
+++ b/docs/models/.templates/models/ssl-resnet.md
@@ -1,4 +1,4 @@
-# Summary
+# SSL ResNet
 
 **Residual Networks**, or **ResNets**, learn residual functions with reference to the layer inputs, instead of learning unreferenced functions. Instead of hoping each few stacked layers directly fit a desired underlying mapping, residual nets let these layers fit a residual mapping. They stack [residual blocks](https://paperswithcode.com/method/residual-block) ontop of each other to form network: e.g. a ResNet-50 has fifty layers using these blocks. 
 
diff --git a/modelindex/.templates/models/ssl-resnext.md b/docs/models/.templates/models/ssl-resnext.md
similarity index 99%
rename from modelindex/.templates/models/ssl-resnext.md
rename to docs/models/.templates/models/ssl-resnext.md
index f913e4a7..01b3a055 100644
--- a/modelindex/.templates/models/ssl-resnext.md
+++ b/docs/models/.templates/models/ssl-resnext.md
@@ -1,4 +1,4 @@
-# Summary
+# SSL ResNeXT
 
 A **ResNeXt** repeats a [building block](https://paperswithcode.com/method/resnext-block) that aggregates a set of transformations with the same topology. Compared to a [ResNet](https://paperswithcode.com/method/resnet), it exposes a new dimension,  *cardinality* (the size of the set of transformations) $C$, as an essential factor in addition to the dimensions of depth and width. 
 
diff --git a/modelindex/.templates/models/swsl-resnet.md b/docs/models/.templates/models/swsl-resnet.md
similarity index 99%
rename from modelindex/.templates/models/swsl-resnet.md
rename to docs/models/.templates/models/swsl-resnet.md
index f4dbb8e4..a44f188b 100644
--- a/modelindex/.templates/models/swsl-resnet.md
+++ b/docs/models/.templates/models/swsl-resnet.md
@@ -1,4 +1,4 @@
-# Summary
+# SWSL ResNet
 
 **Residual Networks**, or **ResNets**, learn residual functions with reference to the layer inputs, instead of learning unreferenced functions. Instead of hoping each few stacked layers directly fit a desired underlying mapping, residual nets let these layers fit a residual mapping. They stack [residual blocks](https://paperswithcode.com/method/residual-block) ontop of each other to form network: e.g. a ResNet-50 has fifty layers using these blocks. 
 
diff --git a/modelindex/.templates/models/swsl-resnext.md b/docs/models/.templates/models/swsl-resnext.md
similarity index 99%
rename from modelindex/.templates/models/swsl-resnext.md
rename to docs/models/.templates/models/swsl-resnext.md
index 68470949..61cc4f5a 100644
--- a/modelindex/.templates/models/swsl-resnext.md
+++ b/docs/models/.templates/models/swsl-resnext.md
@@ -1,4 +1,4 @@
-# Summary
+# SWSL ResNeXt
 
 A **ResNeXt** repeats a [building block](https://paperswithcode.com/method/resnext-block) that aggregates a set of transformations with the same topology. Compared to a [ResNet](https://paperswithcode.com/method/resnet), it exposes a new dimension,  *cardinality* (the size of the set of transformations) $C$, as an essential factor in addition to the dimensions of depth and width. 
 
diff --git a/modelindex/.templates/models/tf-efficientnet-condconv.md b/docs/models/.templates/models/tf-efficientnet-condconv.md
similarity index 97%
rename from modelindex/.templates/models/tf-efficientnet-condconv.md
rename to docs/models/.templates/models/tf-efficientnet-condconv.md
index 84b7f355..e5b2cdeb 100644
--- a/modelindex/.templates/models/tf-efficientnet-condconv.md
+++ b/docs/models/.templates/models/tf-efficientnet-condconv.md
@@ -1,4 +1,4 @@
-# Summary
+# (Tensorflow) EfficientNet CondConv
 
 **EfficientNet** is a convolutional neural network architecture and scaling method that uniformly scales all dimensions of depth/width/resolution using a *compound coefficient*. Unlike conventional practice that arbitrary scales  these factors, the EfficientNet scaling method uniformly scales network width, depth, and resolution with a set of fixed scaling coefficients. For example, if we want to use $2^N$ times more computational resources, then we can simply increase the network depth by $\alpha ^ N$,  width by $\beta ^ N$, and image size by $\gamma ^ N$, where $\alpha, \beta, \gamma$ are constant coefficients determined by a small grid search on the original small model. EfficientNet uses a compound coefficient $\phi$ to uniformly scales network width, depth, and resolution in a  principled way.
 
@@ -8,6 +8,8 @@ The base EfficientNet-B0 network is based on the inverted bottleneck residual bl
 
 This collection of models amends EfficientNet by adding [CondConv](https://paperswithcode.com/method/condconv) convolutions.
 
+The weights from this model were ported from [Tensorflow/TPU](https://github.com/tensorflow/tpu).
+
 {% include 'code_snippets.md' %}
 
 ## How do I train this model?
diff --git a/modelindex/.templates/models/tf-efficientnet-lite.md b/docs/models/.templates/models/tf-efficientnet-lite.md
similarity index 97%
rename from modelindex/.templates/models/tf-efficientnet-lite.md
rename to docs/models/.templates/models/tf-efficientnet-lite.md
index e1aa2352..0f96a0aa 100644
--- a/modelindex/.templates/models/tf-efficientnet-lite.md
+++ b/docs/models/.templates/models/tf-efficientnet-lite.md
@@ -1,4 +1,4 @@
-# Summary
+# (Tensorflow) EfficientNet Lite
 
 **EfficientNet** is a convolutional neural network architecture and scaling method that uniformly scales all dimensions of depth/width/resolution using a *compound coefficient*. Unlike conventional practice that arbitrary scales  these factors, the EfficientNet scaling method uniformly scales network width, depth, and resolution with a set of fixed scaling coefficients. For example, if we want to use $2^N$ times more computational resources, then we can simply increase the network depth by $\alpha ^ N$,  width by $\beta ^ N$, and image size by $\gamma ^ N$, where $\alpha, \beta, \gamma$ are constant coefficients determined by a small grid search on the original small model. EfficientNet uses a compound coefficient $\phi$ to uniformly scales network width, depth, and resolution in a  principled way.
 
@@ -8,6 +8,8 @@ The base EfficientNet-B0 network is based on the inverted bottleneck residual bl
 
 EfficientNet-Lite makes EfficientNet more suitable for mobile devices by introducing [ReLU6](https://paperswithcode.com/method/relu6) activation functions and removing [squeeze-and-excitation blocks](https://paperswithcode.com/method/squeeze-and-excitation).
 
+The weights from this model were ported from [Tensorflow/TPU](https://github.com/tensorflow/tpu).
+
 {% include 'code_snippets.md' %}
 
 ## How do I train this model?
diff --git a/modelindex/.templates/models/tf-efficientnet.md b/docs/models/.templates/models/tf-efficientnet.md
similarity index 99%
rename from modelindex/.templates/models/tf-efficientnet.md
rename to docs/models/.templates/models/tf-efficientnet.md
index 44f6b32b..9437e764 100644
--- a/modelindex/.templates/models/tf-efficientnet.md
+++ b/docs/models/.templates/models/tf-efficientnet.md
@@ -1,4 +1,4 @@
-# Summary
+# (Tensorflow) EfficientNet
 
 **EfficientNet** is a convolutional neural network architecture and scaling method that uniformly scales all dimensions of depth/width/resolution using a *compound coefficient*. Unlike conventional practice that arbitrary scales  these factors, the EfficientNet scaling method uniformly scales network width, depth, and resolution with a set of fixed scaling coefficients. For example, if we want to use $2^N$ times more computational resources, then we can simply increase the network depth by $\alpha ^ N$,  width by $\beta ^ N$, and image size by $\gamma ^ N$, where $\alpha, \beta, \gamma$ are constant coefficients determined by a small grid search on the original small model. EfficientNet uses a compound coefficient $\phi$ to uniformly scales network width, depth, and resolution in a  principled way.
 
@@ -6,6 +6,8 @@ The compound scaling method is justified by the intuition that if the input imag
 
 The base EfficientNet-B0 network is based on the inverted bottleneck residual blocks of [MobileNetV2](https://paperswithcode.com/method/mobilenetv2), in addition to [squeeze-and-excitation blocks](https://paperswithcode.com/method/squeeze-and-excitation-block).
 
+The weights from this model were ported from [Tensorflow/TPU](https://github.com/tensorflow/tpu).
+
 {% include 'code_snippets.md' %}
 
 ## How do I train this model?
diff --git a/modelindex/.templates/models/tf-inception-v3.md b/docs/models/.templates/models/tf-inception-v3.md
similarity index 95%
rename from modelindex/.templates/models/tf-inception-v3.md
rename to docs/models/.templates/models/tf-inception-v3.md
index 2b0d316d..590a6801 100644
--- a/modelindex/.templates/models/tf-inception-v3.md
+++ b/docs/models/.templates/models/tf-inception-v3.md
@@ -1,7 +1,9 @@
-# Summary
+# (Tensorflow) Inception v3
 
 **Inception v3** is a convolutional neural network architecture from the Inception family that makes several improvements including using [Label Smoothing](https://paperswithcode.com/method/label-smoothing), Factorized 7 x 7 convolutions, and the use of an [auxiliary classifer](https://paperswithcode.com/method/auxiliary-classifier) to propagate label information lower down the network (along with the use of batch normalization for layers in the sidehead). The key building block is an [Inception Module](https://paperswithcode.com/method/inception-v3-module).
 
+The weights from this model were ported from [Tensorflow/Models](https://github.com/tensorflow/models).
+
 {% include 'code_snippets.md' %}
 
 ## How do I train this model?
diff --git a/modelindex/.templates/models/tf-mixnet.md b/docs/models/.templates/models/tf-mixnet.md
similarity index 95%
rename from modelindex/.templates/models/tf-mixnet.md
rename to docs/models/.templates/models/tf-mixnet.md
index 235d5ce7..fb556f7a 100644
--- a/modelindex/.templates/models/tf-mixnet.md
+++ b/docs/models/.templates/models/tf-mixnet.md
@@ -1,7 +1,9 @@
-# Summary
+# (Tensorflow) MixNet
 
 **MixNet** is a type of convolutional neural network discovered via AutoML that utilises [MixConvs](https://paperswithcode.com/method/mixconv) instead of regular [depthwise convolutions](https://paperswithcode.com/method/depthwise-convolution).
 
+The weights from this model were ported from [Tensorflow/TPU](https://github.com/tensorflow/tpu).
+
 {% include 'code_snippets.md' %}
 
 ## How do I train this model?
diff --git a/modelindex/.templates/models/tf-mobilenet-v3.md b/docs/models/.templates/models/tf-mobilenet-v3.md
similarity index 98%
rename from modelindex/.templates/models/tf-mobilenet-v3.md
rename to docs/models/.templates/models/tf-mobilenet-v3.md
index 03618ba0..95ec7c1a 100644
--- a/modelindex/.templates/models/tf-mobilenet-v3.md
+++ b/docs/models/.templates/models/tf-mobilenet-v3.md
@@ -1,7 +1,9 @@
-# Summary
+# (Tensorflow) MobileNet v3
 
 **MobileNetV3** is a convolutional neural network that is designed for mobile phone CPUs. The network design includes the use of a [hard swish activation](https://paperswithcode.com/method/hard-swish) and [squeeze-and-excitation](https://paperswithcode.com/method/squeeze-and-excitation-block) modules in the [MBConv blocks](https://paperswithcode.com/method/inverted-residual-block).
 
+The weights from this model were ported from [Tensorflow/Models](https://github.com/tensorflow/models).
+
 {% include 'code_snippets.md' %}
 
 ## How do I train this model?
diff --git a/modelindex/.templates/models/tresnet.md b/docs/models/.templates/models/tresnet.md
similarity index 99%
rename from modelindex/.templates/models/tresnet.md
rename to docs/models/.templates/models/tresnet.md
index 6ccfbc54..3d18c801 100644
--- a/modelindex/.templates/models/tresnet.md
+++ b/docs/models/.templates/models/tresnet.md
@@ -1,4 +1,4 @@
-# Summary
+# TResNet
 
 A **TResNet** is a variant on a [ResNet](https://paperswithcode.com/method/resnet) that aim to boost accuracy while maintaining GPU training and inference efficiency.  They contain several design tricks including a SpaceToDepth stem, [Anti-Alias downsampling](https://paperswithcode.com/method/anti-alias-downsampling), In-Place Activated BatchNorm, Blocks selection and [squeeze-and-excitation layers](https://paperswithcode.com/method/squeeze-and-excitation-block).
 
diff --git a/modelindex/.templates/models/vision-transformer.md b/docs/models/.templates/models/vision-transformer.md
similarity index 99%
rename from modelindex/.templates/models/vision-transformer.md
rename to docs/models/.templates/models/vision-transformer.md
index 09088882..ed4ed025 100644
--- a/modelindex/.templates/models/vision-transformer.md
+++ b/docs/models/.templates/models/vision-transformer.md
@@ -1,4 +1,4 @@
-# Summary
+# Vision Transformer (ViT)
 
 The **Vision Transformer** is a model for image classification that employs a Transformer-like architecture over patches of the image. This includes the use of [Multi-Head Attention](https://paperswithcode.com/method/multi-head-attention), [Scaled Dot-Product Attention](https://paperswithcode.com/method/scaled) and other architectural features seen in the [Transformer](https://paperswithcode.com/method/transformer) architecture traditionally used for NLP.
 
diff --git a/modelindex/.templates/models/wide-resnet.md b/docs/models/.templates/models/wide-resnet.md
similarity index 99%
rename from modelindex/.templates/models/wide-resnet.md
rename to docs/models/.templates/models/wide-resnet.md
index 96f2d890..30d6cf06 100644
--- a/modelindex/.templates/models/wide-resnet.md
+++ b/docs/models/.templates/models/wide-resnet.md
@@ -1,4 +1,4 @@
-# Summary
+# Wide ResNet
 
 **Wide Residual Networks** are a variant on [ResNets](https://paperswithcode.com/method/resnet) where we decrease depth and increase the width of residual networks. This is achieved through the use of [wide residual blocks](https://paperswithcode.com/method/wide-residual-block).
 
diff --git a/modelindex/.templates/models/xception.md b/docs/models/.templates/models/xception.md
similarity index 96%
rename from modelindex/.templates/models/xception.md
rename to docs/models/.templates/models/xception.md
index 7cc64ee7..851d2d24 100644
--- a/modelindex/.templates/models/xception.md
+++ b/docs/models/.templates/models/xception.md
@@ -1,7 +1,9 @@
-# Summary
+# Xception
 
 **Xception** is a convolutional neural network architecture that relies solely on [depthwise separable convolution layers](https://paperswithcode.com/method/depthwise-separable-convolution).
 
+The weights from this model were ported from [Tensorflow/Models](https://github.com/tensorflow/models).
+
 {% include 'code_snippets.md' %}
 
 ## How do I train this model?
diff --git a/modelindex/models/adversarial-inception-v3.md b/docs/models/adversarial-inception-v3.md
similarity index 100%
rename from modelindex/models/adversarial-inception-v3.md
rename to docs/models/adversarial-inception-v3.md
diff --git a/modelindex/models/advprop.md b/docs/models/advprop.md
similarity index 100%
rename from modelindex/models/advprop.md
rename to docs/models/advprop.md
diff --git a/modelindex/models/big-transfer.md b/docs/models/big-transfer.md
similarity index 100%
rename from modelindex/models/big-transfer.md
rename to docs/models/big-transfer.md
diff --git a/modelindex/models/csp-darknet.md b/docs/models/csp-darknet.md
similarity index 100%
rename from modelindex/models/csp-darknet.md
rename to docs/models/csp-darknet.md
diff --git a/modelindex/models/csp-resnet.md b/docs/models/csp-resnet.md
similarity index 100%
rename from modelindex/models/csp-resnet.md
rename to docs/models/csp-resnet.md
diff --git a/modelindex/models/csp-resnext.md b/docs/models/csp-resnext.md
similarity index 100%
rename from modelindex/models/csp-resnext.md
rename to docs/models/csp-resnext.md
diff --git a/modelindex/models/densenet.md b/docs/models/densenet.md
similarity index 100%
rename from modelindex/models/densenet.md
rename to docs/models/densenet.md
diff --git a/modelindex/models/dla.md b/docs/models/dla.md
similarity index 100%
rename from modelindex/models/dla.md
rename to docs/models/dla.md
diff --git a/modelindex/models/dpn.md b/docs/models/dpn.md
similarity index 100%
rename from modelindex/models/dpn.md
rename to docs/models/dpn.md
diff --git a/modelindex/models/ecaresnet.md b/docs/models/ecaresnet.md
similarity index 100%
rename from modelindex/models/ecaresnet.md
rename to docs/models/ecaresnet.md
diff --git a/modelindex/models/efficientnet-pruned.md b/docs/models/efficientnet-pruned.md
similarity index 100%
rename from modelindex/models/efficientnet-pruned.md
rename to docs/models/efficientnet-pruned.md
diff --git a/modelindex/models/efficientnet.md b/docs/models/efficientnet.md
similarity index 100%
rename from modelindex/models/efficientnet.md
rename to docs/models/efficientnet.md
diff --git a/modelindex/models/ensemble-adversarial.md b/docs/models/ensemble-adversarial.md
similarity index 100%
rename from modelindex/models/ensemble-adversarial.md
rename to docs/models/ensemble-adversarial.md
diff --git a/modelindex/models/ese-vovnet.md b/docs/models/ese-vovnet.md
similarity index 100%
rename from modelindex/models/ese-vovnet.md
rename to docs/models/ese-vovnet.md
diff --git a/modelindex/models/fbnet.md b/docs/models/fbnet.md
similarity index 100%
rename from modelindex/models/fbnet.md
rename to docs/models/fbnet.md
diff --git a/modelindex/models/gloun-inception-v3.md b/docs/models/gloun-inception-v3.md
similarity index 100%
rename from modelindex/models/gloun-inception-v3.md
rename to docs/models/gloun-inception-v3.md
diff --git a/modelindex/models/gloun-resnet.md b/docs/models/gloun-resnet.md
similarity index 100%
rename from modelindex/models/gloun-resnet.md
rename to docs/models/gloun-resnet.md
diff --git a/modelindex/models/gloun-resnext.md b/docs/models/gloun-resnext.md
similarity index 100%
rename from modelindex/models/gloun-resnext.md
rename to docs/models/gloun-resnext.md
diff --git a/modelindex/models/gloun-senet.md b/docs/models/gloun-senet.md
similarity index 100%
rename from modelindex/models/gloun-senet.md
rename to docs/models/gloun-senet.md
diff --git a/modelindex/models/gloun-seresnext.md b/docs/models/gloun-seresnext.md
similarity index 100%
rename from modelindex/models/gloun-seresnext.md
rename to docs/models/gloun-seresnext.md
diff --git a/modelindex/models/gloun-xception.md b/docs/models/gloun-xception.md
similarity index 100%
rename from modelindex/models/gloun-xception.md
rename to docs/models/gloun-xception.md
diff --git a/modelindex/models/hrnet.md b/docs/models/hrnet.md
similarity index 100%
rename from modelindex/models/hrnet.md
rename to docs/models/hrnet.md
diff --git a/modelindex/models/ig-resnext.md b/docs/models/ig-resnext.md
similarity index 100%
rename from modelindex/models/ig-resnext.md
rename to docs/models/ig-resnext.md
diff --git a/modelindex/models/inception-resnet-v2.md b/docs/models/inception-resnet-v2.md
similarity index 100%
rename from modelindex/models/inception-resnet-v2.md
rename to docs/models/inception-resnet-v2.md
diff --git a/modelindex/models/inception-v3.md b/docs/models/inception-v3.md
similarity index 100%
rename from modelindex/models/inception-v3.md
rename to docs/models/inception-v3.md
diff --git a/modelindex/models/inception-v4.md b/docs/models/inception-v4.md
similarity index 100%
rename from modelindex/models/inception-v4.md
rename to docs/models/inception-v4.md
diff --git a/modelindex/models/legacy-se-resnet.md b/docs/models/legacy-se-resnet.md
similarity index 100%
rename from modelindex/models/legacy-se-resnet.md
rename to docs/models/legacy-se-resnet.md
diff --git a/modelindex/models/legacy-se-resnext.md b/docs/models/legacy-se-resnext.md
similarity index 100%
rename from modelindex/models/legacy-se-resnext.md
rename to docs/models/legacy-se-resnext.md
diff --git a/modelindex/models/legacy-senet.md b/docs/models/legacy-senet.md
similarity index 100%
rename from modelindex/models/legacy-senet.md
rename to docs/models/legacy-senet.md
diff --git a/modelindex/models/mixnet.md b/docs/models/mixnet.md
similarity index 100%
rename from modelindex/models/mixnet.md
rename to docs/models/mixnet.md
diff --git a/modelindex/models/mnasnet.md b/docs/models/mnasnet.md
similarity index 100%
rename from modelindex/models/mnasnet.md
rename to docs/models/mnasnet.md
diff --git a/modelindex/models/mobilenet-v2.md b/docs/models/mobilenet-v2.md
similarity index 100%
rename from modelindex/models/mobilenet-v2.md
rename to docs/models/mobilenet-v2.md
diff --git a/modelindex/models/mobilenet-v3.md b/docs/models/mobilenet-v3.md
similarity index 100%
rename from modelindex/models/mobilenet-v3.md
rename to docs/models/mobilenet-v3.md
diff --git a/modelindex/models/nasnet.md b/docs/models/nasnet.md
similarity index 100%
rename from modelindex/models/nasnet.md
rename to docs/models/nasnet.md
diff --git a/modelindex/models/noisy-student.md b/docs/models/noisy-student.md
similarity index 100%
rename from modelindex/models/noisy-student.md
rename to docs/models/noisy-student.md
diff --git a/modelindex/models/pnasnet.md b/docs/models/pnasnet.md
similarity index 100%
rename from modelindex/models/pnasnet.md
rename to docs/models/pnasnet.md
diff --git a/modelindex/models/regnetx.md b/docs/models/regnetx.md
similarity index 100%
rename from modelindex/models/regnetx.md
rename to docs/models/regnetx.md
diff --git a/modelindex/models/regnety.md b/docs/models/regnety.md
similarity index 100%
rename from modelindex/models/regnety.md
rename to docs/models/regnety.md
diff --git a/modelindex/models/res2net.md b/docs/models/res2net.md
similarity index 100%
rename from modelindex/models/res2net.md
rename to docs/models/res2net.md
diff --git a/modelindex/models/res2next.md b/docs/models/res2next.md
similarity index 100%
rename from modelindex/models/res2next.md
rename to docs/models/res2next.md
diff --git a/modelindex/models/resnest.md b/docs/models/resnest.md
similarity index 100%
rename from modelindex/models/resnest.md
rename to docs/models/resnest.md
diff --git a/modelindex/models/resnet-d.md b/docs/models/resnet-d.md
similarity index 100%
rename from modelindex/models/resnet-d.md
rename to docs/models/resnet-d.md
diff --git a/modelindex/models/resnet.md b/docs/models/resnet.md
similarity index 100%
rename from modelindex/models/resnet.md
rename to docs/models/resnet.md
diff --git a/modelindex/models/resnext.md b/docs/models/resnext.md
similarity index 100%
rename from modelindex/models/resnext.md
rename to docs/models/resnext.md
diff --git a/modelindex/models/rexnet.md b/docs/models/rexnet.md
similarity index 100%
rename from modelindex/models/rexnet.md
rename to docs/models/rexnet.md
diff --git a/modelindex/models/se-resnet.md b/docs/models/se-resnet.md
similarity index 100%
rename from modelindex/models/se-resnet.md
rename to docs/models/se-resnet.md
diff --git a/modelindex/models/selecsls.md b/docs/models/selecsls.md
similarity index 100%
rename from modelindex/models/selecsls.md
rename to docs/models/selecsls.md
diff --git a/modelindex/models/seresnext.md b/docs/models/seresnext.md
similarity index 100%
rename from modelindex/models/seresnext.md
rename to docs/models/seresnext.md
diff --git a/modelindex/models/skresnet.md b/docs/models/skresnet.md
similarity index 100%
rename from modelindex/models/skresnet.md
rename to docs/models/skresnet.md
diff --git a/modelindex/models/skresnext.md b/docs/models/skresnext.md
similarity index 100%
rename from modelindex/models/skresnext.md
rename to docs/models/skresnext.md
diff --git a/modelindex/models/spnasnet.md b/docs/models/spnasnet.md
similarity index 100%
rename from modelindex/models/spnasnet.md
rename to docs/models/spnasnet.md
diff --git a/modelindex/models/ssl-resnet.md b/docs/models/ssl-resnet.md
similarity index 100%
rename from modelindex/models/ssl-resnet.md
rename to docs/models/ssl-resnet.md
diff --git a/modelindex/models/ssl-resnext.md b/docs/models/ssl-resnext.md
similarity index 100%
rename from modelindex/models/ssl-resnext.md
rename to docs/models/ssl-resnext.md
diff --git a/modelindex/models/swsl-resnet.md b/docs/models/swsl-resnet.md
similarity index 100%
rename from modelindex/models/swsl-resnet.md
rename to docs/models/swsl-resnet.md
diff --git a/modelindex/models/swsl-resnext.md b/docs/models/swsl-resnext.md
similarity index 100%
rename from modelindex/models/swsl-resnext.md
rename to docs/models/swsl-resnext.md
diff --git a/modelindex/models/tf-efficientnet-condconv.md b/docs/models/tf-efficientnet-condconv.md
similarity index 100%
rename from modelindex/models/tf-efficientnet-condconv.md
rename to docs/models/tf-efficientnet-condconv.md
diff --git a/modelindex/models/tf-efficientnet-lite.md b/docs/models/tf-efficientnet-lite.md
similarity index 100%
rename from modelindex/models/tf-efficientnet-lite.md
rename to docs/models/tf-efficientnet-lite.md
diff --git a/modelindex/models/tf-efficientnet.md b/docs/models/tf-efficientnet.md
similarity index 100%
rename from modelindex/models/tf-efficientnet.md
rename to docs/models/tf-efficientnet.md
diff --git a/modelindex/models/tf-inception-v3.md b/docs/models/tf-inception-v3.md
similarity index 100%
rename from modelindex/models/tf-inception-v3.md
rename to docs/models/tf-inception-v3.md
diff --git a/modelindex/models/tf-mixnet.md b/docs/models/tf-mixnet.md
similarity index 100%
rename from modelindex/models/tf-mixnet.md
rename to docs/models/tf-mixnet.md
diff --git a/modelindex/models/tf-mobilenet-v3.md b/docs/models/tf-mobilenet-v3.md
similarity index 100%
rename from modelindex/models/tf-mobilenet-v3.md
rename to docs/models/tf-mobilenet-v3.md
diff --git a/modelindex/models/tresnet.md b/docs/models/tresnet.md
similarity index 100%
rename from modelindex/models/tresnet.md
rename to docs/models/tresnet.md
diff --git a/modelindex/models/vision-transformer.md b/docs/models/vision-transformer.md
similarity index 100%
rename from modelindex/models/vision-transformer.md
rename to docs/models/vision-transformer.md
diff --git a/modelindex/models/wide-resnet.md b/docs/models/wide-resnet.md
similarity index 100%
rename from modelindex/models/wide-resnet.md
rename to docs/models/wide-resnet.md
diff --git a/modelindex/models/xception.md b/docs/models/xception.md
similarity index 100%
rename from modelindex/models/xception.md
rename to docs/models/xception.md
diff --git a/mkdocs.yml b/mkdocs.yml
index 86a9b679..5290dc8b 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -11,6 +11,7 @@ nav:
   - feature_extraction.md
   - changes.md
   - archived_changes.md
+  - ... | models/*.md
 theme:
   name: 'material'
   feature:
@@ -40,3 +41,6 @@ markdown_extensions:
       custom_checkbox: true
   - pymdownx.tilde
   - mdx_truly_sane_lists
+plugins:
+  - search
+  - awesome-pages
diff --git a/modelindex/model-index.yml b/model-index.yml
similarity index 97%
rename from modelindex/model-index.yml
rename to model-index.yml
index 2cd87d93..38fb78d2 100644
--- a/modelindex/model-index.yml
+++ b/model-index.yml
@@ -1,5 +1,5 @@
 Import:
-- ./models/*.md
+- ./docs/models/*.md
 Library:
   Name: PyTorch Image Models
   Headline: PyTorch image models, scripts, pretrained weights
diff --git a/modelindex/requirements-modelindex.txt b/requirements-modelindex.txt
similarity index 100%
rename from modelindex/requirements-modelindex.txt
rename to requirements-modelindex.txt