From 5c5cadfe4c2d14a5f35a71ec73082469fbc03729 Mon Sep 17 00:00:00 2001 From: Ross Wightman Date: Fri, 4 Jun 2021 14:44:07 -0700 Subject: [PATCH] Update README.md --- timm/bits/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/timm/bits/README.md b/timm/bits/README.md index 941e471a..9ce2c63d 100644 --- a/timm/bits/README.md +++ b/timm/bits/README.md @@ -44,7 +44,7 @@ This setup assumes you've SSH'd into your TPU-VM after setting it up (https://cl The TPU-VM instances I've been using have a usable version of PyTorch XLA 1.8.1 installed in the python3 environment, we will be using that. -I've found that leveraging TFDS w/ datasets in TFRecord format, streamed from Google Storage buckets is the most practical / cost-effective solution. I've written a PyTorch IterabeDataset wrapper around TFDS so we will install Tensorflow datasets and use that. Note that traditionaly PyTorch datasets on local disks do work both on TPU-VM, GPU cloud instances, or you local machine. Setting up persistent disks wasn't the easiest thing to do on TPU-VM for awhile so TFDS was my default. +I've found that leveraging TFDS w/ datasets in TFRecord format, streamed from Google Storage buckets is the most practical / cost-effective solution. I've written a PyTorch IterabeDataset wrapper around TFDS so we will install Tensorflow datasets and use that. Traditional PyTorch datasets on local disks do work w/ bits for all of TPU-VM, GPU cloud instances, and your local machine. Setting up persistent disks wasn't the easiest thing to do on TPU-VMs so TFDS was my default in that context. One thing to watch, be very careful that you don't use a GS based dataset in a different continent from you TPU-VM instances. I burned through a few thousand USD leaving some wires crossed for 1 day. Otherwise the cost of training w/ buckets in same region are quite low.