Zealousideal-Copy463
Zealousideal-Copy463 OP t1_j5ylls6 wrote
Reply to comment by FuB4R32 in Best cloud to train models with 100-200 GB of data? by Zealousideal-Copy463
Thanks a lot, gonna try it!
Zealousideal-Copy463 OP t1_j5tk24z wrote
Reply to comment by FuB4R32 in Best cloud to train models with 100-200 GB of data? by Zealousideal-Copy463
Ohh, I didn't know that about GCP, so you can point a VM to a bucket and it just "reads" the data? you don't have to "upload" data into the VM?
As I said in a previous comment, my problem with AWS (S3 and Sagemaker), is that the data is in a different network, and even though is still an AWS network, you have to move data around and that takes a while (when it's 200 GB of data).
Zealousideal-Copy463 OP t1_j5tjusi wrote
Reply to comment by v2thegreat in Best cloud to train models with 100-200 GB of data? by Zealousideal-Copy463
Thanks for your comment! I have tried using ec2 and keeping data in EBS but not sure if it's the best solution, what is your workflow there?
I'm playing around mostly with NLP and image models. Right now I'm trying to process videos, like 200GB for a retrieval problem, what I do is: get frames, get feature vectors from pre trained resnet, and resnext (this takes a lot of time). And then I train a siamese network on all of those vectors. As I said I have tried with s3 and sagemaker, but I have to move data into sagemaker notebooks and I waste a lot of time there. Also tried to process stuff in ec2 but setting the whole thing took me a while (downloading data, installing libraries, creating scripts in the shell to process videos, etc).
Zealousideal-Copy463 OP t1_j5tj30h wrote
Reply to comment by incrediblediy in Best cloud to train models with 100-200 GB of data? by Zealousideal-Copy463
My first idea was a 3090, but I'm not based in the US, and getting a used GPU here is risky, it's easy to be scammed. A 4080 is around 2000$ here, 3090 new is 1800$, and a 4900 is 2500$. So I thought that if I decide to get a desktop, I should "just" go for the 4090 cause is 500-700$ more but I'd get double the speed than a 3090 and 8+ vram.
Zealousideal-Copy463 OP t1_j5tih5j wrote
Reply to comment by agentfuzzy999 in Best cloud to train models with 100-200 GB of data? by Zealousideal-Copy463
Sorry, I wrote it in a hurry and now I realize it came out wrong.
What I meant is that in my experience dealing with: moving data between buckets/vms, uploading data, logging into a terminal via ssh or using notebooks that crash from time to time (Sagemaker is a bit buggy), or just training cloud models has some annoyances that are hard to avoid and make the whole experience horrible. So, maybe I should "just buy a good GPU" (4090 is a "good" deal where I live) and stop trying stuff around in the cloud.
Zealousideal-Copy463 OP t1_j61j6n2 wrote
Reply to comment by incrediblediy in Best cloud to train models with 100-200 GB of data? by Zealousideal-Copy463
I was checking Marketplace, couldn't find any used below 1500$. Also, I just discovered that 3090 is 2.2k$ here now lol (that would be the cheapest option)... meanwhile in BestBuy it costs 1k$, was just thinking about traveling to the US with the other k lol