Zealousideal-Copy463

Zealousideal-Copy463 OP t1_j5tk24z wrote

Ohh, I didn't know that about GCP, so you can point a VM to a bucket and it just "reads" the data? you don't have to "upload" data into the VM?

As I said in a previous comment, my problem with AWS (S3 and Sagemaker), is that the data is in a different network, and even though is still an AWS network, you have to move data around and that takes a while (when it's 200 GB of data).

1

Zealousideal-Copy463 OP t1_j5tjusi wrote

Thanks for your comment! I have tried using ec2 and keeping data in EBS but not sure if it's the best solution, what is your workflow there?

I'm playing around mostly with NLP and image models. Right now I'm trying to process videos, like 200GB for a retrieval problem, what I do is: get frames, get feature vectors from pre trained resnet, and resnext (this takes a lot of time). And then I train a siamese network on all of those vectors. As I said I have tried with s3 and sagemaker, but I have to move data into sagemaker notebooks and I waste a lot of time there. Also tried to process stuff in ec2 but setting the whole thing took me a while (downloading data, installing libraries, creating scripts in the shell to process videos, etc).

1

Zealousideal-Copy463 OP t1_j5tj30h wrote

My first idea was a 3090, but I'm not based in the US, and getting a used GPU here is risky, it's easy to be scammed. A 4080 is around 2000$ here, 3090 new is 1800$, and a 4900 is 2500$. So I thought that if I decide to get a desktop, I should "just" go for the 4090 cause is 500-700$ more but I'd get double the speed than a 3090 and 8+ vram.

1

Zealousideal-Copy463 OP t1_j5tih5j wrote

Sorry, I wrote it in a hurry and now I realize it came out wrong.

What I meant is that in my experience dealing with: moving data between buckets/vms, uploading data, logging into a terminal via ssh or using notebooks that crash from time to time (Sagemaker is a bit buggy), or just training cloud models has some annoyances that are hard to avoid and make the whole experience horrible. So, maybe I should "just buy a good GPU" (4090 is a "good" deal where I live) and stop trying stuff around in the cloud.

1