--dany--
--dany-- t1_jedo4gy wrote
Reply to [D] Best deal with varying number of inputs each with variable size using and RNN? (for an NLP task) by danilo62
How about using the embeddings of the whole post? Then you just have to train a model to predict trait from one post. A person’s overall trait can be the average of all traits predicted by all of his posts. I don’t see a point in using RNN over posts.
--dany-- t1_je134t7 wrote
Reply to Noob question: is there a site where you can provide you equipment to help process data and gain credits for further use? by Fonsecafsa
The idea has been explored by vast.so and runpod. Not exactly credit but real cash.
--dany-- t1_jccendz wrote
--dany-- t1_j4zx6lf wrote
Very good write up! Thanks for sharing your thoughts and observations. Some questions many other folks may have as well
- how do you arrive at the number it’s 500x smaller or 200 million parameters?
- Your estimate of 53 years for training a 100T model, can you elaborate how you got 53?
--dany-- t1_iyn91x1 wrote
Reply to [D] PyTorch 2.0 Announcement by joshadel
The speed up is only available for newer Volta and Ampere GPUs for more. Hopefully with primTorch it’s easier to port to other accelerators in the long run. And the speed up is less prominent for consumer GPUs.
--dany-- t1_iy29wqs wrote
Reply to comment by somebodyenjoy in Best GPU for deep learning by somebodyenjoy
I’m saying 2x 3090s are not much better than a 4090. According to lambda labs benchmarks a 4090 is about 1.3 to 1.9 times faster than a 3090. If you’re after speed then a 4090 definitely makes more sense as it’s only slightly slower but is much more power efficient and cheaper than 2x 3090s.
--dany-- t1_iy2149l wrote
Reply to comment by somebodyenjoy in Best GPU for deep learning by somebodyenjoy
Not too much by some benchmarks. So speed is not your point here. Your main concern is if the model and training data can fit your VRAM.
--dany-- t1_iy1zq6n wrote
Reply to Best GPU for deep learning by somebodyenjoy
3090 has NVLink bridge to connect two cards to pool memories together. Theoretically you’ll have 2x computing power and 48GB VRAM to do the job. If VRAM size is important for your big model and you have a beefy PSU then this is the way to go. Otherwise just go with a 4090.
If you don’t need to train a model frequently, colab or some paid gpu rental services might be easier for your wallet and power bill. For example it’s only about $2 per hour to rent 4x RTX A6000 from some rentals.
--dany-- t1_iuxbw6e wrote
Machine learning is just another way to approximate a function. Treat your 9-input 6-output neural network as a black box target function to approximate, and gather enough examples as your training dataset to train a new neural network. According to universal approximation theorem (https://en.wikipedia.org/wiki/Universal_approximation_theorem) if your new neural network is complex enough, it will be infinitely close the black box.
bonus point: if you know the architecture of the target black box model, you will get a very close copy of it. But don’t expect the weights will be exactly the same.
--dany-- t1_jefcm9m wrote
Reply to comment by danilo62 in [D] Best deal with varying number of inputs each with variable size using and RNN? (for an NLP task) by danilo62
The embedding contains all information like sentiment or tf-idf. You just need to train a model to predict trait from post embedding then average over all posts by a person. I didn’t suggest using RNN. Are you sure you were replying my comment?