Deep-Station-1746 t1_je8yb60 wrote on March 30, 2023 at 7:52 AM

Reply to [D] What do you think about all this hype for ChatGPT? by Dear-Vehicle-3215

nausea, noun. a feeling of loathing or disgust.

Deep-Station-1746 t1_je8u12c wrote on March 30, 2023 at 6:51 AM

Reply to [R] LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention by floppy_llama

This is interesting - compared to LoRa, it allows LLaMA to also accept images as inputs. And, I believe it is orthogonal to using LoRa. Meaning, they possibly can be used together. I'm unsure about the training stability though. I know that LoRa training allows ridiculously high learning rates (1e-5 for Text encoder), especially for dreambooth. Using LoRa for the frozen weights + LLaMA adapter is an interesting thing to explore.

Edit: spelling

Deep-Station-1746 t1_jdzlqrw wrote on March 28, 2023 at 10:48 AM

Reply to [D] With ML tools progressing so fast, what are some ways you've taken advantage of them personally? by RedditLovingSun

Oh I've used it to <insert something I previously used to do to do with google>. It's great.

Deep-Station-1746 t1_jdz5z9w wrote on March 28, 2023 at 7:02 AM

Reply to [D] Is French the most widely used language in ML circles after English? If not, what are some useful (natural) languages in the field of machine learning? by Subject_Ad_9680

Pythonese is quite useful, from what I hear. Especially the Torchese dialect.

Deep-Station-1746 t1_jduxvm4 wrote on March 27, 2023 at 11:51 AM

Reply to comment by [deleted] in [D] Build a ChatGPT from zero by manuelfraile

You know what? provide me with a 2-3 sample "good" responses to the above post, and explain why they make for a better response than what I wrote, and I'll actually use them from now on to respond to low-effort posts from this sub.

Deep-Station-1746 t1_jduhmbg wrote on March 27, 2023 at 8:18 AM

Reply to [D] Definitive Test For AGI by jabowery

Sir, this is r/MachineLearning. May I take your quality contribution?

Deep-Station-1746 t1_jduhbth wrote on March 27, 2023 at 8:13 AM

Reply to [D] Build a ChatGPT from zero by manuelfraile

OP is peaking on Dunning-Kruger curve right now.

Deep-Station-1746 t1_jdn3vxg wrote on March 25, 2023 at 5:08 PM

Reply to [N] GPT-4 has 1 trillion parameters by mrx-ai

If you say so.

Deep-Station-1746 t1_jdlmrh5 wrote on March 25, 2023 at 8:46 AM

Reply to comment by learn-deeply in [R] Reflexion: an autonomous agent with dynamic memory and self-reflection - Noah Shinn et al 2023 Northeastern University Boston - Outperforms GPT-4 on HumanEval accuracy (0.67 --> 0.88)! by Singularian2501

This is actually a very good PR material, as it will save engineers' time. Just opened it and referenced your comment. https://github.com/noahshinn024/reflexion-human-eval/pull/1

Deep-Station-1746 t1_jdhhbbg wrote on March 24, 2023 at 1:00 PM

Reply to [D] I just realised: GPT-4 with image input can interpret any computer screen, any userinterface and any combination of them. by Balance-

Nope. Ability to input something doesn't mean being able to use it reliably. For example, take this post - your eyes have an ability to input all the info on the screen, but as a contribution, this post is pretty worthless. And, you are a lot smarter than GPT-4, I think.

Edit: spelling

Deep-Station-1746 t1_jcamy6n wrote on March 15, 2023 at 2:10 PM

Reply to comment by OptimizedGarbage in [D] What do people think about OpenAI not releasing its research but benefiting from others’ research? Should google meta enforce its patents against them? by [deleted]

Patenting a dropout feels a lot like NFTs - it's useless. So why bother?

Edit:

What I don't understand is how can anyone prove that someone is multiplying together matrices in some way as long as they don't admit to that themselves.

That's like someone patenting a thought. If you think about a particular patented pair of pants™, can you be sued for propagating a patented neural activity through your bio network? It's absurd.

Deep-Station-1746 t1_jc196cg wrote on March 13, 2023 at 8:47 AM

Reply to [R] Introducing Ursa from Speechmatics | 25% improvement over Whisper by jplhughes

>25% improvement over Whisper
>Not open source
>doubt.jpeg

Deep-Station-1746 t1_j9tjuw0 wrote on February 24, 2023 at 1:38 PM

Reply to [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt

No.

Deep-Station-1746 t1_j91egc2 wrote on February 18, 2023 at 2:27 PM

Reply to [D] Please stop by [deleted]

Isn't this kind of high-quantity-low-quality trend inevitable after some threshold popularity of the base topic? Is there any reason to try to fight the inevitable, instead of forming more niche, less popular communities?

Deep-Station-1746 t1_j146uw3 wrote on December 21, 2022 at 3:28 PM

Reply to comment by Deep-Station-1746 in Reduce paramter count in an NN without sacrificing performance [P] by ackbladder_

The laziest option is fp16 quantization. As easy as model.half() on most torch-based models. Halves the physical size of the model. You could also try knowledge distillation (read up on how distilbert was made, for example). You can also do stuff that is more arch-specific, like if you have a transformer, you could use xformers efficient attention for example. The list goes on and on.

Deep-Station-1746 t1_j146dhq wrote on December 21, 2022 at 3:24 PM

Reply to Reduce paramter count in an NN without sacrificing performance [P] by ackbladder_

> reduce excess bulk in a NN without sacrificing performance

Simply put, that is not possible. There's literally always a trade-off. So, the question is, what are you willing to sacrifice? How much performance are you willing to forego?

Deep-Station-1746 t1_j12m6fe wrote on December 21, 2022 at 5:18 AM

Reply to [R] PyTorch implementation of Forward-Forward Algorithm by Geoffrey Hinton and analysis of performances over backpropagation by galaxy_dweller

Looking forward to updates. :)

Deep-Station-1746 t1_j0ylg2a wrote on December 20, 2022 at 11:11 AM

Reply to [D] Question: best 'starting' server to train deep ML models by KlausMich

Go for vast.ai if you don't have a huge budget. You could rent a 24GB vram instance for 0.4$/hr.

Deep-Station-1746 t1_j0t6yad wrote on December 19, 2022 at 5:42 AM

Reply to [D] Will there be a replacement for Machine Learning Twitter? by MrAcurite

> replacement for Machine Learning Drama Twitter

FTFY

Deep-Station-1746 t1_j0m0var wrote on December 17, 2022 at 6:10 PM

Reply to [R] Silent Bugs in Deep Learning Frameworks: An Empirical Study of Keras and TensorFlow by Ok-Teacher-22

u/fchollet is probably frothing right now.

Edit* spelling

Deep-Station-1746 t1_j06ku87 wrote on December 14, 2022 at 1:10 PM

Reply to comment by TensorDudee in [P] Implemented Vision Transformers 🚀 from scratch using TensorFlow 2.x by TensorDudee

Everything is slow and hard to implement on tensorflow, without much redeemable excuses either (compared to JAX e.g.).

Deep-Station-1746 t1_j06kayz wrote on December 14, 2022 at 1:05 PM

Reply to [P] Implemented Vision Transformers 🚀 from scratch using TensorFlow 2.x by TensorDudee

Solid work. This reminds me of that internet explorer meme.

Deep-Station-1746 t1_izrr83v wrote on December 11, 2022 at 10:23 AM

Reply to [D] Does Google TPU v4 compete with GPUs in price/performance? by Shardsmp

If you are looking to maximize the TFLOP per $, just use vast.ai. Unless you got enterprise-level VRAM needs, vast will likely be much cheaper than anything a cloud provider lists.

Deep-Station-1746 t1_izpfe8i wrote on December 10, 2022 at 9:18 PM

Reply to [D] A talk about ChatGPT by [deleted]

Correct me if I'm wrong, but this post feels like a thinly-veiled subreddit promotion, and nothing else. :)

Deep-Station-1746 t1_izo734l wrote on December 10, 2022 at 4:15 PM

Reply to [P] I made a command-line tool that explains your errors using ChatGPT (link in comments) by jsonathan

Is this just rewording the TypeError's str description? What is the information context for the ChatGPT?