Additional-Escape498
Additional-Escape498 t1_j9wsis0 wrote
Reply to comment by gullydowny in ChatGPT on your PC? Meta unveils new AI model that can run on a single GPU by 10MinsForUsername
The mods don’t let you link to arxiv on a technology subreddit?
Additional-Escape498 t1_j9w2ix6 wrote
Reply to comment by FpRhGf in What are the big flaws with LLMs right now? by fangfried
LLM tokenization uses wordpieces, not words or characters. This is standard since the original “Attention is All you Need Paper” that introduced the transformer architecture in 2017. Vocabulary size is typically between 32k - 50k depending on the implementation. GPT-2 uses 50k. They include each individual ASCII character plus commonly used combinations of characters. Documentation: https://huggingface.co/docs/transformers/tokenizer_summary
Additional-Escape498 t1_j9vqmlh wrote
Reply to comment by osedao in [D] Is validation set necessary for non-neural network models, too? by osedao
For a small dataset still use cross validation, but use k-fold cross validation so you don’t divide the dataset into 3, just into 2 and then the k-fold subdivides the training set. Sklearn has a class for this already built to make this simple. Since you have a small dataset and are using fairly simple models I’d suggest setting k >= 10.
Additional-Escape498 t1_j9u0h32 wrote
Reply to comment by jsveiga in DeepMind created an AI system that writes computer programs at a competitive level by inaLilah
True. Just like there are some problems that are easier to code in C than in Python
Additional-Escape498 t1_j9txq63 wrote
Reply to DeepMind created an AI system that writes computer programs at a competitive level by inaLilah
Programming might become writing functions by specifying them in natural language in a way that correctly states the inputs and desired outputs. Still requires algorithmic thinking, just at a higher level of abstraction. Like moving from assembly code to Python.
Additional-Escape498 t1_j9rq3h0 wrote
Reply to [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt
EY tends to go straight to superintelligent AI robots making you their slave. I worry about problems that’ll happen a lot sooner than that. What happens when we have semi-autonomous infantry drones? How much more aggressive will US/Chinese foreign policy get when China can invade Taiwan with Big Dog robots with machine guns attached? What about when ChatGPT has combined with toolformer and can write to the internet instead of just read and can start doxxing you when it throws a temper tantrum? What about when rich people can use something like that to flood social media with bots that spew disinformation about a political candidate they don’t like?
But part of the lack of concern for AGI among ML researchers is that during the last AI winter we rebranded to machine learning because AI was such a dirty word. I remember as recently as 2015 at ICLR/ICML/NIPS you’d get side-eye for even bringing up AGI.
Additional-Escape498 t1_j9rnaop wrote
Reply to comment by schu4KSU in Microsoft brings Bing chatbot to phones after curbing quirks by marketrent
About the only thing that could get me to switch from DuckDuckGo is if my search engine randomly picked a fight with me
Additional-Escape498 t1_j9yorbo wrote
Reply to comment by FpRhGf in What are the big flaws with LLMs right now? by fangfried
You’re definitely right that it can’t do those things, but I don’t think it’s because of the tokenization. The wordpieces do contain individual characters, so it is possible for a model to do that with the wordpiece tokenization it uses, but the issue is that the things you’re asking for (like writing a story with pig Latin) require reasoning and LLMs are just mapping inputs to a manifold. LLM’s can’t really do much reasoning or logic and can’t do basic arithmetic. I wrote an article about the limitations of transformers if you’re interested: https://taboo.substack.com/p/geometric-intuition-for-why-chatgpt