Vegetable-Skill-9700
Vegetable-Skill-9700 OP t1_jdpr474 wrote
Reply to comment by Jaffa6 in Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700
Thanks for sharing! It's a great read, I agree most of the current models most likely be under-trained.
Vegetable-Skill-9700 OP t1_jdpr15o wrote
Reply to comment by cameldrv in Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700
I agree that 175B model will always perform better than 6B model on general tasks, so, maybe that is a great model for demos. But as you build product on top on this model which is used in a certain way and satisfies a certain usecase, won't it make sense to use a smaller model and fine-tune on the relevant dataset?
Vegetable-Skill-9700 OP t1_jdpqefg wrote
Reply to comment by FirstOrderCat in Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700
But do we really need all that info in most of the practical use cases? Say, I am using LM to write Reddit posts, probably, it only needs to learn subjects I write about along my style of writing. A well-trained model on a highly refined dataset (which has high-quality examples of my posts) should perform better than GPT-4?
Vegetable-Skill-9700 OP t1_jdl8onp wrote
Reply to comment by Sorry-Balance2049 in [D] Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700
Agreed! I don't expect it to be as good as GPT-4 on all tasks, but maybe fine-tuning for specific tasks can help it achieve similar performance on test samples related to that task. wdyt?
Vegetable-Skill-9700 OP t1_jdl8hh5 wrote
Reply to comment by Blacky372 in [D] Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700
Agreed, it won't generalize as well as GPT-4, but it could achieve similar performance for a specialized task (say answering technical questions around a certain topic or writing social media posts for a certain entity, etc.).
Vegetable-Skill-9700 OP t1_jdl680d wrote
Reply to comment by soggy_mattress in [D] Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700
That's an interesting analogy!
Vegetable-Skill-9700 OP t1_jdl2fbp wrote
Reply to comment by wojapa in [D] Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700
I think it's just supervised training. Similar to alpaca, I guess
Vegetable-Skill-9700 OP t1_j7omfxw wrote
Reply to comment by grigorij-dataplicity in Launching my first-ever open-source project and it might make your ChatGPT answers better by Vegetable-Skill-9700
Working on it :)
Vegetable-Skill-9700 OP t1_j6zpscd wrote
Reply to comment by grigorij-dataplicity in Launching my first-ever open-source project and it might make your ChatGPT answers better by Vegetable-Skill-9700
Firstly, by measuring data drift and analyzing user behavior, UpTrain identifies which prompts/questions were unseen by the model or the cases where the user was unsatisfied with the model output. It automatically collects those cases for the model to retrain upon.
Secondly, you can use the package to define a custom rule and filter out relevant data sets to retrain ChatGPT for your use case.
Say you want to use LLM to write product descriptions for Nike shoes and have a database of Nike customer chats:
a) Rachel - I don't like these shoes. I want to return them. How do I do that?
b) Ross - These shoes are great! I love them. I wear them every day while practicing unagi.
c) Chandler - Are there any better shoes than Nike? π π
You probably want to filter out cases with positive sentiments or cases with lots of emojis. With UpTrain, you can easily define such rules as a python function and collect those cases.
I am working on an example highlighting how all the above can be done. It should be done in a week. Stay tuned!
Vegetable-Skill-9700 OP t1_j6zpmiy wrote
Reply to comment by uwu-dotcom in Launching my first-ever open-source project and it might make your ChatGPT answers better by Vegetable-Skill-9700
Hey, so this typically happens when there is a change in vocabulary. Just sharing my experience of facing this issue, we built a chatbot to answer product onboarding queries and with a new marketing campaign, we got a great influx of younger audience. Their questions were generally accompanied with a lot of urban slang and emojis which our NLP model wasn't equipped to handle, causing the performance to deteriorate.
Vegetable-Skill-9700 t1_j6l1k7r wrote
Personally I find collecting and understanding to be really hard when it comes to speech. Like with images I can visualise a lot of them at once however with speech I'll have to listen to them one by one
Vegetable-Skill-9700 OP t1_j6gygb7 wrote
Reply to comment by jabarifowle in [P] Launching my first ever open-source project and it might make your ChatGPT answers better by Vegetable-Skill-9700
Thanks!
Vegetable-Skill-9700 OP t1_j6e8txc wrote
Reply to comment by SupplyChainPhd in [P] Launching my first ever open-source project and it might make your ChatGPT answers better by Vegetable-Skill-9700
π
Vegetable-Skill-9700 OP t1_j6e8o99 wrote
Reply to comment by StoicBatman in [P] Launching my first ever open-source project and it might make your ChatGPT answers better by Vegetable-Skill-9700
Firstly, by measuring data drift and analyzing user behavior, UpTrain identifies which prompts/questions were unseen by the model or the cases where the user was unsatisfied with the model output. It automatically collects those cases for the model to retrain upon.
Secondly, you can use the package to define a custom rule and filter out relevant data sets to retrain ChatGPT for your use case.
Say you want to use LLM to write product descriptions for Nike shoes and have a database of Nike customer chats:
a) Rachel - I don't like these shoes. I want to return them. How do I do that?
b) Ross - These shoes are great! I love them. I wear them every day while practicing unagi.
c) Chandler - Are there any better shoes than Nike? π π
You probably want to filter out cases with positive sentiments or cases with lots of emojis. With UpTrain, you can easily define such rules as a python function and collect those cases.
I am working on an example highlighting how all the above can be done. It should be done in a week. Stay tuned!
Vegetable-Skill-9700 OP t1_j69wew8 wrote
Reply to comment by jobeta in [P] Launching my first ever open-source project and it might make your ChatGPT answers better by Vegetable-Skill-9700
Thanks!
Vegetable-Skill-9700 OP t1_j68t7rk wrote
Reply to comment by Acceptable-Cress-374 in [P] Launching my first ever open-source project and it might make your ChatGPT answers better by Vegetable-Skill-9700
Lol, I get the joke now, it's a good one! Thanks for bookmarking!
Vegetable-Skill-9700 OP t1_j68g80z wrote
Reply to comment by Acceptable-Cress-374 in [P] Launching my first ever open-source project and it might make your ChatGPT answers better by Vegetable-Skill-9700
So, you know how itβs almost impossible to build 100% accurate and super-generalised ML models. On top, the performance of these models degrade over time. Furthermore, due to the black boxiness of ML models, identifying problems with them and fixing those problems is super-hard.
UpTrain solves for these exact issues. It identifies cases where the model is going wrong, collects those problematic data-points and retrains the model on them to improve it's accuracy!
You can checkout the repo here: https://github.com/uptrain-ai/uptrain
Vegetable-Skill-9700 OP t1_j68ejbm wrote
Reply to [P] Launching my first ever open-source project and it might make your ChatGPT answers better by Vegetable-Skill-9700
We currently support LLMs, Vision models, Recommendation systems, etc., and are working to integrate it seamlessly with any of the major MLOps frameworks or cloud providers.
Vegetable-Skill-9700 OP t1_jdprcom wrote
Reply to comment by Readityesterday2 in Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700
Thanks for the encouraging comment! Do you have a current use-case where you feel you can leverage UpTrain?