Vegetable-Skill-9700

Vegetable-Skill-9700 OP t1_jdpr15o wrote

I agree that 175B model will always perform better than 6B model on general tasks, so, maybe that is a great model for demos. But as you build product on top on this model which is used in a certain way and satisfies a certain usecase, won't it make sense to use a smaller model and fine-tune on the relevant dataset?

1

Vegetable-Skill-9700 OP t1_jdpqefg wrote

But do we really need all that info in most of the practical use cases? Say, I am using LM to write Reddit posts, probably, it only needs to learn subjects I write about along my style of writing. A well-trained model on a highly refined dataset (which has high-quality examples of my posts) should perform better than GPT-4?

1

Vegetable-Skill-9700 OP t1_j6zpscd wrote

Firstly, by measuring data drift and analyzing user behavior, UpTrain identifies which prompts/questions were unseen by the model or the cases where the user was unsatisfied with the model output. It automatically collects those cases for the model to retrain upon.

Secondly, you can use the package to define a custom rule and filter out relevant data sets to retrain ChatGPT for your use case.

Say you want to use LLM to write product descriptions for Nike shoes and have a database of Nike customer chats:
a) Rachel - I don't like these shoes. I want to return them. How do I do that?
b) Ross - These shoes are great! I love them. I wear them every day while practicing unagi.
c) Chandler - Are there any better shoes than Nike? πŸ‘Ÿ 😍
You probably want to filter out cases with positive sentiments or cases with lots of emojis. With UpTrain, you can easily define such rules as a python function and collect those cases.

I am working on an example highlighting how all the above can be done. It should be done in a week. Stay tuned!

2

Vegetable-Skill-9700 OP t1_j6zpmiy wrote

Hey, so this typically happens when there is a change in vocabulary. Just sharing my experience of facing this issue, we built a chatbot to answer product onboarding queries and with a new marketing campaign, we got a great influx of younger audience. Their questions were generally accompanied with a lot of urban slang and emojis which our NLP model wasn't equipped to handle, causing the performance to deteriorate.

2

Vegetable-Skill-9700 OP t1_j6e8o99 wrote

Firstly, by measuring data drift and analyzing user behavior, UpTrain identifies which prompts/questions were unseen by the model or the cases where the user was unsatisfied with the model output. It automatically collects those cases for the model to retrain upon.

Secondly, you can use the package to define a custom rule and filter out relevant data sets to retrain ChatGPT for your use case.

Say you want to use LLM to write product descriptions for Nike shoes and have a database of Nike customer chats:
a) Rachel - I don't like these shoes. I want to return them. How do I do that?
b) Ross - These shoes are great! I love them. I wear them every day while practicing unagi.
c) Chandler - Are there any better shoes than Nike? πŸ‘Ÿ 😍
You probably want to filter out cases with positive sentiments or cases with lots of emojis. With UpTrain, you can easily define such rules as a python function and collect those cases.

I am working on an example highlighting how all the above can be done. It should be done in a week. Stay tuned!

3

Vegetable-Skill-9700 OP t1_j68g80z wrote

So, you know how it’s almost impossible to build 100% accurate and super-generalised ML models. On top, the performance of these models degrade over time. Furthermore, due to the black boxiness of ML models, identifying problems with them and fixing those problems is super-hard.

UpTrain solves for these exact issues. It identifies cases where the model is going wrong, collects those problematic data-points and retrains the model on them to improve it's accuracy!

You can checkout the repo here: https://github.com/uptrain-ai/uptrain

14