A1-Delta

A1-Delta t1_jcrpd05 wrote

Interesting project! I’ve seen many suggest that the training data for transfer learning might actually be the biggest thing holding Alpaca back from a ChatGPT like experience. In other words, that although the OpenAI model allows for the creation of a lot of training data, that data might include a lot of low quality pairs that in an ideal world wouldn’t be included. Do you have any plan to increase the quality of your dataset in addition to the size of it?

I hear your concern about the LLaMA license. It might be bad advice, but personally I wouldn’t worry about it. This is a very popular model people are using for all sorts of things. The chance they are going to come after you seems to me to be small and my understanding is that it’s sort of uncharted legal ground once you’ve done significant fine tuning. That being said, I’m not a lawyer.

LLaMA is a very powerful model and I would hate for you to put all this effort into creating something that ends up being limited and not clearly better than Alpaca simply because of license fears. If I were you though, I’d go with the 13B version. Still small enough to run on many high end consumer GPUs after quantization while providing significantly better baseline performance than the 7B version.

20

A1-Delta t1_j9572gt wrote

Tweaking a CNN without retraining makes it sound like you want a no-code option on your machine learning.

Totally agree that model interpretability is a challenge, but there is a whole subsection of our field working on that. The fundamental design of deep learning sort of precludes what you’re talking about - at least given our current understanding of model interpretation. At best, a model may be trained to give options on certain aspects based on its input (we see this all the time), but that doesn’t sound like what you want. It sounds like you want to be able to target specific and arbitrary components of an output and intuitively modify the weights of all nodes contributing to that part of the output - presumably in isolation.

I think your challenge might lie with a fundamental lack of understanding of how these models actually work. I don’t mean that as a dig - they’re complicated. I just want to help bring you to a place of understanding about why the field is how you’re experiencing it.

Not a huge fan of massive edits to original posts after people have started responding. Your newly added recommendations put an onerous responsibility on any open source authors who might make their work public as a hobby rather than a career.

16

A1-Delta t1_j955sgx wrote

I’m not sure I’m following you. Are you concerned that machine learning models are not easily customizable enough?

Is your trouble with the fundamental concept of transfer learning, that data selection and preparation is difficult, that convolutional neural networks are “black boxes”, or something else?

26

A1-Delta t1_j7n9dof wrote

Mate, I checked out your website and there is not nearly enough info before asking me to sign up. The video is great. It looks like a super cool tool, and I’d be interested in trying it out, but I don’t want to give you my contact info and go through an intake process before even knowing what I’ll get. As OP asked - is it free? Am I going to spend the time using your platform to make a presentation just to be hit with a surprise “subscribe now to download your presentation” at the end? These are the sorts of questions the keep me from even trying it.

1

A1-Delta t1_iwn3n9s wrote

Sorry for confusion, I didn’t mean a legal precedent I meant practice precedent. Specifically, I meant that the legality of these practices have not yet been determined. They are in a grey area. We’ll see if legal precedent is set by the lawsuit you referenced. It’s not at all obvious that current laws apply here.

1

A1-Delta t1_iwkzn5f wrote

I’m not sure there are any laws around what can and cannot be used as training data. It is a sort of grey area right now and current precedent (think copilot) is that you can use whatever you want without worrying about its source so long as your model is generating something new (not just selecting and presenting data you gave it).

6