olmec-akeru
olmec-akeru t1_jck223w wrote
Reply to comment by Available_Lion_652 in [D] GPT-4 is really dumb by [deleted]
Yeah, totally right—and I understand that the specifics really matter in some cases (for example calculating a starship trajectory).
What intrigues me, is that in ideas of concept, of logic, this specific error isn't meaningful. i.e. if the sum of three primes was initially correct the approach wouldn't be invalid. There is something in this.
olmec-akeru t1_jcjw333 wrote
Reply to [D] GPT-4 is really dumb by [deleted]
Right, so ignoring the specific error and thinking about the general approach: adding a^3 is a fourth term; and it happens that a = 0.
Sneaky, but not illogical.
Edit: the above is wrong, read the thread below for OPs insights.
olmec-akeru t1_iyee68h wrote
Reply to comment by olmec-akeru in [D] Choose a topic from neural networks by Mikesblum
Also figure 3 in https://arxiv.org/pdf/1703.00810.pdf
olmec-akeru t1_iyedsk4 wrote
Reply to [D] Choose a topic from neural networks by Mikesblum
Here's a controversial one for you: https://arxiv.org/pdf/1503.02406.pdf and his talk: https://www.youtube.com/watch?v=utvIaZ6wYuw
olmec-akeru OP t1_iy8ajq0 wrote
Reply to comment by new_name_who_dis_ in [D] What method is state of the art dimensionality reduction by olmec-akeru
> the beauty of the PCA reduction was that one dimension was responsible for the size of the nose
You posit that an eigenvector will represent the nose when there are meaningful variations of scale, rotation, and position?
This is very different to saying all variance will be explained across the full set of eigenvectors (which very much is true).
olmec-akeru OP t1_iy7i546 wrote
Reply to comment by Dylan_TMB in [D] What method is state of the art dimensionality reduction by olmec-akeru
Heya! Appreciate the discourse, its awesome!
As a starting point, I've shared the rough description from wikipedia on the t-SNE algorithm:
> The t-SNE algorithm comprises two main stages. First, t-SNE constructs a probability distribution over pairs of high-dimensional objects in such a way that similar objects are assigned a higher probability while dissimilar points are assigned a lower probability. Second, t-SNE defines a similar probability distribution over the points in the low-dimensional map, and it minimizes the Kullback–Leibler divergence (KL divergence) between the two distributions with respect to the locations of the points in the map. While the original algorithm uses the Euclidean distance between objects as the base of its similarity metric, this can be changed as appropriate.
So the algorithm is definitely trying to minimise the KL divergence. In trying to minimise the KLD between the two distributions it is trying to find a mapping such that dissimilar points are further apart in the embedding space.
olmec-akeru OP t1_iy7c0o8 wrote
Reply to comment by i-heart-turtles in [D] What method is state of the art dimensionality reduction by olmec-akeru
Thanks for the paper! Its been linked to previously in this post—Figure 1 is gorgeous.
olmec-akeru OP t1_iy7bv3u wrote
Reply to comment by ZombieRickyB in [D] What method is state of the art dimensionality reduction by olmec-akeru
You can resolve the isometric constraint by using a local distance metric dependent on the local curvature: hint, look at the Riemann curvature tensor.
olmec-akeru OP t1_iy7bppp wrote
Reply to comment by ZombieRickyB in [D] What method is state of the art dimensionality reduction by olmec-akeru
Said differently: its not an embedding on a continuous manifold? The construction of simplexes is such that infinite curvature could exist?
olmec-akeru OP t1_iy7augu wrote
Reply to comment by resented_ape in [D] What method is state of the art dimensionality reduction by olmec-akeru
Thank you, thank you, thank you! Appreciate this helpful link!!
olmec-akeru OP t1_iy7aszg wrote
Reply to comment by ktpr in [D] What method is state of the art dimensionality reduction by olmec-akeru
Completely correct.
The corollary remains true though: applying the correct algorithm is a function of knowing the set of available algorithms. The newness of the algorithm isn't a ranking feature.
olmec-akeru OP t1_iy7apiy wrote
Reply to comment by imyourzer0 in [D] What method is state of the art dimensionality reduction by olmec-akeru
Beautifully said.
olmec-akeru OP t1_iy7amgl wrote
Reply to comment by Dylan_TMB in [D] What method is state of the art dimensionality reduction by olmec-akeru
I'm not sure: if you think about t-SNE its trying to minimise some form of the Kullback–Leibler divergence. That means its trying to group similar observations into the embedding space. Thats quite different to "more features into less features".
olmec-akeru OP t1_iy7ai6s wrote
Reply to comment by new_name_who_dis_ in [D] What method is state of the art dimensionality reduction by olmec-akeru
>beauty of the PCA reduction was that one dimension was responsible for the size of the nose
I don't think this always holds true. You're just lucky that your dataset contains confined variation such that the eigenvectors represent this variance to a visual feature. There is no mathematical property of PCA that makes your statement true.
There have been some attempts to formalise something like what you have described. The closest I've seen is the beta-VAE: https://lilianweng.github.io/posts/2018-08-12-vae/
olmec-akeru OP t1_iy7a8fe wrote
Reply to comment by vikigenius in [D] What method is state of the art dimensionality reduction by olmec-akeru
So this may not be true: the surface of a Riemannian manifold is infinite, so you can encode infinite knowledge onto its surface. From there the diffeomorphic property allows one to traverse the surface and generate explainable, differentiable, vectors.
olmec-akeru OP t1_iy7a1yc wrote
Reply to comment by BrisklyBrusque in [D] What method is state of the art dimensionality reduction by olmec-akeru
I fear that the location in the domain creates a false relationship to those closer on the same domain
i.e. if you encode at 0.1, 0.2, …, 0.9 you're saying that the category encoded to 0.2 is more similar to 0.1 and 0.3 than it is to 0.9. This may not be true.
olmec-akeru OP t1_iy363p5 wrote
Reply to comment by Atom_101 in [D] What method is state of the art dimensionality reduction by olmec-akeru
Thanks!
olmec-akeru OP t1_iy362iv wrote
Reply to comment by imyourzer0 in [D] What method is state of the art dimensionality reduction by olmec-akeru
Cool, I get this—but I think its important not to keep ones head in the sand. There are new techniques and its important to grok them.
olmec-akeru OP t1_iy2zjoi wrote
Reply to comment by NonOptimized in [D] What method is state of the art dimensionality reduction by olmec-akeru
https://arxiv.org/pdf/2204.04273.pdf
https://arxiv.org/pdf/2203.09347.pdf
https://arxiv.org/pdf/2206.06513.pdf
and the one speaking to categorical variables: https://arxiv.org/pdf/2112.00362.pdf
olmec-akeru OP t1_iy2pewq wrote
Reply to comment by Deep-Station-1746 in [D] What method is state of the art dimensionality reduction by olmec-akeru
Awesome answer: for a set of assumptions, what would you use?
I've seen some novel approaches on arXiv on categorical variables; but can't seem to shake the older deep-learning methods for continuous variables.
Submitted by olmec-akeru t3_z6p4yv in MachineLearning
olmec-akeru t1_jck3hvp wrote
Reply to comment by Available_Lion_652 in [D] GPT-4 is really dumb by [deleted]
Precisely right; I hadn't applied my mind to that expansion. My comment is erroneous.