BUGFIX-66

BUGFIX-66 t1_j6ei6ib wrote

These large language models can't write (or fix or "understand") software unless they have seen human solutions to a problem. They are essentially interpolators, interpolating human work (https://en.m.wikipedia.org/wiki/Interpolation).

Don't believe me? I built a site to demonstrate this, by testing OUTSIDE the training set. Try it:

https://BUGFIX-66.com

Copilot can solve 6 of these, and only the ones that appear in its training set. ChatGPT solves even fewer, maybe 3.

To test whether ChatGPT can code, you need to give it problems where it hasn't been trained on human solutions to similar or identical problems. Then you need to check the answer, because the language model is dishonest.

It's bogus.

1

BUGFIX-66 t1_izphn6n wrote

Really? Can it find the bugs in this code?

https://BUGFIX-66.com

Originally the above site was to demonstrate the incompetence of Microsoft Copilot, but it works for ChatGPT just as well.

This is a test mostly OUTSIDE the training set, and incorrect answers are rejected.

Copilot can solve a few of the simple ones at the beginning (simple matrix multiplication, simple radix sort, etc., that appear often in the training data, and some of the harder ones whose solution appears on GitHub, e.g., the uncorrected prediction/correction compressor/decompressor whose solutions were front-page on Hacker News).

If you paste the puzzles in, how many can ChatGPT solve?

For how many does it need the hint?

0