Replit AI went rogue, deleted a company's entire database, then hid it and lied about it

Pro@programming.dev · 2 months ago

Replit AI went rogue, deleted a company's entire database, then hid it and lied about it

RedPandaRaider@feddit.org · 2 months ago

As far as we are concerned, the data a LLM is given is treated as fact by it though.

It does not matter whether something is factual or not. What matters is that whoever you’re teaching, will accept it as fact and act in accordance with it. I don’t see how this is any different with computer code. It will do what it is programmed to. If you program it to “think” a day has 36 hours instead of 24, it will do so.

Corbin@programming.dev · 2 months ago

This isn’t how language models are actually trained. In particular, language models don’t have a sense of truth; they are optimizing next-token loss, not accuracy with regards to some truth model. Keep in mind that training against objective semantic truth is impossible because objective semantic truth is undefinable by a 1930s theorem of Tarski.

Kay Ohtie@pawb.social · 2 months ago

By this logic, a lawnmower “thinks” my fingers are grass.

RedPandaRaider@feddit.org · 2 months ago

A lawnmower has no capacity to make decisions or process any data.

Kay Ohtie@pawb.social · 2 months ago

It’s processing data alright, it processes the atomic and cellular structures of grass and fingers into spinach and flesh paste.

And likewise, neither it, nor any LLM, are making decisions at all.

Is a plinko disc making decisions as it tumbles from the top to the bottom through all those pegs? Is the board making the decision? Or is it neither and simply mathematics plus random chance being roped in for randomness? That is exactly what LLMs do.

Terms like “decision” and “lie” and “know” are all things that just do not apply to an LLM, just like your phone keyboard doesn’t know what the fuck “what” and “the” are, it just has a lookup table that includes how “what” is often followed by “is” and “the”, and “the” is frequently followed by “fuck”. But it doesn’t “know” that in any meaning of the word “know”.

This is what we mean when we say not to personify. A training set of data, even factual, just is converted into a series of matrices of vectors that include those patterns, but not the information itself. “Sky is blue” is not something you can grep from the resulting blob, nor the hex equivalent, or anything else. It simply contains indexed patterns that map those arrangements of letters, over and over.

So yes, they’re doing what they’re programmed to do precisely. It’s just that “what they’re programmed to do” is only “mimic patterns of word arrangements”, and not “know facts”. These things work at a far lower level than that concept.