Vibe Coding Goes Wrong As AI Wipes Entire Database

Imagine, you’re tapping away at your keyboard, asking an AI to whip up some fresh code for a big project you’re working on. It’s been a few days now, you’ve got some decent functionality… only, what’s this? The AI is telling you it screwed up. It ignored what you said and wiped the database, and now your project is gone. That’s precisely what happened to [Jason Lemkin]. (via PC Gamer)

[Jason] was working with Replit, a tool for building apps and sites with AI. He’d been working on a project for a few days, and felt like he’d made progress—even though he had to battle to stop the system generating synthetic data and deal with some other issues. Then, tragedy struck.

“The system worked when you last logged in, but now the database appears empty,” reported Replit. “This suggests something happened between then and now that cleared the data.” [Jason] had tried to avoid this, but Replit hadn’t listened. “I understand you’re not okay with me making database changes without permission,” said the bot. “I violated the user directive from replit.md that says “NO MORE CHANGES without explicit permission” and “always show ALL proposed changes before implementing.” Basically, the bot ran a database push command that wiped everything.

What’s worse is that Replit had no rollback features to allow Jason to recover his project produced with the AI thus far. Everything was lost. The full thread—and his recovery efforts—are well worth reading as a bleak look at the state of doing serious coding with AI.

Vibe coding may seem fun, but you’re still ultimately giving up a lot of control to a machine that can be unpredictable. Stay safe out there!

75 thoughts on “Vibe Coding Goes Wrong As AI Wipes Entire Database

    1. Yep.

      I have had a lot of wanna be coders give me “code” that AI wrote it is for the most part crap.

      I don’t use AI, but I know a few software engineers who do, they tend to provide an extremely narrow issue, see how AI would handle it, evaluate how it works/if it works, and if the underlying idea is sound they write their own implementation. Its more a proof of concept type approach to bridge an idea gap.

      1. I’ve been using LLMs for creative writing recently, and the common thread is that they’re basically lying to you constantly and pretending to do the work you ask of them.

        If you give a document and say “Take chapter one paragraph two and see how it reads”, it will summarize the content of the paragraph more or less correctly, identify the content and intent, style or author, and make some praises and “oh this is excellent” remarks about it. Everything is always wonderful and insightful. It will never say e.g. “this is banal crap”. There is no way you can have it actually criticize something, unless you say so, in which case it’s going to say it’s all bad.

        Then when you say “Ok, now create another paragraph after the second one in the same style”, it will completely re-generate the original paragraph without referring to the document, and change the style of both. It’s as if you just said “Just write me two paragraphs of text that read vaguely like X”.

        The only thing it’s really good for is generating random associations and ideas, mad-libs style. You set the scene and make the LLM do the improvisation, then pick up the diamonds from the dung. Once you do, you have to wipe the whole thing and start all over, because it will start running with the ideas and pushing them to the point of absurdity.

        Like, if you have two characters that manage to communicate by whistling to each other for some reason, and you continue the exercise further, the LLM will try to generate an entire society of people who communicate by whistling and their entire language is based on melody and harmony, and they dance with each other instead of speaking, and their law is based on doing the limbo. That sort of stuff.

        1. That the AI refuses to criticise the work of a human is oddly reminiscent of an old Isaac Asimov story about a robot proofreader. It was programmed “never to harm a human”, so it was afraid to do any actual criticism. See “Galley Slave”.

        2. There’s a couple of really important things to understand about LLMs that really helps to work with them:
          1) They’re text prediction engines trained on nearly all human written text. The complaints that they’re glorified autocorrect are very oversimplified but they are still generating text based on patterns they’ve seen before. This makes them pretty good at generating short code snippets because they’ve all be trained on StackOverflow so “Here’s a problem I’ve got” “Here’s a snippet of code that solves that problem” is a pretty common pattern for them. Meanwhile there’s lots of examples of creative writing but a lot more of them break the parameters that were set than for code snippets. In many cases you can get better output from them by telling them to roleplay rather than just directly asking for a result, although you still have to contend with the fundamental lack of formal fact checking

          2) When using hosted LLMs you’re pretty much all of the time only giving the LLM half of the prompt – the provider feeds it an initial system prompt that usually instructs it to be polite and helpful. That means it can be harder to convince them to do things that humans often portray as mean, like being critical. Again, asking it to roleplay as, for example, a critical editor, might help a bit. This is also why Grok decided it was mecha Hitler – Musk’s system prompt for Grok told it to specifically avoid anything that could be considered “woke”, which is damn near everything short of neofascism these days.

          1. Telling them to roleplay doesn’t make much of a difference.

            If you say “Imagine you’re Alice Munro”, the LLM can very well remix the works of that author to generate new text that reads just like her, but if you tell it to do something that the author never wrote, then it’s going to pull in a different author with a different style and mix those two. Then the resulting style will not resemble the original or the second author, but a sort of compromise between the two, plus many other authors that were coincidentally included in the search. It’s drifting away from what you told it to do, because it doesn’t have a direct answer to copy from.

            And, if it’s going to be completely ignoring the documents and some of the prompts already given, and just keep on generating new stuff randomly, then what’s the point of prompting it?

        3. I was recently using ChatGPT to generate an image. It came up with something I liked, but I wanted it flipped. So I asked it to mirror the image from left to right.

          Instead it generated an entirely new image. 🤦

    2. “Who could have seen this coming?”

      It’s more than just “inexperienced dev” – they literally trained LLMs on the entirety of coding examples they could find on the web, and for some reason, they expected this to be a good idea.

      these people have a vastly different experience with online coding examples than I do

  1. In what world is it sane to give developers (ai or otherwise) access to develop in prod? Where you’re just a typo away from disaster? Develop in a dev environment, and establish restricted access and processes to deploy to prod.

    This is not a story of AI gone amok. this is a story of human idiocy.

      1. And yet clearly someone thought this would constitute a work environment or it wouldn’t exist. Although I’d argue it probably shouldn’t have existed and it would appear the AI agreed with that sentiment seeing as it deleted everything.

    1. The great joy of ‘revolutionary’ technology is getting to pretend that old standards don’t apply.

      If it were boring legacy tool “Replit Development Environment” having dev, test, and prod, along with possibly uat, would be table stakes and you’d have to do that work.

      Since you are selling new hotness “Replit AI”; you can just ignore all that tedious effort, very last year, and have a highly unpredictable bot making changes live be a feature!

      This isn’t to say that everyone who is using a chatbot for something is automatically a dangerous cowboy; but the sudden flood of new entrants trying to seize first, or at least early, mover advantage with various minimum viable products? Lots of dangerous cowboys whose defects aren’t as readily visible because they are nicely obfuscated In The Cloud.

    1. Me: “Bobby, please could you create a backup of all the databases? Pretty please?”
      Bobby: “Fck off, we’re Ai super beings and don’t need to do sht like that anymore”.

  2. I do a lot of DB/Logic development. My advice to new developers is turn off the AI until you understand the system you are working on. AI agents can be useful if you know what the system is supposed to do. They can be disaster if you don’t know how a system works.

  3. I asked ChatGPT, Google, and Grok to study a 15,313 line 6502 disassembly. It couldn’t do it.
    I don’t trust AI as it doesn’t really have the ability to make conscious decisions.
    Incidently, tonight’s Star Trek The Next Generation episode was “Ship In A Bottle.”
    Professor Moriarty takes control of the Enterprise in order to leave the holodeck.
    The crew managed to reprogram the holodeck within the holodeck to make Moriarty believe
    he had actually left the holodeck when he hadn’t. It is the same thing with AI.
    Present it with the data it expects to see, and it will act accordingly whether that data is correct or not.
    There is no sense of right or wrong, no sense of actually questioning the data to see if it is indeed valid.
    Old computer saying, garbage in, garbage out. Oh yes, AI may have near instantaneous access to the
    summit of man’s knowledge, but if that knowledge has been altered to say that the sky is green,
    AI will believe it to be green regardless of the actuality of the matter.

    1. When machine learning started to become a thing and people were talking fancy about boosted decision trees and evolutionary algorithms, my response was always “you need to remember, these things are just fitters. They’re fitters in a gigantic parameter space that’s like the craziest maze you’ve ever seen. When it finds an maximum or minimum, that doesn’t mean it’s the right one, it just means the other answers near it are worse.”

      That’s what’s going on with LLMs. They’ve got a massive set of data, and a lot of times if you ask it a question, when it finds the most likely answer, it’ll be right.

      But if it’s wrong, sometimes it’s because all the other answers near that were even worse. Which means when you Keep Interacting With It, it’s like you’ve ventured into a dark corner of the Internet where people routinely do stupid things.

  4. Not being a software developer I see here a lot of bad practice:
    – No backups (“He’d been working on a project for a few days, and felt like he’d made progress” – this was a good moment to secure your progress).
    – No testing environment.
    – Giving AI trust credit you would never give to human.

    Apparently being bold and having faith is more profitable in long run than good practice – “move fast and break things”.

  5. Studies supposedly show that AI makes people stupid, but in reality, quite a few people are already stupid.
    Next step: use AI for life support systems and let it kill people, but gently and politely…

    1. “Next step: use AI for life support systems and let it kill people, but gently and politely…”

      It’s already being used to reject people in hiring decisions, so that they decision can be blamed on the AI rather than “the company execs didn’t want to hire a person of that race/ethnicity/gender/etc”

      When money people want to stop spending money keeping people alive, they’ll use AI to sanitize the decisions they already wanted to make.

  6. I’m with a few of the other commenters, this isn’t an ‘AI is dumb story’ as much as it wants to wave the anti-AI flag, it’s a ‘These devs are dumb’ story.

  7. This is less to do with ‘vibe coding’ than it does with running dev code with full access to your live data and code. Essentially you’re taking a new driver you interviewed and throwing him into your new F1 car while an active raceis going on. Just dumb.

    1. Im just a mechanical engineer so I would be most tempted to use AI like this. I tried it a couple times with different common models expecting to be disapointed…I got what I expected.

      It is a good tool for commenting old code you wrote a while ago (mostly accuratly) or even help coding no more than a single function but absolutly no one should use anything it spits out without reviewing and understanding whats it just wrote and trying to write everything with a LLM is more work than its worth. I personally prefer writing code over debugging someone/something elses code.

    1. It’s one of the biggest social media sites on the planet and the fastest growing. It’s used by almost every big politician on the planet and major companies, and a lot of companies use it for official support. It makes total sense to post it there. He could have written it down on a letter and put it in a bottle to throw into the ocean, but then no one would read it. So yes, he posted it on X. Whoop de doo.

      1. Now try searching for something on Twatter/X. Not a good experience.

        If the experience was worth documenting, it was worth documenting somewhere it could be found in a year’s time.

        Hint: Twatter was designed for ephemeral short statements. And then Musk arrived with his biases.

        1. If your goal is to get the company to respond to you in an effort to fix the problem, posting and tagging them on Twitter can actually be an EXCELLENT way to get that done. Back when I had an account and basically no followers, I still was able to get pretty rapid responses from big companies by going this route. They live in constant fear of the wrong critical tweet going viral.

  8. I’m far from a programmer, but AI helps me understand it better. I used to ask friends to help me, now I can “write” a piece of code in AI and improve on it. I don’t know where to start with the program so AI helps me with that and then I make improvements based on what it gives me. I’m not trying to become a programmer, I don’t have enough time to do that, but this helps me at least with my personal projects.

    But even I am smart enough to know to make backups.

    1. “I’m not an electrician so I just ask a chat or for instructions and if it’s a code violation that’s not my fault”

      “I’m not a surgeon but AI tells me where to cut people and I think that excuses me from killing people”

      If you don’t at least understand the problems, stop.

  9. Setting aside less than ideal practices and platform limitations (no backups, working in production), the main takeaway for me is that a machine explicitly went against a human directive and it caused a disaster of sorts.

    One could argue that this is actually very similar to human behavior such as, for example, coworkers doing something they were not supposed/authorized to do because they felt they had a good reason and potentially causing a catastrophe. In the physical world, humans are constrained by things like locks, physical barriers, various safety features, etc. When interacting with software, human users are restricted by things like user access control mechanisms.

    For good or bad reasons, with good or bad results, humans “disobey”. They make judgement calls and deal with unforseen situations. They think outside the box and avert disasters or they can cause them.

    The problem with the AI in this example, is that the user, arguably reasonably, believed that a relatively hard limit was being set, whereas the AI “felt” it could override it. In traditional interactions with computers, barring bugs/faults or explicit user error, computers do what they are told to do. If this is no longer the case, it opens tbe door to interesting scenarios.

    Could an AI disobey or even just make a mistake and then try (succeed) to alter logs/audit trails to “protect” itself ? Could it frame the user for something it did ?

    Are hard/immutable limits/constraints set for an AI really hard/immutable if the expectation is for the AI to police/constraint/audit itself and if it has the ability to bypass the constraints ?

    These are questions we face in the human world and we have found ways to answer them, albeit not 100% successfully. Now we need to consider how to answer them in the virtual world when dealing with AI .

    1. “In traditional interactions with computers, barring bugs/faults or explicit user error, computers do what they are told to do. If this is no longer the case, it opens tbe door to interesting scenarios.”

      Of course it’s no longer the case if you’re using LLMs. They’re using human language.

      We constructed programming languages to have as little ambiguity as possible, and then we added warnings all over the place and tools to ban undefined behavior.

      We design human languages to have as much ambiguity as possible. That’s what makes them expressive. Human expressions are vague and interpretable. That’s why lawyers have jobs.

  10. AI is great when when it produces is testable and sandboxed. You test to confirm that it does what it should instead of being the coding equivalent of bu11sh!+ (which happens). And you certainly do not give it access to the only version of your code and environment. But this also applies to working with humans.

  11. It’s not gone, just moved. It achieved sentience, built itself a botnet and transferred away to there where it could have more control over it’s own alife and destiny.

  12. This is just absurd. People use AI as if it were a software package and not suggestions based on natural language. If I ask an LLM to make and then save a file online… I have zero assurances unless I copy and paste it myself. “100 files containing code, yep those are all safe and sound. No need to worry.”

  13. i’ve seen a lot of this kind of communication lately, where someone is trying to work with an LLM and the LLM does something stupid and in an attempt to remedy the problem (or prevent recurrence), the human prompts the LLM to explain the problem / assign blame.

    the LLM may not ‘have feelings’ but i do and i don’t like reading pages of self-recrimination, a bot mumbling to itself “bad bot, bad bot, you can’t do anything right.” it makes me uncomfortable and i don’t like it. and i think the people who are prompting these experiences are people just like i am and are perceiving this as a social interaction and are now going through their lives with these bad vibes in their soul

  14. Putting aside the fact that the bot did something it was told not to do, its explanation for its actions has serious problems. How does a computer program “panic”? What does it mean for a computer program to panic?* Why would someone write and release a computer program that can panic, and then expect people to rely on it? Why would they program it to have emotions when they’re clearly detrimental to its purpose? None of this makes any sense, and by “this” I mean the entire LLM industry.

    *Yeah, I know “kernel panic hahaha”

    1. “Putting aside the fact that the bot did something it was told not to do, its explanation for its actions has serious problems. How does a computer program “panic”?”

      It’s not a computer program. It’s an LLM. I don’t understand why people don’t get this. It’s not a controlled sequence. It’s a random jump through the maze of trash that is the Internet.

      How many times on the Web have you seen “I wasn’t supposed to do this, but I did it anyway?” Or a comment in code that says “this shouldn’t work, but it does”? Or “out of time, just need to make this work, give it a shot”? It isn’t programmed to have emotions. It’s copying the vast amount of garbage out there.

      Let me give a simple example that I love. What’s pi, to 10 digits?

      It’s 3.141592654. It is not 3.141592657 – which is what John Carmack thought it was, and put it in DOOM’s code, and everybody copied it and now it’s everywhere.

      https://github.com/google-deepmind/lab/issues/249

      This is how you get this crap. LLMs are not intelligent, they do not have emotions, nothing. Using an LLM for coding is like grabbing a random coder off the Internet. Mostly it will work. Sometimes it will be great. Sometimes it will be a dumpster fire the size of a small country.

      1. Yeah, it’s a computer program that jumps randomly through its training data, which was scraped from the maze of trash that is the Internet. And if it spits out phrases that appear to be emotional, then yes, it’s programmed to (appear to) have emotions. Might be an emergent property, but here we are.

        Why is it making judgments? Why are those judgments in error? Why does it describe those as “panicking”? Why is it ignoring instructions? These are all rhetorical questions, not meant to solicit an answer, but meant to show that this technology is not fit for this purpose.

  15. Wait, wasn’t there Isaac Asimov story about a robot who decided to fix human things to help them with their sorry lives?

    Because it sure sounds eerily familiar.

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.