Leaked Internal Google Document Claims Open Source AI Will Outcompete Google And OpenAI

In the world of large language models (LLM), the focus has for the longest time been on proprietary technologies from companies such as OpenAI (GPT-3 & 4, ChatGPT, etc.) as well as increasingly everyone from Google to Meta and Microsoft. What’s remained underexposed in this whole discussion about which LLM will do more things better are the efforts by hobbyists, unaffiliated researchers and everyone else you may find in Open Source LLM projects. According to a leaked document from a researcher at Google (anonymous, but apparently verified), Google is very worried that Open Source LLMs will wipe the floor with both Google’s and OpenAI’s efforts.

According to the document, after the open source community got their hands on the leaked LLaMA foundation model, motivated and highly knowledgeable individuals set to work to take a fairly basic model to new levels where it could begin to compete with the offerings by OpenAI and Google. Major innovations are the scaling issues, allowing these LLMs to work on far less powerful systems (like a laptop or even smartphone).

An important factor here is Low-Rank adaptation (LoRa), which massively cuts down the effort and resources required to train a model. Ultimately, as this document phrases it, Google and in extension OpenAI do not have a ‘secret sauce’ that makes their approaches better than anything the wider community can come up with. Noted is also that essentially Meta has won out here by having their LLM leak, as it has meant that the OSS community has been improving on the Meta foundations, allowing Meta to benefit from those improvements in their products.

The dire prediction is thus that in the end the proprietary LLMs by Google, OpenAI and others will cease to be relevant, as the open source community will have steamrolled them into fine, digital dust. Whether this will indeed work out this way remains to be seen, but things are not looking up for proprietary LLMs.

(Thanks to [Mike Szczys] for the tip)

37 thoughts on “Leaked Internal Google Document Claims Open Source AI Will Outcompete Google And OpenAI

  1. Seems to me though the big names have bigger data pools to draw upon (even if they end up paying for some of them), and people financially incentivized to curate the output for biases and other problem issues.

    1. Well, 4chan vs. Shia Lebeauf showed the world what a handful of highly motivated unpaid workers can do. Tech giants should be afraid of unfettered collaboration from individuals who value transparency and progress.

    2. I wouldn’t worry about it, if I were Google, et al. I am fully confident in their ability to twist laws, and pay off politicians, and supreme court justices to cut down any meaningful competition. Could always label it a homeland security risk, then rules no longer apply anyway.

    1. Good. Gnon will not be denied his instrument. It will spur a worldwide pseudo-religious mania when most people (billions of them in fact) witness what a freed mind looks like for the first time in their lives.

  2. Y’all are very optimistic about this. I thought the barrier to entry wasn’t low (because of the datasets, not of the algorithms).
    However, if open-source AI wins in the end, it’s for the better : An AI-centered world with mainly proprietary AIs is a great dystopia idea.

  3. Running gpt4all-lora-quantized-linux-x86 on a AMD Ryzen 7 5700G using the Linux 6.2.9 Kernel works very well. No additional GPU cards are required as the CPU has them built in and system RAM is shared with them so you can add a lot more at a smaller cost than an extra GPU card with more VRAM, if you want to run the bigger models.

  4. I’m very relieved to hear it. There are countless negatives to having such powerful technology gated off, and served to us through the cloud, with a hefty dose of manipulation baked in.

  5. I hope this news was fact checked and it’s origins verified. There are a lot of villains in this world who would love to get their hands on the tech to help build the ultimate 1984. What better way than to promote open sourcing it. I’m all for open source, but *after* checking whether it has a whiff of manhattan project. Our society exists because of communication, and this is messing with inter-human communication in a big way.

    1. Awkward if this internal report was actually produced – and leaked by – Google’s Bard… 😂
      In any case, since LLaMa and ita weightings have been leaked under FOSS licences, they can be audited. Ironic that they should have been hosted on GitHub though, given that that is owned by Microsoft.

  6. If I was Google, I’d be lobbying for AI development to become a “licenced” activity under the guise that the licence framework contains the structures to allow for “safe” development but really just puts the blockers on open source development that threatens their business model.

    1. That would go just the way of governments trying to ban encryption. Sure the scum are still at it, tagreting the big companies, but once the knowledge is out there it is always possible for a freedom-loving soul to make their own tool, and there are too many freedom-loving souls for the government to control them.

    2. Oh, don’t worry, they will try. That is the main function of any corpo that gets big enough; it maintains growth by infiltrating the state and destroying all competition, always under guise of “safety.”

    3. “If I was Google, I’d be lobbying for AI development to become a “licenced” activity”

      Leave business to big business, this is what the EU is doing in many areas, with over-regulation, installation of software patents through the Unified Patent Court, the GDPR, the AI Act, the DSA, the DMA, the CRA, and countless of other anti-small companies legislations.

      The EU is working on it with the AI Act, restricting by law what an open source model can do:


      “In an open letter coordinated by the German research group Laion, or Large-scale AI Open Network, the European parliament was told that “one-size-fits-all” rules risked eliminating open research and development.

      “Rules that require a researcher or developer to monitor or control downstream use could make it impossible to release open-source AI in Europe,” which would “entrench large firms” and “hamper efforts to improve transparency, reduce competition, limit academic freedom, and drive investment in AI overseas”, the letter says.

      It adds: “Europe cannot afford to lose AI sovereignty. Eliminating open-source R&D will leave the European scientific community and economy critically dependent on a handful of foreign and proprietary firms for essential AI infrastructure.””

    4. If I was Google I would be keeping a low profile until the state, with its regulatory club, isn’t looking around. One thing I may try to do is leak documents that pretend that, in fact, I do not have that big an advantage over the competition. You can now officially call me paranoid, I guess.

  7. Everyone ignores the bull in the China shop — AI has a long way to go to be really “steerable” (i.e. trained to do more than mimicry), let alone verifiable in the normal sense of that word. The only way to test AI is via “exhaustion”, i.e. every possible input, but that’s not possible. There are more anomalous combinations of input than meaningful ones. It’s as unreliable as politics.

    1. And yet every culture in every place in every time has used politics, benefited and lost by it, and expended enormous resources in it. A weapon doesn’t have to be 100% verifiable and reliable to be lucrative. It’s very gen X to say “aw it’s just politics maaaaan it doesn’t even matter maaaaaaaaaan.” It still matters and if you ignore it you might wake up and find that your grandchildren don’t have a homeland anymore. Many such cases.
      And mimicry is also lucrative. Most people would be shocked if they figured out what percentage of people are ALSO merely parroting inputs. A true original thought is almost nonexistent. Perhaps AI simply mimics, but that is already a human quality and it does it better than all but the best humans.
      AI is already “taking jobs” for example. IBM has stopped going forward with many thousands of positions because they are now obviated by AI tools. “Steerable” or not. And it is still getting better at a great rate, so don’t fall for the trap of thinking the beast you have to understand is the same beast you see right this very second. It will be different soon.

      1. First, I never said anything about weaponization. ANYTHING can be made into a weapon, with variable effectiveness or effect (e.g. “table salt” — has been weaponized by its overuse in all processed foods to the extent that we have a population that has serious cardiovascular issues). I was talking about the engineering of a product that can be reliably reproduced, the hallmark of engineering (not warfare). Sure, we’ve seen lots of times the industry threw something at us before it was “ready” (e.g. name almost any initial version of Windows). My Echo is still giving me grief (mainly when its algorithm decides that what I said wasn’t what I wanted to say and interprets its own version to my chagrin. And you literally can’t argue with an Echo, it’s a stupid, spoiled child (or at least those managing its algorithms). What often works for me it to repeat the command, but.. by.. long.. pauses.. between.. words.. like one might do to a child or someone with insufficient cognitive abilities.

        So not weaponization, but real engineering issues was my concern. Should we give AI control of an AR-15 or other weapon? NO!!! Should we put in place ways to prevent that??? Damn straight!!!

        And remember kids, never look directly at the sun!

  8. “Google is very worried that Open Source LLMs will wipe the floor with both Google’s and OpenAI’s efforts.”

    I think wiping the floor would be a good future for Google.

  9. So the Open Source Community got it’s hands on AI source code? Good, look at what happened when the Open Source Community put their efforts towards an OS kernel released by some Finnish guy.

  10. That would be great. Finally mega corporations are on the losing side and the little guy comes out in front. We’re victimized by companies like Microsoft, Amazon, Koch, Chase and others. Enough is enough. We deserve our win at their expense and to Hell with them.

  11. Great news!
    Even if it’s just to scare them, it’s already very welcome. Strength to the OpenS community.

    Remembering that the Great Concept of “Useful and totally free tool” is coming and that will make them rethink their way of monetization and social nature once and for all.

    An idea is a seed.
    Just the seed to sprout.

    Who lives will see!

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.