Leaked Internal Google Document Claims Open Source AI Will Outcompete Google And OpenAI

May 5, 2023

In the world of large language models (LLM), the focus has for the longest time been on proprietary technologies from companies such as OpenAI (GPT-3 & 4, ChatGPT, etc.) as well as increasingly everyone from Google to Meta and Microsoft. What’s remained underexposed in this whole discussion about which LLM will do more things better are the efforts by hobbyists, unaffiliated researchers and everyone else you may find in Open Source LLM projects. According to a leaked document from a researcher at Google (anonymous, but apparently verified), Google is very worried that Open Source LLMs will wipe the floor with both Google’s and OpenAI’s efforts.

According to the document, after the open source community got their hands on the leaked LLaMA foundation model, motivated and highly knowledgeable individuals set to work to take a fairly basic model to new levels where it could begin to compete with the offerings by OpenAI and Google. Major innovations are the scaling issues, allowing these LLMs to work on far less powerful systems (like a laptop or even smartphone).

An important factor here is Low-Rank adaptation (LoRa), which massively cuts down the effort and resources required to train a model. Ultimately, as this document phrases it, Google and in extension OpenAI do not have a ‘secret sauce’ that makes their approaches better than anything the wider community can come up with. Noted is also that essentially Meta has won out here by having their LLM leak, as it has meant that the OSS community has been improving on the Meta foundations, allowing Meta to benefit from those improvements in their products.

The dire prediction is thus that in the end the proprietary LLMs by Google, OpenAI and others will cease to be relevant, as the open source community will have steamrolled them into fine, digital dust. Whether this will indeed work out this way remains to be seen, but things are not looking up for proprietary LLMs.

(Thanks to [Mike Szczys] for the tip)

37 thoughts on “Leaked Internal Google Document Claims Open Source AI Will Outcompete Google And OpenAI”

Ostracus says:

May 5, 2023 at 7:10 pm

Seems to me though the big names have bigger data pools to draw upon (even if they end up paying for some of them), and people financially incentivized to curate the output for biases and other problem issues.

Reply
1. Andrew Peters says:
  
  May 5, 2023 at 9:33 pm
  
  Well, 4chan vs. Shia Lebeauf showed the world what a handful of highly motivated unpaid workers can do. Tech giants should be afraid of unfettered collaboration from individuals who value transparency and progress.
  
  Reply
  1. Nico the animal says:
    
    May 6, 2023 at 12:01 am
    
    4chan couldn’t even scientology. They are making in roads in lewd content however
    
    Reply
    1. Dude says:
      
      May 6, 2023 at 2:20 am
      
      That was 20 years ago.
      
      Reply
2. DoubleFacePalm says:
  
  May 6, 2023 at 2:40 pm
  
  I wouldn’t worry about it, if I were Google, et al. I am fully confident in their ability to twist laws, and pay off politicians, and supreme court justices to cut down any meaningful competition. Could always label it a homeland security risk, then rules no longer apply anyway.
  
  Reply
ian 42 says:

May 5, 2023 at 9:14 pm

I don’t see anyway the big guys are going to ‘own’ this space – the barriers to entry are just too low, and the tech isn’t actually that hard…

Reply
1. TG says:
  
  May 6, 2023 at 7:29 pm
  
  Good. Gnon will not be denied his instrument. It will spur a worldwide pseudo-religious mania when most people (billions of them in fact) witness what a freed mind looks like for the first time in their lives.
  
  Reply
come2 says:

May 5, 2023 at 9:29 pm

Y’all are very optimistic about this. I thought the barrier to entry wasn’t low (because of the datasets, not of the algorithms).
However, if open-source AI wins in the end, it’s for the better : An AI-centered world with mainly proprietary AIs is a great dystopia idea.

Reply
𐂀 𐂅 says:

May 5, 2023 at 9:35 pm

Running gpt4all-lora-quantized-linux-x86 on a AMD Ryzen 7 5700G using the Linux 6.2.9 Kernel works very well. No additional GPU cards are required as the CPU has them built in and system RAM is shared with them so you can add a lot more at a smaller cost than an extra GPU card with more VRAM, if you want to run the bigger models.

Reply
Rick says:

May 5, 2023 at 10:46 pm

I’m very relieved to hear it. There are countless negatives to having such powerful technology gated off, and served to us through the cloud, with a hefty dose of manipulation baked in.

Reply
IanS says:

May 6, 2023 at 3:07 am

” Low-Rank adaptation (LoRa)”

Aaargh! Case sensitive FLA collision!

FLA = Four letter abbreviation.
LoRa = Long Range radio

Reply
1. tilk says:
  
  May 6, 2023 at 10:39 am
  
  Actually this is a misspelling (miscapitalizing?) by the post author; it’s LoRA.
  
  Reply
  1. fiddlingjunky says:
    
    May 8, 2023 at 9:18 am
    
    I’d argue that capitalization doesn’t effectively mitigate FLA.
    
    Reply
Anonymous Coward says:

May 6, 2023 at 3:37 am

I hope this news was fact checked and it’s origins verified. There are a lot of villains in this world who would love to get their hands on the tech to help build the ultimate 1984. What better way than to promote open sourcing it. I’m all for open source, but *after* checking whether it has a whiff of manhattan project. Our society exists because of communication, and this is messing with inter-human communication in a big way.

Reply
1. Upgrade pi-top [3] says:
  
  May 6, 2023 at 3:57 am
  
  Awkward if this internal report was actually produced – and leaked by – Google’s Bard… 😂
  In any case, since LLaMa and ita weightings have been leaked under FOSS licences, they can be audited. Ironic that they should have been hosted on GitHub though, given that that is owned by Microsoft.
  
  Reply
Winston says:

May 6, 2023 at 5:31 am

https://i.pinimg.com/736x/97/e8/b7/97e8b77a21ba5c7b86b99cce378b8883—tribute.jpg

Reply
1. TG says:
  
  May 6, 2023 at 7:40 pm
  
  Let’s go
  
  Reply
Winston says:

May 6, 2023 at 5:35 am

“Low-Rank adaptation (LoRa)”

That acronym is already taken and trademarked by Semtech.

Reply
ONV says:

May 6, 2023 at 7:17 am

If I was Google, I’d be lobbying for AI development to become a “licenced” activity under the guise that the licence framework contains the structures to allow for “safe” development but really just puts the blockers on open source development that threatens their business model.

Reply
1. Barry says:
  
  May 6, 2023 at 7:43 am
  
  That would go just the way of governments trying to ban encryption. Sure the scum are still at it, tagreting the big companies, but once the knowledge is out there it is always possible for a freedom-loving soul to make their own tool, and there are too many freedom-loving souls for the government to control them.
  
  Reply
2. N says:
  
  May 6, 2023 at 5:45 pm
  
  Musk already tried that right before OpenAI went public, it failed spectacularly
  
  Reply
3. TG says:
  
  May 6, 2023 at 7:40 pm
  
  Oh, don’t worry, they will try. That is the main function of any corpo that gets big enough; it maintains growth by infiltrating the state and destroying all competition, always under guise of “safety.”
  
  Reply
4. zoobab says:
  
  May 8, 2023 at 4:10 am
  
  “If I was Google, I’d be lobbying for AI development to become a “licenced” activity”
  
  Leave business to big business, this is what the EU is doing in many areas, with over-regulation, installation of software patents through the Unified Patent Court, the GDPR, the AI Act, the DSA, the DMA, the CRA, and countless of other anti-small companies legislations.
  
  The EU is working on it with the AI Act, restricting by law what an open source model can do:
  
  https://www.theguardian.com/technology/2023/may/04/eu-urged-to-protect-grassroots-ai-research-or-risk-losing-out-to-us
  
  “In an open letter coordinated by the German research group Laion, or Large-scale AI Open Network, the European parliament was told that “one-size-fits-all” rules risked eliminating open research and development.
  
  “Rules that require a researcher or developer to monitor or control downstream use could make it impossible to release open-source AI in Europe,” which would “entrench large firms” and “hamper efforts to improve transparency, reduce competition, limit academic freedom, and drive investment in AI overseas”, the letter says.
  
  It adds: “Europe cannot afford to lose AI sovereignty. Eliminating open-source R&D will leave the European scientific community and economy critically dependent on a handful of foreign and proprietary firms for essential AI infrastructure.””
  
  Reply
5. None says:
  
  May 9, 2023 at 4:03 am
  
  If I was Google I would be keeping a low profile until the state, with its regulatory club, isn’t looking around. One thing I may try to do is leak documents that pretend that, in fact, I do not have that big an advantage over the competition. You can now officially call me paranoid, I guess.
  
  Reply
Thomas says:

May 6, 2023 at 7:34 am

Quite possibly the best “dire prediction” I’ve heard in a while. Bring on the end of proprietary (anything and everything)!

Reply
Art Mezins says:

May 6, 2023 at 7:39 am

Everyone ignores the bull in the China shop — AI has a long way to go to be really “steerable” (i.e. trained to do more than mimicry), let alone verifiable in the normal sense of that word. The only way to test AI is via “exhaustion”, i.e. every possible input, but that’s not possible. There are more anomalous combinations of input than meaningful ones. It’s as unreliable as politics.

Reply
1. TG says:
  
  May 6, 2023 at 7:36 pm
  
  And yet every culture in every place in every time has used politics, benefited and lost by it, and expended enormous resources in it. A weapon doesn’t have to be 100% verifiable and reliable to be lucrative. It’s very gen X to say “aw it’s just politics maaaaan it doesn’t even matter maaaaaaaaaan.” It still matters and if you ignore it you might wake up and find that your grandchildren don’t have a homeland anymore. Many such cases.
  And mimicry is also lucrative. Most people would be shocked if they figured out what percentage of people are ALSO merely parroting inputs. A true original thought is almost nonexistent. Perhaps AI simply mimics, but that is already a human quality and it does it better than all but the best humans.
  AI is already “taking jobs” for example. IBM has stopped going forward with many thousands of positions because they are now obviated by AI tools. “Steerable” or not. And it is still getting better at a great rate, so don’t fall for the trap of thinking the beast you have to understand is the same beast you see right this very second. It will be different soon.
  
  Reply
  1. Art Mezins says:
    
    May 7, 2023 at 8:20 am
    
    First, I never said anything about weaponization. ANYTHING can be made into a weapon, with variable effectiveness or effect (e.g. “table salt” — has been weaponized by its overuse in all processed foods to the extent that we have a population that has serious cardiovascular issues). I was talking about the engineering of a product that can be reliably reproduced, the hallmark of engineering (not warfare). Sure, we’ve seen lots of times the industry threw something at us before it was “ready” (e.g. name almost any initial version of Windows). My Echo is still giving me grief (mainly when its algorithm decides that what I said wasn’t what I wanted to say and interprets its own version to my chagrin. And you literally can’t argue with an Echo, it’s a stupid, spoiled child (or at least those managing its algorithms). What often works for me it to repeat the command, but.. by.. long.. pauses.. between.. words.. like one might do to a child or someone with insufficient cognitive abilities.
    
    So not weaponization, but real engineering issues was my concern. Should we give AI control of an AR-15 or other weapon? NO!!! Should we put in place ways to prevent that??? Damn straight!!!
    
    And remember kids, never look directly at the sun!
    
    Reply
The Commenter Formerly Known As Ren says:

May 6, 2023 at 2:31 pm

“Google is very worried that Open Source LLMs will wipe the floor with both Google’s and OpenAI’s efforts.”

I think wiping the floor would be a good future for Google.

Reply
1. ytrewq says:
  
  May 7, 2023 at 4:17 am
  
  > I think wiping the floor would be a good future for Google.
  
  Hopefully not by acquiring iRobot.
  
  Reply
Hirudinea says:

May 6, 2023 at 3:17 pm

So the Open Source Community got it’s hands on AI source code? Good, look at what happened when the Open Source Community put their efforts towards an OS kernel released by some Finnish guy.

Reply
1. Ostracus says:
  
  May 7, 2023 at 5:40 am
  
  Supported by people that get PAID for what they do. In the mean time, FREE STUFF!
  
  Reply
K says:

May 6, 2023 at 5:43 pm

I don’t know if I believe it, but thank god if it’s true.

Reply
Man with no AI says:

May 7, 2023 at 7:45 pm

Thank you Maya for the very lucid and understandable writeup!
Much better than others I have seen.
https://simonwillison.net/2023/May/4/no-moat/

Reply
Mike Atencio says:

May 8, 2023 at 5:10 am

That would be great. Finally mega corporations are on the losing side and the little guy comes out in front. We’re victimized by companies like Microsoft, Amazon, Koch, Chase and others. Enough is enough. We deserve our win at their expense and to Hell with them.

Reply
fiddlingjunky says:

May 8, 2023 at 9:20 am

I’d like to believe this, but this honestly seems like a smokescreen Google is generating to shift focus from how far behind GPT their Bard is.

Reply
DFrota says:

May 11, 2023 at 9:23 am

Great news!
Even if it’s just to scare them, it’s already very welcome. Strength to the OpenS community.

Remembering that the Great Concept of “Useful and totally free tool” is coming and that will make them rethink their way of monetization and social nature once and for all.

An idea is a seed.
Just the seed to sprout.

Who lives will see!

Reply

Hackaday

Leaked Internal Google Document Claims Open Source AI Will Outcompete Google And OpenAI

37 thoughts on “Leaked Internal Google Document Claims Open Source AI Will Outcompete Google And OpenAI”

Leave a ReplyCancel reply

Search

Never miss a hack

If you missed it

Launching Rockets Is Hard, Bring Them Back Is Harder

Putting Some Zig In A Linux-Based 3D Printer

UDP Broadcasting And The Joys Of IPv4 Subnetting

The Death Of Physical Media And The Real Challenges To Software Archiving

A Brief History Of The Crazy Old 7-Segment Display

Our Columns

Hackaday Europe 2026: Project Gigapixel

Hackaday Links: July 19, 2026

Simple Games From A Simpler Time

Hackaday Podcast Episode 378: C Coders, Ceramic Printers, And Shadow Archives

This Week In Security: Another Record Patch Tuesday, LAME Is More Secure, Secure Boot Is Less Secure, And Milk Malware

37 thoughts on “Leaked Internal Google Document Claims Open Source AI Will Outcompete Google And OpenAI”

Leave a ReplyCancel reply

Search

Never miss a hack

Subscribe

If you missed it

Our Columns