The Singularity Isn’t Here… Yet

So, GPT-4 is out, and it’s all over for us meatbags. Hype has reached fever pitch, here in the latest and greatest of AI chatbots we finally have something that can surpass us. The singularity has happened, and personally I welcome our new AI overlords.

Hang on a minute though, I smell a rat, and it comes in defining just what intelligence is. In my time I’ve hung out with a lot of very bright people, as well as a lot of not-so-bright people who nonetheless think they’re very clever simply because they have a bunch of qualifications and diplomas. Sadly the experience hasn’t bestowed God-like intelligence on me, but it has given me a handle on the difference between intelligence and knowledge.

My premise is that we humans are conditioned by our education system to equate learning with intelligence, mostly because we have flaky CPUs and worse memory, and that makes learning something a bit of an effort. Thus when we see an AI, a machine that can learn everything because it has a decent CPU and memory, we’re conditioned to think of it as intelligent because that’s what our schools train us to do. In fact it seems intelligent to us not because it’s thinking of new stuff, but merely through knowing stuff we don’t because we haven’t had the time or capacity to learn it.

Growing up and making my earlier career around a major university I’ve seen this in action so many times, people who master one skill, rote-learning the school textbook or the university tutor’s pet views and theories, and barfing them up all over the exam paper to get their amazing qualifications. On paper they’re the cream of the crop, and while it’s true they’re not thick, they’re rarely the special clever people they think they are. People with truly above-average intelligence exist, but in smaller numbers, and their occurrence is not a 1:1 mapping with holders of advanced university degrees.

Even the examples touted of GPT’s brilliance tend to reinforce this. It can do the bar exam or the SAT test, thus we’re told it’s as intelligent as a school-age kid or a lawyer. Both of those qualifications follow our educational system’s flawed premise that education equates to intelligence, so as a machine that’s learned all the facts it follows my point above about learning by rote. The machine has simply barfed up what it has learned the answers are onto the exam paper. Is that intelligence? Is a search engine intelligent?

This is not to say that tools such as GPT-4 are not amazing creations that have a lot of potential to do good things aside from filling up the internet with superficially readable spam. Everyone should have a play with them and investigate their potential, and from that will no doubt come some very interesting things. Just don’t confuse them with real people, because sometimes meatbags can surprise you.

Computers For Fun

The last couple years have seen an incredible flourishing of the cyberdeck scene, and probably for about as many reasons as there are individual ’deck designs. Some people get really into the prop-making, some into scrapping old tech or reusing a particularly appealing case, and others simply into the customization possibilities. That’s awesome, and they’re all different motivations for making a computer that’s truly your own.

But I really like the motivation and sentiment behind [Andreas Eriksen]’s PotatoP. (Assuming that his real motivation isn’t all the bad potato puns.) This is a small microcomputer that’s built on a commonly available microcontroller, so it’s not a particularly powerful beast – hence the “potato”. But what makes up for that in my mind is that it’s running a rudimentary bare-metal OS of his own writing. It’s like he’s taken the cyberdeck’s DIY aesthetic into the software as well.

What I like most about the spirit of the project is the idea of a long-term project that’s also a constant companion. Once you get past a terminal and an interpreter – [Andreas] is using LISP for both – everything else consists of small projects that you can check off one by one, that maybe don’t take forever, and that are limited in complexity by the hardware you’re working on. A simple text editor, some graphics primitives, maybe a sound subsystem. A way to read and write files in flash. I don’t love LISP personally, but I love that it brings interactivity and independence from an external compiler, making the it possible to develop the system on the system, pulling itself up by its own bootstraps.

Pretty soon, you could have something capable, and completely DIY. But it doesn’t need to be done all at once either. With a light enough computer, and a good basic foundation, you could keep it in your backpack and play “OS development” whenever you’ve got the free time. A DIY play OS for a sandbox computing platform: what more could a nerd want?

ChatGPT, Bing, And The Upcoming Security Apocalypse

Most security professionals will tell you that it’s a lot easier to attack code systems than it is to defend them, and that this is especially true for large systems. The white hat’s job is to secure each and every point of contact, while the black hat’s goal is to find just one that’s insecure.

Whether black hat or white hat, it also helps a lot to know how the system works and exactly what it’s doing. When you’ve got the source code, either because it’s open-source, or because you’re working inside the company that makes the software, you’ve got a huge advantage both in finding bugs and in fixing them. In the case of closed-source software, the white hats arguably have the offsetting advantage that they at least can see the source code, and peek inside the black box, while the attackers cannot.

Still, if you look at the number of security issues raised weekly, it’s clear that even in the case of closed-source software, where the defenders should have the largest advantage, that offense is a lot easier than defense.

So now put yourself in the shoes of the poor folks who are going to try to secure large language models like ChatGPT, the new Bing, or Google’s soon-to-be-released Bard. They don’t understand their machines. Of course they know how the work inside, in the sense of cross multiplying tensors and updating weights based on training sets and so on. But because the billions of internal parameters interact in incomprehensible ways, almost all researchers refer to large language models’ inner workings as a black box.

And they haven’t even begun to consider security yet. They’re still worried about how to construct obscure background prompts that prevent their machines from spewing hate speech or pornographic novels. But as soon as the machines start doing something more interesting than just providing you plain text, the black hats will take notice, and someone will have to figure out defense.

Indeed, this week, we saw the first real shot across the bow: a hack to make Bing direct users to arbitrary (bad) webpages. The Bing hack requires the user to already be on a compromised website, so it’s maybe not very threatening, but it points out a possible real security difference between Bing and ChatGPT: Bing gives you links to follow, and that makes it a juicy target.

We’re right on the edge of a new security landscape, because even the white hats are facing a black box in the AI. So far, what ChatGPT and Codex and other large language models are doing is trivially secure – putting out plain text – but Bing is taking the first dangerous steps into doing something more useful, both for users and black hats. Given the ease with which people have undone OpenAI’s attempts to keep ChatGPT in its comfort zone, my guess is that the white hats will have their hands full, and the black-box nature of the model deprives them of their best hope. Buckle your seatbelts.

Simultaneous Invention, All The Time?

As Tom quipped on the podcast this week, if you have an idea for a program you’d like to write, all you have to do is look around on GitHub and you’ll find it already coded up for you. (Or StackOverflow, or…) And that’s probably pretty close to true, at least for really trivial bits of code. But it hasn’t always been thus.

I was in college in the mid 90s, and we had a lab of networked workstations that the physics majors could use. That’s where I learned Unix, and where I had the idea for the simplest program ever. It took the background screen color, in the days before wallpapers, and slowly random-walked it around in RGB space. This was set to be slow enough that anyone watching it intently wouldn’t notice, but fast enough that others occasionally walking by my terminal would see a different color every time. I assure you, dear reader, this was the very height of wit at the time.

With the late 90s came the World Wide Web and the search engine, and the world got a lot smaller. For some reason, I was looking for how to set the X terminal background color again, this time searching the Internet instead of reading up in a reference book, and I stumbled on someone who wrote nearly exactly the same random-walk background color changer. My jaw dropped! I had found my long-lost identical twin brother! Of course, I e-mailed him to let him know. He was stoked, and we shot a couple funny e-mails back and forth riffing on the bizarre coincidence, and that was that.

Can you imagine this taking place today? It’s almost boringly obvious that if you search hard enough you’ll find another monkey on another typewriter writing exactly the same sentence as you. It doesn’t even bear mentioning. Heck, that’s the fundamental principle behind Codex / CoPilot – the code that you want to write has been already written so many times that it will emerge as the most statistically likely response from a giant pattern-matching, word-word completion neural net model.

Indeed, stop me if you’ve read this before.

All About USB-C: Manufacturer Sins

People experience a variety of problems with USB-C. I’ve asked people online about their negative experiences with USB-C, and got a wide variety of responses, both on Twitter and on Mastodon. In addition to that, communities like r/UsbCHardware keep a lore of things that make some people’s experience with USB-C subpar.

In engineering and hacking, there’s unspoken things we used to quietly consider as unviable. Having bidirectional power and high-speed data on a single port with thousands of peripherals, using nothing but a single data pin – if you’ve ever looked at a schematic for a proprietary docking connector attempting such a feat, you know that you’d find horrors beyond comprehension. For instance, MicroUSB’s ID pin quickly grew into a trove of incompatible resistor values for anything beyond “power or be powered”. Laptop makers had to routinely resort to resistor and one-wire schemes to make sure their chargers aren’t overloaded by a laptop assuming more juice than the charger can give, which introduced a ton of failure modes on its own.

When USB-C was being designed, the group looked through chargers, OTG adapters, display outputs, docking stations, docking stations with charging functions, and display outputs, and united them into a specification that can account for basically everything – over a single cable. What could go wrong?

Of course, device manufacturers found a number of ways to take everything that USB-C provides, and wipe the floor with it. Some of the USB-C sins are noticeable trends. Most of them, I’ve found, are manufacturers’ faults, whether by inattention or by malice; things like cable labelling are squarely in the USB-C standard domain, and there’s plenty of random wear and tear failures.

I don’t know if the USB-C standard could’ve been simpler. I can tell for sure that plenty of mistakes are due to device and cable manufacturers not paying attention. Let’s go through the notorious sins of USB-C, and see what we can learn. Continue reading “All About USB-C: Manufacturer Sins”

Copyright Data, But Do It Right

Copyright law is a triple-edged sword. Historically, it has been used to make sure that authors and rock musicians get their due, but it’s also been extended to the breaking point by firms like Disney. Strangely, a concept that protected creative arts got pressed into duty in the 1980s to protect the writing down of computer instructions, ironically a comparatively few bytes of BIOS code. But as long as we’re going down this strange road where assembly language is creative art, copyright law could also be used to protect the openness of software as well. And doing so has given tremendous legal backbone to the open and free software movements.

So let’s muddy the waters further. Looking at cases like the CDDB fiasco, or the most recent sale of ADSB Exchange, what I see is a community of people providing data to an open resource, in the belief that they are building something for the greater good. And then someone comes along, closes up the database, and sells it. What prevents this from happening in the open-software world? Copyright law. What is the equivalent of copyright for datasets? Strangely enough, that same copyright law.

Data, being facts, can’t be copyrighted. But datasets are purposeful collections of data. And just like computer programs, datasets can be licensed with a restrictive copyright or a permissive copyleft. Indeed, they must, because the same presumption of restrictive copyright is the default.

I scoured all over the ADSB Exchange website to find any notice of the copyright / copyleft status of their dataset taken as a whole, and couldn’t find any. My read is that this means that the dataset is the exclusive property of its owner. The folks who were contributing to ADSB Exchange were, as far as I can tell, contributing to a dataset that they couldn’t modify or redistribute. To be a free and open dataset, to be shared freely, copied, and remixed, it would need a copyleft license like Creative Commons or the Open Data Commons license.

So I’ll admit that I’m surprised to have not seen permissive licenses used around community-based open data projects, especially projects like ADSB Exchange, where all of the software that drives it is open source. Is this just because we don’t know enough about them? Maybe it’s time for that to change, because copyright on datasets is the law of the land, no matter how absurd it may sound on the face, and the closed version is the default. If you want your data contributions to be free, make sure that the project has a free data license.

Now ChatGPT Can Make Breakfast For Me

The world is abuzz with tales of the ChatGPT AI chatbot, and how it can do everything, except perhaps make the tea. It seems it can write code, which is pretty cool, so if it can’t make the tea as such, can it make the things I need to make some tea? I woke up this morning, and after lying in bed checking Hackaday I wandered downstairs to find some breakfast. But disaster! Some burglars had broken in and stolen all my kitchen utensils! All I have is my 3D printer and laptop, which curiously have little value to thieves compared to a set of slightly chipped crockery. What am I to do!

Never Come Between A Hackaday Writer And Her Breakfast!

OK Jenny, think rationally. They’ve taken the kettle, but I’ve got OpenSCAD and ChatGPT. Those dastardly miscreants won’t come between me and my breakfast, I’m made of sterner stuff! Into the prompt goes the following query:

"Can you write me OpenSCAD code to create a model of a kettle?" Continue reading “Now ChatGPT Can Make Breakfast For Me”