Since ELIZA was created by [Joseph Weizenbaum] in the 1960s, its success had led to many variations and ports being written over the intervening decades. The goal of the ELIZA Archaeology Project by Stanford, USC, Oxford and other university teams is to explore and uncover as much of this history as possible, starting with the original 1960s code. As noted in a recent blog post by [Anthony Hay], most of the intervening ‘ELIZA’ versions seem to have been more inspired by the original rather than accurate replicas or extensions of the original. This raises the question of what the original program really looked like, a question which wasn’t answered until 2020 when the original source code was rediscovered. Continue reading “The ELIZA Archaeology Project: Uncovering The Original ELIZA”
Software Development696 Articles
Filters Are In Bloom
If you are a fan of set theory, you might agree there are two sets of people who write computer programs: those who know what a Bloom filter is and those who don’t. How could you efficiently test to see if someone is one set or another? Well, you could use a Bloom filter. [SamWho] takes us through the whole thing in general terms that you could apply in any situation.
The Bloom filter does perform a trade-off for its speed. It is subject to false positives but not false negatives. That is, if a Bloom filter algorithm tells you that X is not part of a set, it is correct. But if it tells you it is, you may have to investigate more to see if that’s true.
If it can’t tell you that something is definitely in a set, why bother? Usually, when you use a Bloom filter, you want to reduce searching through a huge amount of data. The example in the post talks about having a 20-megabyte database of “bad” URLs. You want to warn users if they enter one, but downloading that database is prohibitive. But a Bloom filter could be as small as 1.8 megabytes. However, there would be a 1 in 1000 chance of a false positive.
Increase the database size to 3.59 megabytes, and you can reduce false positives to one in a million. Presumably, if you got a positive, you could accept the risk it is false, or you could do more work to search further.
Imagine, for example, a web cache device or program. Many web pages are loaded one time and never again. If you cache all of them, you’ll waste a lot of time and push other things out of the cache. But if you test a page URL with a Bloom filter, you can improve things quite a bit. If the URL may exist in the Bloom filter, then you’ve probably seen it before, so you might want to cache it.
If it says you haven’t, you can add it to the filter so if it is ever accessed again, it will cache. Sure, sometimes a page will show a false positive. So what? You’ll just cache the page on the first time, which is what you did before, anyway. If that happens only 0.1% of the time, you still win.
In simple terms, the Bloom filter hashes each item using three different algorithms and sets bits in an array based on the result. To test an item, you compute the same hashes and see if any of the corresponding bits are set to zero. If so, the item can’t be in the set. Of course, there’s no assurance that all three bits being set means the set contains the item. Those three bits might be set for totally different items.
Why does increasing the number of bits help? The post answers that and looks at other optimizations like a different number of hash functions and counting.
The post does a great job of explaining the filter but if you want a more concrete example in C, you might want to read this post next. Or search for code in your favorite language. We’ve talked about Python string handling with Bloom filters before. We’ve even seen a proposal to add them to the transit bus.
WoWMIPS: A MIPS Emulator For Windows Applications
When Windows NT originally launched it had ports to a wide variety of platforms, ranging from Intel’s x86 and i860 to DEC’s Alpha as well as the MIPS architecture. Running Windows applications written for many of these platforms is a bit tricky these days, which [x86matthew] saw as a good reason to write a MIPS emulator. This isn’t just any old emulator, though. It maps 32-bit Windows applications targeted at the MIPS R4000 CPU to an x86 CPU instead. Since both platforms run in a little-endian, 32-bit mode, this theoretically should be a walk in the park.
The use of the Windows PE executable format is also the same, so the first task was to figure out how to load the MIPS PE binary in a way that made sense for an x86 platform. This involved some reverse-engineering of the MIPS ntdll.dll file to figure out how relocations on that platform were handled. Following this, the mapping of the instructions of the R4000 CPU to the (CISC) x86 ISA was pretty easy. Only Floating Point Unit (FPU) support was left as a future challenge. Memory access was left as direct access, meaning no sandboxing or isolation, for simplicity’s sake.
The final task was mapping the native API calls, which call almost directly into the underlying host Windows OS’s API, with a bit of glue logic. With all of this done, Windows NT applications originally written for 1990s MIPS ran just fine on a modern-day x86_64 PC running Windows — as long as you don’t need an FPU (for now).
Make Your Bookshelf Clickable
We’ll confess that we have a fondness for real books and plenty of them. So does [James], and he decided he needed a way to take a picture of his bookshelves and make each book clickable to find more information. This is one of those things that sounds fairly simple until you decide to do it. You can try an example of the results and then go back and read about the journey it took to get there.
There are several subtasks involved. First, you want to identify each book’s envelope. It wouldn’t do to click on the Joy of Cooking and get information about Remembrance of Things Past.
The next challenge is reading the title of the book. This can be tricky. Fonts differ. The book could be upside down. Some titles go cross the spine, but most go vertically. The remainder of the task is fairly easy. If you know the region and the title, you can easily find a link (for Google Books, in this case) and build an SVG overlay that maps the areas for each book to the right link.
Developing In Pascal On The Commodore 64 With Abacus Super Pascal 64
Most people associate the Commodore 64 with Commodore BASIC and precompiled applications, but it also had a number of alternative development environments produced for it. One of these was Super Pascal 64 by Abacus. A solid introduction to this software package is provided in a video tutorial by [My Developer Thoughts] on YouTube. This uses the Abacus Super Pascal 64 software and manual from the [Lyon Labs] website, which incidentally has a lot more development environments and operating systems for the C64 listed for your perusal.
Abacus’ Super Pascal supports the official Pascal language, requiring nothing more than a Commodore 64 and two Commodore 1541 floppy disk drives to get started. One FDD is for the Super Pascal software, which boots into the development environment, the other FDD and the disks in it are the target for the current project’s source code and compiled binary. Although the lack of support for FDDs other than the 1541 is somewhat odd, this comes presumably from the operating system nature of the development environment and the 1541 being by far the most common FDD for the C64.
Continue reading “Developing In Pascal On The Commodore 64 With Abacus Super Pascal 64”
How Many Time Zones Are There Anyway?
Nowadays, it’s an even bet that your newest project somehow connects to the Internet and, thus, to the world. Even if it doesn’t, if you share your plans, someone might reproduce your creation in some far distant locale. If your design uses time, you might need to think about time zones. Easy, right? That’s what [Zain Rizvi] thought until he tried to deploy something that converted between timezones. You can learn from his misconceptions thanks to a detailed post he provides.
You might think, “What’s the big deal?” After all, there’s UTC, and then there are 12 time zones ahead of UTC and 12 time zones later. But that’s not even close to true.
As [Zain] found out, there are 27 hours in a full-day cycle if you count UTC as one hour. Why? Because some islands in the Pacific wanted to be on the wrong side of the International Date Line. So there are a few extra zones to accommodate them.
You can’t even count on time zones being offset by an hour from the previous zone. Several zones have a half-hour offset from UTC (for example, India’s standard time is 5.5 hours from UTC). But surely the offset is always either a whole number or a number where the fractional part is 0.5, right?
Um, no. Nepal wants the sun to be directly over the mountain at noon, so it offsets by 45 minutes! [Zain] wonders — as we do — what would happen if the mountain shifted over time? Until 1940, Amsterdam used a 20-minute offset. Some cities are split with one half in one time zone and another in the other.
Of course, there are the usual problems with multiple names for each zone, both because many countries want their own zone and because the exact same zone is different in different languages. Having your own zone is not just for vanity, though. Daylight savings time rules will vary by zone and even, in some cases, only in certain parts of a zone. For example, in the United States, Arizona doesn’t change to daylight savings time. Oh, except for the Navajo Nation in Arizona, which does! Some areas observe daylight savings time that starts and ends multiple times during the year. Even if you observe daylight savings time, there are cases where the time shift isn’t an entire hour.
Besides multiple names, common names for zones often overlap. For example, in the United States, the Eastern Standard Time zone differs from Australia’s. Confused? You should be.
Maybe we should have more respect for multiple time zone clock projects. We’ve noticed these problems before when we felt sorry for the people who maintain the official time zone database.
Benchmarking Latency Across Common Wireless Links For MCUs
Although factors like bandwidth, power usage, and the number of (kilo)meters reach are important considerations with wireless communication for microcontrollers, latency should be another important factor to pay attention to. This is especially true for projects like controllers where round-trip latency and instant response to an input are essential, but where do you find the latency number in datasheets? This is where [Michael Orenstein] and [Scott] over at Electric UI found a lack of data, especially when taking software stacks into account. In other words, it was time to do some serious benchmarking.
The question to be answered here was specifically how fast a one-way wireless user interaction can be across three levels of payload sizes (12, 128, and 1024 bytes). The effective latency is measured from when the input is provided on the transmitter, and the receiver has processed it and triggered the relevant output pin. The internal latency was also measured by having a range of framework implementations respond to an external interrupt and drive a GPIO pin high. Even this test on an STM32F429 MCU already showed that, for example, the STM32 low-level (LL) framework is much faster than the stm32duino one.
Continue reading “Benchmarking Latency Across Common Wireless Links For MCUs”