Living In The (LLM) Past

February 9, 2026

In the early days of AI, a common example program was the hexapawn game. This extremely simplified version of a chess program learned to play with your help. When the computer made a bad move, you’d punish it. However, people quickly realized they could punish good moves to ensure they always won against the computer. Large language models (LLMs) seem to know “everything,” but everything is whatever happens to be on the Internet, seahorse emojis and all. That got [Hayk Grigorian] thinking, so he built TimeCapsule LLM to have AI with only historical data.

Sure, you could tell a modern chatbot to pretend it was in, say, 1875 London and answer accordingly. However, you have to remember that chatbots are statistical in nature, so they could easily slip in modern knowledge. Since TimeCapsule only knows data from 1875 and earlier, it will be happy to tell you that travel to the moon is impossible, for example. If you ask a traditional LLM to roleplay, it will often hint at things you know to be true, but would not have been known by anyone of that particular time period.

Chatting with ChatGPT and telling it that it was a person living in Glasgow in 1200 limited its knowledge somewhat. Yet it was also able to hint about North America and the existence of the atom. Granted, the Norse apparently found North America around the year 1000, and Democritus wrote about indivisible matter in the fifth century. But that knowledge would not have been widespread among common people in the year 1200. Training on period texts would surely give a better representation of a historical person.

The model uses texts from 1800 to 1875 published in London. In total, there is about 90 GB of text files in the training corpus. Is this practical? There is academic interest in recreating period-accurate models to study history. Some also see it as a way to track both biases of the period and contrast them with biases found in data today. Of course, unlike the Internet, surviving documents from the 1800s are less likely to have trivialities in them, so it isn’t clear just how accurate a model like this would be for that sort of purpose.

Instead of reading the news, LLMs can write it. Just remember that the statistical nature of LLMs makes them easy to manipulate during training, too.

Featured Art: Royal Courts of Justice in London about 1870, Public Domain

23 thoughts on “Living In The (LLM) Past”

James Paskaruk says:

February 9, 2026 at 9:12 am

If they did this training, while doing a dedicated parallel course of training on similar modern texts, with a lot of personal and biographical and (especially?) professional data about the authors of the modern training texts, it is not that far-fetched to think that a chatbot which is representative of a typical author of the antique era could be produced. A common street person, probably not so much,

though there is plenty of sensational garbage in our culture to mirror the broadsheets and ballads of the middle ages as well, I would think.

Reply
1. C says:
  
  February 10, 2026 at 8:00 am
  
  Which begs the question, how many diaries of ancestors are forgotten in attics and never digitized? Previously those didn’t have value, but for this project it would. Certainly if you look at gender bias. Diaries from average women at that time would improve the model. Hopefully those diaries will be rescued and digitized so those personal writing styles and memories won’t be lost.
  
  Reply
2. Pat says:
  
  February 10, 2026 at 10:26 am
  
  “it is not that far-fetched to think that a chatbot which is representative of a typical author of the antique era could be produced.”
  
  Uh. No. It’d produce author-like output from a typical author of that time. Authors don’t talk like they write. They’re not trying to be conversational. Even when you’re writing letters back and forth with someone else, you’re not talking like you converse.
  
  The best examples you could get would be from plays, but even there it’s limited.
  
  Reply
royalestel says:

February 9, 2026 at 9:39 am

Brian Roemmele agrees with this. His focus is on the 1870s-1970s material. Unfortunately Moltbook demonstrates the inherent problem that came from training on Reddit.

Reply
regent says:

February 9, 2026 at 9:54 am

This is great, both as a DIY project in its own right, and also as a helpful way to show how the sausage is made. Too many of my technical friends are duping themselves into believing LLMs are experiencing the world, despite being demonstrably neither learning nor mutating. Looking at how the training data is processed to become a model helps dismiss the illusion.

Reply
1. Ostracus says:
  
  February 9, 2026 at 10:22 am
  
  LLMs trained on sausage making. The horror.
  
  Reply
  1. irox says:
    
    February 9, 2026 at 8:03 pm
    
    When you are an AI powered sausage maker, every problem looks like a pile of meat that needs to be turned into sausages.
    
    Reply
James says:

February 9, 2026 at 10:42 am

I find this concept fascinating. Who knows what knowledge lies buried in pre 1900 text. If enough newspapers, books, letters, and periodicals were assembled there might actually be enough data to train a model that performs well.

Reply
1. JB says:
  
  February 9, 2026 at 3:41 pm
  
  I was originally going to say that this wouldn’t happen as we’ve refined and grown our knowledge overtime. However I just recalled that someone discovered something decadesbefore it found any use and at the time was only thought of as a curiosity. It was someone much later rediscovering the work and realising it could solve a massive modern problem. I’m heading to bed now, might come to me in my sleep.
  
  Reply
D says:

February 9, 2026 at 10:55 am

I can see a lot of interesting uses for this. It’s always been fun to speculate about the “Great Man” vs “social inevitability” theories of history. We could explore with this by giving an 1800s LLM an interface it can call to “perform a physics experiment” (i.e. the 1800s LLM describes an experimental setup and a modern LLM responds with realistic experiment results). Then see whether the 1800s LLMs can “invent” quantum mechanics on their own.

If it gets there relatively quickly and using similar “experiments” as Bohr, Plank, Rutherford et al, then we would have some evidence that quantum mechanics was a logical next step as opposed to a miraculous leap.

Of course, it was difficult and expensive to get published in the 1800s. So the training corpus does exactly reflect the full milieu of the time.

Reply
1. D says:
  
  February 9, 2026 at 12:17 pm
  
  doesn’t* exactly reflect the full milieu
  
  Reply
2. Awen says:
  
  February 9, 2026 at 4:35 pm
  
  Nice :)
  
  Reply
3. Joshua says:
  
  February 10, 2026 at 1:14 am
  
  +1
  
  The LLMs can’t simulate the wandering mind of a dreamer or a novelist, though.
  To this to happen, it would need to read between the lines and apply knowledge and experience from totally different fields.
  Including children stories, lullabies and fairy tales. ;)
  And even that doesn’t suffice, maybe, because some real people had real hallucinations that might have inspired them (ahem).
  
  As is, the resulting LLM will be rather a mirror of the scientific society of a given era, at the best.
  A manifestation of things that have been accepted as valid and true by the masses of the day.
  It will be the equivalent of common knowledge of the ambitious readers of newspapers and scientific books of the given time, maybe.
  
  Which, for its own, can turn out to be very interesting and entertaining, though.
  Especially if it manages to talk as eloquently as these ladies and gentlemen used to do.
  Assuming that users of today still have the mental capacity to follow. 🥲
  
  Reply
  1. Dude says:
    
    February 10, 2026 at 7:37 am
    
    talk as eloquently as these ladies and gentlemen used to
    
    Brevity is the soul of wit.
    – Shakespeare
    
    Reply
4. Dude says:
  
  February 10, 2026 at 7:23 am
  
  as Bohr, Plank, Rutherford et al
  
  The fact that you’re listing multiple people here is already kinda disproving the “great man” theory…
  
  The argument goes that the mode of development shifts from individuals to groups as social complexity goes up. It was easier for a single inventor to make groundbreaking discoveries when humanity knew basically nothing, as opposed to today when there’s too much knowledge for any one person to hold in their head.
  
  Reply
5. Pat says:
  
  February 10, 2026 at 8:49 am
  
  Nope. No chance. Not a prayer.
  
  The reason why something like this is going to be inherently limited is that “written works of the 1800s” is not the Internet. Not just in terms of scale, but in terms of what they were intended for. We use the Internet for basic communication. Like, extremely basic. You email people that are 10 feet from you.
  
  Nowadays papers get published for practically anything, because it’s cheap and easy and it’s the best way to document and preserve information. They did not do that then. Most learning and development happened in person, which is why you got development in clusters and bursts. YoLLMs u didn’t communicate via papers. You published results in papers. But the development was either in person or private communication.
  
  Go and read papers from the 1800s. They’re out there, and they’ve been translated. They don’t read like modern papers. No plots. Few equations. Lots of “obviously” and “it is clear” and “can easily see.” You get tons of results that happen shortly after physical society meetings. And you don’t have records of communication there. Because they were in person. It’s even worse if you look at actually experimental descriptions and such. Good luck even understanding what they were doing. You weren’t expected to be able to replicate stuff from papers.
  
  I mean, LLMs just absolutely suck at understanding modern research and they’ve got the benefit of a much larger dataset. The place where they’re most useful in modern science is essentially data mining – if you try to find meaning from something that’s totally not understood by scientists now, there’s just nothing there. Because there’s nothing to find.
  
  Reply
6. Raph says:
  
  February 11, 2026 at 11:56 pm
  
  A significant factor I believe LLMs would struggle with here is taking into account the manufacturing abilities of the time, particularly in terms of accuracy and consistency.
  
  Measuring anything & everything with any level of precision was so much more difficult at the time, just consider the Greenwich Time Lady for adjusting your clock
  
  Reply
UnderSampled says:

February 9, 2026 at 11:43 am

This is great. Mainstream LLMs certainly are bad at anachronisms. This would actually serve as a good model to train bigger models against to learn how to not have that problem.

It’s also very close to another thing we need: an LLM trained only on public domain works. It just needs to be put through some proper reinforcement learning to be helpful.

Reply
1. Sure Tap says:
  
  February 9, 2026 at 10:11 pm
  
  Apertus – https://www.swiss-ai.org/apertus
  
  Reply
Mystick says:

February 10, 2026 at 5:16 am

[list games]

Chess
Checkers
Backgammon
Poker
Fighter Combat
Guerilla Engagement
Desert Warfare
Air-To-Ground Actions
Theaterwide Tactical Warfare
Theaterwide Biotoxin and Chemical Warfare
Global Thermonuclear War

Reply
1. Hans-Christian Heine says:
  
  February 10, 2026 at 11:49 pm
  
  [list games]
  Deer
  Wild boar
  Rabbit
  Pheasant
  Duck
  Elk
  Moose
  Quail
  Wild turkey
  Hare
  
  Reply
2. Aknup says:
  
  February 12, 2026 at 1:24 am
  
  So you only read the letters ‘LLM’, and that was enough for you to post your comment eh.
  
  Reply
  1. Mystick says:
    
    February 12, 2026 at 2:41 pm
    
    The problem with modern “progress” is that it’s killed comedy. Lighten up, Francis.
    
    Reply

Hackaday

Living In The (LLM) Past

23 thoughts on “Living In The (LLM) Past”

Leave a ReplyCancel reply

Search

Never miss a hack

If you missed it

Ask Hackaday: How Much Compute Is Enough?

WheatForce: Learning From CPU Architecture Mistakes

Improving FDM Filament Drying With A Spot Of Vacuum

Spy Tech: Conflicts Bring A New Number Station

The Most Secure, Modern Computer Might Be A Mac

Our Columns

Ask Hackaday: Using CoPilot? Are You Entertained?

Solar Balconies Take Europe By Storm

Medieval Alhambra’s Pulser Pump And Other Aquatic Marvels

Hackaday Links: March 29, 2026

For Art’s Sake

23 thoughts on “Living In The (LLM) Past”

Leave a ReplyCancel reply

Search

Never miss a hack

Subscribe

If you missed it

Our Columns