Malamud’s General Index: Research Gist, No Slap On The Wrist

November 2, 2021 by Kristina Panos 10 Comments

Tired of that unsettling feeling you get from looking for paywalled papers on that one site that shall not be named? Yeah, us too. But now there’s an alternative that should feel a little less illegal: this new index of the world’s research papers over on the Internet Archive.

It’s an index of words and short phrases (up to five words) culled from approximately 107 million research papers. The point is to make it easier for scientists to gain insights from papers that they might not otherwise have access to. The Index will also make it easier for computerized analysis of the world’s research. Call it a gist machine.

Technologist Carl Malamud created this index, which doesn’t contain the full text of any paper. Some of the researchers with early access to the Index said that it is quite helpful for text mining. The only real barrier to entry is that there is no web search portal for it — you have to download 5TB of compressed files and roll your own program. In addition to sentence fragments, the files contain 20 billion keywords and tables with the papers’ titles, authors, and DOI numbers which will help users locate the full paper if necessary.

Nature’s write-up makes a salient point: how could Malamud have made this index without access to all of those papers, paywalled and otherwise? Malamud admits that he had to get copies of all 107 million articles in order to build the thing, and that they are safe inside an undisclosed location somewhere in the US. And he released the files under Public Resource, a non-profit he founded in Sebastopol, CA. But we have to wonder how different this really is from say, the Google Books N-Gram Viewer, or Google Scholar. Is the difference that Google is big enough to say they’re big enough get away with it?

If this whole thing reminds you of another defender of free information, remember that you can (and should) remove the DRM from his e-book of collected writings.

Via r/technology

Indexing Chuck Not Required

January 25, 2018 by Brian McEvoy 15 Comments

Becoming accomplished with a lathe is a powerful skillset, but it’s only half of the journey. Being clever comes later, and it’s the second part of the course. Patience is in there somewhere too, but let’s focus on being clever. [TimNummy] wants a knobbed bolt with critical parameters, so he makes his own. After the break, there is a sixty-second summary of the linked video.

Making stock hardware is a beginner’s tasks, so custom hardware requires ingenuity or expensive machinery. Adding finger notches to a bolthead is arbitrary with an indexing chuck, but one isn’t available. Instead, hex stock becomes a jig, and the flat sides are utilized to hold the workpiece at six intermittent angles. We can’t argue with the results which look like a part that would cost a pretty penny.

Using material found in the workshop is what being clever is all about. Hex brass stock comes with tight tolerances on the sides and angles so why not take advantage of that?

[TimNummy] can be seen on HaD for his Jeep dome light hack and an over-engineered mailbox flag. Did you miss [Quinn Dunki]’s piece on bootstrapping precision machine tools? Go check that out!

Continue reading “Indexing Chuck Not Required” →

Hackaday

index

2 Articles

Malamud’s General Index: Research Gist, No Slap On The Wrist

Indexing Chuck Not Required

Search

Never miss a hack

If you missed it

Thingino Teaches Cheap IP Cameras New Tricks

Hackaday Europe 2026: High Performance SDR On The Cheap

Encryption In The 1790s

The Need For Speed: Internet Speed Measurement (or DIY?)

Postal IRCs Are Almost A Thing Of The Past

Our Columns

Hackaday Podcast Episode 380: 3D Printing The Rainbow, IR And IP Camera Hacks, And Americium 241 On The Loose

This Week In Security: What’s In A Name, The AI Bugpocalypse Hits Everyone, OpenWRT Flaws, And Duress Passwords

FLOSS Weekly Episode 877: RCE As A Service

Hackaday Links: July 26, 2026

Add Sensors To Everything!

Search

Never miss a hack

Subscribe

If you missed it

Our Columns