People have gotten much savvier about computer security in the last decade or so. Most people know that sending a document with sensitive information in it is a no-no, so many people try to redact documents with varying levels of success. A common strategy is to replace text with a black box, but you sometimes see sophisticated users pixelate part of an image or document they want to keep private. If you do this for text, be careful. It is possible to unredact pixelated images through software.
It appears that the algorithm is pretty straightforward. It simply guesses letters, pixelates them, and matches the result. You do have to estimate the size of the pixelation, but that’s usually not very hard to do. The code is built using TypeScript and while the process does require a little manual preparation, there’s nothing that seems very difficult or that couldn’t be automated if you were sufficiently motivated.
You don’t see it as often as you used to, but there have been a slew of legal and government scandals where someone redacted a document by putting a black box over a PDF so it was hidden when printed but the text was still in the document. Older wordprocessors often didn’t really delete text, either, if you knew how to look at the files. The Facebook valuation comes to mind. Not to mention that the National Legal and Policy Center was stung with poor redaction techniques.