Statistics and Hacking: A Stout Little Distribution

Previously, we discussed how to apply the most basic hypothesis test: the z-test. It requires a relatively large sample size, and might be appreciated less by hackers searching for truth on a tight budget of time and money.

As an alternative, we briefly mentioned the t-test. The basic procedure still applies: form hypotheses, sample data, check your assumptions, and perform the test. This time though, we’ll run the test with real data from IoT sensors, and programmatically rather than by hand.

The most important difference between the z-test and the t-test is that the t-test uses a different probability distribution. It is called the ‘t-distribution’, and is similar in principle to the normal distribution used by the z-test, but was developed by studying the properties of small sample sizes. The precise shape of the distribution depends on your sample size. Continue reading “Statistics and Hacking: A Stout Little Distribution”

Statistics and Hacking: An Introduction to Hypothesis Testing

In the early 20th century, Guinness breweries in Dublin had a policy of hiring the best graduates from Oxford and Cambridge to improve their industrial processes. At the time, it was considered a trade secret that they were using statistical methods to improve their process and product.

One problem they were having was that the z-test (a commonly used test at the time) required large sample sizes, and sufficient data was often unavailable. By studying the properties of small sample sizes, William Sealy Gosset developed a statistical test that required fewer samples to produce a reasonable result. As the story goes though, chemists at Guinness were forbidden from publishing their findings.

So he did what many of us would do: realizing the finding was important to disseminate, he adopted a pseudonym (‘Student’) and published it. Even though we now know who developed the test, it’s still called “Student’s t-test” and it remains widely used across scientific disciplines.

It’s a cute little story of math, anonymity, and beer… but what can we do with it? As it turns out, it’s something we could probably all be using more often, given the number of Internet-connected sensors we’ve been playing with. Today our goal is to cover hypothesis testing and the basic z-test, as these are fundamental to understanding how the t-test works. We’ll return to the t-test soon — with real data. Continue reading “Statistics and Hacking: An Introduction to Hypothesis Testing”

Bone Up on Your Multiplication Skills

John Napier was a Scottish physicist, mathematician, and astronomer who usually gets the credit for inventing logarithms. But his contributions to simplifying mathematics and building shorthand solutions didn’t end there. In the course of performing the many calculations he needed to practice these subjects in the 1500s, Napier invented a kind of computing mechanism for multiplication. It’s a physical manifestation of an old system known as lattice multiplication or gelosia.

Lattice multiplication makes use of the multiplication table in order to multiply huge numbers together quickly and easily. It is thought to have originated in India and moved west into Europe. When the lattice method reached Italy, the Italians named it gelosia after the trellised window covering it resembled, which was commonly used to keep prying eyes away from one’s possessions and wife.

Continue reading “Bone Up on Your Multiplication Skills”