Sunday, January 16, 2005

Benford's Law

When I was searching around for a fix to one of the problems i was facing at work, i came across one of my favourite weird facts in maths. It was a senior who introduced me to this (Thx PS).

In any data set of base-ten numbers, one would expect to see each digit occurring at each place with equal probability 1/10 ~ 11%. For example, in a huge set of numbers, one would expect to see as many 2's in the leading digit as 9's, i.e. with a probability 1/9 (we ignore 0's here). This works well with fake data generated with a random number generator but with naturally occurring data, this generally isn't true.

It comes as a great surprise that, if the numbers under investigation are not entirely random but somehow socially or naturally related, the distribution of the first digit is not uniform. More accurately, digit D appears as the first digit with the frequency proportional to log10(1 + 1/D). In other words, one may expect 1 to be the first digit of a random number in about 30% of cases, 2 will come up in about 18% of cases, 3 in 12%, 4 in 9%, 5 in 8%, etc. This is known as Benford's Law.

This amazing fact was first discovered by American astronomer Simon Newcomb in 1881, a time when Logarithm books were used for all complex mathematical calculations. Newcomb observed that the pages of the logarithm book, starting the number 1 were worn much more than other pages. After analyzing several sets of naturally occurring data, Newcomb went on to derive what later became Benford's law. Newcomb was rewarded for his effort by being ignored.

In 1938, Frank Benford arrived at the same formula after a comprehensive investigation of listings of data covering a variety of natural phenomena. (Benford's original data table can be found on Eric Weisstein's Treasure Troves of Mathematics - Benford's Law page.) The law applies to budget, income tax or population figures as well as street addresses of people listed in the book American Men of Science.

Following Benford's Law, or Looking Out for No. 1 by Malcom. W. Browne is an interesting read of the applications of this Law. One such is the following excerpts from the same

“Probability predictions are often surprising. In the case of the coin-tossing experiment, Dr. Hill wrote in the current issue of the magazine American Scientist, a "quite involved calculation" revealed a surprising probability. It showed, he said, that the overwhelming odds are that at some point in a series of 200 tosses, either heads or tails will come up six or more times in a row. Most fakers don't know this and avoid guessing long runs of heads or tails, which they mistakenly believe to be improbable. At just a glance, Dr. Hill can see whether or not a student's 200 coin-toss results contain a run of six heads or tails; if they don't, the student is branded a fake.”

Interesting huh?? Why don’t you try it out? It worked for me.


Not everthing that counts can be counted and not everything that can be counted counts - Albert. E


Post a Comment

<< Home