In the practice of statistics, most problems involving a significance test (z or t test), finding a probability, or the determination of a confidence interval requires the usage of normal approximations. Most populations have roughly normal distributions that allow for a random sample to be taken and tested without caution. However, there are some populations that just don’t fit the description of “normal”. They don’t follow a precise bell curve, and the lack of normality could leave a statistician wondering how they could possibly determine if normal approximations are appropriate.
It is at this point that one of the most important theorems in statistical mathematics comes into play. The Central Limit Theorem, which was first introduced by Pierre-Simon Laplace in the late 18th century, states that as long as other important conditions are satisfied, a sample will become more normal as its size increases. An example of this is shown below. [pic] As you can see, as “N” increased, the distribution began to appear more like a bell curve, which is just another way of saying “normal”.
Some rules have been applied to the Central Limit Theorem throughout its history that have contributed to it becoming one of the most consistently accurate mathematical theorems. For example, there are the previously mentioned ‘conditions’ that need to be met. In order to utilize the theorem, the sample must be taken randomly, pulled from the population, and it must show independence. The basic way to determine independence is through the 10% condition, which says that a sample is independent if it makes up less than 10% of the overall population.
There is also the issue of deciding just how “large” the sample must be in order to be considered “large” enough to assume normality. For the most part, as long as the sample size (N) is greater than 30, the Central Limit Theorem can be applied. If N is between 10 and 30, the Central Limit Theorem can be applied only if the sample shows no outliers. Any N value under 10 that comes from a population that is not normally distributed will most likely not follow the Central Limit Theorem.
There is a particular experiment that efficiently shows the accuracy of the Central Limit Theorem. It involves finding the average year on a sample of pennies. The procedure is very simple. Every student in a class is given 25 pennies. To begin the experiment, each student randomly chooses 5 pennies and calculates the average year that the penny was produced. Each student in the class then puts their data together to form a histogram that shows the average age of a penny when N = 5.
After observing the shape of the distribution, they repeat the experiment two more times, first with an N value of 10, and then with an N value of 25. When looking at the histograms for each sample size, it is obvious that the distribution with the highest “N” not only has a smaller spread than the other two, but is also much more normally shaped and centered near the median. The distributions in which N equalled 5 and 10 were both somewhat skewed to the right.
The Central Limit Theorem is apparent here, as increasing N value led to an increase in normality of the distribution. The Central Limit Theorem has led to many statistical discoveries that have provided the entire world with new, important information. By creating a way to perform tests and experiments on populations that lack normal distributions, Laplace, Poisson, Lindeberg, and every other mathematician who contributed managed to change the practice of statistics forever.