The central limit theorem states that even if the population is not normally distributed, the

If the population of all subscribers to the magazine were normal, you would expect its sampling distribution of means to be normal as well. But what if the population were non‐normal? The central limit theorem states that even if a population distribution is strongly non‐normal, its sampling distribution of means will be approximately normal for large sample sizes (over 30). The central limit theorem makes it possible to use probabilities associated with the normal curve to answer questions about the means of sufficiently large samples.

According to the central limit theorem, the mean of a sampling distribution of means is an unbiased estimator of the population mean.

The central limit theorem states that even if the population is not normally distributed, the

Similarly, the standard deviation of a sampling distribution of means is

 

The central limit theorem states that even if the population is not normally distributed, the

Note that the larger the sample, the less variable the sample mean. The mean of many observations is less variable than the mean of few. The standard deviation of a sampling distribution of means is often called the standard error of the mean. Every statistic has a standard error, which is a measure of the statistic's random variability.

If the population mean of number of fish caught per trip to a particular fishing hole is 3.2 and the population standard deviation is 1.8, what are the mean and standard deviation of the sampling distribution for samples of size 40 trips?

The central limit theorem states that even if the population is not normally distributed, the

Central Limit Theorem

The central limit theorem states that the sampling distribution of the mean of any independent,random variable will be normal or nearly normal, if the sample size is large enough.

How large is "large enough"? The answer depends on two factors.

  • The shape of the underlying population. The more closely the original population resembles a normal distribution, the fewer sample points will be required.

In practice, some statisticians say that a sample size of 30 is large enough when the population distribution is roughly bell-shaped. Others recommend a sample size of at least 40. But if the original population is distinctly not normal (e.g., is badly skewed, has multiple peaks, and/or has outliers), researchers like the sample size to be even larger.

T-Distribution vs. Normal Distribution

The t distribution and the normal distribution can both be used with statistics that have a bell-shaped distribution. This suggests that we might use either the t-distribution or the normal distribution to analyze sampling distributions. Which should we choose?

Guidelines exist to help you make that choice. Some focus on the population standard deviation.

  • If the population standard deviation is unknown, use the t-distribution.

Other guidelines focus on sample size.

  • If the sample size is small, use the t-distribution.

In practice, researchers employ a mix of the above guidelines. On this site, we use the normal distribution when the population standard deviation is known and the sample size is large. We might use either distribution when standard deviation is unknown and the sample size is very large. We use the t-distribution when the sample size is small, unless the underlying distribution is not normal. The t distribution should not be used with small samples from populations that are not approximately normal.

Test Your Understanding

In this section, we offer two examples that illustrate how sampling distributions are used to solve commom statistical problems. In each of these problems, the population sample size is known; and the sample size is large. So you should use the Normal Distribution Calculator, rather than the t-Distribution Calculator, to compute probabilities for these problems.

Normal Distribution Calculator

The normal calculator solves common statistical problems, based on the normal distribution. The calculator computes cumulative probabilities, based on three simple inputs. Simple instructions guide you to an accurate solution, quickly and easily. If anything is unclear, frequently-asked questions and sample problems provide straightforward explanations. The calculator is free. It can be found under the Stat Tables tab, which appears in the header of every Stat Trek web page.

Example 1

Assume that a school district has 10,000 6th graders. In this district, the average weight of a 6th grader is 80 pounds, with a standard deviation of 20 pounds. Suppose you draw a random sample of 50 students. What is the probability that the average weight of a sampled student will be less than 75 pounds?

Solution: To solve this problem, we need to define the sampling distribution of the mean. Because our sample size is greater than 30, the Central Limit Theorem tells us that the sampling distribution will approximate a normal distribution.

To define our normal distribution, we need to know both the mean of the sampling distribution and the standard deviation. Finding the mean of the sampling distribution is easy, since it is equal to the mean of the population. Thus, the mean of the sampling distribution is equal to 80.

The standard deviation of the sampling distribution can be computed using the following formula.

σx = [ σ / sqrt(n) ] * sqrt[ (N - n ) / (N - 1) ] 
σx = [ 20 / sqrt(50) ] * sqrt[ (10,000 - 50 ) / (10,000 - 1) ] = (20/7.071) * (0.995) = 2.81

Let's review what we know and what we want to know. We know that the sampling distribution of the mean is normally distributed with a mean of 80 and a standard deviation of 2.82. We want to know the probability that a sample mean is less than or equal to 75 pounds.

Because we know the population standard deviation and the sample size is large, we'll use the normal distribution to find probability. To solve the problem, we plug these inputs into the Normal Probability Calculator: mean = 80, standard deviation = 2.81, and normal random variable = 75. The Calculator tells us that the probability that the average weight of a sampled student is less than 75 pounds is equal to 0.038.

Note: Since the population size is more than 20 times greater than the sample size, we could have used the "approximate" formula σx = [ σ / sqrt(n) ] to compute the standard error. Had we done that, we would have found a standard error equal to [ 20 / sqrt(50) ] or 2.83.

Example 2

Find the probability that of the next 120 births, no more than 40% will be boys. Assume equal probabilities for the births of boys and girls. Assume also that the number of births in the population (N) is very large, essentially infinite.

Solution: The Central Limit Theorem tells us that the proportion of boys in 120 births will be approximately normally distributed.

The mean of the sampling distribution will be equal to the mean of the population distribution. In the population, half of the births result in boys; and half, in girls. Therefore, the probability of boy births in the population is 0.50. Thus, the mean proportion in the sampling distribution should also be 0.50.

The standard deviation of the sampling distribution (i.e., the standard error) can be computed using the following formula.

σp = sqrt[ PQ/n ] * sqrt[ (N - n ) / (N - 1) ]

Here, the finite population correction is equal to 1.0, since the population size (N) was assumed to be infinite. Therefore, standard error formula reduces to:

σp = sqrt[ PQ/n ] 
σp = sqrt[ (0.5)(0.5)/120 ] = sqrt[0.25/120 ] = 0.04564

Let's review what we know and what we want to know. We know that the sampling distribution of the proportion is normally distributed with a mean of 0.50 and a standard deviation of 0.04564. We want to know the probability that no more than 40% of the sampled births are boys.

Because we know the population standard deviation and the sample size is large, we'll use the normal distribution to find probability. To solve the problem, we plug these inputs into the Normal Probability Calculator: mean = .5, standard deviation = 0.04564, and the normal random variable = .4. The Calculator tells us that the probability that no more than 40% of the sampled births are boys is equal to 0.014.

Note: This problem can also be treated as a binomial experiment. Elsewhere, we showed how to analyze a binomial experiment. The binomial experiment is actually the more exact analysis. It produces a probability of 0.018 (versus a probability of 0.14 that we found using the normal distribution). Without a computer, the binomial approach is computationally demanding. Therefore, many statistics texts emphasize the approach presented above, which uses the normal distribution to approximate the binomial.

What does the central limit theorem tell us about non normal distributions?

The central limit theorem states that the sample means of moderately large samples are often well-approximated by a normal distribution even if the data are not normally distributed.

Does the central limit theorem have to be normally distributed?

The central limit theorem says that the sampling distribution of the mean will always be normally distributed, as long as the sample size is large enough. Regardless of whether the population has a normal, Poisson, binomial, or any other distribution, the sampling distribution of the mean will be normal.

What if the population is not normally distributed?

If the population has a normal distribution, then the sample means will have a normal distribution. If the population is not normally distributed, but the sample size is sufficiently large, then the sample means will have an approximately normal distribution.

What happens if the central limit theorem is applied to a normally distributed population?

The central limit theorem states that if you have a population with mean μ and standard deviation σ and take sufficiently large random samples from the population with replacement , then the distribution of the sample means will be approximately normally distributed.