A distribution is if one side of the histogram is a mirror image of the other side.

Inhaltsverzeichnis Show

Shape—Mirror, Mirror, On the Wall…
Span—A Little or a Lot?
Outliers (and the ozone layer)
What is the distribution of a histogram?
When the left side of the distribution is a mirror image of the right side we say the distribution has which of the following characteristics?
Can you tell distribution from a histogram?
When the right half of a histogram is a mirror image of the left?

Histograms are one of the most common graphs used to display numeric data. Anyone who takes a statistics course is likely to learn about the histogram, and for good reason: histograms are easy to understand and can instantly tell you a lot about your data.

Here are three of the most important things you can learn by looking at a histogram.

Shape—Mirror, Mirror, On the Wall…

If the left side of a histogram resembles a mirror image of the right side, then the data are said to be symmetric. In this case, the mean (or average) is a good approximation for the center of the data. And we can therefore safely utilize statistical tools that use the mean to analyze our data, such as t-tests.

If the data are not symmetric, then the data are either left-skewed or right-skewed. If the data are skewed, then the mean may not provide a good estimate for the center of the data and represent where most of the data fall. In this case, you should consider using the median to evaluate the center of the data, rather than the mean.

Did you know...

If the data are left-skewed, then the mean is typically LESS THAN the median.

If the data are right-skewed, then the mean is typically GREATER THAN the median.

Span—A Little or a Lot?

Suppose you have a data set that contains the salaries of people who work at your organization. It would be interesting to know where the minimum and maximum values fall, and where you are relative to those values. Because histograms use bins to display data—where a bin represents a given range of values—you can’t see exactly what the specific values are for the minimum and maximum, like you can on an individual value plot. However, you can still observe an approximation for the range and see how spread out the data are. And you can answer questions such as "Is there a little bit of variability in my organization's salaries, or a lot?"

Outliers (and the ozone layer)

Outliers can be described as extremely low or high values that do not fall near any other data points. Sometimes outliers represent unusual cases. Other times they represent data entry errors, or perhaps data that does not belong with the other data of interest. Whatever the case may be, outliers can easily be identified using a histogram and should be investigated as they can shed interesting information about your data.

Rewind to the mid-1980s when scientists reported depleting ozone levels above Antarctica. The Goddard Space Center had studied atmospheric ozone levels, but surprisingly didn’t discover the issue. Why? The analysis they used automatically eliminated any Dobson readings below 180 units because ozone levels that low were thought to be impossible.

Inspecting Distributions

Making a statistical graph is not an end in itself. After all, a computer or graphing calculator can make graphs faster than we can. The purpose of the graph is to help us understand the data. After you (or your calculator) make a graph, always ask, �What do I see?� Here is a general tactic for looking at graphs:

Look for an overall pattern and also for striking deviations from that pattern.

OVERALL PATTERN OF A DISTRIBUTION

To describe the overall pattern of a distribution:

�

Give the center and the spread.

�

See if the distribution has a simple shape that you can describe in a few words.

Figure 1.9

Section 6 will tell us in detail how to measure center and spread. For now, describe the center by finding a value that divides the observations so that about half take larger values and about half have smaller values. In Figure 1.9, the center is 1. That is, a typical team scored about 1 goal in its playoff soccer game. You can describe the spread by giving the smallest and largest values. The spread in Figure 1.9 is from 0 goals to 7 goals scored. The dotplot in Figure 1.9 shows that in most of the playoff games, Division V soccer teams scored very few goals. There were only four teams that scored 4 or more goals. We can say that the distribution has a �long tail� to the right, or that its shape is �skewed right.� You will learn more about describing shape shortly. Is the one team that scored 7 goals an outlier? This value certainly differs from the overall pattern. To some extent, deciding whether an observation is an outlier is a matter of judgment. We will introduce an objective criterion for determining outliers in Section 6. Once you have spotted outliers, look for an explanation. Many outliers are due to mistakes, such as typing 4.0 as 40. Other outliers point to the special nature of some observations. Explaining outliers usually requires some background information. Perhaps the soccer team that scored seven goals has some very talented offensive players. Or maybe their opponents played poor defense. Sometimes the values of a variable are too spread out for us to make a reasonable dotplot.

OUTLIERS

An

outlier in any graph of data is an individual observation that falls outside the overall pattern of the graph.

Let's revisit the histogram of the presidential inauguration ages.

Here is a good interpretation of the graph.

Center:

It appears that the typical age of a new president is about 55 years, because 55 is near the center of the histogram.

Spread:

As the histogram shows, there is a good deal of variation in the ages at which presidents take office. Teddy Roosevelt was the youngest, at age 42, and Ronald Reagan, at age 69, was the oldest.

Shape:

The distribution is roughly symmetric and has a single peak (unimodal).

Outliers:

There appear to be no outliers.

More about shape

When you describe a distribution, concentrate on the main features. Look for major peaks, not for minor ups and downs in the bars of the histogram. Look for clear outliers, not just for the smallest and largest observations. Look for rough

symmetry or clear skewness.

In mathematics, symmetry means that the two sides of a figure like a histogram are exact mirror images of each other. Data are almost never exactly symmetric, so we are willing to the call the presidential inauguration ages histogram approximately symmetric as an overall description.

Here are more examples.

SYMMETRIC AND SKEWED DISTRIBUTIONS

A distribution is

symmetric if the right and left sides of the histogram are approximately mirror images of each other.

Symmetric

A distribution is skewed to the right if the right side of the histogram (containing the half of the observations with larger values) extends much farther out than the left side. This type of distribution is also called positively skewed.

Skewed right

It is skewed to the left if the left side of the histogram extends much farther out than the right side. This type of distribution is also called negatively skewed.

Skewed left

Remember these basic shapes as they will appear throughout the course.

Relative frequency, cumulative frequency, percentiles, and ogives

Sometimes we are interested in describing the relative position of an individual within a distribution. You may have received a standardized test score report that said you were in the 80th percentile. What does this mean? Put simply, 80% of the people who took the test earned scores that were less than or equal to your score. The other 20% of students taking the test earned higher scores than you did.

PERCENTILE

The pth percentile of a distribution is the value such that p percent of the observations fall at or below it.

A histogram does a good job of displaying the distribution of values of a variable. But it tells us little about the relative standing of an individual observation. If we want this type of information, we should construct a

relative cumulative frequency graph, often called an ogive (pronounced O-JIVE).

Recall the histogram of the ages of U.S. presidents when they were inaugurated. Now we will examine where some specific presidents fall within the age distribution.

How to construct an ogive (relative cumulative frequency graph):
Step 1:
Decide on class intervals and make a frequency table, just as in making a histogram. Add three columns to your frequency table: relative frequency, cumulative frequency, and relative cumulative frequency.

�

To get the values in the relative frequency column, divide the count in each class interval by 43, the total number of presidents. Multiply by 100 to convert to a percentage.

�

To fill in the cumulative frequency column, add the counts in the frequency column that fall in or below the current class interval.

�

For the relative cumulative frequency column, divide the entries in the cumulative frequency column by 43, the total number of individuals.

Here is the frequency table from the presidential inauguration ages with the relative frequency, cumulative frequency, and relative cumulative frequency columns added.

Class	Frequency	Relative Frequency	Cumulative frequency	Relative Cumulative Frequency
40-44	2	2/43 = 0.047	2	2/43 = 0.047
45-49	6	6/43 = 0.140	8	8/43 = 0.186
50-54	13	13/43 = 0.302	21	21/43 = 0.488
55-59	12	12/43 = 0.279	33	33/43 = 0.767
60-64	7	7/43 = 0.163	40	40/43 = 0.930
65-69	3	3/43 = 0.070	43	43/43 = 1.000
Total	43

Step 2:
Label and scale your axes and title your graph. Label the horizontal axis �Age at inauguration� and the vertical axis �Relative cumulative frequency.� Scale the horizontal axis according to your choice of class intervals and the vertical axis from 0% to 100%. Step 3:
Plot a point corresponding to the relative cumulative frequency in each class interval at the left endpoint of the next class interval. For example, for the 40�44 interval, plot a point at a height of 4.7% above the age value of 45. This means that 4.7% of presidents were inaugurated before they were 45 years old. Begin your ogive with a point at a height of 0% at the left endpoint of the lowest class interval. Connect consecutive points with a line segment to form the ogive. The last point you plot should be at a height of 100%. The complete ogive is plotted below.

How to locate an individual within the distribution:

What about Bill Clinton? He was age 46 when he took office. To find his relative standing, draw a vertical line up from his age (46) on the horizontal axis until it meets the ogive. Then draw a horizontal line from this point of intersection to the vertical axis. We would estimate that Bill Clinton�s age places him at the 10% relative cumulative frequency mark. That tells us that about 10% of all U.S. presidents were the same age as or younger than Bill Clinton when they were inaugurated. Put another way, President Clinton was younger than about 90% of all U.S. presidents based on his inauguration age. His age places him at the 10th percentile of the distribution.

How to locate a value corresponding to a percentile:
What inauguration age corresponds to the 60th percentile? To answer this question, draw a horizontal line across from the vertical axis at a height of 60% until it meets the ogive. From the point of intersection, draw a vertical line down to the horizontal axis.

Find the center of the distribution.

Since we use the value that has half of the observations above it and half below it as our estimate of center, we simply need to find the 50th percentile of the distribution. Estimating as for the previous question, confirm that 55 is the center.

Try Self Check 4

Practice Problem:

Here is an ogive of the amount spent by grocery shoppers.
(a)
Estimate the center of this distribution. Explain your method. (b)
At what percentile would the shopper who spent $17.00 fall? (c)
Draw the histogram that corresponds to the ogive.

Answers:

a. To find the center of the distribution I would go to 50 on the y-axis (Relative Cumulative Frequency) since 50 represents the center and draw a horizontal line until it met the line of the ogive. At that point I would draw a vertical line to the x-axis (Amount Spent ($)). The estimate at this point is $27.

b. 35th percentile

What is the distribution of a histogram?

A histogram shows the distribution of the data to assess the central tendency, variability, and shape. A histogram for a quantitative variable divides the range of the values into discrete classes, and then counts the number of observations falling into each class interval.

When the left side of the distribution is a mirror image of the right side we say the distribution has which of the following characteristics?

In a symmetrical distribution the two sides of the distribution are a mirror image of each other. A normal distribution is a true symmetric distribution of observed values.

Can you tell distribution from a histogram?

A frequency distribution shows how often each different value in a set of data occurs. A histogram is the most commonly used graph to show frequency distributions.

When the right half of a histogram is a mirror image of the left?

A histogram is symmetric if its right half is a mirror image of its left half. Very few histograms are perfectly symmetric, but many are approximately symmetric.

A distribution is if one side of the histogram is a mirror image of the other side.

Shape—Mirror, Mirror, On the Wall…

Span—A Little or a Lot?

Outliers (and the ozone layer)

What is the distribution of a histogram?

When the left side of the distribution is a mirror image of the right side we say the distribution has which of the following characteristics?

Can you tell distribution from a histogram?

When the right half of a histogram is a mirror image of the left?

zusammenhängende Posts

Which of the following events is best described using a hypergeometric distribution

How does the mean median and mode scores indicate if the distribution is normal or skewed?

Which of the following is a property of the sampling distribution of the sample mean using the Central Limit Theorem?

What is the objective that two or more companies adopt horizontal marketing system?

How sample size of random samples affect the variability of the distribution brainly

How is the mean of the population related to the mean of the sampling distribution of the sample means?

Which of the following is the standard deviation of the sampling distribution of the sample mean when the population is finite?

In which of the following distribution options does the manufacturer not get involved in selling?

Which of the following is the most common channel of distribution for consumer goods?

Access to raw materials and distribution channels is an example of tangible resources.

Werbung

NEUESTEN NACHRICHTEN

Wissen Sie wissen Sie wer Radetzky ist

Wie besprochen sende ich Ihnen im Anhang die?

Wie heißen die adoptivkinder von christina aguilera

To evaluate quality, it is helpful when organizations develop ______ system.

What models of decision making explain how managers really come to decisions

Which of the following outlines the overall authority to perform an IS audit

Wo bekommt man am meisten für seine Rente?

Duplex zimmer bedeutung

Transformers 5 deutsch der ganze film

What is the Internet standard for how Web pages are formatted and displayed?

Werbung

Populer

Werbung

Um

Legal

Hilfe

Sozial