How would you describe the relationship between two variables in a scatter plot?

5 Data Visualization 5.6 Scatter plot

Text begins

In science, the scatterplot is widely used to present measurements of two or more related variables. It is particularly useful when the values of the variables of the y-axis are thought to be dependent upon the values of the variable of the x-axis.

In a scatterplot, the data points are plotted but not joined. The resulting pattern indicates the type and strength of the relationship between two or more variables. Chart 5.6.1 is an example of a scatterplot. Car ownership increases as the household income increases, showing that there is a positive relationship between these two variables.

How would you describe the relationship between two variables in a scatter plot?

Data table for Chart 5.6.1

Data table for Chart 5.6.1
Table summary
This table displays the results of Data table for Chart 5.6.1. The information is grouped by Income ($) (appearing as row headers), Percentage (%) (appearing as column headers).

Income ($)Percentage (%)
20,00060
30,00055
40,00075
50,00085
60,00082
70,00097
80,00087
90,00090
100,00095

The pattern of the data points on the scatterplot reveals the relationship between the variables. Scatterplots can illustrate various patterns and relationships, such as:

  • a linear or non-linear relationship,
  • a positive (direct) or negative (inverse) relationship,
  • the concentration or spread of data points,
  • the presence of outliers.

Linear or non-linear relationship

When the data points form a straight line on the graph, the relationship between the variables is linear, as shown in Chart 5.6.2, Part A. When the data points don’t form a line or when they form a line that is not straight, like in Chart 5.6.2, Part B, the relationships between variables is not linear.

How would you describe the relationship between two variables in a scatter plot?

Data table for Chart 5.6.2

Data table for Chart 5.6.2
Table summary
This table displays the results of Data table for Chart 5.6.2. The information is grouped by Variable X (appearing as row headers), Variable Y1 (Part A) and Variable Y2 (Part B) (appearing as column headers).

Variable XVariable Y1 (Part A)Variable Y2 (Part B)
0-3 -2
74 -2
1319 7
2021 3
2734 10
3324 -5
4042 9
4745 9
5358 22
6058 25
6771 47
7378 71
8077 100
8785 160
9390 249
10099 392
0 true zero or a value rounded to zero

Positive or negative relationship

If the points cluster around a line that runs from the lower left to upper right of the graph area, then the relationship between the two variables is said to be positive or direct (Chart 5.6.3, Part A). If the points cluster around a line that runs from the upper left to the lower right of the graph area, then the relationship is said to be negative or inverse (Chart 5.6.3, Part B).

How would you describe the relationship between two variables in a scatter plot?

Data table for Chart 5.6.3

Data table for Chart 5.6.3
Table summary
This table displays the results of Data table for Chart 5.6.3. The information is grouped by Variable X (appearing as row headers), Variable Y1 (Part A) and Variable Y2 (Part B) (appearing as column headers).

Variable XVariable Y1 (Part A)Variable Y2 (Part B)
0-17 83
716 103
1320 93
2014 74
2735 81
3328 62
4046 66
4765 72
5356 49
6051 31
6762 29
7388 42
80105 45
87115 42
93108 21
100114 14
0 true zero or a value rounded to zero

Concentration or spread of data points

Data points can be close together (Chart 5.6.4, Part A) or spread widely across the graph area (Chart 5.6.4, Part B).

How would you describe the relationship between two variables in a scatter plot?

Data table for Chart 5.6.4

Data table for Chart 5.6.4
Table summary
This table displays the results of Data table for Chart 5.6.4. The information is grouped by Variable X1 (Part A) (appearing as row headers), Variable Y1 (Part A), Variable X2 (Part B) and Variable Y2 (Part B) (appearing as column headers).

Variable X1 (Part A)Variable Y1 (Part A)Variable X2 (Part B)Variable Y2 (Part B)
4451 4 37
4251 25 32
4851 64 60
4946 15 18
3846 51 18
4152 60 54
5551 20 70
5058 35 24
5441 15 55
5948 47 62
4249 62 13
5549 35 6
5246 60 81
4657 65 16
5552 70 65

Presence of outliers

Besides portraying relationships between the variables, a scatterplot can also show whether or not there are any outliers in the data. Outliers are data points that are far from the other points in the data set, like the two points in red in Chart 5.6.5.

How would you describe the relationship between two variables in a scatter plot?

Data table for Chart 5.6.5

Data table for Chart 5.6.5
Table summary
This table displays the results of Data table for Chart 5.6.5. The information is grouped by Variable X (appearing as row headers), Variable Y and Symbol (appearing as column headers).

Variable XVariable YSymbol
0-1 Black circle
71 Black circle
1332 Black circle
1583 Red triangle (potential outlier)
2028 Black circle
275 Black circle
2895 Red triangle (potential outlier)
3330 Black circle
4046 Black circle
4729 Black circle
5341 Black circle
6046 Black circle
6729 Black circle
7354 Black circle
8052 Black circle
8763 Black circle
9359 Black circle
10082 Black circle
0 true zero or a value rounded to zero

Report a problem on this page

Is something not working? Is there information outdated? Can't find what you're looking for?

Please contact us and let us know how we can help you.

Privacy notice

Date modified: 2021-09-02

How would you describe the relationship between two variables on a scatter plot?

We often see patterns or relationships in scatterplots. When the y variable tends to increase as the x variable increases, we say there is a positive correlation between the variables. When the y variable tends to decrease as the x variable increases, we say there is a negative correlation between the variables.

What are the two variables in a scatter plot called?

A scatter plot is a plot of the values of Y versus the corresponding values of X: Vertical axis: variable Y--usually the response variable. Horizontal axis: variable X--usually some variable we suspect may ber related to the response.