Value that describes a sample, usually derived from measurements of the individuals in the sample

Definitions

StatisticsCollection of methods for planning experiments, obtaining data, and then organizing, summarizing, presenting, analyzing, interpreting, and drawing conclusions.VariableCharacteristic or attribute that can assume different valuesRandom VariableA variable whose values are determined by chance.PopulationAll subjects possessing a common characteristic that is being studied.Sample A subgroup or subset of the population.ParameterCharacteristic or measure obtained from a population.Statistic (not to be confused with Statistics)Characteristic or measure obtained from a sample.Descriptive StatisticsCollection, organization, summarization, and presentation of data.Inferential StatisticsGeneralizing from samples to populations using probabilities. Performing hypothesis testing, determining relationships between variables, and making predictions.Qualitative VariablesVariables which assume non-numerical values.Quantitative VariablesVariables which assume numerical values.Discrete VariablesVariables which assume a finite or countable number of possible values. Usually obtained by counting.Continuous VariablesVariables which assume an infinite number of possible values. Usually obtained by measurement.Nominal Level Level of measurement which classifies data into mutually exclusive, all inclusive categories in which no order or ranking can be imposed on the data.Ordinal LevelLevel of measurement which classifies data into categories that can be ranked. Differences between the ranks do not exist.Interval LevelLevel of measurement which classifies data that can be ranked and differences are meaningful. However, there is no meaningful zero, so ratios are meaningless. Ratio LevelLevel of measurement which classifies data that can be ranked, differences are meaningful, and there is a true zero. True ratios exist between the different units of measure.Random SamplingSampling in which the data is collected using chance methods or random numbers.Systematic SamplingSampling in which data is obtained by selecting every kth object.Convenience SamplingSampling in which data is which is readily available is used.Stratified SamplingSampling in which the population is divided into groups (called strata) according to some characteristic. Each of these strata is then sampled using one of the other sampling techniques.Cluster SamplingSampling in which the population is divided into groups (usually geographically). Some of these groups are randomly selected, and then all of the elements in those groups are selected.

Table of Contents

Researchers use of statistics - refers to a set of methods and rules for organizing, summarizing, and interpreting information.

Inhaltsverzeichnis Show

Definitions
Two basic kinds of statistics
What is a measurement from a sample called?
What is a characteristic that describes the sample?
What is it that values used to summarize or describe a population?
What is a number that describes a population called?

Two basic kinds of statistics

- Descriptivestatistics are statistical procedures used to summarize, organize, and simplify data.

- Inferentialstatistics consist of techniques that allow us to study samples and then make generalizations about the populations from which they were selected.

- A population is the set of all individuals of interest in a particular study

- A prameter is a value, usually a numerical value, that describes a population. A parameter may be obtained from a single measurement, or it may be derived from a set of measurements from the population.

- A sample is a set of individuals selected from a population, ususally intended to represent the population in a study.

- A statistic is a value, usually a numerical value, that describes a sample. A statistic may be obtained from a single measurement, or it may be derived from a set of measurements from the sample.

Sampling error is the discrepancy or amount of error, that exists between a sample statistic and the corresponding population parameter.

There are 3 characteristics used that completely describe a distribution: shape, centraltendency, and variability.

Shape: In a symmetrical distribution, it is possible to draw a vertical line through the middle so that one side of the distribution is an exact mirror image of the other.

In a skewed distribution, the scores tend to pile up toward one end of the scale and taper off gradually at the other end.

The section where the scores taper off towards one end of a distribution is called the tail of the distribution.

negatively positively

A skewed distribution with the tail on the right-hand side is said to be positively skewed (because the tail points towards positive numbers). If the tail points to the left, then the distribution is said to be negatively skewed.

Central tendency is a statistical measure that identifies a single score as representative of an entire distribution. The goal of central tendency is to find the single score that is most typical or most representative of the entire group.

There are several measures of central tendency, but we’ll only focus on the mean.

The most commonly known measure of central tendency is the arithmetic average, or the mean (note: in everyday speech, the term average actually refers to all three measures of central tendency, for examples of this see gray box 3.4, pg 90). We’ve already talked about how you would go about figuring this out from the data in a frequency distribution table.

The mean for a distribution is the sum of the scores divided by the number of scores.

The formula for the population mean is: m = å X

The formula for the sample mean is:

= å X

Variability provides a quantitiative measure of the degree to which scores in a distribution are spread out or clustered together. In other words variablility refers to the degree of “differentness” of the scores in the distribution. High variability means that the scores differ by a lot, while low variability means that the scores are all similar (“homogeneousness”).

There are several measures of variability, but we’ll concentrate on the standard deviation.

In essence, the standard deviation measures how far off all of the individuals in the distribution are from a standard, where that standard is the mean of the distribution.

So to get a measure of the deviation we need to subtract the population mean from every individual in our distribution.

X - m = deviation score

- if the score is a value above the mean the deviation score will be positive

- if the score is a value below the mean the deviation score will be negative

Add up all the deviations and you get zero. So what we have to do is get rid of the negative signs. We do this by squaring the deviations and then taking the square root of the sum of the squared deviations.

Sum of Squares = SS = å (X - m)2

Population variance = s2 = SS/N

standard deviation =

The Standard Deviation of a Sample is nearly the same

- the computations are pretty much the same here:

- different notation:

- s = sample standard deviation

- use

instead of m in the computaion of SS

- need to adjust the computation to tak into account that a sample will

typically be less variable than the corresponding population.

- if you have a good, representative sample, then your sample and population means should be very similar, and the overall shape of the two distributions should be similar. However, notice that the variability of the sample is smaller than the variability of the population.

- to account for this the sample variance is divided by n - 1 rather than just n

sample variance = s2 = _SS_

n - 1

- and the same is true for sample standard deviation

sample standard deviation = s =

So what we’re doing when we subtract 1 from n is using degrees of freedom to adjust our sample deviations to make an unbiased estimation of the population values.

Recall that the goal of inferential statistics is to make claims about population parameters based on sample statistics. So the logic will be something like this. We can’t measure the whole population, so we take a sample. Our best estimate for the mean of the population will be the mean of our sample. (remember that it is only an estimate because we have sampling error before - the difference between a sample statistic and the corresponding population parameter). It sounds simple and straight forward, but consider the following:

Suppose that you take 3 different samples from the same population. They are going to be different from one another. They will have different shapes, different means, and different variability. So how do you figure out what the best estimate of the population mean is?

How many possible samples can we take? Infinite (remember that we are sampling with replacement)? Luckily for us, the huge set of possible samples forms a simple, orderly, and

predictable pattern (a sampling distribution). Because of this, we are able to base our predictions about sample characteristics on the distribution of sample means.

The distribution of sample means is the collection of sample means for all the possible random samples of a particular size (n) that can be obtained from a population.

mean: the average of all of the sample means will equal the mean of the

population. The average of all of the sample means is called the expected

value of

. It is “expected” because it should be a value near the

population mean m.

variability: the standard deviation of the distribution of sample means is called

the standard error of

standard error of

= standard distance between
and m.

in other words, this statistic describes the standard (typical/average)

distance from the mean. In this case it is the distance between the sample mean

and the population mean m. The major purpose/use of the standard error of

is that it tells us how well the sample mean estimates the population mean. In other words, how big is the sample error.

the numerical value of the standard error is determined by two characteristics: the

variability of the population & the size of the sample

1) the variability of the population - the bigger the variability

of the population, the more variability you’ll have in the

sample means.

2) the size of the sample - the larger your sample size (n), the

more accurately the sample represents the population.

Central Limit Theorem: For any population with mean m and standard

deviation s, the distribution of sample means for sample size n will

approach a normal distriution with a mean of m and a standard deviation

as n approaches infinity.

Hypothesis testing is an inferential procedure that uses sample data to evaluate

the credibility of a hypothesis about a population.

step1: Make a hypothesis and select a criteria for the decsion

step2: Collect your data

- randomly select individuals from a population

- randomly assign selected individuals to specific treatment groups

step3: Compute a test statistic (more on this later in the lecture, and the course)

- things like z-scores, t-tests, f-tests (ANOVA)

step4: Compare the test statistic to a distribution to make an inference about the

parameter and hence draw a conclusion about the sample.

The decision criteria = the alpha level

		Actual situation
		H0 is correct	H0 is wrong
Experimenter’s	Reject H0	oops! Type I error	Yay! correct
Conclusions	Fail to reject H0	Yay! correct	oops! Type II error

the two kinds of error each have their own name, because they really are reflecting different things

type I error (a, alpha) - the H0 is actually correct, but the experimenter rejected it

type II error (b, beta)- the H0 is really wrong, but the experiment didn’t feel as though they could reject it

The courtroom/jury analogy

		Actual situation
		X is innocent	X is guilty
Jury’s	Guilty	oops! Type I error	Yay! correct
Conclusions	Not Guilty	Yay! correct	oops! Type II error

Type I error - sending an innocent person to jail

Type II error - letting a guilty person go free

In scientific research, we typically take a conservative approach, and set our critera such

that we try to minimize the chance of making a Type I error (concluding that there is an effect of something when there really isn’t). In other words, scientists focus on setting an acceptible alpha level (a), or levelof significance.

The alpha level (a), or levelof significance, is a probabiity value that defines the very unlikely sample outcomes when the null hypothesis is true. Whenever an experiment produces very unlikely data (as defined by alpha), we will reject the null hypothesis. Thus, the alpha level also defines the probability of a Type I error - that is, the probability of rejecting H0 when it is actually true. note: In psychology a is usually set at 0.05

Using tables and graphs to present your results.

What is a measurement from a sample called?

A statistic is a numerical measurement describing some characteristic of a sample.

What is a characteristic that describes the sample?

A statistic is a characteristic, usually numerical, that describes a sample.

What is it that values used to summarize or describe a population?

For example, tables or graphs are used to organize data, and descriptive values such as the average score are used to summarize data. A descriptive value for a population is called a parameter and a descriptive value for a sample is called a statistic.

What is a number that describes a population called?

A parameter is a number describing a whole population (e.g., population mean), while a statistic is a number describing a sample (e.g., sample mean).

Value that describes a sample, usually derived from measurements of the individuals in the sample

Definitions

Two basic kinds of statistics

What is a measurement from a sample called?

What is a characteristic that describes the sample?

What is it that values used to summarize or describe a population?

What is a number that describes a population called?

zusammenhängende Posts

Which of the following terms describes a situation in which many virtual machines are deployed without proper it our site?

What process describes the movement of molecules from an area of higher concentration to an area of lower concentration in the absence of a semipermeable membrane )?

Which of the following best describes the ability of the two computers to communicate with each other across the Internet?

Which of the following best describes the ability of parallel computing to improve efficiency?

Which of the following statements best describes the relationship between the Internet and the World Wide Web WWW )?

Which of the following best describes the difference between merchandise import and service import?

What term describes all of the water and electrolytes enclosed within cells of the body quizlet?

Which statement best describes the instructions that should be given to a patient when assessing the thyroid from the posterior approach?

Which of the following correctly describes the changes experience by American citizens after the American Revolution?

Which statement best describes the role of railroads in the industrialization of the united states?

Werbung

NEUESTEN NACHRICHTEN

Wissen Sie wissen Sie wer Radetzky ist

Wie besprochen sende ich Ihnen im Anhang die?

Wie heißen die adoptivkinder von christina aguilera

To evaluate quality, it is helpful when organizations develop ______ system.

What models of decision making explain how managers really come to decisions

Which of the following outlines the overall authority to perform an IS audit

Wo bekommt man am meisten für seine Rente?

Duplex zimmer bedeutung

Transformers 5 deutsch der ganze film

What is the Internet standard for how Web pages are formatted and displayed?

Werbung

Populer

Werbung

Um

Legal

Hilfe

Sozial