As we noted in Chapter 1, the purpose of descriptive statistics is to make a group of scores understandable. We looked at some ways of getting that un-derstanding through tables and graphs. In this chapter, we consider the main statistical techniques for describing a group of scores with numbers. First, you can describe a group of scores in terms of a representative (or typical) value, such as an average. A representative value gives the central tendency of a group of scores. A representative value is a simple way, with a single number, to describe a group of scores (and there may be hundreds or even thousands of scores). The main represen- tative value we consider is the mean. Next, we focus on ways of describing how spread out the numbers are in a group of scores. In other words, we consider the amount of variation, or variability, among the scores. The two measures of variabil- ity you will learn about are called the variance and standard deviation.
In this chapter, for the first time in this book, you will use statistical formulas. Such formulas are not here to confuse you. Hopefully, you will come to see that they actually simplify things and provide a very straightforward, concise way of describ- ing statistical procedures. To help you grasp what such formulas mean in words, whenever we present formulas in this book we always also give the “translation” in ordinary English.
✪ Summary 57
✪ Key Terms 57
✪ Example Worked-Out Problems 57
✪ Practice Problems 59
✪ Using SPSS 62
✪ Chapter Notes 65
CHAPTER 2
T I P F O R S U C C E S S Before beginning this chapter, you should be sure you are comfort- able with the key terms of variable, score, and value that we consid- ered in Chapter 1.
IS B
N 0-
55 8-
46 76
1- X
Statistics for Psychology, Fifth Edition, by Arthur Aron, Elaine N. Aron, and Elliot J. Coups. Published by Prentice Hall. Copyright © 2009 by Pearson Education, Inc.
34 Chapter 2
Central Tendency The central tendency of a group of scores (a distribution) refers to the middle of the group of scores. You will learn about three measures of central tendency: mean, mode, and median. Each measure of central tendency uses its own method to come up with a single number describing the middle of a group of scores. We start with the mean, the most commonly used measure of central tendency. Understanding the mean is also an important foundation for much of what you learn in later chapters.
The Mean Usually the best measure of central tendency is the ordinary average, the sum of all the scores divided by the number of scores. In statistics, this is called the mean. The average, or mean, of a group of scores is a representative value.
Suppose 10 students, as part of a research study, record the total number of dreams they had during the last week. The numbers of dreams were as follows:
7, 8, 8, 7, 3, 1, 6, 9, 3, 8
The mean of these 10 scores is 6 (the sum of 60 dreams divided by 10 students). That is, on the average, each student had 6 dreams in the past week. The information for the 10 students is thus summarized by the single number 6.
You can think of the mean as a kind of balancing point for the distribution of scores. Try it by visualizing a board balanced over a log, like a rudimentary teeter- totter. Imagine piles of blocks set along the board according to their values, one for each score in the distribution (like a histogram made of blocks). The mean is the point on the board where the weight of the blocks on one side balances exactly with the weight on the other side. Figure 2–1 shows this for the number of dreams for the 10 students.
Mathematically, you can think of the mean as the point at which the total distance to all the scores above that point equals the total distance to all the scores below that point. Let’s first figure the total distance from the mean to all the scores above the mean for the dreams example shown in Figure 2–1. There are two scores of 7, each of which is 1 unit above 6 (the mean). There are three scores of 8, each of which is 2 units above 6. And, there is one score of 9, which is 3 units above 6. This gives a total distance of 11 units from the mean to all the scores above the mean. Now, let’s look at the scores below the mean. There are two scores of 3, each of which is 3 units below 6 (the mean). And there is one score of 1, which is 5 units below 6. This gives a total distance of 11 units from the mean to all of the scores below the mean. Thus, you can see that the total distance from the mean to the scores above the mean is the same as the total distance from the mean to the scores below the mean. The scores above the mean balance out the scores below the mean (and vice-versa).
(3 + 3 + 5)
(1 + 1 + 2 + 2 + 2 + 3)
mean arithmetic average of a group of scores; sum of the scores divided by the number of scores.
5 6 7 8 91 2 3 4
M = 6
Figure 2–1 Mean of the distribution of the number of dreams during a week for 10 students, illustrated using blocks on a board balanced on a log.
central tendency typical or most representative value of a group of scores.
IS B
N 0-558-46761-X
Statistics for Psychology, Fifth Edition, by Arthur Aron, Elaine N. Aron, and Elliot J. Coups. Published by Prentice Hall. Copyright © 2009 by Pearson Education, Inc.
Central Tendency and Variability 35
Some other examples are shown in Figure 2–2. Notice that there doesn’t have to be a block right at the balance point. That is, the mean doesn’t have to be a score ac- tually in the distribution. The mean is the average of the scores, the balance point. The mean can be a decimal number, even if all the scores in the distribution have to be whole numbers (a mean of 2.30 children, for example). For each distribution in Figure 2–2, the total distance from the mean to the scores above the mean is the same as the total distance from the mean to the scores below the mean. (By the way, this analogy to blocks on a board, in reality, works out precisely only if the board has no weight of its own.)
Formula for the Mean and Statistical Symbols The rule for figuring the mean is to add up all the scores and divide by the number of scores. Here is how this rule is written as a formula:
(2–1)
M is a symbol for the mean. An alternative symbol, (“X-bar”), is sometimes used. However, M is almost always used in research articles in psychology, as rec- ommended by the style guidelines of the American Psychological Association (2001). You will see used mostly in advanced statistics books and in articles about statistics. In fact, there is not a general agreement for many of the symbols used in statistics. (In this book we generally use the symbols most widely found in psychol- ogy research articles.) S, the capital Greek letter sigma, is the symbol for “sum of.” It means “add up
all the numbers for whatever follows.” It is the most common special arithmetic symbol used in statistics.
X stands for the scores in the distribution of the variable X. We could have picked any letter. However, if there is only one variable, it is usually called X. In later chapters we use formulas with more than one variable. In those formulas, we use a second letter along with X (usually Y ) or subscripts (such as and ).
is “the sum of X.” This tells you to add up all the scores in the distribution of the variable X. Suppose X is the number of dreams of our 10 students: X is
, which is 60.7 + 8 + 8 + 7 + 3 + 1 + 6 + 9 + 3 + 8 ©
©X X2X1
X
X
M = gX
N
M mean.
5 6 7 8 91 2 3 4
5 6 7 8 91 2 3 4
5 6 7 8 91 2 3 4
5 6 7 8 91 2 3 4
M = 6
M = 3.60
M = 6
M = 6
Figure 2–2 Means of various distributions illustrated with blocks on a board balanced on a log.
The mean is the sum of the scores divided by the number of scores.
S sum of; add up all the scores follow- ing this symbol.
X scores in the distribution of the variable X.
T I P F O R S U C C E S S Think of each formula as a statisti- cal recipe, with statistical symbols as ingredients. Before you use each formula, be sure you know what each symbol stands for. Then carefully follow the formula to come up with the end result.
IS B
N 0-
55 8-
46 76
1- X
Statistics for Psychology, Fifth Edition, by Arthur Aron, Elaine N. Aron, and Elliot J. Coups. Published by Prentice Hall. Copyright © 2009 by Pearson Education, Inc.
36 Chapter 2
N stands for number—the number of scores in a distribution. In our example, there are 10 scores. Thus, N equals 10.1
Overall, the formula says to divide the sum of all the scores in the distribution of the variable X by the total number of scores, N. In the dreams example, this means you divide 60 by 10. Put in terms of the formula,
Additional Examples of Figuring the Mean Consider the examples from Chapter 1. The stress ratings of the 30 students in the first week of their statistics class (based on Aron et al., 1995) were:
8, 7, 4, 10, 8, 6, 8, 9, 9, 7, 3, 7, 6, 5, 0, 9, 10, 7, 7, 3, 6, 7, 5, 2, 1, 6, 7, 10, 8, 8
In Chapter 1 we summarized all these numbers into a frequency table (Table 1–3). You can now summarize all this information as a single number by figuring the mean. Figure the mean by adding up all the stress ratings and dividing by the num- ber of stress ratings. That is, you add up the 30 stress ratings:
, for a total of 193. Then you divide this total by the number of scores, 30. In terms of the formula,
This tells you that the average stress rating was 6.43 (after rounding off). This is clearly higher than the middle of the 0–10 scale. You can also see this on a graph. Think again of the histogram as a pile of blocks on a board and the mean of 6.43 as the point where the board balances on the fulcrum (see Figure 2–3). This single rep- resentative value simplifies the information in the 30 stress scores.
M = gX
N =
193
30 = 6.43
7 + 5 + 2 + 1 + 6 + 7 + 10 + 8 + 8 8 + 6 + 8 + 9 + 9 + 7 + 3 + 7 + 6 + 5 + 0 + 9 + 10 + 7 + 7 + 3 + 6 +
8 + 7 + 4 + 10 +
M = gX
N =
60
10 = 6
N number of scores in a distribution.