The Classic “Geltz & Luna Question”

Copyright © Timothy Horrigan 2005

Click here to learn how to become a test reader/scorer yourself!

During my two years as a Measured Progress test scorer, I scored the “Geltz & Luna” question several times. It was one of Measured Progress's favorite questions. It was originally written back in the 1990s for MCAS exams in Massachusetts. The Massachusetts state department of education published the answer in a newsletter in 1999. (This was several years before other states used the exact same question.) In Massachusetts, this was a Grade 8 question. It requires students to understand the common measures of “central tendency” (mean, median, mode, etc.)

Click here to see a copy of the PDF file of the newsletter.
(Original URL: http://www.doe.mass.edu/mcas/starting_now/spring99.pdf)

It is a fairly typical math question, although the third part of the question actually has two possible correct answers.



The coach for the All-Star Basketball Game needs to pick one of two
players for the team. The table below shows the number of points each of
the players scored in his last 10 games.

Name of player: Number of points scored in last 10 games

Geltz: 18, 32, 28, 18, 14, 28, 10, 16, 36, 20
Luna: 22, 17, 23, 8, 24, 24, 22, 20, 18, 22


a. Find the mean (average) number of points scored by each player. Show or describe

how you found the means.

b. Find the median number of points scored by each player. Show or describe how you found the medians.

c. Based on the data, which player would you recommend for the All-Star team?

Explain your recommendation. Use the data and include a comparison of the means and medians you calculated.

The answers are:

a. Geltz averaged 22.0 points per game; Luna scored 20.0 pts/game



b. Geltz's median score was 19.0 points; Luna's was 22.0 points. (In case you don't remember what a median is: when there are 10 scores, the median of the data set would be the average of the 5th and 6th-highest scores.)


c. The preferred answer is that Luna is better because his higher median proves he is more “consistent.” It is also OK to state that Geltz is better because his average per game is higher (but you must still point out that Luna's median was higher.) Stating that Geltz is better because he had four very high scores (two 28s, a 30 and 36) is not considered a correct answer to part c.



Some of the things which make this question typical are:

  1. It is tenuously linked to something the kids are interested in, i.e., basketball. However, the data provided are quite irrelevant and incomplete. A real-life basketball coach would never just look at the point totals from the last ten games and nothing else. There's nothing in here about how many minutes the players played, how many rebounds they got, how well their teams as a whole played, who won those games, etc. (Students are trained not to point out this type of flaw. The few smart-alecks who do bother to point out such flaws usually answer the question as asked, before stating at the bottom that the premise is all wrong. They don't get extra credit for doing so.)

  2. The student is asked to make a totally arbitrary choice. Here, the hypothetical coach has exactly two players to choose from and must choose exactly one of the two, based strictly on a limited amount of irrelevant and incomplete data. (In real life, a coach would be able to choose a dozen or so players out of all the players in the league, and in addition to having a lot more statistical data then this coach has, the real-life coach would have been able to see the players in action.)

  3. The question is designed to test no more than two or three closely-related skills and/or concepts. In this case, it is designed to make sure students understand the difference between means and medians.

  4. The scoring criteria will penalize the student if she tries to introduce a concept she is not being tested on. In the case of the Geltz & Luna question, an occasional student tried to treat the two sets of scores as two time series, i.e., they assumed that the game scores were in order. We marked them down for this, because the prompt doesn’t actually come out and say the numbers are in anything other than random order, and also because students are being tested on the skill of “central tendency measurement”, not on time series analysis. (And, in any case, times series analysis is college level math, and we are testing junior- or senior-high math, not college math.)

  5. The answers almost always come out to be round numbers: in this case, the two means and the two medians are whole numbers.

  6. The test developer always deliberately places a distractor or two in the prompt. In this case, for example, the intended distractor is the outlier where Geltz scored 36 points in one game. (Ironically, in real life, the outliers would be useful information. A coach would see that Geltz twice scored more than 30 points in a game, while Luna never made more than 24 points in a game.) An unintended distractor might be the fact that the mode of Luna's scores (i.e., the most common value) is 22, like his median; on the other hand, Geltz's modal score is 28.

  7. The question tries to be culturally neutral (e.g., one player has a Hispanic-sounding name) but it doesn't quite succeed in being neutral, since you can't answer the question if you don't know that basketball is one of those games where the higher score is the winning one. (On the other hand, too much knowledge about basketball can be confusing, since the situation in the question is totally artificial.)


In any case, here are examples of answers for each score point from 4 down to 1. The MCAS people, understandably, tried to leave the impression that these questions were scored on a much finer scale than 0 through 4. (I will arbitrarily assume that these are female students talking about male basketball players.)



4: “Advanced”

This kid nailed the question. She chose the preferred answer, which was that Luna is better because his higher median shows he was “more consistent” notwithstanding the fact that Geltz had a higher average. She could have gotten just as high a score by stating that Geltz scored more points, but she would have to refer to Luna's higher median.






3: “Proficient”

This kid missed out on a 4 prinarily because she leaves out one of Luna's 3 scores of 22 when calculating the median. She also misses out by just stating that Geltz is better because of his higher average without saying why this is preferable. (The previous student's answer was preferable because she gave a reason, albeit a vague one, for preferring the player with the higher median over one who scored more points.)






2: “Needs Improvement”

This is a classic two: the student made it halfway to a right answer. She calculated the mean correctly, but she messed up the median by simply averaging the scores for the 5th and 6th games as shown in the (presumably chronological) order in the prompt. Also, her attempt to reconcile the different mean and medians for each player by adding them up to make an an overall score was deemed “inappropriate to solve this particular problem” even though she “shows some logical reasoning.”








1 or 0. “Failing”

They are a little too harsh on this student. The student supposedly “answered none of the parts correctly and clearly doesn’t understand any of the concepts.” Actually, she does understand that she should add up Geltz's and Luna's scores for the 10 games to see who scored the most points; so, she is more than halfway to understanding how to find the mean. She writes down all 10 of Geltz's scores, but fails to include his first two games in her addition. She arrives at a total of 170 (i.e., the correct total for Geltz's Games 3 through 10.) She fails to write down one of Luna's 24-point games, and adds the remaining nine games correctly for a total of 176. And she correctly notes that a higher total (allegedly, Luna's) is better. I would be very tempted to give this student a very low 2, because she does understand how to calculate the total score, which is just as good as a mean score when the number of games is the same.

It is answers like these which cause a lot of “arbitrations” where scorers disagree by 2 or more score points. You could make a case, as I just did, that this is a low 2. You could also make a case that this is a Zero, because she didn't do the addition right, didn't divide the total by the number of games, and didn't even attempt to calculate a median. My guess is this response was actually intended to be an examplar of a “1” score rather than a “0.”










The Forgotten Liars