Recommend previous regressions

The regression trap

It's tricky and difficult to spot. Even intelligent people who are trained in formal thinking all too often fall into the "regression trap" - the cause of many thinking errors and misjudgments. Klaus Fiedler from the Psychological Institute of Heidelberg University explains the universal principle of regression using surprising everyday examples and shows how one can recognize and avoid the serious sources of error in human thinking. Klaus Fiedler recently received the renowned Leibniz Prize from the German Research Foundation for his research.

When psychologists talk about "regression" they usually mean a relapse to an earlier stage of development. Just as exciting and provocative as this psychoanalytic interpretation of the term is the concept of "statistical regression" - the subject of this article. As dry and formal as the term from mathematical statistics may appear at first, it is of great importance in understanding and overcoming errors in reasoning and misjudgments. The "regression trap" is tricky and difficult to spot; even intelligent and formal thinking people all too often overlook them. Mastering them requires an unfamiliar way of thinking, against which people intuitively defend themselves. And that is exactly what makes the secret traps so dangerous.

To illustrate, consider an anecdote about Francis Galton. The cousin of the famous naturalist Charles Darwin meticulously recorded the differences in size between fathers and sons and found that the sons of very great fathers were usually smaller and the sons of very small fathers were usually larger. According to this, big and small people should become similar over the generations. At the same time, Galton also determined the opposite: very large sons have, on average, smaller fathers, and very small sons have larger fathers. According to this, the differences between tall and short people should continue to increase over generations (see figure on the left on page 18).

The resolution of this paradox is understandable even for non-mathematicians. Whenever the relationship between two measured variables - for example between the height of sons and fathers - is not perfect, then a representation of the second measurement as a function of the first measurement is regressive. That is, those measuring points that were extreme in the first measurement regress to the center in the second measurement. Initially high values ​​are lower in the second measurement, and low values ​​are usually higher in the second measurement. The regression curve (see figure on page 18, bottom right) therefore has a slope <1. This can be explained using the selected size example: Both measurements do not accurately reflect the underlying commonality - the common genetic makeup of fathers and sons. The body size depends not only on the genes, but also on "disruptive factors" that affect fathers and sons differently, for example the genetic makeup of the mothers, different nutritional or living conditions. Repeated measurements of empirical quantities, some of which are subject to different influences, can therefore diverge.

Relative devaluation of minorities through regression

From a mathematical point of view, the regression trap is based on the fact that the errors associated with high, medium and low measured values ​​are not the same. The more extreme a measured value is, the more likely it is that the true, corrected value is less extreme. Very high measured values ​​tend to represent overestimations, very low measured values ​​tend to represent underestimates; with average measured values, errors in both directions are equally likely. If the values ​​of the first measurement are recorded as abscissa values ​​(as in the illustration on page 18 bottom right), then the regressive values ​​of the second measurement are shown on the ordinate; large values ​​become smaller and small values ​​become larger. The strength of the effect can be appreciated. If the two measurements x and y are deviations from the mean, then y = x R, where the reliability R is a measure of how much the two variables measure something in common. If the first measurement is 10 above the mean (x = +10) and the reliability is R = 0.60, then the second measurement regresses in the expected value to y = +10 · 0.60 = +6 (only 6 above the mean). Or an extremely small measured value of 20 below the mean (x = -20) regresses to y = -20 · 0.60 = -12. It can be seen that regression is greater for extreme values ​​than for moderate ones and for unreliable measurements (small R) it is stronger than for reliable ones.

Illustration of the statistical regression using fictitious data on the height of sons and fathers. Extreme (very large or very small) fathers have, on average, less extreme sons. Conversely, extreme sons also have less extreme fathers.

What makes regressive processes so difficult to understand intuitively is that they run counter to the prevailing trend. Every high measured value has the potential to decrease and every low measured value has the potential to increase. Nothing could better illustrate this difficulty than the current favorite toy of human intelligence, the stock market. If a stock outperformed the market for a year, investors conclude that the stock is superior and buy it. If a stock performs worse than the rest of the market, then they look for a reason in the weakness of the stock. However, the current increase or decrease often turns out to be a "stationary" random fluctuation. Normal regression causes high values ​​to fall again and low values ​​to rise again. For a certain market segment (newsletters) it could be shown that the systematic purchase of papers that achieved better results than the market as a whole leads to a profit of 95 percent over 15 years, whereas the systematic purchase of unsuccessful papers led to a profit of 330 percent (by the way, the market itself achieved a profit of 550 percent in the same period).

The mean height of the sons of great fathers is smaller, while the mean height of small fathers is larger. The regression line therefore has a slope of 1 (fictitious data).

Intuitively, people tend to infer positive traits from positive outcomes and negative traits from negative outcomes. The principle of regression, however, requires the exact opposite. An example from upbringing should clarify this conflict: teachers and educators infer positive characteristics from positive behavior in a child and praise the child; from bad behavior they infer bad qualities and punish the child. However, this prevailing tendency to infer internal characteristics distracts from the fact that children's behavior is also dependent on many external influences and random fluctuations. Exemplary behavior is often followed by less positive behavior; after particularly bad behavior it can only get better. These perfectly normal, regressive fluctuations create the grave illusion that punishment is more effective than reward. Because if bad behavior is punished, the regression that follows says that the punishment was effective. However, reward for positive behavior only seems to evoke ingratitude, because regression means deterioration. The fact that many teachers and educators are increasingly inclined to adopt strict measures over time is therefore partly based on a statistical illusion.

Because there are fewer observations about another group than about one's own group, the regression effect is stronger. The result is a less differentiated picture of the outgroup (that is, fewer supra-threshold attributes highlighted in dark blue).

A third example is the replication of scientific research. If an experiment produced a new result, it should be possible to replicate it. Researchers are often disappointed or discard the innovative finding if it can no longer be seen so clearly on the second attempt. In fact, only this is to be expected. Since empirical measurements are always flawed, the results of a replication study turn out to be weaker in the expected value.

Average frequency estimate of three diamonds, six triangles and twelve circles, whereby the triangles are either treated as a single category (left graphic) or split into two subcategories (right graphic).

In the following, a number of psychological findings, which are traditionally explained differently, are alternatively explained by regression effects. The devaluation of minorities is a particularly memorable phenomenon that is traditionally explained by completely different concepts: racist prejudice, ethnic conflicts or the special attention value of the missteps of a minority. It is now known, however, that the tragedy of minorities is to a large extent based on statistical deception. This can be demonstrated in controlled experiments. Test participants are shown positive and negative behaviors from members of two groups. As in the real world, positive behavior occurs more frequently than negative behavior (which deviates from the norm), and of course there are more observations about the larger group (majority) than about the small (minority). In a typical experiment, the majority show 18 positive and 8 negative behavior, the minority 9 positive and 4 negative behavior. Both groups behave positively with the same relative frequency: 18 / (18 + 8) = 9 / (9 + 4) = 69 percent. Nevertheless, such experiments show that the minority is rated worse than the majority. The reason lies in the regressivity of human memory. Because the memory of 39 = 18 + 8 + 9 + 4 observations is not error-free, normal regression occurs. Information is lost; the ratio of 69 percent positive and 31 percent negative behaviors is remembered less extremely. However, this regression effect is worse for the minority than for the majority. Because of the small number of observations on the minority, the reliability is extremely low. While the remembered ratio for the majority is around 60 percent positive and 40 percent negative observations, the difference is almost completely lost for the minority due to fewer observations (see figures on page 18, top right).

Minorities, for example, find themselves in a tragic role: even if they show adapted behavior at the same rate, and if there are neither prejudices nor group conflicts in society, minorities are also devalued because of a statistical illusion. The higher occurrence rate of positive than negative behaviors is less clear for minorities than for majorities, because less frequent observations reduce the reliability and thus cause more regression.

The same principle provides a disarmingly simple explanation for the phenomenon of discrimination against outgroups and the simultaneous upgrading of one's own group. The fact that members of one's own gender group, one's own nationality or one's own football club are rated more positively than the opposite sex, a foreign nation or another club is traditionally explained by self-serving motives. Under the obvious assumption that there is generally less data about outgroups than about ingroups, one must recognize that regression does less harm to one's own group than to the other. Because the predominant tendency towards the positive is underestimated or completely overlooked when the observation samples are impoverished, and this is especially the case with foreign groups. The often cited finding that prejudice and discrimination are reduced through increased contact and experience with foreign groups is fully compatible with this statistical explanation.

Another frequently discussed phenomenon is the homogenization of outgroups, i.e. the simplifying tendency to lump everything together when assessing foreign groups and to differentiate less than with one's own group. This aspect of social stereotypes is usually explained motivationally (through the desire for individuality) or through qualitatively different representations of outgroups and ingroups in the memory. The simplest of all explanations, namely that of the principle of regression, was almost completely forgotten. This does not require any further assumptions about motivation and memory. Let us simply assume that outgroup and ingroup do not actually differ in their differentiation, i.e. the number or variety of frequently occurring characteristics (symbolized by the number of columns in the figure on page 19 left) is the same for both groups. However, the recording of these characteristics is not error-free, so that the true expression of many characteristics is regressively underestimated in the subjective perception. Since the regression is particularly pronounced in foreign groups due to the small database, the subjectively perceived manifestation of many characteristics in the outgroup remains undetected (i.e. below a certain threshold; light blue column bases in the figure on page 19 left). The larger experience samples for the ingroup lead to the fact that a larger number of features are recognized above the threshold (dark blue column ends), so that a more differentiated, feature-rich picture emerges.

Regression is a universal principle that applies not only to subjective judgments, but also to statistical surveys and surveys. Just as the difference between mountain and valley is reduced more and more by erosion, the differences between high and low measured values ​​or frequencies due to error influences (measurement inaccuracy, forgetting etc.) are lost. The actual difference between very large and small risks in the expected value is underestimated, as is the difference between good and weak students in the assessment of teachers or the difference in quality of consumer goods. The influence of this statistical tendency does not always mean an erosion-like approximation, but can also contribute to systematic overestimation or underestimation - and can be misused for statistical manipulations. The "Category Split Effect" is an example of this. Suppose a car dealer wants to convince a customer that the network of Japanese dealers in Europe is already very dense, and therefore asks the customer to estimate how many Japanese cars are already driving on European roads today. Another dealer in the same situation asks his customers to estimate how many Mazda, Honda, Nissan, Mitsubishi and Suzuki there are in Europe and then adds up the estimates received with the customer. Which of the two car dealers is more successful in convincing their customers of the density of Japanese cars? It is certainly the latter who (without knowing it) exploits a category split effect in his favor. If you split a medium-frequent or moderately rare category (Japanese cars) into a larger number of rare or very rare sub-categories (Mazda, Honda, etc.), then the sum of the sub-categories is more than the original category. Because through regression, the resulting smaller, rarer sub-categories are always overestimated, relative to the larger starting categories.

This can be easily demonstrated in any psychological internship. If you show test subjects a longer series of geometric shapes, such as twelve circles, six triangles and three diamonds, in a random sequence, and then let them estimate the frequency of the three shapes, then a typical answer could be: nine circles, six triangles and five diamonds - so normal regressive responses. However, if you split the six triangles into three equilateral and three unequal, then the estimates are typically: eight circles, five equilateral triangles, five unequal triangles and five diamonds. The original six triangles have been split into two smaller subcategories, so an estimated 5 + 5 = 10 triangles (see the two figures on page 19 on the right).

What is true of this harmless demonstration can have serious economic and political effects. A manufacturer can increase its market share for product X if it offers two similar products X1 and X2. Or the overall social influence of a radical political orientation can increase (with the same voter potential) if a radical party splits into two parties. From the foregoing it should already be clear that such effects can be expected precisely when buyers or voters are not determined, but are subject to random, "erroneous" influences.

The series of statistical illusions created by regression could go on for a long time. Some of them are of great practical importance and by no means just academic gimmicks.Understood or unrecognized regression effects are effective in the casino, in the interpretation of political elections or in naive discussions about hereditary and environmental influences (where regression is often seen as the superiority of genetic influences). The examples illustrated here are only intended to give a first glimpse into a serious source of error in human thinking. One aim of this article is to stimulate the imagination of interested readers so that they themselves will pay more attention to the many regression traps set in so many places in the social, physical, political, and economic world. The benefits of being more informed about statistics are sometimes considerable. The cost of missteps into the regression trap can only be guessed at.

Author:
Prof. Dr. Klaus Fiedler
Psychological Institute, Hauptstrasse 47, 69117 Heidelberg,
Telephone (0 62 21) 54 72 70, fax (0 62 21) 54 77 45