Be wary of stylized facts

Combinations of stylized facts from various papers can often produce curious conclusions.

In The Innovation Premium to Soft Skills, Philippe Aghion documents an interesting finding. The wage premium for low-skilled workers at an innovative firm is large, but the corresponding wage premium for high-skilled workers is non-existent.


In order to explain this finding, Aghion develops a model in which the complementarity between low-skilled and high-skilled workers is higher in firms that innovate more. Why might this be? Workers of all types have both “hard” and “soft” skills. Hard skills can be easily verified. (Did you in fact pass the California bar exam?) Soft skills take time to recognize. Think of the trustworthy butler at a wealthy person’s country home. It takes time for the home-owner to ascertain the trustworthiness of his butler. Once he realizes that he has a good butler, he may pay him much more and be loath to lose him.

The story makes perfect sense and the model rationalizes both the story and the empirical finding.

David Autor pointed out the following empirical fact: the urban wage premium has collapsed for workers in low-skilled occupations.

If we combine Autor’s and Aghion’s empirical facts together, we would naturally conclude that innovative firms do not concentrate in urban areas. If they did, then the wage premium for low-skilled workers in innovative firms would translate into an urban wage premium for the same workers. But this flies in the face of a third stylized fact: innovative firms tend to cluster in areas like San Francisco and Seattle.

Hence, our three stylized facts produce a contradiction.

There are a few ways out of this mess. The first is that Autor was looking at data in the United States and Aghion was looking at data in England. Maybe, then, there is no flattening of the urban wage premium for low-skilled workers in England. But Aghion’s model has nothing country-specific about it. If you take his model seriously, then it should also result in higher wages for low-skilled workers at innovative firms in the US, and if innovative firms cluster in urban areas, then there should be a positive urban wage premium in the US. So, if you believe all three stylized facts, you would have to conclude that Aghion has not identified the correct mechanism to explain his observation about wages in the UK.

Another option to reconcile these facts is that R+D spending may not in fact be a good proxy for how innovative a firm is. Aghion defines a firm as innovative if it engages in R+D, but John Haltiwanger likes to point out that Walmart is one of the world’s most innovative firms, but it spends nothing on R+D. But, this semantic solution is not satisfactory either. Tech firms disproportionately engage in R+D and they are disproportionately located in urban areas, so if the tech firms have a high degree of complementarity between low- and high-skilled workers, then we should see wages for low-skilled workers increasing in urban areas, which is of course not what we see.

The third possibility is that urban wages for low-skilled workers would actually be even lower in the absence of urban R+D-intensive firms. In the absence of these firms, the wage gradient for low-skilled workers wouldn’t just be flat: it would slope downward as population density increases and we move from rural to urban areas. Then we would have to ask: what is pushing urban wages lower than rural wages for the same work? Have the new minimum wage laws had no effect? Or are they not showing up yet in the data?

I don’t have a good answer for how to reconcile these three stylized facts, but it shows the peril of combining conclusions which are largely true when you slice the data one way with conclusions that are largely true when you slice the data another way. It reminds me of the many examples in probability theory that violate transitivity. Consider the following puzzle:

Is it possible to have random variables X,Y and Z for which simultaneously Prob(X>Y) > ½, Prob(Y>Z) > ½, and Prob(Z>X) > ½?

The answer is, surprisingly, yes. In fact, it’s even possible for Prob(X>Y) = .6,  Prob(Y>Z) = .6, and Prob(Z>X) = .6. This means that it is possible for three stylized facts to be true on average, even if they violate transitivity (and seem contradictory). But it also means that the stylized facts are not capturing a lot of other important information that paints a more detailed view of the phenomenon you are trying to explain.