A germ of conventional wisdom that has emerged in the last decade is the claim that diversity in corporate boards and senior executives has been proven to improve company performance. The New York Times claims, “A growing body of research shows that more diverse teams outperform their peers.” The Nasdaq market justified diversity disclosure rules with the statement, which the SEC accepted, “There is a compelling body of credible research on the association between company performance and board diversity.”
Unlike most “studies prove” bandwagons, this one can be traced clearly to an influential 2015 study by the consulting firm McKinsey & Company . The study has been updated three times—each time the claimed improvement is larger.
Among scientific researchers, finding that something is bigger each time you measure it is a sign you might not understand what’s going on. Only among non-scientific ideologues is continuing escalation of claims proof of their validity.
The BBC used the third incarnation, “A 2020 McKinsey & Company analysis of 1,000 US firms showed companies with more gender diversity within their leadership teams were 25% more likely to have higher profits than their peers who did not.” The 2015 original was cited in California Assembly Bill No. 979 (overturned on May 15, 2023 for being “a race-based quota”) which mandated diversity in corporate boards, “According to a report by McKinsey and Company, for every 10 percent increase in racial and ethnic diversity on the senior-executive team, earnings before interest and taxes rise 0.8 percent.”
The McKinsey tetralogy, and a similar study by one of their competitors Boston Consulting Group, continue to be the main sources because they make the most straightforward claims. That’s possible because they do not conform to academic publication standards. There is an extensive academic literature, but none of it bears directly on the question and, in any event, the weight of it suggests diversity is either unrelated to or bad for corporate performance.
After opening by asserting they already know “intuitively” that diversity improves corporate performance, McKinsey leads with this blatant contradiction, “While correlation does not equal causation (greater gender and ethnic diversity in corporate leadership doesn’t automatically translate into more profit), the correlation does indicate that when companies commit themselves to diverse leadership, they are more successful.”
The first phrase and the parenthetical are correct, the last phrase contradicts it and is wrong. All observational studies like the McKinsey one can show is that two things are associated. Companies with greater diversity tend to be more profitable. They cannot show that when companies commit to diversity, profits increase. It could equally well be true that when companies earn higher profits they increase diversity, or that some other factor—such as having brand names, or operating internationally—leads to both diversity and profits.
Another issue is McKinsey chose a perverse measure of diversity, the Herfindahl–Hirschman index. Most people think a diverse board should reflect population percentages of individuals. But the H-H index mandates that boards and executive suites be composed of the same number of Blacks, Non-Hispanic Whites, Hispanics, Near Easterners, East Asians, South Asians, Native Americans and Other.
To see what this means, imagine a company identified 200 candidates who were a perfect cross-section of the US population for eight board seats. The company would be required to take the one Native American, regardless of qualifications. They’d have their choice of one of two Near Easterners, one of three South Asians and one each of eight East Asians and eight Others (mostly mixed race). They could pick one each from 23 Blacks, 38 Hispanics and 117 Whites.
This is the widely despised strategy of “tokenism,” not most people’s idea of diversity. And as you increase the number of groups to represent by incorporating sexual identity, religion, immigration status and other factors you end up with every individual unique, and every board equally diverse because it has no two identical members.
The McKinsey definition of corporate performance is also misleading. They use the fraction of corporate revenue left after operating and overhead expenses, which is used to pay the government and investors, and to reinvest in the business. These are the companies that in other contexts progressives are apt to denounce as “price gougers” and “employee abusers” since they take much more from customers than they pay in wages.
It's true that value investors, such as Warren Buffett, like profitable companies all else equal. But most investors prefer growth companies which keep prices down to get more customers, and spend big on new products and continuous improvements—resulting in lower profit margins but higher revenue and faster profit growth. Profit margin is a strategic choice, not a metric of business success.
Some companies compete on good quality for a low price, while others charge more but maintain name brands customers are willing to pay more for. In the beverage sector, it’s hard to charge more than competitors for generic water, so Primo Water has a 3.60% profit margin, while Coca-Cola enjoys 23.39%—not because it is run better than Primo, but because people will pay more for its name-brand formulations than similar generic products. Ford chugs along on 2.46% margins, Tesla buyers are willing to cough up 15.47%. Hyatt offers a clean, affordable room for 3.30%, Intercontinental Hotels promises luxury and earns 16.22%. Levi Strauss jeans command 4.04% profit, LVMH luxury clothing delivers 18.52%. TripAdvisor’s generic services earn 0.56%, Meta gets 29.41%.
This suggests a plausible reason for the relation McKinsey claims to have found. Luxury companies protecting brand names may choose to burnish their corporate reputations with popular board appointments, while companies focused on minimizing costs to keep customer prices low are worrying about other things.
The final problem with the McKinsey study is statistical significance. McKinsey observed that 47 of 91 companies with the lowest gender-diversity scores had higher profitability than their national-industry average, while 54 or the 91 companies with the highest gender-diversity scores did.
The first thing to note is that the lowest gender-diversity-score companies had slightly more than the expected 45.5 companies with above-median profit, so both diversity and lack of diversity seem to be good for profits. The second point is the result could easily be the result of random chance if diversity and profitability are unrelated. There were a total of 101 companies in the two groups that were above median. If we flipped a fair coin to assign them randomly to the bottom and top diversity groups, there’s a 28% chance that one or the other group would get 47 or fewer of the high-profit firms. By convention you need a chance below 5% to claim a result is statistically significant.
McKinsey claims its results are statistically significant, but does not disclose the methodology or significance level used. In subsequent reports it refers to the 2015 results as significant at the 10% level—doubt the usual threshold—but they results fail to meet that as well.
The result for what McKinsey calls “ethnic” diversity—which is based on ancestry rather than ethnicity—is slightly better. 43 of 101 above-median profits were from the low-ethnic-diversity groupo, 58 from the high-ethnic diversity group. This gives a p-value of 8%, so not statistically significant at the conventional level, but under the 10% claim from the future publications.
The Boston Consulting Group study was better in some ways, worse in others. It used a small (98 company) and narrow (only Swiss, German and Austrian companies) sample. On the other hand, it defined diversity not merely by sex or ethnic identification by across six dimensions: sex, country of parents, age, education, work for other companies and work for other sectors; but it used the same flawed methodology in which “diverse” means having the same number in each category. That’s reasonable for sex if you think sex is binary, but not for age, education or country.
BCG defined corporate performance as the fraction of revenue generated from new or enhanced products across the last three years. Like profit margin, this is a strategic choice, not a measure of company performance. Some companies constantly introduce “new and improved” products, others focus on developing long-lasting designs. Some sectors have change forced by technology or fashion, other sectors have stable product lines.
Like most areas of academic research, especially for politicized questions in the social sciences, most published results are wrong. There are good papers relating diversity and performance, but none that I can find address the question in the sense it is generally understood in the popular media stories and legislative debates, and certainly none provide clear support that diversity improves corporate performance.
It would not be hard to do a good study—or more accurately, two good studies—on this question. One interpretation of “diversity is associated with improved corporate performance” is that investors should look for diverse companies and avoid non-diverse ones. With decades of high-quality data on many thousands of stocks, it’s not hard to determine whether investors on average do better with stocks in diverse or otherwise-similar non-diverse companies.
This question does not require establishing causality. It doesn’t matter whether diversity causes outperformance, or outperformance causes diversity or both are caused by some third factor. Investing in diverse companies either wins, loses or makes no different.
Another advantage is you don’t have to start out with some narrow definition of diversity. You can test lots of diversity metrics to see which ones work best.
The interpretation of the diversity/performance question is that if companies are forced to diversity, will their performance improve? This is the question relevant for California lawmakers, Nasdaq, the SEC and other bodies pushing diversity as a social goal.
The easy way to study this is to look at the effect laws or campaigns to increase corporate diversity. Here causality does matter, which is why the studies are natural experiments rather than pure observation.
These studies do require pre-selected definitions of diversity, whichever ones are pushed by the laws or campaigns. But they allow study of many aspects of corporate performance, not just return to shareholders. Presumably the campaigns are not primarily done for shareholder benefit—after all, if shareholders want diversity they can vote for it directly.
We could study each campaign for the effect on things like job growth, wage growth, price changes, environmental policies, customer service—any metrics we think are important.
Both diversity and corporate performance are complex concepts with multiple aspects. How aspects of one interact with aspects of the other, and also with exogenous factors, is certainly going to be complex. Teasing out the more important and stable interactions would be valuable, but no one seems interested in doing it. The results would no doubt be complex and unsatisfying to political partisans. Ideologues, legislators, regulators and popular media are looking for clear, simple answers—and McKinsey and BCG are happy to supply them.
The rest of us, for now anyway, should be content to think about diversity and corporate performance separately, on their own terms, without insisting that we know one causes the other.