Danish Data Scientist: Claudine Gay’s Conclusions in Thesis, Article Questionable; Peer Review “Not That Good”
AP Images
Claudine Gay
Article audio sponsored by The John Birch Society

Disgraced former Harvard University President Claudine Gay, forced to resign after conservative journalists uncovered plagiarism in her academic writings and doctoral thesis, now is under fire for misusing data.

In an interview with Christopher Rufo, the City Journal writer who helped uncover Gay’s literary theft, a Danish scientist said the numbers in her thesis and an article don’t add up to the conclusions that Gay drew. And peer review of such work, Jonatan Pallesen said, isn’t what it should be.

But worse than that, Gay’s plagiarism and questionable data might be the result of her own inability to understand her subject matter.

Erroneous Data

Pallesen works for the Danish trade industry. He told Rufo that Gay’s thesis, Taking Charge: Black Electoral Success and the Redefinition of American Policies, and her article The Effect of Black Congressional Representation on Political Participation don’t prove what she claimed.

“The thesis and the paper claim to find that the election of black representatives causes a reduced white voter turnout,” Pallesen told Rufo. “But what they show is only a correlation, not a causal relationship.”

Pallesen explained the problem with an analogy. Consider “the relationship between the presence of black representatives and factors such as average income, the proportion of renters, and black population density,” he continued:

There is also a correlation here, but it would be incorrect to conclude that electing black representatives directly causes higher population density. Instead, it is more likely that areas with a higher black population density have a greater tendency to elect black representatives.

Pallesen said that Gay estimated white voter turnout using “ecological inference,” which Harvard professors define as “the process of extracting clues about individual behavior from information gathered at the group or aggregate level.” It appears that Gay, however, fell into the formal logical fallacy associated with ecological inference. 

Pallesen explained where Gay fouled up:

This estimation relies on data such as the total votes cast per precinct, as well as information about the precinct, including average income and the other previously mentioned factors. In step two, a regression shows a correlation between this estimate and black representatives. The paper concludes that black representation has a causal effect on white voter turnout, based on this correlation.

But this has the same basic problem as the simple example. Factors like black population density are likely to influence the tendency to elect black representatives. And since they, by construction, also influence the estimate of the white turnout, this leads to a correlation in the data—without any causal effect from the election of black representatives.

This is very basic. For many people who work with data, such considerations about possible alternative hypotheses are the first thing we think about. But for some reason it was not considered in the paper, which means that the conclusion it makes about causality is invalid.

But that’s not the only problem with Gay’s data. Christopher Brunet — who with Rufo co-authored the original piece about her plagiarism — reported that “Gay refused to provide her raw data to researchers who wanted to verify or attempt to replicate her work,” Rufo told Pallesen, which raises the question of whether Gay fudged her data.

Noting two instances in which Gay’s conclusions are suspect, Pallesen said she “should definitely make this data public.”

“This should be standard scientific practice to begin with, and she has a responsibility to set an example,” he told Rufo. “Additionally, in light of the fraud scandal involving Francesca Gino at Harvard Business School, there is an even greater urgency for an open data policy among researchers.”

Gino has denied the fraud allegation.

Is Gay Out of Her Depth?

“If Gay’s errors are so fundamental, how did they pass through the peer-review process?” Rufo asked Pallesen. “How did they earn her tenure at America’s most prestigious universities?”

Replied Pallesen:

Peer review is simply not that good. There are often issues that are not caught. Even so, I am still surprised that something this egregious was not noticed. It is important that we see science as an ongoing process, and not as one that concludes with peer review.

But Pallesen added that Gay was in over her head, and simply doesn’t understand the subject.

“Her scientific output for tenure was thin, even when no problems had been pointed out with this paper or with plagiarism,” he told Rufo. “It is well known that universities give preferential treatment to people based on their race and gender, instead of basing their selection process on merit.”

In other words, Gay — obviously hired because she is black — is yet another victim of “the soft bigotry of low expectations”:

It is also worth considering whether the plagiarism could be a symptom of more than sloppiness. Some scientists have wondered why she didn’t just write her own dry science prose. One possibility is that she may not fully comprehend the scientific nuances in the topics she’s writing about. In such cases, there might be a greater temptation to plagiarize, to ensure the avoidance of inaccuracies. It’s noteworthy that several of the plagiarized segments are found in sections involving statistical inferences.

Pallesen also explained that leftist scholars won’t criticize Gay because “research that aligns with woke claims tends to find easier acceptance.”

Gay’s troubles began when she was accused of not fighting “antisemitism” at Harvard. But the anti-white president’s downfall came after Rufo and others uncovered the plagiarism in her thesis and other articles. She resigned in early January.