The second day of the Science Writers 2009 conference began the New Horizons briefings, hosted by the Council for the Advancement of Science Writing (CASW), a 501(c)3 non-profit organization devoted to improving the quality of science writing. My favorite sessions on day 2 were a session on Information Technology that focused on catching plagiarism, and a Statistics session that highlighted some Florida election troubles (not the election you’re thinking of).
Informatics and Plagiarism Scandal
Starting out the day, Harold “Skip” Garner, PhD of the Univ of Texas and his team created a web-based software program, called eTBLAST, to scan databases, like Medline (online database of biomedical research papers), to look for instances of duplication that may indicate plagiarism. As someone who works a research agency that is also the single-largest source of funding for medical research, I found his research interesting, and just a little bit frightening.
The modern research environment is extremely competitive. The cost of doing medical research is increasing, while the dollars available for that research has been in steady decline. Academic researchers are judged, in large part, by the number of publications bearing their name, and the number of research dollars they bring into the institution. A common mantra among scientists, in both public and private settings, is “publish, or perish.” Perhaps it is this pressure to publish the next big paper, so that you can bring in the big research dollars so that you can publish the next paper, that drives some toward unethical behavior.
Previous studies cited by Dr. Garner have attempted to quantify instances of unethical behavior and found that 0.3% of researchers admitted to faking data, 1.4% admitted to some form of plagiarism, 4.7% published the same data more than once, and as many as 10% included authors that shouldn’t have been included.
Using their text analysis software, the researchers created a searchable database, called Deja Vu, which lists “highly-similar” publications found in Medline. They currently list over 79,000 papers that are strikingly similar to other papers in the database. The papers must be verified by hand, and currently only 6372 have been verified. Mixed in that 6372 in that number are papers that were reprinted with permission, papers that are corrections of previous publications, papers from authors that republish after expanding upon their original work, etc.
The team follow-up on 206 papers, which had, on average, 86% of the same text and 73% of the same references as an earlier paper. They sent surveys to the authors of the duplicate papers, and to the authors of the original papers. Ninety-three percent of the authors of the original papers had no idea that their work had been copied. As for the authors of the latter papers, 25% denied any wrongdoing, while 35% admitted and apologized. So far, over 90 investigations have been initiated, and there have been over 50 retractions. I’m thinking that, in the “publish or perish” world, imitation is NOT considered flattering!
Nobody Does an Election like Florida
The second speaker was Arlene Ash, Ph.D., a statistician with Boston University whose recent work has focused on what she calls vote theft. Dr. Ash shared data from the 2006 Congressional election in Florida’s district 13, where Republican Vern Buchanan earned a very narrow, 369 vote victory over Democrat Christine Jennings.
The problem, according to Ash, lies in the missing votes. In Sarasota county, one of the four counties that make up the district, there were 18,000 people (15 percent) who voted for other elections, but didn’t vote for the Congressional race. Other counties only had three-percent missing votes, indicating a clear anomaly. Adjusting for the norm, there were 15,000 excess missing votes in Sarasota county.
If the 15,000 people voted in a similar pattern to the rest of the voters, then the missing votes would not matter. However, according to Ash, these votes were not lost at random, and therefore one cannot assume that the votes would be distributed as the non-lost votes. Polls that showed the Democratic candidate having a strong lead prior to the election seem to confirm this suspicion.
It is speculated that many of the votes were lost due to the design of the ballot in that Florida county (wait, this feels familiar). On the Sarasota ballot, the congressional race did not get its own page, as many of the other votes did. It was squished at the top of a page, occupying about a quarter to a fifth of the page, while the rest of the page was occupied by one other election. Also, for some reason, the headers of the sections with for the other elections were highlighted with bright colors, while this one was just white.
Ash believes that this constitutes vote theft, which, according to her, occurs when the choices of eligible voters who make reasonable efforts to vote are not counted. In other words, disenfranchisement.
Vote THEFT seems kind of harsh, but it does seem awfully fishy. Considering the high potential for error when different places have different ballots, why don’t we just have standardized ballots in this country. If we have standard passports, and are moving toward standardized ID’s, why not standard ballots?