This article from NYTimes.com has been sent to you by [log in to unmask] /-------------------- advertisement -----------------------\ Share the spirit with a gift from Starbucks. Our coffee brewers & espresso machines at special holiday prices. http://www.starbucks.com/shop/subcategory.asp?category_name=Sale/Clearance&ci=274&cookie_test=1 \----------------------------------------------------------/ Stop, Historians! Don't Copy That Passage! Computers Are Watching January 26, 2002 By EMILY EAKIN These are boon times for muckrakers on the scholarship beat. In the last month alone, not one but two of the nation's most high-profile historians, Stephen Ambrose and Doris Kearns Goodwin, stand accused of plagiarism in cases that are generating headlines and hand-wringing. Sensing an opportunity to uncover front-page-worthy fraud, journalists armed with Post-It notes - and anonymous tips about the thefts - have turned into literary gumshoes, painstakingly combing through books in the library stacks. But the job needn't be so taxing. Over the last decade, plagiarism detection has gone high-tech. Today's software market is flooded with programs designed to rout out copycats with maximum efficiency and minimum effort. Historians were among the first scholars to try to nail a plagiarism suspect with a computer. In 1991, in a case that became famous in academic circles, several historians filed a complaint with the American Historical Association charging Stephen B. Oates, a historian at the University of Massachusetts at Amherst and the author of a well-regarded 1977 biography of Abraham Lincoln, with plagiarism. As evidence, Mr. Oates's accusers pointed to passages in his book that closely resembled passages in a 1952 biography of Lincoln by Benjamin P. Thomas. Mr. Oates furiously denied the charges, attributing any similarities between the two books to a reliance on the same historical sources. Twenty-three colleagues signed a public statement calling the plagiarism charges "totally unfounded." After deliberating on the case for a year, the association ruled that Mr. Oates had "failed to give Mr. Thomas sufficient attribution for the material he used," but carefully avoided the word plagiarism. Some of Mr. Oates's opponents were convinced he was being let off the hook too easily. One hit on the idea of having a computer judge the case and approached Walter Stewart and Ned Feder, scientists at the National Institutes of Health in Bethesda who had developed what the media dubbed a "plagiarism machine." Mr. Stewart and Mr. Feder spent four months on the project. By the time it was over, they had scanned more than 60 books into a computer and compared them not just to Mr. Oates's Lincoln biography but to his subsequent biographies of William Faulkner and the Rev. Dr. Martin Luther King Jr. as well. Their software followed a simple rule: each time a string of at least 30 characters in one of Mr. Oates's books matched a string of 30 characters in one of the other books, the computer made a note. (Strings of fewer than 30 characters were apt to turn up meaningless matches - including common proper names and phrases.) In February 1993, the scientists submitted a 1,400- page report to the association, detailing what they claimed were 175 instances of plagiarism in the Lincoln biography, 200 instances in the Faulkner biography and 240 instances in the King biography, all identified by their computer. But once again the association found no evidence of plagiarism, though it did state that Mr. Oates had depended to a degree greater than recommended "on the structure, distinctive language and rhetorical strategies of other scholars and sources." The association also took pains to dismiss Mr. Stewart and Mr. Feder's plagiarism machine, declaring that "computer-assisted identification of similar words and phrases in itself does not constitute a sufficient basis for a plagiarism or misuse complaint." The scientists' supervisors at the National Institutes of Health were no more enthusiastic. When they caught wind of Mr. Stewart and Mr. Feder's extracurricular activities, they confiscated the plagiarism machine and had their research lab shuttered. For the nascent plagiarism detection business, this was an inauspicious beginning, but hardly, it turned out, a major setback. Nearly 10 years later, antiplagiarism software is routinely used by dozens of colleges and universities - even high schools - on student work. At one end of the spectrum are companies like Turnitin.com, based in Oakland, Calif., which uses a software program to check the content of student work against millions of sites around the Web and a database of papers from online term-paper mills. At the other end are companies like Glatt Plagiarism Services in Chicago, which draw on techniques from cognitive theory to verify authorship. The Glatt Plagiarism Screening program, for example, relies on a method called the "Cloze procedure," originally used in the reading comprehension portion of standardized intelligence tests. Sample passages from a suspect work - which can range in size from a single essay to an entire book - are scanned into a computer, which, following the Cloze procedure, removes every fifth word. The sample passages are then returned to the author, who is asked to fill in the missing words. Glatt's founder and president, Dr. Barbara Glatt, says that if the work is authentic, the author will be able to recall most of the missing words. A plagiarist, on the other hand, will invariably flunk the test, or else fess up before taking it. "It's a tough test to pass," Dr. Glatt said. "I have never gotten 100 percent of them right." Nevertheless, she insisted, the Cloze technique is considered highly reliable. Scientists have tried removing the third and fourth words instead, she said, but with much less success. "So far," she added, "no one has ever been falsely accused by the test." Of course, neither of these approaches seems well suited for catching scholarly plagiarists. Professional historians of the stature of Mr. Ambrose and Ms. Goodwin, both of whom deny plagiarism but concede carelessness, are unlikely to be stealing from online term- paper mills. And though Dr. Glatt's approach has the advantage of being able to detect plagiarism when the identity of the plagiarized text is unknown, it's hard to imagine scholars readily agreeing to sit through a Cloze procedure exam at their accusers' request. The approach Mr. Stewart and Mr. Feder adopted - comparing one book to another - may still be a literary sleuth's best bet. Last year, Louis Bloomfield, a physics professor at the University of Virginia, created one such software program that he uses to run quick checks on his students' work. (When he first tried it last spring, he found 122 cases of possible cheating, leading to 15 student explusions and volunteer departures so far.) "It would be interesting to scan the world's libraries into electronic form and start doing these kinds of comparisons," Mr. Bloomfield said with a mischievous laugh. "I'm afraid you'd pop up all kinds of trouble." http://www.nytimes.com/2002/01/26/arts/26TANK.html?ex=1013050500&ei=1&en=ed2c3ac32c1d2b23 HOW TO ADVERTISE --------------------------------- For information on advertising in e-mail newsletters or other creative advertising opportunities with The New York Times on the Web, please contact Alyson Racer at [log in to unmask] or visit our online media kit at http://www.nytimes.com/adinfo For general information about NYTimes.com, write to [log in to unmask] Copyright 2001 The New York Times Company ========================================================= Important Subscriber Information: The Museum-L FAQ file is located at http://www.finalchapter.com/museum-l-faq/ . You may obtain detailed information about the listserv commands by sending a one line e-mail message to [log in to unmask] . The body of the message should read "help" (without the quotes). If you decide to leave Museum-L, please send a one line e-mail message to [log in to unmask] . The body of the message should read "Signoff Museum-L" (without the quotes).