PUBLISHER'S NOTE: It has been called a "parlour game, a whodunit, and would perhaps make an interesting board game - or even a reality TV show. (Any suggestions for moderator?) The mystery: Who is the author of the bombshell New York Times anonymous op-ed page? As a former editorial Board member of the Toronto Star - and op-ed writer - I am fascinated by this story - and all of the pseudo-linguistic analysis clogging the media. (Who can believe the sheer power of a 750-word piece?) The Toronto Star hits the subject head on in an Associated Press article published on September 7, 2018. I was caught by the claim of one of the interviewees that “The science is very good..."It’s not quite DNA. It’s actually considered by some scientists to be the second-most accurate form of forensic identification we have, because it is so good.”...". So I did some basic research and came up with a very thorough analysis in the New Yorker (July 27, 2012) quoting "the pioneer of forensic linguistics" as saying: "I won’t claim that we have anything remotely like DNA in this work...but we are a whole lot better than a lot of the crazy schemes that cops are being taught.” Oh, oh! (That's not so comforting!) Apparently Betting shops are now in the 'who's the op-ed author' game. I'm not into placing any bets on the basis of what has all the smell of a not-quite ripened 'science.' (Besides, I'm not really a betting man!) Beware!
Harold Levy. Publisher; The Charles Smith Blog.
-----------------------------------------------------------
PASSAGE OF THE DAY: (Toronto Star story): "Experts use a combination of language use, statistics and computer science to help figure out who wrote documents that are anonymous or possibly plagiarized. They’ve even solved crimes and historical mysteries that way. Some call the field forensic linguistics; others call it stylometry or simply doing “author attribution.”The field is suddenly at centre stage after an unidentified “senior administration official” wrote in the Times that he or she was part of a “resistance” movement working from within the administration to curb Trump’s most dangerous impulses. “My phone has been ringing off the hook with requests to do that analysis and I just don’t have the time,” says Duquesne University computer and language scientist Patrick Juola."
-------------------------------------------------------
STORY: "Words on trial: Can professional word sleuths unveil mystery author of White House opinion piece?," by AP reporter Seth Borenstein, published by The Toronto Star on September 6, 2018.
PHOTO CAPTION: "One political scientist estimates there are about 50 people in the Trump administration who could have written an anonymous opinion piece published in the New York Times this week."
PHOTO CAPTION: "One political scientist estimates there are about 50 people in the Trump administration who could have written an anonymous opinion piece published in the New York Times this week."
GIST: "Language detectives say the key clues to who wrote an anonymous New York Times opinion piece
 slamming President Donald Trump may not be the odd and glimmering 
“lodestar,” but the itty-bitty words that people usually read right 
over: “I,” “of” and “but.” And lodestar? That could be a red herring meant to throw sleuths off track, some experts say. Experts
 use a combination of language use, statistics and computer science to 
help figure out who wrote documents that are anonymous or possibly 
plagiarized. They’ve even solved crimes and historical mysteries that 
way. Some call the field forensic linguistics; others call it stylometry
 or simply doing “author attribution.”The
 field is suddenly at centre stage after an unidentified “senior 
administration official” wrote in the Times that he or she was part of a
 “resistance” movement working from within the administration to curb 
Trump’s most dangerous impulses. “My phone has been ringing off 
the hook with requests to do that analysis and I just don’t have the 
time,” says Duquesne University computer and language scientist Patrick 
Juola. Robert Leonard, a Hofstra University linguistics professor 
who has helped solve murders by examining language, says if experts 
could get the right number of writing samples from officials whose 
identities are known, “an analysis could certainly be done.”One
 political scientist figures there are about 50 people in the Trump 
administration who fit the Times’ description as a senior administration
 official and could be the author. The key would be to look at how they 
write, the words they use, what words they put next to each other, 
spelling, punctuation and even tenses, experts say. “Language is a
 set of choices. What to say, how to say and when to say it,” Juola 
said. “And there’s a lot of different options.” One
 of the favourite techniques of Juola and other experts is to look at 
what are called “function words.” These are words people use all the 
time but that are hard to define. Some examples are “of,” “with,” “the,”
 “a,” “over” and “and.” “We all use them but we don’t use them in 
the same way,” Juola says. “We don’t use them in the same frequency.” 
Same goes with apostrophes and other punctuation. For example, do 
you say “different from” or “different than?” asks computer science and 
data expert Shlomo Argamon of the Illinois Institute of Technology. Women tend to use first- and second-person pronouns more — “I,” “me” and “you” — and more present tense verbs, Argamon said. Men use “the,” “of,” “this” and “that” more often, he said. “You
 look for clues and you try to assess the usefulness of those clues,” 
Argamon said. But he is less optimistic that the Trump opinion piece 
case will be cracked for various reasons, including the New York Times’ 
editing for style and possible efforts to fool language detectives with 
words that someone else likes to use, such as “lodestar.” Mostly, he’s 
pessimistic because to do a proper comparison, samples from all suspects
 have to be gathered and have to be similar, such as all opinion columns
 as opposed to novels, speeches or magazine stories. Rachel 
Greenstadt at Drexel University studies when people try to throw off 
investigators with words they don’t normally use or purposeful 
misspellings. She said her first instinct is that the word “lodestar” — 
one Vice-President Mike Pence has used several times — is “a red 
herring.” It seems too deliberate. “Most people are still looking 
for sound-bite-sized features like lodestar instead of trying to get a 
handle on the whole picture,” says Hofstra’s Leonard. Greenstadt 
says language analysis “could kind of contribute to the picture” of who 
wrote the Times opinion piece, but she adds that “by itself, I’d be 
concerned to use it.” Still, with the right conditions, words matter. Juola
 testified in about 15 trials and handled even more cases that never 
made it to court. His biggest case was in 2013, when a British newspaper
 got a tip that the book The Cuckoo’s Calling by Robert Galbraith was really written by Harry Potter author J.K. Rowling. In about an hour, Juola fed two Rowling books, The Cuckoo’s Calling
 and six other novels into his computer, analyzed the language patterns 
with four different systems and concluded that Rowling did it. A couple of days later, Rowling confessed. It
 was far from the first time that language use fingered the real 
culprit. The Unabomber’s brother identified him because of his 
distinctive writing style. Field pioneers helped find a kidnapper who 
used the unique term “devil strip” for the grassy area between the 
sidewalk and road. The phrase is only used in parts of Ohio. Even in politics, words are poker tells. In 1996, the novel Primary Colors
 about a Clintonesque presidential candidate set Washington abuzz trying
 to figure out who was the anonymous author. An analysis by a Vassar 
professor and other work pointed to Newsweek’s Joe Klein and he finally 
admitted it. But the literary sleuthing goes back to the founding 
of the republic. Historians had a hard time figuring out which specific 
Federalist Papers were written by Alexander Hamilton and which were by 
James Madison. A 1963 statistical analysis figured it out: One of the 
many clues came down to usage of the words “while” and “whilst.” Madison
 used “whilst”; Hamilton preferred “while.” Juola says experts in 
the field can generally tell introverts from extroverts, men from women,
 education level, age, location, almost everything but astrological 
sign. “The science is very good,” Juola said. “It’s not quite DNA.
 It’s actually considered by some scientists to be the second-most 
accurate form of forensic identification we have, because it is so 
good.”
https://www.thestar.com/news/world/2018/09/06/can-professional-word-sleuths-unveil-author-of-mystery-white-house-opinion-piece.html
https://www.thestar.com/news/world/2018/09/06/can-professional-word-sleuths-unveil-author-of-mystery-white-house-opinion-piece.html
------------------------------------------------------------------
PASSAGE OF THE DAY: New Yorker article: "Butters said, “Forensic linguistics has not come to a place where we are mature enough to answer a lot of these questions.” Carole Chaski, the executive director of the Institute for Linguistic Evidence and the president of Alias Technology, in Georgetown, Delaware, which markets linguistic software, agrees. Chaski has been working to perfect a computer algorithm that identifies patterns hidden in syntax. With enough linguistic material to work with, she says, she can run the program and draw accurate linguistic conclusions. Her goal is to develop a standard “validated tool” that police, civil investigators, and linguists can turn to when testifying in crucial cases, such as a capital murder trial. “If this is real, these tools should be so reliable that I can automate them and somebody can use them,” she says. Chaski foresees a time when forensic-linguistic “technicians” will do what DNA technicians in crime labs do: “They learn how to run a piece of software or run a Southern blot”—a standard DNA test—“through electrophoresis and then go, ‘Here are my results.’ ”In Chaski’s view, a trail of words can be parsed to reveal its author, but that work is best done quantitatively, through brute computational force, not qualitatively, by subjective scholars. Forensic linguistics, she believes, should not be limited to a few highly credentialled experts who have been approved by the courts to testify. She warned me of the recklessness of an “academic” and an “ex-cop” hanging out a shingle, and said their methodology was “fraught with error.” In the small world of forensic linguistics, it was obvious that she meant Leonard and Fitzgerald."
ARTICLE: "Words on Trial Can linguists solve crimes that stump the police?," by Jack Hitt, published by The New Yorker on July 23, 2012.
ILLUSTRATION CAPTION: "A suspect’s conversations and writings can be analyzed for patterns and peculiarities."
 
ILLUSTRATION CAPTION: "A suspect’s conversations and writings can be analyzed for patterns and peculiarities."
GIST: "The pioneer of 
forensic linguistics is widely considered to be Roger Shuy, a retired 
Georgetown University professor and the author of such fundamental 
textbooks as “Language Crimes: The Use and Abuse of Language Evidence in
 the Courtroom.” Shuy is now eighty-one years old and lives in Montana. 
When I asked him to describe the origins of forensic linguistics, he 
referred me to an Old Testament story. After a confusing battle with the
 Ephraimites, the Gileadites were able to identify the enemy by asking 
them each to pronounce the Hebrew word “shibboleth.” If they pronounced 
the first syllable in the Ephraimic dialect, “sib,” instead of in the 
Gilead dialect, “shib,” they were killed. According to Judges 12:6, some
 forty-two thousand Ephraimites failed that first linguistic test. The
 field’s more recent origins might be traced to an airplane flight in 
1979, when Shuy found himself sitting next to a lawyer. By the end of 
the flight, Shuy had a recommendation as an expert witness in his first 
murder case. Since then, he’s been involved in numerous cases in which 
forensic analysis revealed how meaning had been distorted by the process
 of writing or recording. In a bribery trial in the nineteen-eighties, 
two Nevada brothel commissioners were caught on tape in a crucial 
exchange. When they were offered a bribe, one turned to the other and, 
according to the police transcript, said, “I would take a bribe, 
wouldn’t you?” Shuy analyzed the tape and, on the stand, testified that 
the defendant had actually said the opposite: “I wouldn’t take a bribe, 
would you?” The tape was scratchy. Moreover, in conversational speech, 
the “n’t” of a contraction is barely vocalized. It was hard to hear—or, 
rather, easy to hear what the listener was primed to hear. But two facts
 were indisputable, Shuy noted: both versions of the sentence had 
exactly eight syllables, and the pause fell just before the last two 
syllables. Thus, Shuy testified, only one reading of the sentence made 
sense: “I wouldn’t take a bribe, would you?” The trial resulted in a 
hung jury. Shuy has become famous in his discipline for some of 
the field’s finest Holmesian aperçus. Early in his career, the police in
 Illinois approached him regarding a notorious kidnapping case; they had
 several suspects, and they hoped his reading of the ransom notes might 
help narrow down the list of suspects. In each note, the kidnapper 
demanded money in a semiliterate rant: “No kops! Come alone!!,” followed
 by a terse instruction—“Put it in the green trash kan on the devil 
strip at the corner 18th and Carlson.” Shuy studied the letters and then
 asked, “Is one of your suspects an educated man born in Akron, Ohio?” 
The cops were stunned. There was one who matched that description 
perfectly, and when confronted he confessed. As Shuy subsequently 
explained, “kop” and “kan” most likely were intentional misspellings by 
someone posing as illiterate. And he knew from his research that the 
patch of grass between the sidewalk and the street—sometimes known as 
the “tree belt,” “tree lawn,” or “sidewalk buffer”—is called the 
“devil’s strip” only in Akron, Ohio. In recent years, following 
Shuy’s lead, a growing number of linguists have applied their techniques
 in criminal cases, such as Chris Coleman’s, and even in major 
commercial lawsuits. An upcoming suit 
between Apple and Microsoft, slated to go before the Trademark Trial and
 Appeal Board, features two stars of the field, Rob Leonard and Ronald 
Butters, a retired Duke University linguist. At issue is: What part of 
speech is the phrase “app store”? Leonard, siding with Apple, contends 
that it is a proper noun, which is to say a trademarked expression that 
should be capitalized. Butters’s work upholds Microsoft’s view: the term
 consists of two common nouns and is not proprietary at all. Butters
 is a past president of the International Association of Forensic 
Linguists, which has some two hundred and fifty members. Most of them, 
he said, are in the United States, England, and Spain, but interest has 
spread to Australia, Japan, and China. Today, one can study forensic 
linguistics at several schools, and last year Leonard inaugurated the 
first graduate program in forensic linguistics, at Hofstra. For those 
earning a master’s degree, the field offers job prospects outside the 
courtroom. Immigration and Customs Enforcement hires language detectives
 to assist agents in evaluating asylum seekers. In such cases, forensic 
linguists interview applicants to verify that their accents and their 
use of idiom and slang match those of the country they claim to have 
fled. Increasingly in the courtroom, however, forensic linguists 
have been asked to weigh in on matters of “author identification”—not to
 determine the grammatical significance of certain words but to identify
 who said or wrote them. This trend has widened an old schism in the 
field. Given the stakes in, say, the Coleman case—a felony murder 
potentially involving the death sentence—some linguists hold the view 
that Leonard is taking forensic linguists into groundbreaking territory.
 Others, including Butters, wonder if he isn’t leading them over a 
cliff. When
 I visited Leonard one afternoon at Hofstra, he was reviewing a range of
 cases: another murder involving the killer’s letters; a libel suit that
 turned on a single, ambiguous sound; an attempt to identify a potential
 assassin of a prominent politician; and a Whirlpool Corporation lawsuit
 involving the meaning of the word “steam.” In a modest office walled 
with books, I found Leonard working at a laptop. He was noticeably 
kempt, in pressed slacks and a crisp blue button-down shirt—a Sam Spade 
of semantics. His hair was surprisingly dark for a man in his sixties; 
his eyes were playful and his smile fetching, a little bit show biz. Long
 before he emerged as one of the foremost language detectives in the 
country, Leonard had achieved a different kind of celebrity. As an 
undergraduate at Columbia in the nineteen-sixties, he and his brother 
George revolutionized the school’s a-cappella group by having everyone 
dress as faux Brooklyn thugs (white T-shirts, greased-back hair) and 
sing up-tempo arrangements of such nineteen-fifties doo-wop classics as 
“Duke of Earl” and “At the Hop.” They named the group Sha Na Na and 
became wildly popular. One of their hits was “Teen Angel,” which Leonard
 sang at Woodstock just before Jimi Hendrix, who had invited Sha Na Na, 
débuted his version of “The Star-Spangled Banner.” By 1970, 
Leonard the heartthrob had to choose between academia and show business.
 “All of our good friends were dying of drug overdoses,” he said. “I 
just decided to move on.” Leonard finished his undergraduate studies at 
Columbia; William Labov, a prominent linguist who had introduced him to 
the field, helped him earn a fellowship. Leonard pursued a scholarly 
career until 2000, when he heard Shuy give a lecture urging linguists to
 apply their training in the real world—especially in the courtroom, as 
language detectives. Leonard struck up a professional friendship with 
Shuy and has been consulting on cases ever since. As we sat in his
 office, Leonard described his recent involvement in the tabloid saga of
 Natalee Holloway. In 2005, after graduating from high school in 
Alabama, Holloway went with her friends on a chaperoned trip to Aruba 
and disappeared. The case remains unsolved. The chief suspect is a young
 Dutchman named Joran van der Sloot, who pleaded guilty in 2012 to 
charges of murdering a twenty-one-year-old woman in Peru. In Aruba, two 
young brothers, Deepak and Satish Kalpoe, were initially arrested (they 
and van der Sloot had partied with Holloway the night before she 
disappeared), but were released in the first weeks of the investigation.
 After being the subjects of a television exposé, the brothers are suing
 Dr. Phil McGraw and CBS for defamation. The Kalpoe legal team has hired
 Leonard as their expert witness in a lawsuit that could turn on the 
pronunciation of a single syllable. The “Dr. Phil” show promoted 
the exposé by claiming, “You are going to find out what he”—Deepak—“says
 he did with Natalee the night she disappeared.” An announcer adds, 
“What he said brought Natalee’s mother to tears.” On the show, viewers 
listen to the audio of Deepak being secretly videotaped by a private 
investigator named Jamie Skeeters and making an astonishing confession:
SKEETERS: I’m sure she had sex with all of you.Leonard examined the uncut version of the exchange. In it, Kalpoe denies having sex with Holloway. “Simple” refers to the fact that, from his point of view, the evening was uneventful:
KALPOE: She did. You’d be surprised how simple it was.
SKEETERS: I’m sure she had sex with all of you, and . . . good . . .Watching an unedited piece of footage doesn’t require a linguistics expert, but Leonard realized that there were other issues at play. During the covert interview, the microphone generated a great deal of confusing ambient sound. Moreover, the hidden camera captured only the top of Kalpoe’s head, so his face and lips weren’t visible. Amid the muffled noises, and before Kalpoe speaks, there is an odd sound—sha!—which Kalpoe appears to make just before “No.” When I met with Leonard, he had been concentrating on this sound. An expert hired by the opposing counsel was taking the position that the sha! might not be a throat-clearing or some other stray sound, as Leonard contends, but a “voiceless vowel with ‘r’ coloration.” Leonard explained: “Vowels are the most open of sounds, and when you come off a vowel and cease saying it you switch the vocal apparatus to pronounce the next sound.” In some words, like “forth,” the “r” gets full phonetic treatment, but in many words, like “bird” and “sure,” the “r” isn’t fully voiced and instead becomes a shadow of the vowel, just because it’s easier to say that way. If the lawyers for “Dr. Phil” can show that the first word Kalpoe spoke in that sentence was “sure,” and that there is no audible “n’t” at the end of “did,” then the transcript of Kalpoe’s first utterance changes from “No, she didn’t” to its opposite, “Sure, no, she did.” Like all linguists, Leonard starts from the position that meaning is delicately contingent, and that the most common way we compensate for this frailty is “redundancy.” We say the same thing more than once, or in more than one way. In his written report to the court on this case, Leonard notes that the original video of the meeting between Deepak Kalpoe and Skeeters shows Deepak “shaking his head ‘no’ from side to side,” as if to deny the accusations. The program, though, aired only a still photo of Deepak. The case has yet to go to trial, but when it does, Leonard says, he will argue that there is enough redundancy in the semiotic detritus of these sounds to conclude that Kalpoe’s meaning is clear: he is stating that he did not have sex with Holloway that night. It may be that the changes made to the edited interview were deliberately damaging, but forensic linguists offer another possibility: that a subtle presumption of guilt unconsciously overwhelmed the editing process and inverted the meaning of the exchange. Such inversions, linguists say, happen far more often than we might like to believe. According to Leonard, words serve as catalysts, setting off sparks of potential meaning that the listener organizes into more specific meaning by observing facial expressions, body language, and other redundant cues. We then employ another powerful tool: prior experience and the storehouse of narratives that each of us carries—what linguists call “schema.” To every exchange we bring unconscious scripts; as any given sentence unspools, we readjust the schema to make better sense of what we are hearing. One afternoon at Hofstra, Leonard explained to the twenty students in his introductory course how this works. He wrote a sentence on the board: “John was on his way to school last Friday and was really worried about the math lesson.” He quizzed the students on what they might presume about this story. John is a student, one called out; he is either on a bus or walking. “So we can just close our eyes and imagine John the schoolboy on the bus,” Leonard said. “But are we all imagining John with the same height, the same hair color?” Nothing in the sentence signals any of that information, yet each of us supplies our own variant, which awaits further verbal data for confirmation. Leonard wrote another sentence beneath the first: “Last week, he had been unable to control the class.” Who is John now? “A teacher!” someone shouted. And how is John getting to school? “A car!” Leonard wrote a third sentence: “It was not fair for the math teacher to leave him in charge.” Instantly, the students revelled in John’s new identity as a janitor or a substitute teacher. Meaning, Leonard noted, is constantly bent by expectation, and can be grossly distorted. Indeed, one of Shuy’s first studies, of the Abscam trials of the nineteen-eighties, reveals just how easily the meaning of linguistic evidence can be twisted by a background assumption of guilt. Abscam was an F.B.I. sting operation in which nine United States congressmen were lured to meetings with a government agent posing as an Arab oil sheikh with “Abdul Enterprises.” The initial meeting was described as a legitimate business deal. At one point, though, the agent playing the sheikh would offer the congressmen an outright bribe. Their conversations were videotaped, and some of the evidence was breathtakingly unambiguous. Representative John Jenrette, of South Carolina, accepted the money cheerfully and chirped on tape, “I’ve got larceny in my blood!”
KALPOE: No, she didn’t.
SKEETERS: O.K., well, I mean, good. If she did, fine.
KALPOE: You’d be surprised how simple it was that night. [cartoon id="a16677"]
The sting resulted in seven indictments. 
Toward the end came the trial of Senator Harrison (Pete) Williams, of 
New Jersey. Shuy listened to those tapes and became convinced that the
 Senator was innocent. Whenever the sheikh raised the issue of bribery 
or illegality, Williams steered the conversation to legal ground. At one
 point, the sheikh put the bribe directly to Williams: “I would like to 
give you . . . some money for, for permanent residence.” The first four 
words of Williams’s reply were “No, no, no, no.” A prosecution 
memo at the time stated that there was no case against Williams, but the
 judge, who, in his ruling, decried “the cynicism and hypocrisy of 
corrupt public officials,” set it aside; Williams was found guilty and 
sentenced to three years in prison. Shuy later noted that, with such 
attitudes prevalent, the schema of the “corrupt congressman” overwhelmed
 even the plainest facts pointing to Williams’s innocence. After the 
trial, the lead juror confessed that had he known all the facts he would
 not have found Williams guilty. The Senator was forced to resign his 
seat, though he declared his innocence at every opportunity. He was the 
first senator in eighty years to go to prison; President Bill Clinton 
refused to pardon him. Shortly
 after the Unabomber case was cracked, in 1996, forensic linguistics 
gained another public boost. Donald Foster, an English professor at 
Vassar, employed the most basic forensic technique—tallying word 
frequency—to unmask the anonymous author of “Primary Colors,” the 
best-selling novel about Clinton’s first Presidential campaign: Joe 
Klein. Foster analyzed dozens of pages of writing from several suspects,
 including Time’s Walter Shapiro and the former Deputy
 Treasury Secretary Roger Altman. He compiled a concordance that showed 
how frequently each writer used certain words, compared this information
 with a database of word frequency in the novel, and was able to 
identify the author.
For a short time, the potential of forensic 
linguistics seemed limitless. With enough raw data and computing power, a
 trail of words might betray its author as reliably as a set of 
fingerprints identifies an individual. Foster, though, was a professor 
of literature, not a linguist; he was not trained to use the forensic 
methods that Shuy had mastered, such as listening for unconscious 
semantic patterns and looking for distinctive phrases or unusual 
colloquialisms. Overconfident, Foster went on to identify a suspect in 
the JonBenét Ramsey murder case, only to learn that he had already been 
cleared by the police. In the days after September 11, 2001, Foster 
falsely implicated the bioweapons expert Steven Hatfill as the person 
who had sent several anthrax-laden letters around the country; the 
accusation wrecked Hatfill’s career and resulted in a settled lawsuit. 
Foster then recanted a previous claim linking a 1612 poem of dubious 
provenance to Shakespeare; another academic had shown that the analysis 
was fatally flawed. Foster has since retreated to his campus in 
Poughkeepsie. Foster’s disgrace left most forensic linguists 
feeling cautious. Now that Leonard’s work is bringing the field back 
into the realm of author identification, some are worried. Ronald 
Butters, the Duke linguist, provided expert testimony for the defense at
 the Coleman trial and challenged every aspect of Leonard’s testimony as
 “linguistically meaningless.” Butters argued that even though certain 
linguistic oddities, such as using “U” for “you” or consistently 
misplacing the apostrophe in contractions, seemed distinctive, there 
weren’t enough examples to be statistically significant. Moreover, 
Butters told me, it can be tricky to compare different genres of even a 
single person’s writing. Reading, say, a routine office e-mail alongside
 rants spray-painted on a wall makes about as much sense as comparing 
the prose in one of Wallace Stevens’s insurance riders with the cadences
 of his poem “Sunday Morning.” “Really bad linguistic testimony is
 when you go to court and say you’re pretty sure that this person wrote 
that, and yet you’re comparing apples and oranges,” Butters said. 
Leonard argues that he never claims to name a specific author but simply
 presents comparative evidence for the jury. Butters, Leonard said, “is a
 specialist in trademark cases, so I’m not sure what his experience was 
in authorship cases, and they are two quite different applications of 
linguistics.” On the stand, Butters admitted that he hadn’t read all 
Leonard’s research on the evidence; his challenge was focussed on 
Leonard’s methodology and its purported usefulness in the identification
 of individual authors. Butters said, “Forensic linguistics has not come
 to a place where we are mature enough to answer a lot of these 
questions.” Carole Chaski, the executive director of the Institute
 for Linguistic Evidence and the president of Alias Technology, in 
Georgetown, Delaware, which markets linguistic software, agrees. Chaski 
has been working to perfect a computer algorithm that identifies 
patterns hidden in syntax. With enough linguistic material to work with,
 she says, she can run the program and draw accurate linguistic 
conclusions. Her goal is to develop a standard “validated tool” that 
police, civil investigators, and linguists can turn to when testifying 
in crucial cases, such as a capital murder trial. “If this is real, 
these tools should be so reliable that I can automate them and somebody 
can use them,” she says. Chaski foresees a time when forensic-linguistic
 “technicians” will do what DNA technicians in crime labs do: “They 
learn how to run a piece of software or run a Southern blot”—a standard 
DNA test—“through electrophoresis and then go, ‘Here are my results.’ ”
In Chaski’s view, a trail of words can be parsed to reveal its author, but that work is best done quantitatively, through brute computational force, not qualitatively, by subjective scholars. Forensic linguistics, she believes, should not be limited to a few highly credentialled experts who have been approved by the courts to testify. She warned me of the recklessness of an “academic” and an “ex-cop” hanging out a shingle, and said their methodology was “fraught with error.” In the small world of forensic linguistics, it was obvious that she meant Leonard and Fitzgerald. Leonard said that Chaski’s computerized approach made him “want to take a nap.” His methods and findings are all transparent, he noted, whereas her algorithm is a proprietary “black box.” He does not believe that computer software can eliminate the need for human interpretation. “Even those algorithms have to be coded by humans,” he said; any good linguist will depend on both quantitative and qualitative analysis. “One thing we have learned about language is that it is a very human form of communication. You have to have human intelligence, human powers of inference, and human encyclopedic knowledge of the world” to make sense of it. At the end of the day, the scientific findings depend on human interpretation, Leonard said. Computers can crunch reams of words, but only people can decide what the words mean. Shuy told me that he, too, initially had doubts about author identification. “That is how I felt until Rob Leonard started working,” he said. “Rob has come up with this competing-hypothesis approach.” In the same way that DNA technicians will report only the statistical likelihood that the killer’s DNA and the DNA found on the murder weapon are the same, Leonard creates a number of opposing hypotheses and presents the evidence in light of them. In the Coleman trial, Leonard did not declare that Coleman was the author of the red graffiti and the threatening e-mails; rather, he testified that the language in them “is consistent with” the language in Coleman’s writings. “I don’t know any forensic linguists who will claim that they can find the answer for you,” Shuy said. “Our role is to analyze the data and give it to the triers of the facts, who have to evaluate it or issue the ultimate decision of innocence or guilt. We don’t go that far, and shouldn’t.” Shuy also noted that it was Leonard who popularized a safeguard against comparing unrelated documents, called a Community of Practice filter. For instance, Coleman’s use of “U” for “you” would be of no use in a pool of text messages, but as an unusual abbreviation in an e-mail it becomes another point of data. Recently, Leonard used this technique to question a charge, levelled at a jailed gang member, of murdering a prison guard. Prosecutors had linked the prisoner, Jarvis Masters, to a note that ultimately led to the guard’s murder, based on misspellings such as “has’nt” and “is’nt” and the use of “no” for “know.” But in his research Leonard learned that the way Masters’s gang, the Black Guerrilla Family, disciplined its members was to make them copy propaganda by hand. All the gang members had picked up the oddities pinned on Masters, Leonard determined. “Thus, when we examine the corpus of non-murder documents written by other B.G.F. members,” Leonard said, “we discover the features that may at first seem to educated writers like the prosecution to be randomly incorrect, highly idiosyncratic features were not random at all but systemic features of the B.G.F. community.” On some level, extracting meaning from linguistic evidence is what we all do intuitively every day. Forensic professionals go about the same work, with better tools and a heightened sense of how easily meaning can be misconstrued. As one forensic-linguistics firm, Testipro, puts it in its online promotional pitch, the field is “the basis of the entire legal system. Both Judges and Juries are using informal or unconscious FL”—forensic linguistics—“every time they weigh a witness statement or testimony document.” The field is bound to thrive on the ever-growing piles of what Shuy calls “data.” Our embrace of personal media—e-mails, text messages, voice mail, tweets—has created an avalanche of tossed-off language, an evidentiary trail that linguists are getting better and better at following. Shuy believes that forensic linguistics can do for language crimes, such as bribery, blackmail, and extortion, what DNA has done for violent crimes. It could offer a counterweight to the many old-school methods, like lineups and unrecorded police interrogations, that are heavily relied upon despite serious flaws. “I won’t claim that we have anything remotely like DNA in this work,” Shuy said, “but we are a whole lot better than a lot of the crazy schemes that cops are being taught.” Leonard offered a sobering statistic: eighty per cent of people who were later exonerated by DNA evidence had falsely confessed to their alleged crimes. “When I got into this business, I figured if there was an eyewitness or a confession, then case closed, the guy absolutely, one hundred per cent did it. But those are the two shakiest types of evidence, really.” He recalled many cases where a confession on paper turned out to be no confession at all. “The way humans perceive language is according to schemas, which lead to misperceptions as much as perceptions.” In a sense, investigators who try to extract evidence from confessions are acting as linguists, too, albeit poorly trained ones. A few weeks ago, Leonard finished testifying in the retrial of Brian Hummert, a Pennsylvania man charged with strangling his wife. After initial suspicions pointed to Hummert, the police received handwritten letters claiming that a serial killer, not Hummert, had committed the murder. Once again, the linguistic evidence was important to the case. The notes bore a resemblance to a series of stalker letters that preceded the killing and to the defendant’s writing. As an expert witness, Leonard testified about Hummert’s prose style, noting the rare use of what he calls “ironic repetition” in constructions such as “She tried to break it off, so I broke her neck.” And all the letters contained a linguistic habit that, Leonard testified, he had found nowhere else: a tendency to use contractions in negative statements (“I can’t”) but not in positive ones (“I am”). The jury was out for forty-five minutes and returned a verdict of guilty."
In Chaski’s view, a trail of words can be parsed to reveal its author, but that work is best done quantitatively, through brute computational force, not qualitatively, by subjective scholars. Forensic linguistics, she believes, should not be limited to a few highly credentialled experts who have been approved by the courts to testify. She warned me of the recklessness of an “academic” and an “ex-cop” hanging out a shingle, and said their methodology was “fraught with error.” In the small world of forensic linguistics, it was obvious that she meant Leonard and Fitzgerald. Leonard said that Chaski’s computerized approach made him “want to take a nap.” His methods and findings are all transparent, he noted, whereas her algorithm is a proprietary “black box.” He does not believe that computer software can eliminate the need for human interpretation. “Even those algorithms have to be coded by humans,” he said; any good linguist will depend on both quantitative and qualitative analysis. “One thing we have learned about language is that it is a very human form of communication. You have to have human intelligence, human powers of inference, and human encyclopedic knowledge of the world” to make sense of it. At the end of the day, the scientific findings depend on human interpretation, Leonard said. Computers can crunch reams of words, but only people can decide what the words mean. Shuy told me that he, too, initially had doubts about author identification. “That is how I felt until Rob Leonard started working,” he said. “Rob has come up with this competing-hypothesis approach.” In the same way that DNA technicians will report only the statistical likelihood that the killer’s DNA and the DNA found on the murder weapon are the same, Leonard creates a number of opposing hypotheses and presents the evidence in light of them. In the Coleman trial, Leonard did not declare that Coleman was the author of the red graffiti and the threatening e-mails; rather, he testified that the language in them “is consistent with” the language in Coleman’s writings. “I don’t know any forensic linguists who will claim that they can find the answer for you,” Shuy said. “Our role is to analyze the data and give it to the triers of the facts, who have to evaluate it or issue the ultimate decision of innocence or guilt. We don’t go that far, and shouldn’t.” Shuy also noted that it was Leonard who popularized a safeguard against comparing unrelated documents, called a Community of Practice filter. For instance, Coleman’s use of “U” for “you” would be of no use in a pool of text messages, but as an unusual abbreviation in an e-mail it becomes another point of data. Recently, Leonard used this technique to question a charge, levelled at a jailed gang member, of murdering a prison guard. Prosecutors had linked the prisoner, Jarvis Masters, to a note that ultimately led to the guard’s murder, based on misspellings such as “has’nt” and “is’nt” and the use of “no” for “know.” But in his research Leonard learned that the way Masters’s gang, the Black Guerrilla Family, disciplined its members was to make them copy propaganda by hand. All the gang members had picked up the oddities pinned on Masters, Leonard determined. “Thus, when we examine the corpus of non-murder documents written by other B.G.F. members,” Leonard said, “we discover the features that may at first seem to educated writers like the prosecution to be randomly incorrect, highly idiosyncratic features were not random at all but systemic features of the B.G.F. community.” On some level, extracting meaning from linguistic evidence is what we all do intuitively every day. Forensic professionals go about the same work, with better tools and a heightened sense of how easily meaning can be misconstrued. As one forensic-linguistics firm, Testipro, puts it in its online promotional pitch, the field is “the basis of the entire legal system. Both Judges and Juries are using informal or unconscious FL”—forensic linguistics—“every time they weigh a witness statement or testimony document.” The field is bound to thrive on the ever-growing piles of what Shuy calls “data.” Our embrace of personal media—e-mails, text messages, voice mail, tweets—has created an avalanche of tossed-off language, an evidentiary trail that linguists are getting better and better at following. Shuy believes that forensic linguistics can do for language crimes, such as bribery, blackmail, and extortion, what DNA has done for violent crimes. It could offer a counterweight to the many old-school methods, like lineups and unrecorded police interrogations, that are heavily relied upon despite serious flaws. “I won’t claim that we have anything remotely like DNA in this work,” Shuy said, “but we are a whole lot better than a lot of the crazy schemes that cops are being taught.” Leonard offered a sobering statistic: eighty per cent of people who were later exonerated by DNA evidence had falsely confessed to their alleged crimes. “When I got into this business, I figured if there was an eyewitness or a confession, then case closed, the guy absolutely, one hundred per cent did it. But those are the two shakiest types of evidence, really.” He recalled many cases where a confession on paper turned out to be no confession at all. “The way humans perceive language is according to schemas, which lead to misperceptions as much as perceptions.” In a sense, investigators who try to extract evidence from confessions are acting as linguists, too, albeit poorly trained ones. A few weeks ago, Leonard finished testifying in the retrial of Brian Hummert, a Pennsylvania man charged with strangling his wife. After initial suspicions pointed to Hummert, the police received handwritten letters claiming that a serial killer, not Hummert, had committed the murder. Once again, the linguistic evidence was important to the case. The notes bore a resemblance to a series of stalker letters that preceded the killing and to the defendant’s writing. As an expert witness, Leonard testified about Hummert’s prose style, noting the rare use of what he calls “ironic repetition” in constructions such as “She tried to break it off, so I broke her neck.” And all the letters contained a linguistic habit that, Leonard testified, he had found nowhere else: a tendency to use contractions in negative statements (“I can’t”) but not in positive ones (“I am”). The jury was out for forty-five minutes and returned a verdict of guilty."
The entire article can be read at:
https://www.newyorker.com/magazine/2012/07/23/words-on-trial
------------------------------------------------------------