Trick of Tweet: Data Tool Pinpoints Words Seen as Credible

Computers are sorting real news from fake on social media

Sixty-two percent of Americans get their news from social media, according to a 2016 poll by Pew Research Center. That stat helps explain the ubiquity of fake news: When information travels via social networks, regular editorial filters have no chance to separate the quality tweet from the chaff. Developing tools to help stop the spread of lies and false rumors will require the collaboration of computer scientists, linguists, psychologists and sociologists. A new study, to be presented this month at a conference of the Association for Computing Machinery, analyzed millions of tweets and revealed which words and phrases are considered most credible.

Tanushree Mitra, a computer scientist at Georgia Institute of Technology and the study’s primary author, says she became interested in the problem when Osama bin Laden was killed in 2011. Messages circulated about whether and how he really died, and many people first heard about the killing on Twitter. “This sort of breaking news and speculation happens on social media,” Mitra says, “and a lot of time it happens before the news reaches the traditional news media.” She and her collaborators at Georgia Tech wanted to develop automated systems for assessing whether events actually happened, based purely on how people were talking about them. These tools might help detect false rumors before they spread too far.

The researchers constructed a database of 1,377 events that took place between October 2014 and February 2015, along with their associated tweets. To assign a “credibility” score to each event, participants saw tweets about it and, based on their knowledge or additional online research, rated how “accurate” the event was. Based on the percentage of people who deemed events “certainly accurate,” they were placed into four categories—Perfect Credibility, High Credibility, Moderate Credibility and Low Credibility. Low Credibility events included a football player dying after a hard hit and police pepper spraying a crowd. (Accuracy ratings were not perfect, however; the crowd in question really had been pepper sprayed.)

On supporting science journalism

If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.

The researchers then statistically analyzed 66 million tweets pertaining to these events, looking for correlations between credibility scores and several features such as words expressing uncertainty or emotion. In their study, which has not yet been published, they list several useful clues: “Credible” events were more likely to be described on Twitter with hedges such as appeared, depending and guessed whereas “incredible” events came with other hedges, such as indicates, certain level and dubious. Some of the best barometers were opinionated words: vibrant, unique and intricate predicted high credibility whereas pry, awfulness and lacking predicted low credibility. (Oddly, darn was associated with high credibility, damn with low.) And although boosters such as without doubt and undeniable predicted low credibility in original tweets, they predicted high credibility in retweets.

Outside of particular words, long quotes in retweets suggested low credibility—possibly because retweeters were reluctant to take ownership of the claim. And a high number of retweets was also associated with low credibility. (These are all correlations; the researchers do not know if, say, the number of retweets influenced the human ratings or if retweets and human ratings each followed independently from features of the supposed event.)

The researchers also tested how well their computer model could predict event credibility by combining indicators like those above. If the algorithm were to guess randomly, it would be right 25 percent of the time; if it had always guessed High Credibility—the category with the most events—it would be right 32 percent of the time. But it performed significantly better than this, achieving 43 percent accuracy. If given half-credit for being one category off (say, guessing Perfect Credibility for an event with High Credibility), the algorithm’s accuracy was 65 percent. The researchers hope to improve its performance by combining linguistic cues with factors such as the author of the tweet or the links cited. Mitra has done preliminary work showing stories originating from only one person tend to be Low Credibility.

They also see any tool that may result as merely a first set of eyes, to bring reporters’ or fact checkers’ attention to accounts they should consider covering or discrediting. Such a tool might also help first responders decide what information to trust during a disaster, according to Robert Mason, a researcher at the University of Washington who studied rumors on Twitter about the Boston Marathon bombing, but was not involved in the present study. Another possibility, Mason says, is to build alert systems into Twitter or Facebook that detect when people are about to pass along potentially fake stories and ask if they are sure they want to do that—“just slowing down the ease with which we spread information.”

Even with artificial intelligence, stopping the spread of fake news will be difficult. Mason notes the adage that a lie can travel halfway around the world before the truth can get its boots on. Often misinformation is more grippingthan the real thing. And journalists are rushed to report news quickly. In any case, people frequently ignore the authoritativeness of the source. “In a time of social media and very fast-moving information,” Mason says, “what is an authoritative source? We no longer have a Walter Cronkite or an Edward R. Murrow to say, ‘And that’s the way it is.’ We’ve got multiple voices saying this is the way it is. So we get to choose.”