Crap Detection 101

"Every man should have a built-in automatic crap detector operating inside him."
Ernest Hemingway, 1954

An excellent article by Howard Rheingold published in SFGate, the San Francisco Chronicle.

"Unless a great many people learn the basics of online crap detection and begin applying their critical faculties en masse and very soon, I fear for the future of the Internet as a useful source of credible news, medical advice, financial information, educational resources, scholarly and scientific research."

The article is reproduced below (without useful links)

The answer to almost any question is available within seconds, courtesy of the invention that has altered how we discover knowledge – the search engine. Materializing answers from the air turns out to be the easy part – the part a machine can do. The real difficulty kicks in when you click down into your search results. At that point, it's up to you to sort the accurate bits from the misinfo, disinfo, spam, scams, urban legends, and hoaxes. "Crap detection," as Hemingway called it half a century ago, is more important than ever before, now that the automation of crapcasting has generated its own word: "spamming."

Unless a great many people learn the basics of online crap detection and begin applying their critical faculties en masse and very soon, I fear for the future of the Internet as a useful source of credible news, medical advice, financial information, educational resources, scholarly and scientific research. Some critics argue that a tsunami of hogwash has already rendered the Web useless. I disagree. We are indeed inundated by online noise pollution, but the problem is soluble. The good stuff is out there if you know how to find and verify it. Basic information literacy, widely distributed, is the best protection for the knowledge commons: A sufficient portion of critical consumers among the online population can become a strong defense against the noise-death of the Internet.

The first thing we all need to know about information online is how to detect crap, a technical term I use for information tainted by ignorance, inept communication, or deliberate deception. Learning to be a critical consumer of Webinfo is not rocket science. It's not even algebra. Becoming acquainted with the fundamentals of web credibility testing is easier than learning the multiplication tables. The hard part, as always, is the exercise of flabby think-for-yourself muscles.

The issue of info pollution has been on my mind since at least 1994, when I wrote "The Tragedy of the Electronic Commons" about the infamous Canter and Siegel – the first Internet spammers. A few years later, I personally confronted the importance of teaching information literacy to 14 year olds when I watched my daughter come of age at the same time online search engines became available. I sat down in front of the circa-1999 computer with my daughter and explained that most of the books she could get from the library could be counted on to be factually accurate. But when you enter words into a search engine, there is no guarantee that your search will lead you to accurate information. "You have to do some investigation before you accept anything you find online," I warned her.

"Ask a few questions and use available tools to see if you can find answers," is what I told her when she asked me how to go about investigating.

Today, just as it was back then, "Who is the author?" is the root question. If you don't find one, turn your skepticism meter to the top of the dial. And use to find out who owns the site if there is no author listed. If the author provides a way to ask questions, communicate, or add comments, turn up the credibility meter and dial back the skepticism. When you identify an author, search on the author's name in order to evaluate what others think of the author – and don't turn off your critical stance when you assess reputation. Who are these other people whose opinions you are trusting? Is the site a .gov or .edu? If so, turn up the credibility a notch. If it helps, envision actual meters and dials in your mind's eye – or a thermometer or speedometer. Take the website's design into account – professional design should not be seen as a certain indicator of accurate content, but visibly amateurish design is sometimes an indicator that the "Institute of Such-and-Such" might be an obsessive loner.

More good questions to use as credibility probes: Does the author provide sources for factual claims, and what happens when you search on the names of the authors of those sources? Have others linked to this page, and if so, who are they (use the search term "link: http://…" and Google shows you every link to a specified page). See if the source has been bookmarked on a social bookmarking service like Delicious or Diigo; although it shouldn't be treated as a completely trustworthy measurement, the number of people who bookmark a source can furnish clues to its credibility. All the mechanics of doing this kind of checking take only a few seconds of clicking, copying and pasting, searching, and judging for yourself. Again, the part that requires the most work is learning to do your own judging.

You aren't paranoid if you suspect that some sites might even deliberately try to deceive you. Some sites insidiously cloak their real bias, for example. I use as an example with my students today – it's not owned by admirers of the late civil rights leader, but you wouldn't know that at first glance. Another, less sinister but equally sobering teaching story: "The parody site once duped the Center for International Legal Studies into believing it was the Web site of the World Trade Organization. Accordingly, a few years ago, the association arranged for someone from the parody site to speak at its annual conference. The speaker – an imposter from an activist group known as the Yes Men – offended several attendees with racist remarks. A staged pie-throwing incident followed the presentation, and the fiasco later culminated in the faked death of the speaker/imposter." A good question to ask yourself, particularly when a website asks you to download something to your own computer, is "might somebody be trying to put one over on me?"

When I began teaching my daughter how to evaluate the credibility of web pages, I started collecting rules of thumb, strategies, tools – especially free and easy to use ones – for sorting the goodinfo from the badinfo. Fortunately, tools are far more powerful today than they were a decade ago; the bad news is that too many people don't know about them. In recent years, as so many more people have started to rely on the web for such vitally important forms of information as news, medical information, scholarly research, investment advice, the lack of general education in critical consumption of information found online is turning into a public danger. No, Bill Gates won't send you $5 for forwarding this chain e-mail, the medical advice you get in a chat room isn't necessarily better than what your doctor tells you, and the widow of the deceased African dictator is definitely not going to transfer millions of dollars to your bank account. That scurrilous rumor about the political candidate that never makes the mainstream media but circulates as email and blog posts probably isn't true. The data you are pasting into your memo or term paper may well be totally fabricated.

Use the following methods and tools to protect yourself from toxic badinfo. Use them, pass them along to others. Promote the notion that more info-literacy is a practical answer to growing info-pollution.

Although the Web undermines authority, the usefulness of authority as another clue to credibility hasn't entirely disappeared. I would add credibility points if a source is a verified professor at a known institution of higher learning, an authentic M.D. or Ph.D., but I wouldn't subtract points from uncredentialed people whose expertise seems authentic. Nor would I stop at simply verifying that the claim to be a professor is valid. The next step: use the scholarly productivity index that derives a score from the scholar's publications, citations by other scholars, grants, honors, and awards. If you want to get even more serious, download a free copy of Publish or Perish software, which analyzes scientific citations from Google Scholar according to multiple criteria. Again, don't trust just one source. Triangulate.

I got good strategy advice from John McManus, author of "Detecting Bull: How to Identify Bias and Junk Journalism in Print, Broadcast and on the Wild Web", who told me "you have think like a detective." Think of tools like search engines, the productivity index, hoax debunking sites like, and others I will mention later as forensic instruments, like Sherlock Holmes' magnifying glass or the crime scene investigator's fingerprint kit. The tools are only useful as the means to sleuthing out a mystery. In the case of people who stake their health on online medical information from a virtual community, their economic well-being on online financial information, their political liberty on the news they get from Twitter, blogs, or YouTube, the stakes in this detective game are high. Triangulation is what detectives do – try to find three different ways to test a source's credibility. For example, you could Google the author's name, enter the author's name in the scholarly productivity index, and use the literacy resources at to triangulate a source. ('s sister site, researches claims from all political factions.)

Know how to use online filters. As more people get their news online, and more people at the site of newsworthy events have Web-linked cameras and video cameras, we'll see more situations like the 250,000 "tweets" per hour that passed through Twitter during the Iranian political demonstrations of June, 2009. Before Twitter came on the scene, online services like Flickr and YouTube enabled users to "tag" photographs and video with key words, making it possible to search for images tagged with those key words, revealing all the still images and videos coming in from amateur chroniclers during an event. The San Diego Union-Tribune called publicly for citizen reporters to use the same tags for their images of the 2007 wildfire. Using the search facility on Flickr or YouTube enables you to see a stream of images or videos, and automatically subscribing to that search through "RSS" means you can continue to see visual reports stream in as others upload them – in real time. At the height of the Iran demonstrations, CNN was displaying videos posted to YouTube, alerted via Twitter. Quite properly, CNN introduced the images with the disclaimer that they were as-yet unverified. As Clay Shirky has noted, we're in the age of "publish, then filter."

Again, it's up to the consumer of the information to decide which images, videos, tweets are authentic. As always happens when there is a high demand for separating signal from noise, people began to put together filters for doing that – and human tools for sorting the more trustworthy information. After the terrorist attacks in Mumbai provided both noteworthy on-the-scene reports and outright rumors, some experts started talking about "crowdsourcing the filter" by growing populations of trusted editors who would collectively identify the good stuff. Although they did not start as a filter for fast-breaking news, American Public Media's Public Insight Network is moving in the direction of a crowdsourced filter.

When the Iran demonstrations happened, people in Iran and around the world used the Twitter equivalent of a Flickr or YouTube tag, a now-famous "hashtag" – #iranelection – and for a few days flooded the world with riveting images that will probably win Pulitzers, along with shocking and politically inflammatory videos, torrents of contradictory reports, bogus rumors, apparent disinformation, both informed and ignorant political arguments and, as always, spam, porn, and porn spam. As the Iran events unfolded, Marc Ambinder wrote an astute article in The, "Follow the developments in Iran like a CIA analyst." Just as thinking like a detective is a strategy for trying to determine the credibility of webinfo, thinking like an intelligence analyst is a strategy for trying to gauge the credibility of online reports about breaking news events. Ambinder recommends watching for disinformation, looking for patterns in the geographic location of sources (but warns against assuming that everything that resembles a pattern really is one), examining your assumptions and looking for sources that contradict them.

Twitter Journalism ("Where News and Tweets Converge"), published a series of steps to take to verify a tweet, including, among many other tips, checking the history of past tweets by a person to see what context you might find before the claim about a news event was tweeted, checking the bio of twitterer who makes a claim, being wary of news tweets from someone with very few previous tweets or who joined very recently, use Twitter's reply feature to engage the twitterer directly. I've been collecting links for journalism students who want to understand the journalistic requirements of good Twitter practice from the production side.

The biases of trusted sources like newspapers and television need to be examined critically, as well as those that come in from what are increasingly called "social media." Questioning Video is all about understanding the vocabulary of visual deception that can be used to distort television news. Newstrust is trying to crowdsource the filter for mainstream as well as alternative news sources by growing a bipartisan virtual community of critical news filterers who use the same set of criteria for evaluating whether a news story exhibits bias, makes factual claims that can or cannot be verified, presents more than one interpretation of events. Fairspin's community votes on stories in order for the community's aggregate judgements to identify opinion disguised as fact and reflect the degree of political bias detected in stories from both the left and right. And on the cutting edge of community-based filtering tools, Intel labs' Dispute Finder Firefox Extension "highlights disputed claims on web pages you browse and shows you evidence for alternative points of view."

The good news about the pace of medical research is also the bad news – few medical specialists can keep up with the rate of new discoveries. That means that it's possible for the collective intelligence of a committed community – and there is nothing as committed as people who are suffering from a disease – to stay ahead of all but those most dedicated individual specialists. However, along with the latest word on cutting-edge drug trials are unsubstantiated claims, rumors, outright quackery. When it comes to medical information, just as when it comes to information that affects political liberty, believing or forwarding badinfo can be unhealthy or fatal. Again, the critical consumer of online medical advice has a number of triangulation tools at hand. For scientific articles, Science Direct has guest access. The Health on the Net Foundation has been a steady source of finding reliable/credible health information online. They even have a browser plug-in that enables you to check health information on any website against HON's database. An astute medical student wrote a guide to how to check quality of medical information online. How much work is it to check three links before believing or passing along health information you find online? Simply Googling the name of the Dr. who tried to sell do-it-yourself eye surgery kits, for example, immediately raises questions for those who are considering aiming lasers at their own retinas.

To me, the issue of information literacy could be even more important than the health or education of some individuals. Fundamental aspects of democracy, economic production, the discovery and use of knowledge might be at stake. Some of the biggest problems facing the world today seem to be far beyond the ability of any individual or community, or even the whole human race, to tackle. But the noise death of the Internet is something we can take on and win. Although large forces are at work, when it comes to the shape of online media, I believe that what people know – and how many people know – matters. Digital media and networked publics are only the infrastructure for participation – the cables and chips do no good unless people know how to use them. The collision of newly participative populations with authoritarian control is taking place every day in Teheran and Beijing, Berlin and Washington, D.C. Nobody knows how this clash will play out, but the one cost-effective measure that the participative have in their contest with central control is know-how. And the lack of know-how among the population is an asset to those who seek to put the lid back on their – our – power of expression.


(Please suggest resources in comments, and I will add them below as I receive, find, and vet them)

Teachable moment: Crap Detecting
Assessing the Credibility of Online Sources
The CRAP Test
Urban Legends
Check quality of backlinks (links TO a site)
Newstrust guide to finding good journalism online
Museum of Hoaxes
Urban Legends (
Carl Sagan's "Baloney Detection Kit"

2 comments to Crap Detection 101

Leave a Reply

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>




This site uses Akismet to reduce spam. Learn how your comment data is processed.