How Google Works: Is this Relevancy?

Peter JonesInformation Ecology

A tip to Barry Ritholtz’ Big Picture for this reference: How Does Google Work? Barry’s is the best overall blog for Investing Plus Economics views, so given the events of last week (credit market meltdowns, Hedge Fund troubles, big index drops) I look to him for context. And with Barry you always also get the techie info-porn treats such as the ref to the Google map.

My apparently Bush-formed blog question (“Is our children learning?”) refers to discovering that “How Google Works” is about the operation of Google the enterprise, not Google the search engine. There’s nothing about the puzzle of Google’s internal relevancy algorithms other than the mention of the age-old citation linking strategy it uses – they do not even mention PageRank or remind people they can find Page’s 1999 Stanford paper online. The overview remains clever and visually appealing, just not very informative. Info-p@Rn.

There are many problems with our total reliance on Google and other purveyors of computational relevance. Relevance is inherently a cognitive concept, not computational. Algorithms can mine large databases and index content, but only approximate relevancy. With wide recall (reach and scope of a search) we can spend all day looking for something we suspect is in the corpus searched. With more powerful precision we can search all day for a better match to terms and qualifiers. As LexisNexis users have always discovered, the meaning of relevance is internally judged, it is with you, not a property of the content. Relevance as cognitively judged is similar to Gary Klein’s Recognition-Primed Decision-Making : A searcher recognizes relevance based on matching available content to a kind of mental model benchmark representing the issue of interest, the information need (usage), and the sufficiency of responses. Also, for information decision making, the temporal value of the information artifact may be extremely valuable. That’s why people pay so much for LexisNexis and Bloomberg. Validity and timeliness are worth a LOT. But relevance can only be judged by the human “in charge.”

Don’t tell me the Semantic Web will fix this. For relevancy to an issue other humans must also weigh in to the discussion. No one person knows the entire scope of an issue, and no scope of content, no corpus, is complete. Sometimes multiple representations of relevancy – from informed participants or experts – is necessary. Think of how to reason through and determine the relevancy of medical research to a physician needing to identify the validity and process of a new procedure.At what point does a professional believe they have sufficient relevance AND validity to make an informed individual decision when the facts and evidence are unclear, but the decision has enormous potential for helping?

These are the problems Google will not solve with search algorithms. For these types of issues and the relevancy of content to wicked problems, you need multiple perspectives, a variety of related experiences, knowledges across disciplines and corpora, and a way to pull them together. We are working on this …  And it looks as though Google is also working on collaborative tech. No surprise there.

So, the other economics blogs you should read and plug are:

  • Bull / Not Bull – Michael Nystrom’s excellent aggregation & no-hold commentary
  • Mish’s Global Economics – Good thinking about the underlying dynamics of markets and behavior
  • Nouriel Roubini – Because deep in your heart, you know this bearish economist has been right all along
  • And Brad DeLong – Who merges into political dynamics more than the others, thank you.