Wednesday, June 07, 2006

Truth without Consequences

Google has introduced one of those new search applications that’s just the kind of thing that makes many librarians distrust Googled information. It’s called Google Trends and it provides context-free information that seems to have no bearing outside the arcane world of search itself. But no one ever accused Google of humility. This new engine is a seductive toy that at this stage promises much more than it delivers. Here’s how it works: you searcha word or short (parenthesized) phrase, put a comma after it, then select another (and another three, if you like) to compare it to. Google Trends then tells you how relatively frequently those terms were searched over a few years (on a graph with no scales), as well as what countries or cities it was searched in, and in what language (based on the web version of Google). Some frequency highlights are correlated to currents events, but other than that no explanations are provided, or even suggested, as to why something is trending in any direction. Or on how many searches the trend is based on – 866 or 435,205. Some students may welcome this, for who can doubt that a reason exists for a trend? And a halfway decent student can find reasons much faster than he can find facts.

What are the reasons for interesting trends? A search comparing the terms “stony brook,” “sunysb,” “stonybrook” and “stony brook university” shows us that the single word stonybrook is the second search term of choice used in the US, and that mostly English speakers use it, that India is by far the place where the second highest number of searches came from, that Chinese was easily the second most used Google site for this search, that the phrase Stony Brook is used everywhere much more than the other terms seeking “Stonybrookness”, and that outside the campus community, almost no one ever searches with the term sunysb. You can spend hours discovering similar fascinating, puzzling and potentially meaningless trends by comparing all kind of things, places, names, even numbers. For example, try comparing “0, zero and nothing.” Or “da Vinci, da Vinci Code, Dan Brown and Leonardo.” Which of these five concepts do you think is most searched: “truth, fact, fiction, information and myth?”

۞ I forget just how I came upon this next site, The Athanasius Kircher Image Gallery, at the Stanford University Library. Probably it was randomly from a blog listing. Here are some literally fabulous rarely-seen illustrations of rarely-seen phenomena created by the great, albeit unknown, 17th century German Jesuit polymath. What, you never heard of him? You’ll need to download the DjVu plug-in to see his work. But why were they digitized? The site has impeccable bona fides and links to some fascinating pages about the man, like The Athanasius Kircher Project at Stanford University, The Correspondence of Athanasius Kircher, The Societa Italiana di Storia della Scienza in Florence, Project Director Michael John Gorman, a lecturer in the history of science (at one time) at Stanford

and now a scholar living in Dublin, Ireland, the program of a 2001 Colloquium on Kircher, a gloss on an Exhibition of his Baroque Encyclopedia, an article about Kircher scholarship from the Chronicle, and Wikipedia’s inevitable page on the man. And of course there’s The Proceedings of the Athanasius Kircher Society , which renders the obscurity behind Kircher’s genius into blog-like clarity.

About the only thing I haven’t been able to find is something that explains how his genius impacted the world - though he did make an appearance in an Umberto Eco novel and corresponded with over 750 people. Most of the web sites about Kircher seem full of superficial details. I suppose it takes a certain kind of genius to publish and illustrate dozens of books on many major and arcane subjects in his lifetime, but where’s the rest of him? I haven’t been able to see any of his actual correspondence online (server problems), but there’s no doubt that the images displayed in the Gallery are exotic, deliberately impossible, funny, skillfully drawn and suggestive. It is good to know that we can find images of his Musurgical Ark or his Tarantula and the Musical Antidote to its Poison, even if we aren’t told what they mean. Much of Kircher’s quoted text is either untranslated, is vague, is given no context, is written in strange symbols, or refers to sources that only a grant could verify. In fact, it all would make more sense if we knew it was a hoax, but it isn’t. Still, you won’t find these images on a proprietary database. Who would expect you to pay to see his works? No one. Kircher’s fascinating inaccessability is being paradoxically extended by something libraries will always do well: preserving the virginity of information for its marriage to scholarship.

۞ Statistics don’t lie, they just never tell the whole truth. What reference librarian hasn’t had a student come up and say something like “I need statistics on how many unmarried Japanese men under 40, over 5’8” and under 150 pounds, living abroad and earning less than $50,000 a year missed the train to work because they were shaving on Tuesday?” While there are librarians who attempt to look for answers to such questions, using the usual resources, I rarely do. Instead of telling them the correct answer - 7,387 (J. Ethnic Shaving Sociometrics) - which they never believe, I try to explain that there simply can’t be statistics for such fact . What I can’t say is that I refuse to spend months of searching, analyzing, computing, cross-linking, interpreting and translating to find out that there is no answer. But they wouldn’t accept that either. Why should they, when everyone knows there are fantastic sites offering statistics about baseball, the census, libraries, pregnancy, literacy, jobs, mortality rates, crime, food and movies? Many excellent sites gather tens of thousands of statistical studies and let you believe you can search them. Go ahead, make their day. Have you ever looked at StateMaster or Statistical Universe? If the numbers are there, someone has probably crunched them. But that’s the catch: where do the numbers come from?

It simply doesn’t occur to most people that no one sits around at a desk all day at the center of the world, keeps statistical track of everything happening in the universe at a given moment and relates it to everything else that happens. Only one being does that. For the rest of us, there’s not enough time to track what occurs in time. Maybe Google is working on it. One of the strange lessons of information science is that there are statistics for everything under the sun but what you are looking for. Absorb this lesson, and statistics become less intimidating, almost recreational. Hank Aaron has the all-time home run record? Yes, and he also grounded into double plays – 328 – more than any other player in history – except Cal Ripken, Jr. We know this because there is such a thing as a double play, innings, outs, and rules in a game called baseball. When consequences rule life, statistics will always come to bat. There is such a thing as an infant death, as box office receipts, as employment rates, as book circulation, as starvation. And then there’s everything else. But what about those expatriated Japanese men who shave every Tuesday? Surely they exist, surely they matter. But where are the statistics? Wasn’t anybody watching these guys? When all the information in the world in all languages from books, articles, indexes, statistics, reports, libraries, institutes, laboratories and think tanks - is linked, will librarians be able to answer such questions? Of course not.

Paul B. Wiener

1 June 2006