Saturday, January 26, 2013

Goodbye, Anecdotes! The Age Of Big Data Demands Real Criticism


This article was written by Trevor Butterworth from The Awl. Read the original and complete article here.


If you think of all the information encoded in the universe from your genome to the furthest star, from the information that's already there, codified or un-codified, to the information pregnant in every interaction, "big" has become the measure of data. And our capacity to produce and collect Big Data in the digital age is very big indeed. Every day, we produce 2.5 exabytes of information, the analysis of which will, supposedly, make us healthier, wiser, and above all, wealthier—although it's all a bit fuzzy as to what, exactly, we're supposed to do with 2.5 exabytes of data—or how we're supposed to do whatever it is that we're supposed to do with it, given that Big Data requires a lot more than a shiny MacBook Pro to run any kind of analysis. "Start small," is the paradoxical advicefrom Bill Franks, author of Taming the Big Data Tidal Wave.
If you are a company that makes sense; but if Big Data is the new big thing, can it answer any big questions about the way we live? Can it produce big insights? Will it begat Big Liberty, Big Equality or Big Fraternity—Big Happiness? Are we on the cusp of aggregating utilitarianism into new tyrannies of scale? Is there a threshold where Big Pushpin is incontrovertibly better than small poetry, because the numbers are so big, they leave interpretation behind and acquire their own agency, as the digital age's answer to Friedrich Nietzsche—Chris Anderson—suggested in his "Twilight of the Idols" Big Data manifesto from four years ago. These are critical—one might say, "Big Critical—questions. Is Lena Dunham the voice of her generation as every news story about her HBO show "Girls" seems to stipulate or is this just a statistical artifact within an aggregated narrative about women that's even harder to swallow?
But because we are all disciples of enumeration, the first question is how big is Big? Well, an exabyte is a very big number indeed: one quintillion bytes—which is not as big as the exotically large quantities at the highest reaches of number crunching, numbers which begin to exhaust meaningful language like yottabytes;* but it's getting there. If you were to count from one to a quintillion, and took a second to visualize each number (audibly counting would take a lot longer), your journey would last 31.7 billion years, and you'd still be less than halfway through a day in the digital life of the world. Or, imagine if each byte occupied a millimeter of visual space: every four days our modern Bayeux tapestry would cover a light year. By way of contrast, Claude E. Shannon—the "father of information theory"—estimated the size of the Library of Congress in 1949 at 12,500 mega bytes, which is by today's standards, a mere Post-it note of information in the virtual annals of human data, albeit a rather useful one.
This latter historical tidbit (tidbyte?) comes from a thrilling, vertiginous essay by Martin Hilbert—"How much information is there in the 'information society?'"—that appeared in the August 2012 edition of Significance, the Journal of the Royal Statistical Society, (one of the indispensable publications of the digital age). Hilbert has lots of interesting data and, a fortiori, lots of interesting things to say; but one of his most interesting observations is that the big bang of digital data resulted in a massive expansion of text within the universe, and not as you might have intuited, light years worth of YouTube videos, BitTorrent copyright violations, cat pictures and porn:
In the early 1990s, video represented more than 80% of the world's information stock (mainly stored in analogue VHS cassettes) and audio almost 15% (on audio cassettes and vinyl records). By 2007, the share of video in the world's storage devices had decreased to 60% and the share of audio to merely 5%, while text increased from less than 1% to a staggering 20% (boosted by the vast amounts of alphanumerical content on internet servers, hard disks and databases.) The multimedia age actually turns out to be an alphanumeric text age, which is good news if you want to make life easy for search engines.
The bathos is forgivable—after all, what else do we spend a vast amount of our time doing other than searching on the Internet for information? But, happily, such "good news" amounts to more than better functionality, a "gosh, how convenient!" instrumental break in the pursuit of better instrumentalism; it also means we can ask bigger questions of Big Data. We can ask what the big picture actually means, and—no less important—we can criticize those who claim to know. We can, in other words, be "Big Critics"; we can do "Big Crit."

No comments:

Post a Comment