Saturday, May 05, 2012

Care For A Zettabyte Of Data, Sie Belesen Teufel?

Today, we nageln gelee zum scheunentor in this blog once again. A bright young German studies prof at the University of Virginia sheds light on a centuries-old concern about knowledge and information. If this is (fair & balanced) digitized epistemology, so be it.

[x Hedgehog Review]
Why Google Isn’t Making Us Stupid… Or Smart
By Chad Wellmon

Tag Cloud of the following article

created at TagCrowd.com
(Click to embiggen)

Last year The Economist published a special report not on the global financial crisis or the polarization of the American electorate, but on the era of big data. Article after article cited one big number after another to bolster the claim that we live in an age of information superabundance. The data are impressive: 300 billion emails, 200 million tweets, and 2.5 billion text messages course through our digital networks every day, and, if these numbers were not staggering enough, scientists are reportedly awash in even more information. This past January astronomers surveying the sky with the Sloan telescope in New Mexico released over 49.5 terabytes of information—a mass of images and measurements—in one data drop. The Large Hadron Collider at CERN (the European Organization for Nuclear Research), however, produces almost that much information per second. Last year alone, the world’s information base is estimated to have doubled every eleven hours. Just a decade ago, computer professionals spoke of kilobytes and megabytes. Today they talk of the terabyte, the petabyte, the exabyte, the zettabyte, and now the yottabyte, each a thousand times bigger than the last.

Some see this as information abundance, others as information overload. The advent of digital information and with it the era of big data allows geneticists to decode the human genome, humanists to search entire bodies of literature, and businesses to spot economic trends. But it is also creating for many the sense that we are being overwhelmed by information. How are we to manage it all? What are we to make, as Ann Blair asks, of a zettabyte of information—a one with 21 zeros after it?1 From a more embodied, human perspective, these tremendous scales of information are rather meaningless. We do not experience information as pure data, be it a byte or a yottabyte, but as filtered and framed through the keyboards, screens, and touchpads of our digital technologies. However impressive these astronomical scales of information may be, our contemporary awe and increasing worry about all this data obscures the ways in which we actually engage it and the world of which it and we are a part. All of the chatter about information superabundance and overload tends not only to marginalize human persons, but also to render technology just as abstract as a yottabyte. An email is reduced to yet another data point, the Web to an infinite complex of protocols and machinery, Google to a neutral machine for producing information. Our compulsive talk about information overload can isolate and abstract digital technology from society, human persons, and our broader culture. We have become distracted by all the data and inarticulate about our digital technologies.

The more pressing, if more complex, task of our digital age, then, lies not in figuring out what comes after the yottabyte, but in cultivating contact with an increasingly technologically formed world.2 In order to understand how our lives are already deeply formed by technology, we need to consider information not only in the abstract terms of terrabytes and zettabytes, but also in more cultural terms. How do the technologies that humans form to engage the world come in turn to form us? What do these technologies that are of our own making and irreducible elements of our own being do to us? The analytical task lies in identifying and embracing forms of human agency particular to our digital age, without reducing technology to a mere mechanical extension of the human, to a mere tool. In short, asking whether Google makes us stupid, as some cultural critics recently have, is the wrong question. It assumes sharp distinctions between humans and technology that are no longer, if they ever were, tenable.

Two Narratives

The history of this mutual constitution of humans and technology has been obscured as of late by the crystallization of two competing narratives about how we experience all of this information. On the one hand, there are those who claim that the digitization efforts of Google, the social-networking power of Facebook, and the era of big data in general are finally realizing that ancient dream of unifying all knowledge. The digital world will become a “single liquid fabric of interconnected words and ideas,” a form of knowledge without distinctions or differences.3 Unlike other technological innovations, like print, which was limited to the educated elite, the internet is a network of “densely interlinked Web pages, blogs, news articles and Tweets [that] are all visible to anyone and everyone.”4 Our information age is unique not only in its scale, but in its inherently open and democratic arrangement of information. Information has finally been set free. Digital technologies, claim the most optimistic among us, will deliver a universal knowledge that will make us smarter and ultimately liberate us.5 These utopic claims are related to similar visions about a trans-humanist future in which technology will overcome what were once the historical limits of humanity: physical, intellectual, and psychological. The dream is of a post-human era.6

On the other hand, less sanguine observers interpret the advent of digitization and big data as portending an age of information overload. We are suffering under a deluge of data. Many worry that the Web’s hyperlinks that propel us from page to page, the blogs that reduce long articles to a more consumable line or two, and the tweets that condense thoughts to 140 characters have all created a culture of distraction. The very technologies that help us manage all of this information are undermining our ability to read with any depth or care. The Web, according to some, is a deeply flawed medium that facilitates a less intensive, more superficial form of reading. When we read online, we browse, we scan, we skim. The superabundance of information, such critics charge, however, is changing not only our reading habits, but also the way we think. As Nicholas Carr puts it, “what the Net seems to be doing is chipping away my capacity for concentration and contemplation. My mind now expects to take in information the way the Net distributes it: in a swiftly moving stream of particles.”7 The constant distractions of the internet—think of all those hyperlinks and new message warnings that flash up on the screen—are degrading our ability “to pay sustained attention,” to read in depth, to reflect, to remember. For Carr and many others like him, true knowledge is deep, and its depth is proportional to the intensity of our attentiveness. In our digital world that encourages quantity over quality, Google is making us stupid.

Each of these narratives points to real changes in how technology impacts humans. Both the scale and the acceleration of information production and dissemination in our digital age are unique. Google, like every technology before it, may well be part of broader changes in the ways we think and experience the world. Both narratives, however, make two basic mistakes.

First, they imagine our information age to be unprecedented, but information explosions and the utopian and apocalyptic pronouncements that accompany them are an old concern. The emergence of every new information technology brings with it new methods and modes for storing and transmitting ever more information, and these technologies deeply impact the ways in which humans interact with the world. Both the optimism of technophiles who predict the emergence of a digital “liquid” intelligence and the pessimism of those who fear that Google is “making us stupid” echo historical hopes and complaints about large amounts of information.

Second, both narratives make a key conceptual error by isolating the causal effects of technology. Technologies, be it the printed book or Google, do not make us unboundedly free or unflaggingly stupid. Such a sharp dichotomy between humans and technology simplifies the complex, unpredictable, and thoroughly historical ways in which humans and technologies interact and form each other. Simple claims about the effects of technology obscure basic assumptions, for good or bad, about technology as an independent cause that eclipses causes of other kinds. They assume the effects of technology can be easily isolated and abstracted from their social and historical contexts.

Instead of thinking in such dichotomies or worrying about all of those impending yottabytes, we might consider a perhaps simple but oftentimes overlooked fact: we access, use, and engage information through technologies that help us select, filter, and delimit. Web browsers, hyperlinks, blogs, online newspapers, computational algorithms, rss feeds, Facebook, and Google help us turn all of those terrabytes of data into something more useful and particular, that is, something that can be remade and repurposed by an embodied human person. These now ubiquitous technologies help us filter the essential from the excess and search for the needle in the haystack, and in so doing they have become central mediums for our experience of the world.

In this sense, technology is neither an abstract flood of data nor a simple machine-like appendage subordinate to human intentions, but instead the very manner in which humans engage the world. To celebrate the Web, or any other technology, as inherently edifying or stultifying is to ignore its more human scale: our individual access to this imagined expanse of pure information is made possible by technologies that are constructed, designed, and constantly tweaked by human decisions and experiences. These technologies do not exist independently of the human persons who design and use them. Likewise, to suggest that Google is making us stupid is to ignore the historical fact that over time technologies have had an effect on how we think, but in ways that are much more complex and not at all reducible to simple statements like “Google is making us stupid.”

Think of it this way: the Web in its entirety—just like those terrabytes of information that we imagine weighing down upon us—is inaccessible to the ill-equipped person. Digital technologies make the Web accessible by making it seem much smaller and more manageable than we imagine it to be. The Web does not exist. In this sense, the history of information overload is instructive less for what it teaches us about the quantity of information than what it teaches us about how the technologies that we design to engage the world come in turn to shape us. The specific technologies developed to manage information can give us insight into how we organize, produce, and distribute knowledge—that is, the history of information overload is a history of how we know what we know. It is not only the history of data, books, and the tools used to cope with them. It is also a history of ourselves and of the environment within which we make and in turn are made by technologies.

In the following sections, I put our information age in historical context in an effort to demonstrate that technology’s impact on the human is both precedented and constitutive of new forms of life, new norms, and new cultures. The concluding sections focus on Google in particular and consider how it is impacting our very notion of what it is to be human in the digital age. Carr and other critics of the ways we have come to interact with our digital technologies have good reason to be concerned, but, as I hope to show, for rather different reasons than they might think. The core issue concerns not particular modes of accommodating new technologies—nifty advice on dealing with email or limiting screen time—but our very conception of the relationship between the human and technology.

Too Many Books

As historian Ann Blair has recently demonstrated, our contemporary worries about information overload resonate with historical complaints about “too many books.” Historical analogues afford us insight not only into the history of particular anxieties, but also into the ways humans have always been impacted by their own technologies. These complaints have their biblical antecedents: Ecclesiastes 12:12, “Of making books there is no end”; their classical ones: Seneca, “the abundance of books is a distraction”8; and their early modern ones: Leibniz, the “horrible mass of books keeps growing.”9 After the invention of the printing press around 1450 and the attendant drop in book prices, according to some estimates by as much as 80 percent, these complaints took on new meaning. As the German philosopher and critic Johann Gottfried Herder put it in the late eighteenth century, the printing press “gave wings” to paper.10

Complaints about too many books gained particular urgency over the course of the eighteenth century when the book market exploded, especially in England, France, and Germany. Whereas today we imagine ourselves to be engulfed by a flood of digital data, late eighteenth-century German readers, for example, imagined themselves to have been infested by a plague of books [Bücherseuche]. Books circulated like contagions through the reading public. These anxieties corresponded to a rapid increase in new print titles in the last third of the eighteenth century, an increase of about 150 percent from 1770 to 1800 alone.

Similar to contemporary worries that Google and Wikipedia are making us stupid, these eighteenth-century complaints about “excess” were not merely descriptive. In 1702 the jurist and philosopher Christian Thomasius laid out some of the normative concerns that would gain increasing traction over the course of the century. He described the writing and business of books as a

kind of Epidemic disease, which hath afflicted Europe for a long time, and is more fit to fill warehouses of booksellers, than the libraries of the Learned. Any one may understand this to be meant of that itching desire to write books, which people are troubled with at this time. Heretofore none but the learned, or at least such as ought to be accounted so, meddled with this subject, but now-a-days there is nothing more common, it extends itself through all professions, so that now almost the very Coblers, and Women who can scarce read, are ambitious to appear in print, and then we may see them carrying their books from door to door, as a Hawker does his comb cases, pins and laces.11

The emergence of a print book market lowered the bar of entry for authors and gradually began to render traditional filters and constraints on the production of books increasingly inadequate. The perception of an excess of books was motivated by a more basic assumption about who should and should not write them.

At the end of the century, even book dealers had grown weary of a market that seemed to be growing out of control. In his 1795 screed, Appeal to My Nation: On the Plague of German Books, the German bookseller and publisher Johann Georg Heinzmann lamented that “no nation has printed so much as the Germans.”12 For Heinzmann, late eighteenth-century German readers suffered under a “reign of books” in which they were the unwitting pawns of ideas that were not their own. Giving this broad cultural anxiety a philosophical frame, and beating Carr to the punch by more than two centuries, Immanuel Kant complained that such an overabundance of books encouraged people to “read a lot” and “superficially.”13 Extensive reading not only fostered bad reading habits, but also caused a more general pathological condition, Belesenheit [the quality of being well-read], because it exposed readers to the great “waste” [Verderb] of books. It cultivated uncritical thought.

Like contemporary worries about “excess,” these were fundamentally normative. They made particular claims not only about what was good or bad about print, but about what constituted “true” knowledge. First, they presumed some unstated yet normative level of information or, in the case of a Bücherseuche, some normative number of books. There are too many books; there is too much data. But compared to what? Second, such laments presumed the normative value of particular practices and technologies for dealing with all of these books and all of this information. Every complaint about excess was followed by a proposal on how to fix the apparent problem. To insist that there are too many books was to insist that there were too many books to be read or dealt with in a particular way and thus to assume the normative value of one form of reading over another.

Enlightenment Reading Technologies

Not so dissimilar to contemporary readers with their digital tools, eighteenth-century German readers had a range of technologies and methods at their disposal for dealing with the proliferation of print—dictionaries, bibliographies, reviews, note-taking, encyclopedias, marginalia, commonplace books, footnotes. These technologies made the increasing amounts of print more manageable by helping readers to select, summarize, and organize an ever-increasing store of information. The sheer range of technologies demonstrates that humans usually deal with information overload through creative and sometimes surprising solutions that blur the line between humans and technology.

By the late seventeenth and early eighteenth centuries, European readers dealt with the influx of new titles and the lack of funds and time to read them all by creating virtual libraries called bibliotheca. At first these printed texts were simply listings of books that had been published or displayed at book fairs, but over time they began to include short reviews and summaries intended to guide the collector, scholar, and amateur in their choice and reading of books. They also allowed eighteenth-century readers to avoid reading entire books by providing summaries of individual books.

Eighteenth-century readers also made use of an increasing array of encyclopedias. In contrast to their early modern Latin predecessors that sought to summarize the most significant branches of established knowledge (designed to present an enkuklios paideia, or common knowledge), these Enlightenment encyclopedias were produced and sold as reference books that disseminated information more widely and efficiently by compiling, selecting, and summarizing more specialized and, above all, new knowledge. It made knowledge more general and common by sifting and constraining the purview of knowledge.14

Similarly, compilations, which date from at least the early modern period, employed cut and paste technologies, rather than summarization, to select, collect, and distribute the best passages from an array of books.15 A related search technology, the biblical concordance—the first dates back to 1247—indexed every word of the Bible and facilitated its broader use for sermons and, after its translations into the vernacular, even broader audiences. Similarly, indexes became increasingly popular and big selling points of printed texts by the sixteenth century.16

All of these technologies facilitated a consultative reading that allowed a text to be accessed in parts instead of reading a text straight through from beginning to end.17 By the early eighteenth century, there was even a science devoted to organizing and accounting for all of these technologies and books: historia literaria. It produced books about books. The technologies and methods for organizing and managing all of these books and information were embedded into other forms and even other sciences.

All of these devices and technologies provided shortcuts and methods for filtering and searching the mass of printed or scribal texts. They were technologies for managing two perennially precious resources: money (books and manuscripts were expensive) and time (it takes a lot of time to read every word).

While many overwhelmed readers welcomed these techniques and technologies, some, especially by the late eighteenth century, began to complain that they led to a derivative, second-hand form of knowledge. One of Kant’s students and a key figure of the German Enlightenment, J. G. Herder, mocked the French for their attempts to deal with such a proliferation of print through encyclopedias:

Now encyclopedias are being made, even Diderot and D’Alembert have lowered themselves to this. And that book that is a triumph for the French is for us the first sign of their decline. They have nothing to write and, thus, produce Abregés, vocabularies, esprits, encyclopedias—the original works fall away.18

Echoing contemporary concerns about how our reliance on Google and Wikipedia might lead to superficial forms of knowledge, Herder worried that these technologies reduced knowledge to discrete units of information. Journals reduced entire books to a paragraph or blurb; encyclopedias aggregated huge swaths of information into a deceptively simple form; compilations separated readers from the original texts.

By the mid-eighteenth century, the word “polymath”—previously used positively to describe a learned person—became synonymous with dilettante, one who merely skimmed, aggregated, and heaped together mounds of information but never knew much at all. In sum, encyclopedias and the like had reduced the Enlightenment project, these critics claimed, to mere information management. At stake was the definition of “true” knowledge. Over the course of the eighteenth century, German thinkers and authors began to make a normative distinction between what they termed Gelehrsamkeit and Wissen, between mere pedantry and true knowledge.

As this brief history of Enlightenment information technologies suggests, to claim that a particular technology has one unique effect, either positive or negative, is to reduce both historically and conceptually the complex causal nexus within which humans and technologies interact and shape each other. Carr’s recent and broadly well-received arguments wondering if Google makes us stupid, for example, rely on a historical parallel that he draws with print. He claims that the invention of printing “caused a more intensive” form of reading and, by extrapolation, print caused a more reflective form of thought—words on a page focused the reader.19

Historically speaking, this is hyperbolic techno-determinism. Carr assumes that technologies simply “determine our situation,” independent of human persons, but these very technologies, methods, and media emerge from particular historical situations with their own complex of factors.20 Carr relies on quick allusions to historians of print to bolster his case and inoculate himself from counter-arguments, but the historian of print to whom he appeals, Elizabeth L. Eisenstein, warns that “efforts to summarize changes wrought by printing in any simple or single formula are likely to lead us astray.”21

Arguments like Carr’s—and I focus on him because he has become the vocal advocate of this view—also tend to ignore the fact that, historically, print facilitated a range of reading habits and styles. Francis Bacon, himself prone to condemning printed books, laid out at least three ways to read books: “Some books are to be tasted, others to be swallowed, and some few to be chewed and digested.”22 As a host of scholars have demonstrated of late, different ways of reading co-existed in the print era.23 Extensive or consultative forms of reading—those that Carr might describe as distracted or unfocused—existed alongside more intensive forms of reading—those that he might describe as deep, careful, prolonged engagements with particular texts in the Enlightenment. Eighteenth-century German Pietists read the Bible very closely, but they also consistently consulted Bible concordances and Latin encyclopedias.24 Even the form of intensive reading held up today as a dying practice, novel reading, was often derided in the eighteenth century as weakening the memory and leading to “habitual distraction,” as Kant put it.25 It was thought especially dangerous to women who, according to Kant, were already prone to such lesser forms of thought. In short, print did not cause one particular form of reading; instead, it facilitated a range of ever-newer technologies, methods, and innovations that were deeply interwoven with new forms of human life and new ways of experiencing the world.

The problem with suggestions that Google makes us stupid, smart, or whatever else we might imagine, however, is not just their historical myopia. Such reductions elide the fact that Google and print technology do not operate independently of the humans who design, interact with, and constantly modify them, just as humans do not exist independently of technologies. By focusing on technology’s capacity to determine the human (by insisting that Google makes us stupid, that print makes us deeper readers), we risk losing sight of just how deeply our own agency is wrapped up with technology. We forego a more anthropological perspective from which we can observe “the activity of situated people trying to solve local problems.”26 To emphasize a single and direct causal link between technology and a particular form of thought is to isolate technology from the very forms of life with which it is bound up.

Considering our anxieties and utopic fantasies about technology or information superabundance in a more historical light is one way to mitigate this tendency and gain some conceptual clarity. Thus far I have offered some very general historical and conceptual observations about technology and the history of information overload. In the next sections, I focus on one particular historical technology—the footnote—and its afterlife in our contemporary digital world.

The Footnote: From Kant to Google

Today our most common tools for organizing knowledge are algorithms and data structures. We often imagine them to be unprecedented. But Google’s search engines take advantage of a rather old technology—that most academic and seemingly useless thing called the footnote. Although Google continues to tweak and improve its search engines, the data that continue to fuel them are hyperlinks, those blue colored bits of texts on the Web that if clicked will take you to another page. They are the sinews of the Web, which is simply the totality of all hyperlinks. The World Wide Web emerged in part from the efforts of a British physicist working at CERN in the early 1990s, Tim Berners-Lee. Frustrated by the confusion that resulted from a proliferation of computers, each with its own codes and formats, he wondered how they could all be connected. He took advantage of the fact that regardless of the particular code, every computer had documents. He went on to work on codes for html, URLs, and http that could link these documents regardless of the differences among the computers themselves. It turns out that these digital hyperlinks have a revealing historical and conceptual antecedent in the Enlightenment footnote.

The modern hyperlink and the Enlightenment footnote share a logic that is grounded in assumptions about the text-based nature of knowledge. Both assume that documents, the printed texts of the eighteenth century or the digitized ones of the twenty-first century, are the basis of knowledge. And these assumptions have come to dominate not only the way we search the web, but also the ways we interact with our digital world. The history of the footnote is a curious but perspicuous example, then, of how normative, cultural assumptions and values become embedded in technology.

Footnotes have a long history in biblical commentaries and medieval annotations. Whereas these scriptural commentaries simply “buttressed a text” that derived its ultimate authority from some divine source, Enlightenment footnotes pointed to other Enlightenment texts.27 They highlighted the fact that these texts were precisely not divine or transcendent. They located the work in a particular time and place. The modern footnote anchors a text and grounds its authority not in some transcendent realm, but in the footnotes themselves. Unlike biblical commentaries, modern footnotes “seek to show that the work they support claims authority and solidity from the historical conditions of its creation.”28 The Enlightenment’s citational logic is fundamentally self-referential and recursive—that is, the criteria for judgment are always given by the system of texts themselves and not something external, like divine or ecclesial authority. The value and authority of one text is established by the fact that other texts point to it. The more footnotes that point to a particular text, the more authoritative that text becomes by dent of the fact that other texts point to it.

Online newspapers and blogs are central to our public debates, but printed journals were the central medium of the Enlightenment. One of the most famous German journals was the Berlinische Monatsschrift published between 1783 and 1811. It published the most important articles to a broad and increasingly diverse reading public. In its first issue, the editors wrote that the journal sought “news from the entire empire [Reich] of the sciences”—ethnographic reports, biographical reports about interesting people, translations, excerpts from texts from foreign lands. The editors envisioned the journal as a central node in the broader world of information exchange and circulation. This editorial plan was then carried out according to a citational logic that structured the entire journal.

The journal’s first essay, “On the Origin of the Fable of the Woman in White,” centers on a fable “drawn” from another text of 1723. This citation is followed by another one citing another history, published in 1753, on the origins of the fable. The rest of the essay cites “various language scholars and scholars of antiquity” [Sprach- und Alterthumsforscher] to authorize its own claims. The citations and footnotes that fill the margins and the parenthetical directives that are peppered throughout the main text not only give authority to the broader argument and narrative, but also create a web of interconnected texts.

Even Kant’s famous essay on the question of Enlightenment, which appeared in the same journal in 1784, begins not with a philosophical argument, but with a footnote directly underneath the title, directing the reader to a footnote from another essay published in December of 1783 that posed the original question: “What is Enlightenment?” This essay in turn directs readers to yet another article on Enlightenment from September of that year. The traditional understanding of Enlightenment is based on the self-legislation and autonomy of reason, but all of these footnotes suggest that Enlightenment reason was bound up with print technology from the beginning.

One of the central mediums of the Enlightenment, journals, operated according to a citational logic. The authority, relevance, and value of a text was undergirded—both conceptually and visually—by an array of footnotes that pointed to other texts. Like our contemporary hyperlinks, these citations interrupted the flow of reading—marked as they often were by a big asterisk or a “see page 516.” Perhaps most importantly, however, all of these footnotes and citations pointed not to a single divinely inspired or authoritative text, but to a much broader network of texts. Footnotes and citations were the pointing sinews that connected and coordinated an abundance of print. By the end of the eighteenth century, there even emerged a term for all of this pointing: the language of books [Büchersprache]. Books were imagined to speak to one another because they constantly pointed to and cited one another. The possibility of knowledge and interaction with the broader world in the Enlightenment rested not only on the pensive, autonomous philosopher, but also within the links from book to book, essay to essay.

Google’s Citational Logic

The founders of Google, Larry Page and Sergey Brin, modeled their revolutionary search engine on the citational logic of the footnote and thus transposed many of its assumptions about knowledge and technology into a digital medium. Google “organizes the world’s information,” as their motto goes, by modeling the hyperlink structure inherent in the document-based Web; that is, it produces search results based on all of the pointing between digital texts that hyperlinks do. Taking advantage of the enormous scaling power afforded by digitization, Google, however, takes this citational logic to both a conceptual and practical extreme. Whereas the footnotes in Enlightenment texts were always bound to particular pages, Google uses each hyperlink as a data point for its algorithms and creates a digitized map of all possible links among documents.

Page and Brin started from the insight that the web “was loosely based on the premise of citation and annotation—after all, what is a link but a citation, and what was the text describing that link but annotation.”29 Page himself saw this citational logic as the key to modeling the Web’s own structure. Modern academic citation is simply the practice of pointing to other people’s work—very much like the footnote. As we saw with Enlightenment journals, a citation not only lists important information about another work, but also confers authority on that work: “the process of citing others confers their rank and authority upon you—a key concept that informs the way Google works.”30

With his original Google project, Page wanted to trace all of the links that connected different pages on the Web, not only the outgoing links, but also their backward paths. Page argued that pure computational power could produce a more complete model of the citational structure of the Web—a map of interlinked and interdependent documents by means of tracing hyperlinked citations. He intended to exploit what computer scientists refer to as the Web Graph—the set of all nodes, corresponding to static html pages, with directed hyperlinks from page A to page B. In early 1998 there were an estimated 150 million nodes joined by 2 billion links.31

Other search engines, however, had had this modeling idea before. Given the proliferation of Web pages and with them hyperlinks, Brin and Page, like all other search engineers, knew they had to scale up “to keep up with the growth of the web.”32 By 1994 the World Wide Web Worm (WWWW) had indexed 110,000 pages, but by 1997 WebCrawler had indexed over 100 million Web documents. As Brin and Page put it in 1998, it was “foreseeable” that by 2000 a comprehensive index would contain over a billion documents. They were not merely intent on indexing pages or modeling all of the links between documents on the Web, however. They were also interested in increasing the “quality of results” that search engines returned. In order for searches to improve, their search engine would focus not just on the comprehensiveness, but on the relevance or quality of its results.

The insight that made Google Google was the recognition that all links and all pages are not equal. In designing their link analysis algorithm, PageRank, Brin and Page recognized that the real power of this citational logic rested not just in counting links from all pages equally, but in “normalizing by the number of links on a page.”33 The key difference between Google and early digital search technologies (like the WWWW and the early Yahoo) was that it did not simply count or collate citations. Other early search engines were too descriptive, too neutral. Brin and Page reasoned that users wanted help not just in collecting but in evaluating all of those millions of webpages. From its beginnings at Stanford, the PageRank algorithm modeled the normative value of one page over another. It was concerned not simply with questions of completeness or managerial efficiency, but of value. It exploited the often-overlooked fact that hyperlinks, like those Enlightenment footnotes, not only connected document to document, but offered an implicit evaluation. The technology of the hyperlink, like the footnote, is not neutral but laden with normative evaluations.

The Algorithmic Self

In conclusion, I would like to forestall a possible concern that in historicizing information overload, I risk eliding the particularity of our own digital world and dismissing valid concerns, like Carr’s, about how we interact with our digital technologies. In highlighting the analogies between Google and Enlightenment print culture, I have attempted to resist the alarmism and utopianism that tend to frame current discussions of our digital culture, first by historicizing these concerns and second by demonstrating that technology needs to be understood in deep, embodied connection with the human. Considered in these terms, the question of whether Google is making us stupid or smart might give way to more complex and productive questions. What, for example, is the idea of the human person underlying Google’s efforts to organize the world’s information and what forms of human life does it facilitate?

In order to address such questions, we need to understand that the Web relies on us as much as we rely on it. Every time we click, type in a search term, or update our Facebook status, the Web changes just a bit. “Google might not be making us stupid but we are making it (and Facebook) smarter” because of all the information that we feed them both every day.34 The links that make up the Web are evidence of this. They not only point to other pages, but also highlight the contingency of the Web’s structure by highlighting how the Web at any given moment is produced, manipulated, and organized by hundreds of millions of individual users. Links embody the contingency of the Web, its historical and ever-changing structure of which humans are an essential element.

Thinking more in terms of a digital ecology or environment and less in a human vs. technology dichotomy, we can understand the Web, as James Hendler, Tim Berners-Lee, and colleagues recently put it, not just as an isolated machine “to be engineered for improved performance,” but as a “phenomenon with which we interact.” They write, “at the micro-scale, the Web is an infra-structure of artificial languages and protocols; it is a piece of engineering. However, it is the interaction of human beings creating, linking, and consuming information that generates the Web’s behavior as emergent properties at the macro-scale.”35

It is at this level of analysis, where the human and its technologies are inextricable and together form something like a digital ecology, that we can, for example, evaluate a recent claim of one of Google’s founders. Discussing the future of the search firm, Page described the “perfect search engine” as that which would “understand exactly what I mean and give me back exactly what I want.”36 Such an “understanding,” however, is a function of the implicit normativity of the citational logic that Google’s search engine shares with the Enlightenment footnote. These technologies never leave our desires and thoughts unmediated and unmanipulated. But Google’s search engines transform the normativity of the citational logic of the footnote in important and particular ways that have come to distinguish the digital age from the print age. Whereas an Enlightenment reader might have been able to connect four or five footnotes without much effort, Google’s search engine follows hundreds of millions of links in a fraction of a second. The embodied human can all too easily seem to disappear at such scales. If, as I have done above, the relevance of technology has to be argued for in the Enlightenment, then the inverse is the case for our digital age—the relevance of the embodied human agent has to be argued for today.

On the one hand, individual human persons play a rather insignificant role in Google’s operations. When we conduct a search on Google, the process of evaluation is fundamentally different from the form of evaluation tied to the footnote. Because Google’s search engine operates at such massive scales, it evaluates and normalizes links (judges which ones are relevant) through a recursive function. PageRank is an iterative algorithm—all outputs become inputs in an endless loop. The value of something on the Web is determined simply by the history of what millions of users have valued—that is, its inputs are always a function of its outputs. It is a highly scaled-up feedback loop. A Google search can only ever retrieve what is already in a document. It can only ever find what is known to the system of linked documents. The system is defined not by a particular object, operator, or node within the system, but rather by the history of the algorithm’s own operations.

If my son’s Web page on the construction of his tree house has no incoming links, then his page, practically speaking, does not exist according to PageRank’s logic. Google web crawlers will not find it—or if they do, it will have a very low rank—and thus, because we experience the Web through Google, neither will you. The freedom of the Web—the freedom to link and follow links—is a function of the closed and recursive nature of the system, one that includes by necessarily excluding. Most contemporary search engines, Google chief among them, now share the assumption that a “hyperlink” is a marker of authority or endorsement. Serendipity is nearly impossible in such a document-centric Web. Questions of value and authority are functions of and subject to the purported wisdom of the digital crowd that is itself a normalized product of an algorithmic calculation of value and authority.37

The normative “I” that Google assumes, the “I” that Page’s perfect search engine would understand, is an algorithmic self. It is a function of a citational logic that has been extended to an algorithmic logic. It is an “I” constructed by a limited and fundamentally contingent Web marked by our own history of searches, our own well-worn paths. What I want at any given moment is forever defined by what I have always wanted or what my demographic others have always wanted.

On the other hand, individual human persons are central agents in Google’s operations because they author hyperlinks. Columnists like Paul Krugman and Peggy Noonan make decisions about what to link to and what not to link to in their columns. Similarly, as we click from link to link (or choose not to click), we too make decisions and judgments about the value of a link and thus of the document that hosts it.

Because algorithms increase the scale of such operations by processing millions of links, however, they obscure this more human element of the Web. All of those decisions to link from one particular page to the next, to click from one link to the next involve not just a link-fed algorithm, but hundreds of millions of human persons interacting with Google every minute. These are the human interactions that have an impact on the Web at the macro-level, and they are concealed by the promises of the Google search box.

Only at this macro-level of analysis can we make sense of the fact that Google’s search algorithms do not operate in absolute mechanical purity, free of outside interference. Only if we understand the Web and our search and filter technologies as elements in a digital ecology can we make sense of the emergent properties of the complex interactions of humans and technology: gaming the Google system through search optimization strategies, the decision by Google employees (not algorithms) to ban certain webpages and privilege others (ever notice the relatively recent dominance of Wikipedia pages in Google searches?). The Web is not just a technology but an ecology of human-technology interaction. It is a dynamic culture with its own norms and practices.

New technologies, be it the printed encyclopedia or Wikipedia, are not abstract machines that independently render us stupid or smart. As we saw with Enlightenment reading technologies, knowledge emerges out of complex processes of selection, distinction, and judgment—out of the irreducible interactions of humans and technology. We should resist the false promise that the empty box below the Google logo has come to represent—either unmediated access to pure knowledge or a life of distraction and shallow information. It is a ruse. Knowledge is hard won; it is crafted, created, and organized by humans and their technologies. Google’s search algorithms are only the most recent in a long history of technologies that humans have developed to organize, evaluate, and engage their world.


ENDNOTES

1. Ann Blair, “Information Overload, the Early Years,” The Boston Globe (28 November 2010) .
2. Mark N. Hansen, Embodying Technesis: Technology beyond Writing (2010) 235.
3. Kevin Kelly, “Scan This Book!,” The New York Times (14 May 2006).
4. Randall Stross, “World’s Largest Social Network: The Open Web,” The New York Times (15 May 2010).
5.The most euphoric among them speak of a coming “singularity” when computer intelligence will exceed human intelligence.
6. For a less utopian and more nuanced account of a post-human era, see Friedrich Kittler, Gramophone, Film, Typewriter, trans. Geoffrey Winthrop-Young and Michael Wutz (1999).
7. Nicholas Carr, "Is Google Making Us Stupid?: What the Internet Is Doing to Our Brains,” The Atlantic (July–August 2008). See also the expansion of his argument in The Shallows: What the Internet Is Doing to Our Brains (2010).
8. Quoted in Ann Blair, Too Much to Know: Managing Scholarly Information before the Modern Age (2010) 15. The following historical account draws on Blair’s work.
9. Quoted in Stuart Brown, “The Seventeenth-Century Intellectual Background,” The Cambridge Companion to Leibniz, ed. Nicholas Jolley (1995) 61 n28.
10. Johann Gottfried Herder, Briefe zur Beförderung der Humanität (1971) II: 92–93.
11. A review of Christian Thomasius’s Observationum selectarum ad rem litterariam spectantium [Select Observations Related to Learning], volume II (1702), which was published in the April 1702 edition of the monthly British newspaper History of the Works of the Learned, Or an Impartial Account of Books Lately Printed in all Parts of Europein, as cited in David McKitterick, “Bibliography, Bibliophily and Organization of Knowledge,” The Foundations of Knowledge: Papers Presented at Clark Library (1985) 202.
12. Johann Georg Heinzmann, Appell an meine Nation: Über die Pest der deutschen Literatur (1795) 125.
13. Immanuel Kant, Philosophical Encyclopedia, 29:30, in Kant’s Gesammelte Schriften, ed. Königliche Preußische (later Deutsche) Akademie der Wissenschaften (1902–present).
14. See Richard R. Yeo, Encyclopaedic Visions: Scientific Dictionaries and Enlightenment Culture (2001).
15. Blair, Too Much to Know, 34.
16. Blair, Too Much to Know, 53.
17. Blair, Too Much to Know, 8.
18. Herder quoted in Ernst Behler, “Friedrich Schegels Enzyklopädie der literarischen Wissenschaften im Unterschied zu Hegels Enzyklopädie der philosophischen Wissenschaften,” Studien zur Romantik und idealistischen Philosophie (1988) 246.
19. From an interview with Nicholas Carr.
20. Kittler xxxix.
21. Elizabeth L. Eisenstein, The Printing Revolution in Early Modern Europe (2005) 332.
22. Francis Bacon, “On Studies,” Essays with Annotations (1884) 482.
23. Much of this work has been done in German-language scholarship. For an English-language overview, see Guglielmo Cavallo and Roger Chartier, eds., A History of Reading in the West (1999).
24. See Jonathan Sheehan, The Enlightenment Bible: Translation, Scholarship, Culture (2005).
25. Immanuel Kant, Anthropologie, 7:208 in Kant’s Gesammelte Schriften, ed. Königliche Preußische (later Deutsche) Akademie der Wissenschaften (1902–present).
26. Hansen 271n8.
27. Anthony Grafton, The Footnote: A Curious History (1997) 32. Footnotes of this sort go back to at least the seventeenth century. John Selden’s History of Tithes (1618) and Johannes Eisenhart's De fie historica (1679), which emphasized the importance of citing sources, reveal the process of knowledge production.
28.Grafton 32.
29. John Batelle, The Search: How Google and its Rivals Rewrote the Rules of Business and Transformed Our Culture (2005) 72.
30. Batelle 70.
31. James Glieck, The Information: A History, a Theory, a Flood (2011) 423.
32. Sergey Brin and Lawrence Page, “The Anatomy of a Large-Scale Hypertextual Web Search Engine,” Computer Networks and ISDN Systems 30 (1998): 107–17.
33. Brin and Page.
34. Siva Vaidhyanathan, The Googlization of Everything (And Why We Should Worry) [2011] 182.
35. James Hendler, et al., “Web Science: An Interdisciplinary Approach to Understanding the Web,” Communications of the ACM 51.7 (July 2008): 60–69.
36. Larry Page, as quoted at the Google Corporate page.
37. Critics of Google’s document-centric search technologies have long been promising the advent of a semantic web that would “free” data from a document-based web. Some see social media tools like Facebook and 38. Twitter as offering something similar. For an early vision of what this might look like, see Tim Berners-Lee, James Hendler, and Ora Lassila, “The Semantic Web,” The Scientific American (17 May 2001): 34–43. Ω

[Chad Wellmon is Assistant Professor of German Studies at the University of Virginia. He is the author of Becoming Human: Romantic Anthropology and the Embodiment of Freedom (2010) and is currently finishing a book on eighteenth-century information overload and the modern research university Wellmon received a B.A. in Political Philosophy and German from Davidson College and a Ph.D. in German Studies from the University of California atBerkeley.]

Copyright © 2012 Institute for Advanced Studies in Culture

Get the Google Reader at no cost from Google. Click on this link to go on a tour of the Google Reader. If you read a lot of blogs, load Reader with your regular sites, then check them all on one page. The Reader's share function lets you publicize your favorite posts.

Creative Commons License
Sapper's (Fair & Balanced) Rants & Raves by Neil Sapper is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License. Based on a work at sapper.blogspot.com. Permissions beyond the scope of this license may be available here.



Copyright © 2012 Sapper's (Fair & Balanced) Rants & Raves