Tuesday, November 16, 2004

Caveat Lector!

In my declining years at the Collegium Excellens, my students would submit Internet sources in a bibliography without any regard to the reliability or authenticity of their citations. All of that spurious nonsense rubbed off on me. Some recent posts to this blog have come from Wikipedia. I am no better than the students I flogged for their sins against scholarship. If this is a (fair & balanced) admonition, so be it.

[x Tech Central Station]
The Faith-Based Encyclopedia
By Robert McHenry

Away back about 1993, '94 -- in retrospect, the last of the halcyon days when a relatively small and rather homogeneous group of people around the globe could reasonably consider themselves as constituting the Internet community and could take a strongly proprietary view of its future development -- back then, I am recalling, a cluster of enthusiasts coalesced in an online discussion group devoted to the creation of an encyclopedia on the Internet, an Interpedia, as they called it. As one of the proponents described it,



"...The Interpedia will be a reference source for people who have connectivity to the internet. It will encompass, at the least, articles submitted by individuals, and articles gleaned from non-copyrighted material. It will have mechanisms for submission, browsing, and authentication of articles. It is, currently, a completely volunteer project with no source of funding except for the contributions of the volunteers and their respective institutions. It also has no governing structure except for a group of people who have volunteered to do specific tasks or who have made major contributions to the discussion?. Everyone is encouraged to make a contribution, small or large."


The discussion group generated a great quantity of writing, none of it encyclopedic in nature. There were discussions of the software needed for authoring and databasing and registering and validating and so on; discussions of how to attract contributors and of how teams for larger articles might be organized; of how to ensure that articles were editable but at the same time protected from unauthorized alteration. Every so often there were rhapsodic explanations of why the Interpedia, as a noncommercial and collaborative project, was ipso facto superior to all existing encyclopedias, all of which were published for [shudder] profit and all of which had their origin in [shudder] print.

Every so often, as the discussion went on and on, a burst of enthusiasm would overcome one of the participants, who would post a message along the lines of "Okay, great! How do we start? What can I do, right now?" There never came an answer to that question. Instead, the discussion would begin another great swing around the circle of technical and procedural matters, to end only when another naïf would beg to be given some concrete direction. Eventually the discussion petered out, in part because some real encyclopedias developed Internet presences, and in part because the volunteer nonleaders of the ungoverned, unstructured project truly did not know where or how to begin.

But the dream did not die. A decade later, the Wikipedia project is flourishing. As of November 2004, according to the project's own counts, nearly 30,000 contributors had written about 1.1 million articles in 109 different languages, though some of these language versions of Wikipedia remained quite small. The Manx Gaelic version, for example, had only 3 articles, the Guarani 10, and the Klingon (yes, from the Star Trek series) 48. The largest, the English language version, contained over 382,000 pages that were thought "probably" to be encyclopedic articles. (The "probably" tells as much about the limits of Wikipedia's oversight as any single word possibly could.)

This is an impressive amount of work to have been accomplished in the three years since the project began, and the founders were obviously correct in believing that a vast reservoir of willing volunteers awaited just such an opportunity as Wikipedia would offer. The effort has not gone unnoticed, either. A page on the Wikipedia site lists well over a hundred positive mentions this year in the world's media, including the Economist, the Guardian, the Christian Science Monitor, the Washington Post, Slate.com, Slashdot.com, and, yes, TCS. Wikipedia is "one of the most fascinating developments of the Digital Age"; "brilliant"; an "incredible example of open-source intellectual collaboration"; and so forth.

Credit the founders, then, with having overcome the obstacles that the Interpedia nonleaders failed to surmount. They built the software (the "wiki" in Wikipedia), they attracted the needed contributors, and they generated the all-important buzz. (They also found that they needed to create a background hierarchy of administrators, sysops, bureaucrats (actually so called), and stewards, watched over by an arbitration committee and finally the founder himself, who retains ultimate authority. Even online, democracy has its limits.) The question is, however, just what have they created?

Let's first see what they intended to create. The general FAQ (Frequently Asked Questions) page tells us:

"Wikipedia's goal is to create a free encyclopedia --- indeed, the largest encyclopedia in history, both in terms of breadth and depth and also to become a reliable resource."

Note the adjectives, and the order in which they appear:



  • free
  • largest (breadth)
  • largest (depth)
    "and also"
  • reliable

This statement of purpose must be taken with at least a grain of salt, however, because it, like everything else on the Wikipedia site, is editable, by anyone. We can take it that the statement represents the view of the last person to modify it, and those of unknown others who have chosen not to modify it further or to "revert" it, in the lingo, meaning to return it to a prior state. It is entirely consonant with other statements on the site and with instructions given to volunteer editors and copy editors:

"Please remember that the original author took the trouble to write a new page for Wikipedia and that however good or bad it is, if you are taking the trouble to copy-edit it then it is probably a valuable contribution."

Again with the "probably."

The idea that animates the entire undertaking, and links it with the Interpedia of yore, is expressed in the discussion of editing policy:

"However, one of the great advantages of the Wiki system is that incomplete or poorly written first drafts of articles can evolve into polished, presentable masterpieces through the process of collaborative editing. This gives our approach an advantage over other ways of producing similar end-products. Hence, the submission of rough drafts should also be encouraged as much as possible."

In other words, the process allows Wikipedia to approach the truth asymptotically. The basis for the assertion that this is advantageous vis-à-vis the traditional method of editing an encyclopedia remains, however, unclear.

The general FAQ does offer one mild caveat:

"As anyone can edit any article, it is of course possible for biased, out of date or incorrect information to be posted. However, because there are so many other people reading the articles and monitoring contributions using the Recent Changes page, incorrect information is usually corrected quickly. Thus the overall accuracy of the encyclopedia is improving all the time as it attracts more and more contributors. You are encouraged to help by correcting articles and passing on your own knowledge."

One person's "knowledge," unfortunately, may be another's ignorance. To put the Wikipedia method in its simplest terms:



  1. Anyone, irrespective of expertise in or even familiarity with the topic, can submit an article and it will be published.
  2. Anyone, irrespective of expertise in or even familiarity with the topic, can edit that article, and the modifications will stand until further modified.
    Then comes the crucial and entirely faith-based step:
  3. Some unspecified quasi-Darwinian process will assure that those writings and editings by contributors of greatest expertise will survive; articles will eventually reach a steady state that corresponds to the highest degree of accuracy.


Does someone actually believe this? Evidently so. Why? It's very hard to say. One possibility that occurs to me is this: The combination of prolificacy and inattention to accuracy that characterizes this process is highly suggestive of the modern pedagogic technique known as "journaling." For decades, (following, we are probably meant to assume, some breakthrough research at a school of education somewhere) young students have been not merely encouraged but required to fill pages of their notebooks with writing. Not stories, nor essays, nor any other defined genre of writing; just writing. The writing is judged solely on bulk: So many pages are required per week or semester, but the writing on those pages need not be grammatical or even intelligible. Even the "talented and gifted" program at my own sons' school employed journaling as a principal activity, merely raising the quota over that of standard classrooms. It may well be that the practice of journaling in the schools, along with the acceptance of "creative spelling" as a form of personal expression not to be repressed, underlies much of the success of Wikipedia.

Superimpose on this intellectual preparation the moist and modish notion of "community" and some vague notions about information "wanting" to be free, et voilà!

But conceding for a moment that this exercise in encyclopedia making is enjoyed and even believed in fervently by many thousands of participants, let us take note of someone who is absolutely central to the concept of an encyclopedia but who is hardly acknowledged at all by the Wikipedians. I mean, of course, the user. As in the reader. The person who comes to Wikipedia in search of accurate information.

I know as well as anyone and better than most what is involved in assessing an encyclopedia. I know, to begin with, that it can't be done in any thoroughgoing way. The job is just too big. Professional reviewers content themselves with some statistics -- so many articles, so many of those newly added, so many index entries, so many pictures, and so forth -- and a quick look at a short list of representative topics. Journalists are less stringent. To see what Wikipedia is like I chose a single article, the biography of Alexander Hamilton. I chose that topic because I happen to know that there is a problem with his birth date, and how a reference work deals with that problem tells me something about its standards. The problem is this: While the day and month of Hamilton's birth are known, there is some uncertainty as to the year, whether it be 1755 or 1757. Hamilton himself used, and most contemporary biographers prefer, the latter year; a reference work ought at least to note the issue.

The Wikipedia article on Hamilton (as of November 4, 2004) uses the 1755 date without comment. Unfortunately, a couple of references within the body of the article that mention his age in certain years are clearly derived from a source that used the 1757 date, creating an internal inconsistency that the reader has no means to resolve. Two different years are cited for the end of his service as secretary of the Treasury; without resorting to another reference work, you can guess that at least one of them is wrong. The article is rife with typographic errors, styling errors, and errors of grammar and diction. No doubt there are other factual errors as well, but I hardly needed to fact-check the piece to form my opinion. The writing is often awkward, and many sentences that are apparently meant to summarize some aspect of Hamilton's life or work betray the writer's lack of understanding of the subject matter. A representative one runs thus:


"Arguably, he set the path for American economic and military greatness, though the benefits might be argued."


All these arguments aside, the article is what might be expected of a high school student, and at that it would be a C paper at best. Yet this article has been "edited" over 150 times. Some of those edits consisted of vandalism, and others were cleanups afterward. But how many Wikipedian editors have read that article and not noticed what I saw on a cursory scan? How long does it take for an article to evolve into a "polished, presentable masterpiece," or even just into a usable workaday encyclopedia article?

The history page for this article reveals a most interesting story. Originally, the 1757 birth date was used. Thus the internal inconsistencies of ages and dates that I saw are artifacts of editing. Originally, the two citations of the year Hamilton resigned from the Cabinet agreed; editing has changed one but not the other. In fact, the earlier versions of the article are better written overall, with fewer murky passages and sophomoric summaries. Contrary to the faith, the article has, in fact, been edited into mediocrity.

Is this a surprising result? Not really: Take the statements of faith in the efficacy of collaborative editing, replace the shibboleth "community" with the banal "committee," and the surprise dissolves before your eyes. Or, if you are of a statistical turn of mind, think a little about regression to the mean and the shape of the normal distribution curve. However closely a Wikipedia article may at some point in its life attain to reliability, it is forever open to the uninformed or semiliterate meddler.

It is true, unfortunately, that many encyclopedia users, like many encyclopedia reviewers, have low expectations. They are satisfied to find an answer to their questions. I would argue that more serious users, however, have two requirements: first, an answer to their questions; second, that those answers be correct. Of course, this may be just me. I have had the experience of making this argument before a roomful of sales executives and marketing people and being met with looks of bafflement on the one hand and dismissal on the other.

The user who visits Wikipedia to learn about some subject, to confirm some matter of fact, is rather in the position of a visitor to a public restroom. It may be obviously dirty, so that he knows to exercise great care, or it may seem fairly clean, so that he may be lulled into a false sense of security. What he certainly does not know is who has used the facilities before him.

Robert McHenry is Former Editor in Chief, the Encyclopædia Britannica, and author of How to Know (Booklocker.com, 2004).

Copyright © 2004 Tech Central Station

No comments:

Post a Comment

☛ STOP!!! Read the following BEFORE posting a Comment!

Include your e-mail address with your comment or your comment will be deleted by default. Your e-mail address will be DELETED before the comment is posted to this blog. Comments to entries in this blog are moderated by the blogger. Violators of this rule can KMA (Kiss My A-Double-Crooked-Letter) as this blogger's late maternal grandmother would say. No e-mail address (to be verified AND then deleted by the blogger) within the comment, no posting. That is the (fair & balanced) rule for comments to this blog.