The Given

By: Matthew Battles
January 11, 2013

What is data — or, as I should put it, what are they? Data, the collective noun, is a thing we have, a thing we produce, a thing we consume; irreducible, evidentiary, unimpeachable, swarming myrmidons in the empire of essence. It’s powerful stuff: we pay for it, travel for it, fight for it; men die every day for the lack of it. We want it free, we want it big, we want it open-source. It should be flat, and you can’t copyright it (unless you can). We have data plans, which in practice cover the exchange of certain kinds of electromagnetic signals (distinct from voice and text plans, which cover other kinds of signals)— the usage of which we meter, framing a space in which some comedy of freedom is acted out. Our best work is data driven, our doubts assuaged by appeals to data. Data is the Missouri of ontology, the cybernetic bottom line.

The semantics of this word data, in short, are anything but immediately clear. Its beginnings were humble enough, a nominative usage of the neuter past-participle of the Latin dare, to give. The given— that which is given being in the singular, datum, archaically, with data taking the plural. And its ascent through the modern era, into collectivity, programmability, algorithmic transmutability, was slow. Darwin uses “data” three times in On the Origin of Species— showing a marked preference for “evidence,” which he uses seventy-five times. Data get big with the 1880 US census, which took eight years to tabulate, spurring Herman Hollerith’s development of the punchcard tabulating system (Hollerith’s company became IBM). Even then the word was not in use as it is today; data took on its current colors in the context of the machines in the line of descent from Hollerith’s system.

Etymology is scant guide, of course; but the paths a word follows through time take it through thickets of norms and values, tacit assumptions and ideological fictions. The tribes who serially take up and use the tool of a word, tugging at its affordances, rekindling its fossil poetry, modulating its harmonics and timbres, until a new instrument is made of it. All of which is not to say that a word is some colorless alembic in which we distill our dreams and schemes.

A session at this year’s convention of the Modern Language Association devoted to “Data Management in the Humanities” brought the shifting semantics of data forcibly to mind. With the rise of the digital humanities, historians and literary scholars are getting comfortable with the idea of calling the stuff of their disciplines data. The shift is no mere flourish; it indexes changes in the way we think about the material we read, interpret, and confront aesthetically. Responding on Twitter, it made the media theorist Lev Manovich dispeptic. “‘Data’ is only what can be acted on by algorithms,” he tweeted. “Simply declaring that all academic notes etc is ‘data’ is useless.” Upon consideration, however, Manovich retreated a bit, collegially admitting that he was putting himself in an indefensible position tweeting about a conference he wasn’t attending. But his own formulation intrigued him, and in a subsequent tweet he framed it as a question: “yes, this maybe interesting media theory statement: algorithms create data. Will explore!”

Manovich has written with great sensitivity about the aesthetic peculiarities of databases in contrast to those of narrative, arguing that the former have expressed themselves in electric media since the early twentieth century, finding an especially congenial home in the digital computer. Thus, to Manovich, algorithmic accessibility means something with peculiar force and a special place in time. (In general terms, to be sure, nearly anything is organizable algrothmically, but clearly Manovich has the algorithms of computers in mind.) But I think data also carries modern values that emerged before the computer, in the context of the rise of science — and that when we call collections of novels or old-master paintings or ekphrastic women’s poetry by the term, we’re doing something more than evoking the rhetoric of regnant science.

It’s through data that science provokes nature to speak. Our application of the term to culture may admit that some of the dynamic processes that provoke our wonder in the physical realms — pattern and variation, self-organization, the agencies of the sublime — hold sway in culture as well. This is anathema to a prior generation of humanists, for whom nature’s values are precisely those to hold separate from the realm of the imaginal. But we may also begin to break down the barriers modernity imposed between nature and culture — barriers which both reduced the humanities and gave license to treat harshly the mere matter of the natural realm. But if culture is nature, perhaps nature begins to find a voice in human affairs. Even in the imagined, the sung, and the wrought, there is much that escapes our control; we have always known this. Indeed, there is grandeur in this view of culture.