Mr. Young means to test empirically the existence of “constitutional moments,” changes occurring outside formal processes of amendment that Bruce Ackerman has posited are important elements in the American constitutional progress. To this end, Young focuses Measure on the so-called Reconstruction “moment,” from the period preceding the 1866 congressional elections through 1868, the time range within which Ackerman discerns a structured process of profound commitment to a new racially open political, legal, and institutional order. (See Bruce Ackerman, We The People: Transformations 99-252 (1998).) Measure studies the front pages of some 600 newspapers, viewing 2,000 articles published between June 1, 1866 and December 31, 1866; 2,612 articles published between June 1, 1868 and December 31, 1868; 5,000 newspaper pages on which the word “constitution” appeared between January 1, 1866 and December 31, 1868; and 15,322 newspaper front pages published between June 1 and December 31 in 1866, 1868, 1870, 1872, and 1884. All told, Young takes into account 32,544,870 words. (See Table I, P. 2021.)
In 1866 and 1868, “results indicate empirical support for the hypothesis that Americans were paying attention to constitutional-level issues during these periods.” The newspaper coverage surveyed between 1866 and 1872 and then 1884 shows “support for both the notion that constitutional issues were of high salience during this period and that sustained attention to those issues spiked during certain key moments in 1866 and 1868.” “[E]vidence of both constitutional discourse and a gradual decline in the prevalence of that discourse over time” is “consistent the with predictions of Ackerman’s theory that sustained popular attention to constitutional politics peaks during transformative constitutional moments and then declines as normal politics once again take center stage.” (P. 2053.) “Had my results indicated either no evidence of constitutional discourse, or a constant level of such discourse across time, it would have called into question the entire theoretical superstructure of Ackerman’s work.” (PP. 2053-54.)
“[F]or all the millions of words and thousands of newspaper articles this Note analyzes,” Mr. Young concedes, “this is a rather modest conclusion.” “[T]here is nothing surprising about the fact that the media was paying attention to the passage of major constitutional amendments in the aftermath of a devastating civil war.” (P. 2053.) It’s not Young’s bottom line, however, that marks his effort as important. “[M]illions of words and thousands of newspaper articles”—no law student reads this much! How did he do that?
“Algorithmic topic modeling,” his Note’s title tersely declares. Forty pages plus (out of 54 total) admirably explain what this involves. There is also an elegant technical appendix. Each newspaper front page from the period (all accessible on line) is treated as a separate document and run through optical character recognition software to identify words as words. The documents are computer-converted to brute lists stripped of all original interior organization, so-called common words deleted; the remaining identified words are counted in cases of repetition within each of the documents. The quantified word lists are statistically analyzed (more software) as word distributions, compared with each other, and the most common clusters of words across the full set of documents extracted. These clusters provide the ultimate working material for purposes of Young’s discussion. Texts become data.
Should we too want to do this work? At one point, Michael Taylor Young argues with provocative flair:
In the past, gaining a sense of the public zeitgeist around key political events required immersion in thousands of documents and was subject to the interpretive proclivities of whatever historian was up to the task. [Young footnotes: As a classic of the genre, see Bernard Bailyn, The Ideological Origins of the American Revolution (1967)….] While there is extraordinary value in this kind of synthesis, it also requires an extraordinary outlay of work and effort. … By contrast, algorithmic topic modeling allows us to glean some sense of public discourse in a much more rapid fashion. While we lose the texture of professional historical analysis, topic modeling can assist close readings of primary sources in an economical fashion. (PP. 2019-20 & n.119.)
Bernard Bailyn or MacBook Air? Welcome to the twenty-first century indeed!
Actually, it’s not clear I’ve picked the right computer. Mr. Young is reticent about the mechanics of his project—about what hardware and software he used, about how much space data storage took up, about how long the processes of calculation took, etc. Watson, maybe, or some unnamed beast of a machine?
Perhaps more to the point: This is the cluster (or topic) emerging from his 1866 runs that Young treats as most evocative—“states, government, union, constitution, congress, united, national, right, amendment, people, power, would, country, shall, rebellion.” (Table 2, P. 2025.) 8.37% of the “modeled text in the corpus” included this word list. The next most frequent is 8.25%: “nashville, street, tennessee, states, cotton, diseases, union, agent, united, college, court, stock, terms, company, commission.” (Table 2, cont’d, P. 2026.)
What do we learn when we look at these lists? Young is careful to frame his effort as a search for indicators of “the salience of constitutional issues” (P. 2023), and the first cluster does indeed suggest “some sense of public discourse.” We could probably figure out—if we did enough reading of the original newspaper pages—what the second cluster was showing as salient. Should it matter that it’s likely to be something very different than constitutional debate? Why should something—whatever was evidently going on in Nashville, Tennessee—preoccupy newspapers just about as much as the constitutional cluster? There is also, for example, this 1866 cluster, figuring in 4.03% (about half as many) of the word lists: “friends, great, church, before, himself, country, ladies, young, radical, present, though, mother, christian, story, nothing.” (Table 2, P. 2025.) What is this? The accumulated clusters certainly show that we are confronting the results of autonomous machine reading and not concerted human biasing. Are we supposed to draw any conclusions from the fact that the set of 20 topics Young’s computer identifies resembles something so much like Borges’s encyclopedia? (See “The Analytical Language of John Wilkins,” Other Inquisitions, 1937-1952, p. 103.)
Mr. Young repeatedly depicts his data as word clouds. (See PP. 2024, 2029, 2030, 2033, 2034, 2047, 2048.) This is apt: imagine textual space as like a large room, filled with people, all talking, repeated words sometimes seeming to gather and emerge as somehow interconnected above the clamor. But clouds, we know, are just dust and water molecules, however much they invite interpretation. Are Young’s lists similarly gossamer?
In footnotes, he provides us with glimpses of an intellectual history. The point of departure, it appears, is the 2003 article “Latent Dirichlet Allocation,” written by David Blei, Andrew Ng, and Michael Jordan, published in volume 3 of the Journal of Machine Learning Research (993-1022).
The basic idea is that documents are represented as random mixtures over latent topics, where each topic is characterized by a distribution over words. … The Dirichlet is a convenient distribution… — it is in the exponential family, has finite dimensional sufficient statistics, and is conjugate to the multinomial distribution. … [T]hese properties will facilitate the development of inference and parameter estimation algorithms. (996)
The elaborating apparatus grows dauntingly formidable fast. (See Wikipedia’s account (permanent link).)
Professor Blei—at Princeton and now Columbia—has become a leading figure in this sort of work. He summarizes its present state in an article he published earlier this year: “Build, Compute, Critique, Repeat: Data Analysis with Latent Variable Models.” From a distance anyway, the formalisms look to be—are presented as though they are— rather ad hoc. But in the second of two strikingly engrossing lectures delivered at Cambridge in September 2009, Blei stopped and turned to his students, asking “Why does this work? Really?” He alluded to a working assumption asserted by biophysicist William Bialek: “[E]fficient representation of predictive information … values all (predictive) bits equally … in some instances … filtering and in others … learning. … [W]hat determines whether we should filter or learn is … the structure of the data stream to which we have access.” (See Bialek et al., “Efficient representation as a design principle for neural coding and computation.”) Blei also tentatively suggested that “underneath” the key idea was “co-occurrence.” Topic models (if they worked) picked out and grouped words somehow associated substantively or structurally with notions too rich or elaborate or complex to be captured in language by individual terms.
* * * * *
First, it should be clear that Mr. Young’s Note is an inviting portal opening into a world of endeavor right now revealing notable breadth, ambition, and innovation. To be sure, humanities computing has been taking up and taking apart texts for some time now. The startling experiments of Franco Moretti, however unique, are one example. Topic modeling presents itself as a generally available way of adjusting the idea of reading to encompass huge numbers of documents and new surfaces to be approached—reading differently, but reading nonetheless. American law is a plausible, important recruit to the project, given its own profusion of documents, its claims to political and cultural prominence, and our own recurring, manifest, and precarious attempts to discern “the structure of the data stream.”
There are also, it is easy to see, many points of entry. State constitutional law, for example, accumulates a sequence of constitutional documents, one replacing the other, each subject to multiple amendments, also addressed by numbers of proposed but not ratified amendments. What topics recur across the whole set? What differences in topics appear if we confine analysis to particular periods in time? In U.S. constitutional law, we have not only the collection of Supreme Court opinions, but also many other document series—U.S. Courts of Appeals and District Court opinions, Attorney General opinions, state analogs, huge ranges of commentary. What would we learn if we could consider this mass whole or in large chunks?
Second, if we were to undertake well-framed versions of these and other explorations, we likely would need to work intensely and equally with statisticians and perhaps computer engineers. David Blei makes the point emphatically (from the other side, as it were): “The future of data analysis lies in close collaborations between domain experts and modelers.” (Blei, 2014: 205.) Notably, Daniel Taylor Young teamed with Brandon Stewart, an already statistical-virtuoso Harvard Government Department Ph.D candidate. (See PP. 1993-94 & n.9; P. 2018 n.114.) Such joint effort likely won’t be one-shot. The statistician will not, most of the time I’d guess, simply do some set up work akin to tech support and then leave legal academics to make sense of the results. First runs might prompt critique, suggest revisions in models, additional runs, repeated iterations. (In this respect, we may think, the paradigm is corpus linguistics.) Notably, Blei’s 2014 review is importantly concerned with working out methods facilitating such reflexive engagement. (Blei, 2014: 223-28.) This sort of collaboration—perhaps especially if it encompasses not only scholarly exploration but teaching—might therefore prompt new attention to cross-university overlaps and the difficult negotiations of multidisciplinary work.
Third, pauses for interpretive efforts ought to be understood as integral. So too should subsequent revisions of statistical models, following attempts to test interpretive implications. Mr. Young and Mr. Stewart work hard near the end of the Note to discern hierarchies implicit in their collection of topic terms. (PP. 2039-44.) Interpretation, of course, may bring to bear what’s known outside the statistical exercise as well. The terms spinning to the top in the Note’s report do not seem to evoke much if any sense of the sequence of stages—a political and institutional ballet of considerable intricacy—that Bruce Ackerman describes as the gist of his idea of the “constitutional moment.” (E.g., Transformations 123-24.) Shouldn’t this matter? Maybe the topic model points us toward something like the collection of popularly-accessible moral exhortations that William Nelson concluded was the likely primary substance of much of the Fourteenth Amendment, a conclusion suggested in part by his own Bailynesque reading of great numbers of newspaper editorials written during the ratification period. (See William E. Nelson, The Fourteenth Amendment: From Political Principle to Judicial Doctrine 61-63 (1988).)
We might also wonder about competing constitutional moments. It ought to have been possible within the framework of the study to assess topic models revealed in Southern newspapers considered separately. Ackerman’s book briefly acknowledges the strategy of “masterly inactivity,” the preferred passive-aggressive approach of “moderate” Southern elites opposed to the proposed Fourteenth Amendment that Michael Perman studied at length forty years ago. (Reunion Without Compromise 229-65 (1973).) Did Southern newspaper front pages make use of constitutional terms markedly less often? Or collect different terms? Maybe more significantly, we might ask whether the Note’s modeling, if adjusted, could search out in both North and South topics evidencing the sorts of insurgent, horrific white violence against blacks and associated white popular support, ultimately serving successfully across decades as a terrorist undergirding and at times immediate enforcement of the counter-constitutional racial order largely displacing the Fourteenth Amendment. It is entirely possible, if appallingly ironic, that an infernal constitutional moment emerged with real help from the Fourteenth Amendment ratification politics—that the stalemated interplay of Southern moderates and their congressional counterparts that Perman depicts, set against the backdrop of only intermittently yielding white terrorist insurgency, contributed substantially to the shaping of popular and judicial acknowledgement of the “New South” regime. Bruce Ackerman writes at one point: “To put it mildly, it is easy to tell a story that ends unhappily for the Fourteenth Amendment.” (Transformations, p. 250.)
Fourth, if topic modeling should be of interest to us, it should be so because of its exploratory value. Straightforwardly fact-finding experiments like Young and Stewart’s may not—as theirs may not—show us especially much. Attempts to understand the expectations or presuppositions of the statistical frameworks, however—the implicit jurisprudence, as it were, of Dirichlet distributions and associated devices—may from time to time position us to notice otherwise obscure attributes of even familiar legal materials, and therefore maybe to change in interesting ways what we write and teach. It is this possibility that Daniel Taylor Young (and Brandon Stewart) have shown us. It is why we should appreciate very much what they have done.