A Corpus Linguistic Analysis of “Foreign Tribunal”

Volume 108

108 Va. L. Rev. Online 207
Download PDF
*Assistant Professor of Law, Fowler School of Law, Chapman University; Associate Professor, Applied Linguistics, Northern Arizona University. We thank three coders, all law school graduates in the last few years, for their assistance, as well as the help of Chapman research librarian Tami Carson.Show More

Introduction

In March, the United States Supreme Court heard a case involving the issue of whether a private arbitration panel in another country is covered by the statutory phrase “foreign or international tribunal.”1.See Oral Argument, ZF Auto. US, Inc. v. Luxshare, Ltd., No. 21-401 (U.S. argued Mar. 23, 2022), https://www.oyez.org/cases/2021/21-401. That case is consolidated with AlixPartners, LLC v. Fund for Protection of Investor Rights in Foreign States, No. 21-518 (U.S. argued Mar. 23, 2022). However, the latter case involves a slightly different question: whether 28 U.S.C. § 1782 applies to investor-state arbitrations pursuant to international treaties. This paper will not address the underlying linguistic questions invoked by AlixPartners.Show More The statutory language, enacted in 1964, authorizes a federal district court to order witness testimony or production of evidence “for use in a proceeding in a foreign or international tribunal” if the witness or holder of the material resides or is found in the district.2.28 U.S.C. § 1782(a).Show More The Respondent here seeks to invoke this statutory authorization to assist them in private arbitration held in a foreign country.

Whether Respondent can so rely on this statute is no small matter. In the case, the Respondent, Luxshare, Ltd, plans to initiate private arbitration proceedings in Germany against Petitioner ZF Automotive US, Inc. The German arbitration arises out of a business dispute involving hundreds of millions of dollars in alleged damages,3.See Luxshare, Ltd. v. ZF Auto. US, Inc., 547 F. Supp. 3d 682, 686–87 (E.D. Mich.), cert. granted 142 S. Ct. 637 (2021).Show More under a private agreement calling for private commercial arbitration overseen by arbitrators who are private citizens selected and paid for by the parties.

At its core, this dispute hinges on a linguistic question: what did the term foreign tribunal mean in 1964? Petitioners argue that a foreign tribunal only refers to entities imbued with government or quasi-government authority. Respondent takes a broader view, arguing that foreign tribunal refers to any entity in a foreign country that can enter a decision and bind parties, even if that entity is purely private. The parties devote large chunks of their briefs to the underlying linguistic question, looking to dictionaries and various legal materials to support their position. But the parties’ attempts to divine the meaning of foreign tribunal suffer from shortcomings common to legal interpretation. This article turns to a tool that avoids these shortcomings and provides a more rigorous, objective, and transparent answer to the question at hand. That tool? Corpus linguistics.

Increasingly, our courts (including the U.S. Supreme Court) have looked to corpus linguistics to better answer the linguistic questions that judges face in interpreting the words of the law.4.Carpenter v. United States, 138 S. Ct. 2206, 2238–39 n.4 (2018) (Thomas, J., dissenting) (running a search in the Corpus of Founding-Era American English); Lucia v. S.E.C., 138 S. Ct. 2044, 2056 (2018) (Thomas, J., concurring, joined by Gorsuch, J.) (citing Jennifer Mascott, Who Are “Officers of the United States?” 70 Stan. L. Rev. 443 (2018)); Facebook, Inc. v. Duguid, 141 S. Ct. 1163, 1174 (2021) (Alito, J., concurring) (citing Thomas R. Lee & Stephen C. Mouritsen, Judging Ordinary Meaning, 127 Yale L. J. 788 (2018)); Bostock v. Clayton County, 140 S. Ct. 1731, 1769 n.22 (2020) (Alito, J., dissenting) (citing James C. Phillips, The Overlooked Evidence in the Title VII Cases: The Linguistic (and Therefore Textualist) Principle of Compositionality 3 (unpublished manuscript) (May 11, 2020), https://ssrn.com/abstract=3585940.Show More Understandably, judges use economic tools to tackle economic questions and historical tools to answer historical questions. Should they not use linguistic tools for linguistic questions? “[W]ords are . . . the material of which laws are made. Everything depends on our understanding of them.”5.Garson Kanin, Conversations with Felix, Reader’s Digest, June 1964, at 116, 117 (replying to counsel who said a question from the bench was just a matter of semantics).Show More We can and should use the right tools for seeking this understanding.

This article will proceed in four parts. Part I presents the linguistic debate as framed by the parties, highlighting shortcomings of the traditional tools they employ. Part II explains how the tools of corpus linguistics can address these shortcomings. And Part III presents a corpus linguistic analysis of the terms foreign tribunal and foreign tribunal(s). This approach, more rigorous than that undertaken by the parties, can provide data on the linguistic question that undergirds the legal issue—which reading of the statute is more probable than the other. After all, a “problem in [legal interpretation] can seriously bother courts only when there is a contest between probabilities of meaning.”6.Felix Frankfurter, Some Reflections on the Reading of Statutes, 47 Colum. L. Rev. 527, 528 (1947).Show More Corpus linguistics can help with that contest.

I. Background

The parties frame the linguistic debate at issue here as a question of the ordinary meaning of the statutory terms. They thus point to various sources to support their preferred reading of the statute, including dictionaries, ordinary usage, and legal usage. Some of these tools are a good start. But they do not provide a sufficiently objective, transparent basis for resolving the contest between dueling senses of the statutory terms at issue because they do not fully answer the linguistic question, instead requiring linguistic intuition to fill in the gaps.

A. The Linguistic Debate at Issue Here

1. Dictionaries

Both the petitioners and the respondent turn to dictionaries contemporaneous to the statute’s enactment to proffer a definition that supports their litigating position. They frame their reliance on dictionaries as a quest for the ordinary meaning of the statutory language. For example, ZF Automotive cites four contemporaneous ordinary dictionaries and one contemporaneous legal dictionary for the meaning of tribunal.7.See Brief for the Petitioners at 18, ZF Auto. US, Inc. v. Luxshare, Ltd., No. 21-401 (Jan. 24, 2022).Show More Respondent Luxshare likewise quotes two ordinary dictionaries and two legal dictionaries for tribunal, though strangely two of these dictionaries are of recent vintage—2019 and 1996—calling into question their utility. From these dictionaries emerge the following definitions. First, the narrower sense:

  • “[t]he seat of a judge;”8.Tribunal, Black’s Law Dictionary (4th ed. 1951); Webster’s Third New International Dictionary of the English Language Unabridged 2441 (1961) [hereinafter Webster’s Third (1961)].Show More “the bench on which a judge and his associates sit for administering justice”9.Webster’s Third (1961), supranote 8, at 2441.Show More
  • “[t]he whole body of judges who compose a jurisdiction”10 10.Tribunal, Black’s Law Dictionary (4th ed. 1951).Show More
  • “a court or forum of justice:”11 11.Webster’s Third (1961), supranote 8, at 2441; Merriam-Webster’s Dictionary of Law503 (1996).Show More “[a] seat or court of justice”12 12.The American Heritage Dictionary of the English Language 1369 (1969).Show More; “a judicial court”13 13.Tribunal, Black’s Law Dictionary (4th ed. 1951).Show More
  • “a judicial assembly”14 14.11 The Oxford English Dictionary 341 (1933).Show More

The 1969 edition of Ballentine’s Law Dictionary, which the parties did not cite, also defined tribunal as “[a] court. The seat or bench for the judge or judges of a court.”15 15.Ballentine’s Law Dictionary 1300 (1969).Show More

Second, the broader sense:

  • “[a] court of justice or other adjudicatory body”16 16.Black’s Law Dictionary 1814 (11th ed. 2019).Show More
  • “a person or body of persons having to hear and decide disputes so as to bind the parties”17 17.Merriam-Webster’s Dictionary of Law503(1996).Show More
  • “[a]nything having the power of determining or judging”18 18.The American Heritage Dictionary of the English Language 1369 (1969).Show More
  • a “person or body of persons having authority to hear and decide disputes so as to bind the disputants”19 19.Brief for the Petitioners, supra note 7, at 19 (quoting Webster’s Third (1961), supra note 8, at 2441).Show More

At least one other dictionary not cited by the parties—Funk & Wagnalls New Standard Dictionary of the English Language, published in 1960—included the narrow sense, though it is unclear whether it also included the broad sense given the example it used to illustrate, which at first seems like the broader sense but may actually be referring to an international tribunal that has government authority: “1. A court of justice; any judicial body, as a board of arbitrators. 2. The seat set apart for judges, magistrates, etc.”20 20.Funk & Wagnalls New Standard Dictionary of the English Language 1340 (1960).Show More

Thus, dictionaries reveal that, around 1964, there were at least two senses of tribunal. One sense, common to every dictionary we or the parties could find, legal or ordinary, was narrow in nature and referred mostly to courts. The other, found in two (maybe three) ordinary dictionaries (and two later legal dictionaries that we are not giving weight to, given their date of publication), was broad in nature and could cover private arbitration bodies. One could be tempted from this evidence to infer that the narrow sense was the more common of the two senses. But as described below, such an inference would be a mistake based merely on dictionary frequencies. Likewise, parties sometimes refer to a “lead legal definition[],” “primary definition[],” or “secondary definition.”21 21.Brief for the Petitioners, supra note 7, at 19–20.Show More As described below, such labels are mistaken when derived from dictionaries.22 22.It is worth noting that no contemporaneous legal dictionary included the broader sense of tribunal. This could indicate a divergence from the ordinary and the legal meanings of the word.Show More

None of the dictionaries defined the actual statutory terms, leaving the parties to look up their constituent words in dictionaries. Thus, the parties also looked up the definition of foreign.23 23.SeeBrief for the Petitioners, supra note 7, at 19; Brief for the Respondent at 12, ZF Auto. US, Inc. v. Luxshare, Ltd., No. 21-401 (filed Feb. 23, 2022).Show More “Putting these definitions together,” the petitioners argued that the statutory terms “most naturally refer[] to a court or other governmental adjudicative or quasi-adjudicative body convened to render justice.”24 24.SeeBrief for the Petitioners, supra note 7, at 19.Show More Thus, the terms do “not encompass a private arbitral panel whose authority derives solely from the contractual agreement of private parties rather than any government, and which is not composed of government adjudicators.”25 25.Id.Show More Respondents never put the two terms together to create a definition for foreign tribunal, but rather use dictionaries to argue that private commercial arbitration panels in foreign countries satisfy both the definition for foreign and the definition for tribunal.26 26.SeeBrief for the Respondent, supra note 23, at 12–14.Show More

2. Ordinary Usage

The parties claim to look at “ordinary” usage to support their legal positions. Hence, in rejecting a definition of foreign that could mean just located in a foreign country and instead embracing a definition that means belonging to another country, ZF Automotive presented examples such as “foreign leader,” “foreign official,” “foreign flag,” “foreign law,” and “foreign country.”27 27.Brief for the Petitioners, supra note 7, at 20–21.Show More From this, the petitioners concluded that “[w]hen the word ‘foreign’ modifies a noun with potential governmental or sovereign connotations—like ‘tribunal’—it typically indicates that the noun belongs to the sovereign entity.”28 28.Brief for the Petitioners, supra note 7, at 20.Show More However, neither party actually presented any evidence of ordinary usage of the term foreign tribunal. And Luxshare’s evidence of ordinary meaning was “dictionaries [some being legal dictionaries], judicial opinions, and other legal sources.”29 29.SeeBrief for the Respondent, supra note 23, at 13.Show More Legal sources are not very good indicators of ordinary meaning.

3. Legal Usage

Finally, the parties turned to legal usage. Thus, petitioners looked at the use of the term foreign as a modifier in other portions of the 1964 Act, how Congress has both used the term tribunal and described private arbitration, and how federal courts and legal scholars have used the terms foreign tribunal and arbitral tribunal.30 30.SeeBrief for the Petitioners, supra note 7, at 21–25.Show More Likewise, respondent turned to a recent (2021) legal treatise, recent caselaw (2004 & 1997), and German law in defining foreign. Then, it used federal judicial usage (both recent and contemporaneous to 1964), the same recent legal treatise, various arbitration bodies’ rules, the Geneva Treaties, and legal commentary and scholarship to support its reading of tribunal.31 31.SeeBrief for the Respondent, supra note 23, at 13.Show More

B. The Weakness with the Parties’ Evidence & Methodologies

1. The Limitations of Dictionaries

a. Non-compositionality

Dictionaries generally define single words, not multi-word terms or phrases. Thus, if relying on dictionaries, one has to slice and dice statutory text rather than looking up the whole operative phrase. But this is deeply problematic. That is because of the linguistic phenomenon of non-compositional expression, wherein “a particular word sequence should be considered a single lexical item.”32 32.Alan Cruse, Meaning in Language: An Introduction to Semantics and Pragmatics 82 (3d ed. 2011).Show More

Normally, the principle of compositionality applies. Linguists define compositionality as when “[t]he meaning of a semantically complex expression is a compositional function of the meanings of its semantic constituents.”33 33.Id. at 84.Show More In other words, often what you see is what you get: cherry pie is a pie made from cherries.

But sometimes, “the combination of words has a meaning of its own that is not a reliable amalgamation of the components at all,” such as for good or at all.34 34.Alison Wray, Why Are We So Sure We Know What a Word Is?, in The Oxford Handbook of the Word 725, 737 (John R. Taylor ed., 2015).Show More In short, a phrase may be more (or less) than the sum of its parts. Related to “non-compositionality” is the idiom principle: “a language user has available to him or her a large number of semi-preconstructed phrases that constitute single choices [in communication], even though they might appear to be analysable into segments.”35 35.John McH. Sinclair, Collocation: A Progress Report, in 2 Language Topics: Essays in Honour of Michael Halliday 319, 320 (Ross Steele & Terry Threadgold eds., 1987).Show More Take, for example, of course or in fact. Looking up their constituent words separately will not tell one the idiomatic meaning of the combined phrase. Non-compositional expressions come in several varieties, such as phrasal idioms (pulling someone’s leg); cliches, grammatical idioms (by and large), and frozen metaphors (the ball’s in your court).36 36.See Cruse, supranote 32, at 86–91.Show More

The Supreme Court has recognized this linguistic phenomenon, observing that “two words together may assume a more particular meaning than those words in isolation.”37 37.FCC v. AT&T Inc., 562 U.S. 397, 406 (2011).Show More In fact, in a different area of law—trademark law—the Court has noted this principle for over a century, which has come to be known as the Anti-Dissection Rule.38 38.See 2 McCarthy on Trademarks and UnfairCompetition § 11:27 (5th ed.) (“Under the anti-dissection rule, a composite mark is tested for its validity and distinctiveness by looking at it as a whole, rather than dissecting it into its component parts.”).Show More This same principle can and should be applied to statutory interpretation so that the meaning of a multi-word term or phrase should be “derived from it as a whole, not from its elements separated and considered in detail”—“it should be considered in its entirety.”39 39.Est. of P.D. Beckwith, Inc., v. Comm’r of Pats., 252 U.S. 538, 545–46 (1920).Show More Judge Frank Easterbrook perhaps put this most colorfully when he observed in a trademark case involving a church’s name:

[T]he World Church produced . . . nothing but a dictionary. It did not offer any evidence about how religious adherents use or understand the phrase as a unit. It offered only lexicographers’ definitions of the individual words. That won’t cut the mustard, because dictionaries reveal a range of historical meanings rather than how people use a particular phrase in contemporary culture. (Similarly, looking up the words “cut” and “mustard” would not reveal the meaning of the phrase we just used.)40 40.TE-TA-MA Truth Found.—Fam. of URI, Inc. v. World Church of the Creator, 297 F.3d 662, 666 (7th Cir. 2002) (emphasis added).Show More

Thus, looking up the words foreign and tribunal in dictionaries may not give us a complete and accurate meaning of foreign tribunal. Yet because the parties were heavily relying on dictionaries, that is exactly what they resorted to here. This same criticism can be levied at the parties for looking at the usage in legal materials of just the words foreign, international, and tribunal.

b. Dictionaries as “museums of words” and linguistic intuition

Relatedly, dictionaries are not always very useful for dealing with context. That is because dictionaries are just “museum[s] of words”41 41.Frank H. Easterbrook, Text, History, and Structure in Statutory Interpretation, 17 Harv. J.L. & Pub. Pol’y 61, 67 (1994).Show More—“historical records (as reliable as the judgment and industry of the editors) of the meanings with which words have in fact been used by writers of good repute.”42 42.Henry M. Hart, Jr. & Albert M. Sacks, The Legal Process: Basic Problems in the Making and Application of Law 1375 (1994).Show More Hence, dictionaries “are often useful in answering hard questions of whether, in an appropriate context, a particular meaning is linguistically permissible,” not what is linguistically probable in a given context.43 43.Id. at 1375–76.Show More

Thus, when lawyers, scholars, or jurists countenance one dictionary definition over another as the ordinary meaning of a word or phrase, that tells us more about their linguistic intuition than the dictionary because it is that intuition that is the analytical bridge from dictionary evidence to the interpretive conclusion. After all, dictionaries do not indicate which sense of a word is the ordinary sense—that would depend on context. And besides a lack of transparency, that intuition has at least two pitfalls stemming from the fact that an individual’s linguistic intuition is informed by her exposure to language over her lifetime. The first limitation of linguistic intuition, at least for most lawyers, scholars, and judges, is that they are seldom representative of ordinary members of society, tending to hail from more elite social circles with much more education. These demographic factors influence the language to which they are exposed.

Second, even if an attorney, academic, or judge was just an ordinary person who ran in ordinary circles with an ordinary level and source of education, she is still a product of her time. And that time confines—even distorts—her ability to properly intuit meaning from a time during which she did not live. That is due to the reality of linguistic drift. If the English language were static, then statutes written in an earlier time would not pose challenges to a later person’s linguistic intuition. But English is not static. Over time, meanings can change, sometimes dramatically and quickly. Take the constitutional term domestic violence. From the 1770s through the 1970s, the term consistently meant insurrection, rebellion, or rioting within a state.44 44.Thomas R. Lee & James C. Phillips, Data-Driven Originalism, 167 U. Pa. L. Rev. 261, 298–300 (2019).Show More But starting in the 1980s, that began to change, and by the 1990s, domestic violence almost always means “violent or aggressive behavior within the home, esp[ecially] violent abuse of a partner.”45 45.Domestic Violence, Oxford English Dictionary Online (Mar. 2006), https://www.oed.com/view/Entry/56663?redirectedFrom=domestic+violence#eid41827739 [https://perma.cc/A5ZN-RQRV]; Lee & Phillips, supra note 44,at 300.Show More The previous sense that dominated for two centuries has now almost completely fallen out of use. And that shift occurred within less than two decades. Thus, someone relying on her own linguistic intuition formed in a time after a statute was adopted may miss that linguistic drift had occurred and inaccurately understand a statutory word or phrase.

Yet, when the parties, namely their well-educated and arguably upper-class lawyers, propose ordinary usage terms, like “foreign leader” or “foreign flag,” they are relying on their linguistic intuition formed by language exposure long after the statute was enacted.

c. “Lexicographical prescriptivism”

In the 1960s, Webster’s Third International Dictionary made a move deemed controversial in the world of lexicography: it decided to define words according to actual usage rather than proper usage.46 46.James Sledd & Wilma R. Ebbitt, Dictionaries and That Dictionary 79 (1962) (quoting the editor-in-chief of Webster’s Third as stating that “the dictionary’s purpose was to report the language, not to prescribe what belonged in it”). Because of this move, Justice Scalia rejected Webster’s Third, preferring Webster’s Second. See James J. Brudney & Lawrence Baum, Oasis or Mirage: The Supreme Court’s Thirst for Dictionaries in the Rehnquist and Roberts Eras, 55 Wm. & Mary L. Rev. 483, 508–09 (2013) (noting that Scalia’s reliance “on Webster’s Second and American Heritage—identified as belonging to the prescriptive camp—far more than Webster’s Third, the poster child for descriptive dictionaries,” is a “preference” that “is not inadvertent: Scalia has disparaged Webster’s Third in his opinions . . . and in his recent book”). Scalia’s rejection of Webster’s Third is ironic given his purported aim of understanding words in legal texts according to how people at the time would have understood them.Show More This move to descriptive definitions rather than normative ones was a break from the past as “[l]exicographical prescriptivism in the United States is exactly as old as the making of dictionaries, because of the role played by the dictionary in a society characterized by a great deal of linguistic insecurity.”47 47.Henri Béjoint, Tradition and Innovation in Modern English Dictionaries 116 (1994).Show More

Normative, or prescriptive, dictionaries “establish[] what is right in meaning and pronunciation,” providing users with what the lexicographer deems the “proper” usage of each word.48 48.Sledd & Ebbitt, supra note 46, at 57.Show More Therefore, “the prescriptive school of thought relie[d] heavily on the editors of dictionaries to define and publish the proper meaning and usage of the terms.”49 49.Samuel A. Thumma & Jeffrey L. Kirchmeier, The Lexicon Has Become a Fortress: The United States Supreme Court’s Use of Dictionaries, 47 Buff. L. Rev. 227, 242 (1999).Show More In contrast, “[t]he editors of a descriptive dictionary describe how a word is being used and, unlike their prescriptive counterparts, do not decide how a word should be used.”50 50.Id.Show More To the extent any dictionary is prescriptive, it is less useful for determining how people actually used language—and dictionaries before and during the 1960s, outside of Webster’s Third, tend to be of the prescriptive variety.51 51.Granted, to the extent people rely on dictionaries, even a prescriptive definition could somewhat reflect how people understood language, though it is second-best evidence.Show More And these are many of the very dictionaries relied on by the parties here.

d. Relying on dictionary sense-ordering

Dictionaries list senses in numerical order. This sometimes gives rise to what has been called the “sense-ranking fallacy.”52 52.See Stephen C. Mouritsen, The Dictionary Is Not a Fortress: Definitional Fallacies and a Corpus-Based Approach to Plain Meaning, 2010 BYU L. Rev. 1915, 1926–29 (2010).Show More That fallacy is to deem a sense listed before another as being more “primary.” Justice Breyer did this in Muscarello v. United States.53 53.524 U.S. 125 (1998).Show More In looking at the verb carry, Justice Breyer deemed one sense as “primary” and another as “special,” in part because he observed that the “primary” sense occurred first in three dictionaries, whereas the “special” sense was numerically ranked lower.54 54.Id. at 128–31.Show More This sense-ordering caused Justice Breyer to consider the sense listed sooner as more ordinary.

Such a conclusion is flawed because dictionaries do not claim that the ordering of senses is based on which are more common, frequent, or ordinary.55 55.As has been noted elsewhere, the one exception to this is The Random House Dictionary of the English Language. SeeLee & Mouritsen, supra note 4, at 808 n.89 (observing that dictionary’s front matter declares that “a general policy of putting the most frequently used meanings . . . at the beginning of the entry, followed by other senses in diminishing usage, with archaic, and obsolete senses coming last”) (citing Random House Dictionary of the English Language—Unabridged, at viii (2d ed. 1987) [hereinafter Random House]). However, that dictionary was not cited by the parties here (and would only provide half of the relevant term), and as Lee and Mouritsen note, there are “grounds for skepticism of these sorts of claims” given the way dictionaries are constructed, with even Random House conceding that “sense ranking based on frequency holds only ‘generally.’” Id. (quoting Random House, supra, at xxii).Show More Rather, senses are either ordered based on when they were deemed to have historically entered the lexicon,56 56.1 The Oxford English Dictionary xxix (2d ed. 1989) (“[T]hat sense is placed first which was actually the earliest in the language: the others follow in order in which they appear to have arisen.”).Show More or they are admittedly “an arbitrary arrangement or rearrangement.”57 57.Webster’s Third New International Dictionary of the English Language Unabridged 19a (1971).Show More Thus, at least based on the order senses appear in dictionaries, there is no “primary,” “lead,” or “secondary” sense, as some of the parties argued here.

e. Sense frequency across dictionaries

Another common mistake is to deem a sense that occurs more often across multiple dictionaries as the more common, ordinary, or primary sense.58 58.SeeJohn Mikhail, The Definition of ‘Emolument’ in English Language and Legal Dictionaries, 1523–1806, at 8–10 (July 12, 2017) (unpublished manuscript) (surveying 50 founding-era dictionaries and concluding that because 100% of the entries included at least one element of the broad definition of emolument, and only 8% of the entries included an office or employment-related definition, the word must have been understand at the founding in its broad sense); see alsoJames Cleith Phillips & Sara White, The Meaning of the Three Emoluments Clauses in the U.S. Constitution: A Corpus Linguistic Analysis of American English from 1760–1799, 59 S. Tex. L. Rev. 181, 196–97 (2017) (critiquing Mikhail for this analysis).Show More This misses the fact that the very “‘system of separating senses’ is ‘only a lexical convenience.’”59 59.Lee & Mouritsen, supranote 4, at 809 n.90 (quoting Webster’s Third (1971), supra note 57, at 19a).Show More And dictionaries do not agree as to where to draw the line. That is because “[l]exicographers tend to fall into one of two categories when it comes to writing definitions: lumpers and splitters.”60 60.Kory Stamper, Word by Word: The Secret Life of Dictionaries 119 (2017); see also The Routledge Handbook of Corpus Linguistics 433–34 (Anne O’Keeffe & Michael McCarthy eds., 2010) (discussing “lumpers” and “splitters”).Show More A lumper “tend[s] to write broad definitions that can cover several or more minor variations on that meaning.”61 61.Stamper, supranote 60, at 119.Show More By contrast, a splitter “tend[s] to write discrete definitions for each of those minor variations.”62 62.Id.Show More

Additionally, “[t]he history of English lexicography usually consists of a recital of successive and often successful acts of piracy.”63 63.Sidney I. Landau, Dictionaries: The Art and Craft of Lexicography 35 (1984).Show More This tendency, at least historically, for dictionaries to use the definitions of other dictionaries, “can create a false consensus whereby it looks like all of the dictionaries independently agree, and thus reflect contemporaneous linguistic reality, but in actuality only reflect the views . . . of a few dictionary makers.”64 64.Phillips & White, supranote 58, at 191.Show More To what extent lexicographical piracy was occurring in the 1950s and 60s is uncertain. Many of the dictionaries the parties cite here have identical or near identical definitions, though. At the very least, extreme caution is warranted in surmising anything from the frequency of senses when surveying multiple dictionaries.65 65.SeeLee & Mouritsen, supranote 4, at 810 n.98 (“[T]he methods that [dictionaries] use to sample language use don’t create a reliable sample—aggregating dictionaries isn’t going to accomplish anything if none of them has a reliable sample of language usage.”).Show More

2. Non-Systematic Usage Sampling

To overcome the limitations of dictionaries, one can sample actual usage of the complete term at issue. The parties do this, but not in a systematic way or in sufficient numbers that we can have much confidence. Like dictionaries, these examples of language usage have the potential to suffer from the same defect of relying on legislative history—looking out among the crowd and calling on one’s friends. Or, to put it more bluntly, cherry-picking examples that support one’s position. The parties only present a handful of samples of usage and often they rely on just the usage of one of the words of the multi-word term. Much more is needed to have any confidence in the results. And the sampling must either be random (if there are sufficient examples to need to sample) or weighted towards the usage that is closest in time to the relevant date—here, 1964. What is more, parties are prone to read the data in a way favorable to their position, even if only subconsciously through confirmation bias or motivated reasoning. Our methods below help overcome these shortcomings.

II. A Brief Introduction to Corpus Linguistics

Due to the above-noted limitations with traditional statutory interpretive methodology and tools, something better is needed. Corpus linguistics has the potential to be that something better66 66.For a broader discussion of this, see generallyLee & Mouritsen, supra note 4 (arguing that corpus linguistics can provide answers to questions regarding statutory interpretation).Show More—in the words of Law Professor Larry Solum, to “revolutionize statutory . . . interpretation.”67 67.Amanda K. Fronk, Big Lang at BYU, BYU Magazine (Summer 2017), https://magazine.byu.edu/article/big-lang-at-byu/ [https://perma.cc/23QK-W3GJ].Show More In this sense, corpus linguistics is akin to a paradigm-shifting technology or tool like the Hubble Telescope. Certainly, astronomers could glimpse the heavens from earth before the Hubble was launched. But the increased clarity and scope the Hubble brought to astronomic inquiries was revolutionary. What is more, corpus analysis brings transparency—researchers, courts, and parties can access the corpus and perform the same searches to analyze the data for themselves.

While corpus linguistics and corpora may sound exotic, they are not. A language corpus is similar in some regards to a corpus (or body) of precedent. Moreover, corpora are used in the construction of most modern dictionaries.68 68.Hans Lindquist, Corpus Linguistics and the Description of English 52 (2009) (observing that “today all major British dictionary publishers have their own corpora . . . . The editors use concordances to find out the typical meanings and constructions in which each word is used, and try to evaluate which of these are worth mentioning in the dictionary. Many dictionaries also quote authentic examples from corpora, either verbatim or in a slightly doctored form.”).Show More Corpus linguistics—a robust empirical methodology within the field of linguistics—provides a variety of methods for analyzing a corpus to answer legal interpretive questions.

A. The Purpose of Corpus Linguistics

Corpus linguistics is the empirical study of language using samples (or bodies) of texts called corpora (in the plural). A corpus is constructed in order to study a particular register (variety of texts associated with a situational context) or speech community (group of language users who share the same dialect or language norms).69 69.Tony McEnery & Andrew Hardie, Corpus Linguistics: Method, Theory and Practice 1–2 (2012).Show More Corpus linguistics is premised on the idea that “the best way to find out about how language works is by analyzing real examples of language as it is actually used.”70 70.Paul Baker et al., Glossary of Corpus Linguistics 65 (2006).Show More In studying naturally occurring language use, corpus linguistics can avoid the observer’s paradox—the phenomenon whereby people tend to change their behavior when they are aware they are being studied (i.e., the Hawthorne Effect).71 71.Henry A. Landsberger, Hawthorne Revisited: Management and the Worker, Its Critics, and Developments in Human Relations in Industry 14–15, 23 (1958).Show More

Corpus linguistics is founded on two premises: (1) that a corpus of texts can be constructed to be sufficiently representative of a particular register or speech community, and (2) that one can “empirically describe patterns of language use through analysis of that corpus.”72 72.The Cambridge Handbook of English Corpus Linguistics 1 (Douglas Biber & Randi Reppen eds., 2015).Show More So corpus linguistics “depends on both quantitative and qualitative analy[sis].”73 73.Douglas Biber, Corpus-Based and Corpus-Driven Analyses of Language Variation and Use, in The Oxford Handbook of Linguistic Analysis 160 (Bernd Heine & Heiko Narrog eds., 2010).Show More And corpus linguistics results “in research findings that have much greater generalizability and validity than would otherwise be feasible.”74 74.Id. at 159.Show More Because “a key goal of corpus linguistics is to aim for replicability of results, researchers and data creators have an important duty to discharge in ensuring the data they produce is made available to analysts in the future.”75 75.McEnery & Hardie, supranote 69, at 66.Show More

B. Corpora

A corpus can be made of any kind of naturally occurring texts. Common examples include collections of samples of newspapers articles, books, or legal documents. The utility of a corpus will depend on the degree to which it represents the target language domain of interest. Corpus representativeness depends on two key considerations—“what types of texts should be included in the corpus and how many texts are required.”76 76.Jesse Egbert et al., Designing and Evaluating Language Corpora: A Practical Framework for Corpus Representatives (2022).Show More What is true for computing is true for corpus linguistics: “garbage in, garbage out,” as corpus-based results can be no better than the corpus being used (and it can be worse if the corpus data is not properly analyzed).77 77.United States v. Esquivel-Rios, 725 F.3d 1231, 1234 (10th Cir. 2013) (Gorsuch, J., majority opinion) (“Garbage in, garbage out. Everyone knows that much about computers: you give them bad data, they give you bad results.”).Show More If a corpus does not adequately represent the texts used within the register or by the speech community one wants to make observations about, then other features of the corpus, such as its size, will make little difference. For example, a corpus composed of the transcripts of the television show Game of Thrones will not tell us much about language usage among early 20th century Ethiopian children, no matter how big the corpus is. The corpus must match and represent the register or group about which one wants to draw inferences. Otherwise, one cannot make generalizations about the larger register or speech community of interest. Hence, using Google for corpus linguistics research is arguably not very effective because the searchable web represents a wide range of registers and speech communities.78 78.See Douglas Biber & Jesse Egbert, Register Variation Online 6–7 (2018).Show More

C. Corpus Linguistic Methods

There are a large number of linguistic methods that have been developed and applied to corpus data. We first introduce a selection of methods that have been used for legal interpretation. Then we briefly introduce several other methods that are used within the larger field of corpus linguistics. Perhaps the most basic method for quantitatively analyzing corpus data is frequency—measuring how often, for instance, a word is used over time or in different types of texts (i.e., registers or genres).79 79.Tony McEnery & Andrew Wilson, Corpus Linguistics: An Introduction 82 (2d ed. 2001).Show More

Another corpus method commonly used in legal interpretive research is concordance line analysis. These can be used for qualitative analysis or in order to get at frequency data. Concordance lines are excerpts from texts centered on a search term. In cases where there are many hits resulting from a corpus query, researchers can extract a random sample of concordance lines from the corpus.

To get meaning out of the concordance lines often requires classifying (or “coding”) the search results. We recommend that researchers base concordance line coding on the best practices and principles of content analysis and survey methodologies.80 80.See James C. Phillips & Jesse Egbert, Advancing Law and Corpus Linguistics: Importing Principles and Practices from Survey and Content-Analysis Methodologies to Improve Corpus Design and Analysis, 2017 BYU L. Rev. 1589, 1608 (2017) (“Law and corpus linguistics can learn from the methodologies employed, and the reasons driving those methodologies, in fields that use content-analysis, such as media studies. Specifically, these methodologies can inform and improve what, how, and who codes search results from corpus analysis.”).Show More For instance, one could search for a particular word, then classify each result presented in a concordance line according to a particular sense of that word. Additionally, if greater context than one sentence is needed, one can expand the size of the text excerpt surrounding the search hit to account for more context. In this way, one could analyze the results to determine something a dictionary cannot usually convey: which sense is more common in a given context (i.e., the distribution of senses). This particular exercise, using concordance lines to classify senses, has proven to be an effective method for addressing questions regarding the meaning of words and phrases in legal texts. Further, the nature of the search results prevents one from cherry-picking examples. Of course, classifying senses involves a measure of subjectivity in considering the context to properly classify (or code) a sense. But as explained further below, we have taken measures to minimize this subjectivity.

Another tool found in most corpora is collocation. Some words “co-locate” more frequently than other words. One can think of this phenomenon as “word neighbors.” These semantic patterns of word association can sometimes be intuitive: we expect dark to appear more often in the same semantic environment as night than with perfume. But sometimes the patterns are surprising. This linguistic phenomenon has long been implicitly recognized in the law in the canon of construction called noscitur a sociis: “it is known by its associates.”81 81.Noscitur a sociis, Black’s Law Dictionary (10th ed. 2014).Show More Linguists just put it a slightly different way: “[y]ou shall know a word by the company it keeps!”82 82.John Rupert Firth, A Synopsis of Linguistic Theory, 1930–1955, in Studies in Linguistic Analysis 11 (1957).Show More

By seeing which words are collocates of each other, we can sometimes get additional insight into how people understand those words. This can be done in a corpus by searching for a word and indicating (1) how many words to the left or right (or both) of the search term one wants to examine, and (2) which statistical measure (e.g., frequency, MI score, T score) will be used to measure the strength of association.83 83.SeeJesse Egbert, Tove Larsson & Douglas Biber, Doing Linguistics with a Corpus: Methodological Considerations for the Everyday User 25–29 (2020).Show More In this way, researchers are able to estimate how common it is for words to co-occur in close proximity. We can also use collocate analysis to see how usage patterns change. For instance, one of us in an earlier paper noted that the top five collocates (in raw frequency) of the term domestic violence from 1760-1979 were (1) against, (2) state(s), (3) protect, (4) convened, and (5) invasion.84 84.Lee & Phillips, supranote 44, at 298 tbl.1.Show More This reflects the sense as used in the Constitution of a rebellion or insurrection within a state. But the top five collocates of domestic violence from 1980-2009 showed a radical shift: (1) women, (2) abuse(d), (3) honor, (4) national, and (5) victims.85 85.Id.Show More These collocates reflect the sense of violence against a member of one’s household.

Besides analysis at the word or phrasal level, through a corpus search one can consider grammatical context by looking at a term or phrase in a specific syntactic structure (i.e., a noun modified by a particular adjective). For example, in a recent paper, one of us applied grammatical analysis of corpus data to determine whether language users use the term vehicle to refer to scooters.86 86.Daniel Keller & Jesse Egbert, Hypothesis Testing Ordinary Meaning, 86 Brook. L. Rev. 489, 505–32 (2021).Show More To do this, we identified 230 instances where scooter occurred in close proximity with vehicle, and then we classified each of these into one of three categories: (1) scooters are referred to as vehicles, (2) scooters are not referred to as vehicles, and (3) inconclusive. For each of these categories, we established a number of grammatical structures that clearly indicated the category. Based on this, we found that scooters are referred to as vehicles in 87% of the cases where the data is conclusive.

There are other methods in corpus linguistics that we have not discussed in this section. Among these are methods that have been used in previous legal scholarship (e.g., n-grams87 87.Lee & Phillips, supranote 44, at 304 & tbl.3.Show More or lexical bundles88 88.Douglas Biber, Susan Conrad, & Viviana Cortes, If you look at . . .: Lexical Bundles in University Teaching and Textbooks, 25 Applied Linguistics 371 (2004).Show More), as well as many others—such as dispersion,89 89.See Jesse Egbert, Brent Burch, & Douglas Biber, Lexical Dispersion and Corpus Design, 25 Int’l J. Corpus Linguistics 89–90 (2020); Stefan Th. Gries, Dispersions and Adjusted Frequencies in Corpora, 13 Int’l J. Corpus Linguistics 403 (2008).Show More keyword analysis,90 90.Jesse Egbert & Douglas Biber, Incorporating Text Dispersion into Keyword Analyses, 14 Corpora 77–78 (2019); Mike Scott, PC Analysis of Key Words—And Key Key Words, 25 System 233 (1997).Show More collostructional analysis,91 91.Stefan Th. Gries, & Anatol Stefanowitsch, Extending Collostructional Analysis: A Corpus-Based Perspective on ‘Alternations’, 9 Int’l J. Corpus Linguistics 97 (2004).Show More text type analysis,92 92.Douglas Biber & Edward Finegan, An Initial Typology of English Text Types, in Corpus Linguistics II: New Studies in the Analysis and Exploitation of Computer Corpora 19 (Jan Aarts and Willem Meijs eds., 1986).Show More multi-dimensional analysis,93 93.Douglas Biber, Variation Across Speech and Writing 24 (1988).Show More—that could potentially be used to address legal interpretative questions as research at the intersection of corpus linguistics and legal interpretation continues to grow.

III. Corpus Linguistic Analysis

A. Selecting a Corpus

While the parties never pointed to an instance of the term foreign tribunal(s) being used in a source of ordinary American English, the parties did argue that the term should be understood according to its ordinary meaning. To look at this, we turned to the Corpus of Historical American English, or COHA (pronounced koh-uh).94 94.Corpus of Historical American English, (2021) [hereinafter COHA] https://www.english-corpora.org/coha/ [https://perma.cc/K3VN-JFJD].Show More COHA “is the largest structured corpus of historical English.”95 95.Id.Show More It contains more than 475 million words from 115,000 texts ranging from the 1820s to the 2010s.96 96.Id.Show More It is balanced by genre within each decade, with texts from four types of genres (or registers): fiction, magazines, newspapers, and non-fiction. COHA is also “balanced across decades for sub-genres and domains as well (e.g., by Library of Congress classification for non-fiction; and by sub-genre for fiction—prose, poetry, drama, etc.)”97 97.Id.Show More Further, “[t]his balance across genres and sub-genres allows researchers to examine changes and be reasonably certain that the data reflects actual changes in the ‘real world,’ rather than just being artifacts of a changing genre balance.”98 98.Id.Show More

While claiming they were looking at ordinary meaning, the parties also looked at various legal sources: cases, statutes, and legal scholarship. For cases, we first turned to the Corpus of Supreme Court Opinions of the United States, which “includes all opinions in the United States Reports and opinions published by the Supreme Court through the 2017 term,” resulting in a corpus of about 98 million words and 62,000 texts.99 99.See Corpus of Supreme Court Opinions of the United States (hereinafter COSCO-US), https://lawcorpus.byu.edu/coscous/concordances [https://perma.cc/Y9L4-8EVG].Show More As there are no other corpora created for the remaining sources of legal documents the parties relied on, for federal cases we turned to Westlaw, for U.S. statutes we turned to HeinOnline’s U.S. Code, and for legal scholarship we turned to HeinOnline’s Core U.S. journals database.

B. Best Coding Practices

Given the subjective nature of coding—reading samples of language usage to try and classify that usage into a sense of a word or term—and the tendency of people to read evidence to confirm their pre-existing position or in light of their own biases, we implemented some best practices for the sense-coding portion of our analysis.100 100.See generallyJames C. Phillips & Jesse Egbert, Advancing Law and Corpus Linguistics: Importing Principles and Practices from Survey and Content-Analysis Methodologies to Improve Corpus Design and Analysis, 2017 BYU L. Rev. 1589, 1613–14 (2017).Show More We do this to pursue the twin pillars of good social science research: reliability and validity. Reliability, which could also be called replicability, is the ability of others to replicate the results. Validity is the accuracy of the results in measuring the phenomena claimed to be measured.

To achieve reliability and validity, we used two coders, with both coders coding all of the material independently of each other. We did this so we could see the rate of agreement between the coders. A low rate could mean the material is too difficult to code or that one coder is providing an idiosyncratic view of the material. Having two coders with a high rate of agreement provides greater confidence that the results are accurate and that others will reach similar results. Second, at least one of the coders, if not both, was completely blind to what the authors thought the results would be, thus eliminating any thumbs on the scale, so to speak. If coders think a certain outcome is expected or more likely, they may lean that way in their coding, so having the coders “blind” to such information helps mitigate confirmation bias or motivated reasoning, increasing both validity and reliability. Third, we only coded one instance of a term in a document, coding the first. We did this because multiple uses of a term in the same document are likely to take on the same sense, thus biasing the overall numbers if they are counted as separate instances. Public opinion pollsters do something similar, randomly sampling households rather than individuals since the opinions of members of the same household are highly correlated.

C. “Foreign Tribunal”

1. Corpus of Historical American English (COHA)

To determine what the term foreign tribunal, in both its singular and plural form, meant in “ordinary” American English, we turned to COHA. In the more than 298 million words found in the corpus through 1964 (the cut-off year for our search),101 101.COHA, supra note 94. To calculate this number, we subtracted the number of words from the 1970s–2010s, as well as half of the words for the 1960s, a combined total of 176,666,079 words, from the total words in COHA (475,031,831), resulting in a total of 298,365,752 words.Show More the term only showed up six times in six documents, and never again after 1895.102 102.Searching foreign tribunal in COHA yields both singular and plural results.Show More At the very least, this means that the term foreign tribunal(s) is a rare one in “ordinary” American English, and this may mean that there is no ordinary meaning of the term and that it only has a legal meaning.103 103.And one of the hits from COHA came from a legal source: Kent’s Commentaries on American Law. See James C. Phillips & Jesse Egbert, Appendices to a Corpus Linguistic Analysis of “Foreign Tribunal,” at app. 1 (Mar. 20, 2022) [hereinafter Appendices], https://pa​pers.ssrn.com/sol3/papers.cfm?abstract_id=4052959 [https://perma.cc/KYR3-3CS2]; James Kent, Commentaries on American Law, 24 N. Am. Rev. 345, 358 (1827).Show More

We only coded one of the instances in the document that had two for reasons noted above, resulting in six hits.104 104.See Appendices, supra note 103, at app. 1.Show More These six instances of the term were each independently coded by two coders. Coders determined the sense of foreign tribunal being used. They were given the following options and directions:

  • Government sense: a tribunal that operates under government authority, such as a court
  • Private/non-government: a tribunal that operates under non-governmental/private authority, such as private arbitration
  • Other: if the term being used to describe something that does not fit into the first two categories
  • Unclear: you cannot tell, which could be because there is not enough information or because you are not sure whether the tribunal mentioned fits into the government or non-government category

The first coder classified all six instances as falling under the government-authority sense. The second coder classified four of the instances as invoking the government-authority sense and two of the instances of the term as being unclear. Not once did a coder deem a use of foreign tribunal(s) in COHA to invoke the private-authority sense, nor did either coder deem any other sense as being used.

We also asked the coders to record the specific type of tribunal being discussed, such as a court, a legislature, arbitration, etc. For the six COHA instances, one coder deemed five references as being to a court and one reference as unclear, while the other deemed four of the six to be to a court, one to be to a state legislature, and the other to be unclear. Not once did a coder conclude the use of the term foreign tribunal(s) referred to arbitration. Of course, having only six instances of the term, and none after 1895, severely limits the conclusions we can draw from the findings. But at the very least, there is no clear evidence that the term foreign tribunal(s) as used in ordinary American English invoked the private/non-governmental sense and applied to arbitration.

2. Corpus of Supreme Court Opinions of the United States (COSCO-US)

We next looked to a corpus of U.S. Supreme Court opinions: COSCO-US.105 105.COSCO-US, supra note 99.Show More We limited the search to cases up through 1964. We also only coded the first instance of the term foreign tribunal(s) in a case, even if it appeared more than once. This resulted in forty-three instances, ranging from 1808 to 1958. Two coders independently coded all of these instances.106 106.SeeAppendices, supra note 103, at app. 2.Show More They first coded the following sense categories (the same as coded in COHA, though described here in abbreviated form):

  • government-authorized sense
  • private, non-government-authorized sense
  • any other sense
  • unclear

The coders agreed 88% of the time, a sufficiently high rate of agreement. In the chart below are the results.

Sense Distribution of Foreign Tribunal(s) in Supreme Court Cases, 1789–1964

At least 90% of the time, coders found that the government-authority sense of tribunal was being invoked for the term foreign tribunal. The rest of the time, it was unclear which sense was being used. And not once did a coder find that the U.S. Supreme Court was using the private/non-government-authority sense.

The coders were also asked to record the type of tribunal being referenced. The first coder found that all but one of the instances were referring to a court, the one outlier being a legislature. The second coder concluded that thirty-six of the forty-three instances were referencing a court, six were unclear, and one referenced a surveyor general.107 107.We note that sometimes the term foreign tribunal in referring to courts referred to courts outside of a state’s jurisdiction but not in a foreign country. Thus, to a Maryland state court, a New York state court is sometimes referred to as a foreign tribunal. This usage seemed to occur most often in the context of personal jurisdiction. See, e.g.,Hanson v. Denckla, 357 U.S. 235, 250–51 (1958) (“As technological progress has increased the flow of commerce between States, the need for jurisdiction over nonresidents has undergone a similar increase. At the same time, progress in communications and transportation has made the defense of a suit in a foreign tribunal less burdensome.”).Show More This evidence indicates that the Supreme Court consistently used foreign tribunal in the narrow, government-authority sense before the statute was enacted to refer to courts, not arbitration.

3. Westlaw Federal Court Opinions

A corpus of federal court decisions does not exist outside of the Founding Era.108 108.See Corpus of U.S. Caselaw, https://lawcorpus.byu.edu/cusc;showCorpusInfo=true/conc​ordances [https://perma.cc/ZVG9-QCLW].Show More But for this type of analysis, where one is coding concordance lines in a corpus, a digital database without the additional tools of a linguistic corpus will still work. So, we searched in Westlaw for “foreign tribunal” to capture the terms foreign tribunal and foreign tribunals. We limited the search under “Filters by Jurisdiction” to “Federal Courts of Appeal” and “Federal District Courts.” We also limited the search to any cases prior to 10/03/1964, the date the new statutory language of issue here was enacted. We then ordered the results by date with the most recent listed first since caselaw closer to 1964 would be more relevant and less likely to be influenced by linguistic drift. We coded the first 100 cases that had a valid hit (some had to be discarded because the term foreign tribunal(s) appeared in a headnote rather than in the body of the opinion). This resulted in cases from 1868 to 1964.109 109.See Appendices, supra note 103, at app. 3.Show More

The coding was for one of four categories:

  • government-authorized sense
  • private, non-government-authorized sense
  • any other sense
  • unclear

The coders coded the material independently of each other, resulting in an agreement rate of 98% for the senses of tribunal, a very high agreement rate. The findings are in the chart below.

Sense Distribution of Foreign Tribunal(s) in Federal Cases

Ranging from 98–100% of the time, the coders determined the government sense was being invoked. Twice the second coder determined that the private sense of tribunal was invoked. In the first instance, the district court judge appeared to be referring to arbitration performed by a court in Spain, which would be more consistent with the government-sense.110 110.See The Ciano, 58 F. Supp. 65, 66–67 (E.D. Pa. 1944) (“I am persuaded to the views set forth in The Edam case, supra, as it seems to me that these provisions are not in a true sense, clauses providing for arbitration, but rather clauses and agreements which attempt to give preference to one court over another, and to attempt to construe then as real agreements for arbitration within the purview of the Arbitration Act would be to confer exclusively jurisdiction as here on a foreign tribunal . . . .”).Show More The second case coded as invoking the private sense does refer to arbitration, but appears to do it in contrast to a foreign tribunal: “Arbitration clauses are found in virtually all the standard forms of charter parties and are particularly favored by shipping men as a means of avoiding litigation in distant countries before foreign tribunals.”111 111.Atl. Fruit Co. v. Red Cross Line, 276 F. 319, 322 (S.D.N.Y. 1921).Show More In other spots in the opinion, the court appears to be contrasting arbitration and litigation, so this use of the term foreign tribunals is likely referring to courts in a foreign country, and thus the government-authority sense.112 112.Id. at 321–22.Show More It appears, then, that the second coder may have been mistaken in finding two instances of the private/non-governmental sense.

Further, 99% of the time the first coder classified the entity being referred to as a foreign tribunal as a court, with the lone other instance being where the entity was a patent office. The second coder deemed 98% of the entities being referred to as a foreign tribunal were courts, with the other 2% referencing arbitration, though these were the same two cases just discussed above, leading us to believe these references were mistaken. Thus, it appears federal courts used the term consistent with how the Supreme Court used the term during that time—in the narrow, government-authority sense and usually referring to courts.

4. U.S. Code

We next looked at the United States Code as found in HeinOnline. We limited the results to those before 1965. We searched in “All Titles” under U.S. Code, limiting our search to the terms foreign tribunal and foreign tribunals that occurred up through 1964. After eliminating duplicates and only sampling the first instance if the term appeared more than once in a particular document, we were left with twelve results.113 113.SeeAppendices, supra note 103, at app. 4.Show More The first coder found all twelve instances to refer to the narrow, government-authority sense. The second coder determined that eleven of the twelve used the narrow, government-authority sense, with the other instance being unclear. Not once could we find an example of the private/non-government sense. As for the type of entity that was referred to as a foreign tribunal, the first coder deemed all twelve instances to be courts, while the second coder found that eight of the twelve were courts, and the other four were unclear. We did not find an example of an arbitration body being referred to as a foreign tribunal. This usage is consistent with how the courts were using the term.

5. U.S. Law Reviews

Finally, we looked at HeinOnline’s Core U.S. Journals database to see how foreign tribunal(s) was used in legal scholarship. Given how many times the terms occurred, we limited the years to 1950–1964, which resulted in 201 hits. We eliminated any result quoting another source, any duplicates, or any articles that were merely titles of statutes with no context. If foreign tribunal(s) appeared multiple times in the document, we only sampled it once—the first time it was listed, unless that first instance was eliminated for the reasons just noted. This resulted in ninety-eight instances of foreign tribunal(s) that we coded.114 114.See Appendices, supra note 103, at app. 5.Show More The coding was for one of four categories:

  • government-authorized sense
  • private, non-government-authorized sense
  • any other sense
  • unclear

Two coders coded the material independently of each other, resulting in an agreement rate of 96% for the senses of tribunal, a very high rate of agreement. In the figure below, we report the percentages for each category coded:

Sense Distribution of Foreign Tribunal(s) in U.S. Law Reviews

The results are very clear and very stark. Almost every single time the terms foreign tribunal or foreign tribunals were used in the decade and a half before 1964 in U.S. legal scholarship, the term took on the government-authorized sense. Arguably only once did it take on the private sense. For that one instance, the coders disagree, with one classifying it as taking on the government sense and the other coding it as being the private sense. The context was the trial in Israel of the infamous Nazi Adolf Eichmann. The sentence in which the term appeared was, “While arrangements were made for the taking of affidavits and for cross-examination before foreign tribunals, the understandable reluctance of former Nazis to appear before the court largely derogated from whatever direct applicability the territorial theory might have had to the Eichmann case.”115 115.Vanni E. Treves, Jurisdictional Aspects of the Eichmann Case, 47 Minn. L. Rev. 557, 562–63 (1962).Show More Given this is in the context of a criminal case, it seems unlikely that the term foreign tribunals would cover private entities in other countries. The coder who coded this instance as involving the private/non-governmental sense was likely mistaken. The coder also classified the type of foreign tribunal here to be a court, which is in tension with it being the private/non-governmental sense and further supports the government sense.116 116.SeeAppendices, supra note 103, at app. 5.Show More Hence, it appears the private sense of tribunal never occurred once in our sample of U.S. law reviews.

What is more, in determining what type of foreign tribunal was being discussed, the coders never found anything other than courts being referenced.117 117.One coder deemed that in every instance a court was being referenced. The other coder determined that in eighty-five of the ninety-eight instances, a court was referenced, and the other thirteen instances the coder could not tell what kind of tribunal was being referred to.Show More This usage in legal scholarship is consistent with how Congress, the Supreme Court, and lower federal courts used the term. Furthermore, this legal usage is consistent with the ordinary usage.

* * *

The data are about as one-sided as we have ever seen in doing corpus linguistic analysis. In 259 instances of the use of the term foreign tribunal or foreign tribunals across ordinary American English, U.S. Supreme Court opinions, federal court opinions, the U.S. Code, and U.S. legal scholarship, we found only three debatable instances of the use of a private/non-government-sense of tribunal—and those three were probably mistakenly coded. We also only found two possible instances where foreign tribunal(s) may have been referencing arbitration, but we also think those were probably mistakes. That is about as linguistically lopsided as it can get. Of course, we are not saying that it is impossible for foreign tribunal(s) to refer to a private, commercial arbitration panel. No doubt one could find an instance if one looked long and hard enough, just as one could probably find a few Republicans who would vote for Bernie Sanders for President. We are just saying that, based on the data we sampled, such usage was uncommon.118 118.It is also possible that our coders may have been mistaken on a few of the results they coded, but that would only change our numbers at the margins. Of course, people may look for themselves at the data in our appendices.Show More

D. Alternative Explanation

1. Real-world Frequency

There is an alternative explanation to frequency data in a corpus. It may not reflect linguistic reality but, assuming the corpus is properly constructed, it could reflect non-linguistic reality. In other words, it could reflect the frequency of the real world as to certain phenomenon.119 119.SeeLawrence M. Solan & Tammy Gales, Corpus Linguistics as a Tool in Legal Interpretation, 2017 BYU L. Rev. 1311, 1315 (2017); Thomas R. Lee & Stephen C. Mouritsen, The Corpus and the Critics, 88 U. Chi. L. Rev. 275, 340 (2021).Show More Thus, if one looks in the corpus at the word car, one is more likely to find instances of Fords or Toyotas than Ferraris because there are just many more Fords and Toyotas in existence than Ferraris. But that does not mean a Ferrari is not a car. And to confirm that, one could look to see if every time a Ferrari showed up in the corpus, it was described as a car. Is the fact that the term foreign tribunal almost never shows up as referring to a private, non-government-authorized tribunal or to arbitration merely a reflection of how much less arbitration occurs as compared to government-authorized tribunals and courts?

One way to get some leverage on this question would be to know how many lawsuits are filed in courts each year versus how many arbitration proceedings are instituted. Of course, one would need to know that historical data for the time periods analyzed here—pre-1965. We do not have that data. But it does not appear that the data we have sampled could be entirely driven by the real-world frequencies of courts and lawsuits being more prevalent than arbitration because that would mean arbitration seldom exists.

2. Arbitration Analysis

To look at this difference between linguistic frequency and real-world frequency from another angle, we decided to sample 100 instances of the word arbitration from COHA, to capture more ordinary language, and COSCO-US, to capture more legal meaning. We recorded the general word used for the entity conducting the arbitration proceeding (panel, body, tribunal, commission, etc.). We did so to see whether when the term arbitration is used it is predominantly referred to as a tribunal or predominantly referred to as something else. If arbitration predominantly referred to something other than a tribunal, then it would be further evidence that it is not something about the frequency of arbitration in the real world that may be driving the frequency data we see in our analysis of foreign tribunal(s)—though we recognize this type of analysis is less direct evidence of the meaning of foreign tribunal(s).120 120.While one could also do collocate analysis here (i.e., seeing which words collocate most frequently with arbitration), we did not because we felt the results would be too muddied by multiple hits from the same document.Show More

3. COHA

We searched for the terms arbitration and arbitrations in COHA that occurred from 1950–1964, finding 192 documents. We only took the first instance if there were multiple instances from the same document.121 121.We were not always sure whether a Letter to the Editor was multiple letters or one, so we left all of those in the data.Show More This reduced our total to 117.122 122.SeeAppendices, supra note 103, at app. 6.Show More The overwhelming majority (74%) of the hits did not reveal the type of entity performing the arbitration. Below are the results we found when we could determine the entity type.123 123.Given this did not involve such a subjective judgment as determining which sense was being used, but rather just whether a word was being used, we only used one coder for this coding.Show More

Type of Entity Performing Arbitration in COHA, 1950–1964

Entity Type Total %124 124.This is the percentage of the total times we were able to identify an entity type, which was thirty.Show More
board(s) 19 63.3%
commission 4 13.3%
committee 1 3.3%
court 3 10.0%
panel 2 6.7%
tribunal 1 3.3%

As is evident, it is possible to refer to the entity that is performing arbitration as a tribunal—in this instance a tribunal to handle disputes over the Suez Canal constituting one member named by Egypt, one by the complaining party, and the third by both together or by the International Court of Justice in The Hague.125 125.SeeSailing on a Pledge, Time, May 6, 1957, at 22.Show More (The coder deemed the source of this arbitration tribunal’s authority to be governmental in nature.)126 126.SeeAppendices, supra note 103, at app. 6.Show More But from 1950–1964 in the representative sample of more “ordinary” American English we examined, it was rare to refer to an entity performing arbitration as a tribunal.

4. COSCO-US

We performed the same analysis in COSCO-US to see what type of entity the U.S. Supreme Court referenced as performing arbitration. We only sampled the first instance if the term arbitration was used more than once in an opinion, treating majority and separate opinions as distinct. We also limited our results from 1789 to 1964. This resulted in 88 instances,127 127.SeeAppendices, supra note 103, at app. 7.Show More though again, an overwhelming majority (73%) did not reveal the entity type performing the arbitration. Below are the results we found when we could determine the entity type.

Type of Entity Performing Arbitration in COSCO-US, 1789–1964

Entity Type Total %128 128.This is the percentage of the total times we were able to identify an entity type, the total of which was twenty-four.Show More
association 1 4.2%
board 12 50.0%
body 1 4.2%
commission 3 12.5%
committee 2 8.3%
tribunal 5 20.8%

Here we see that the Supreme Court refers to the entity that performs arbitration as a tribunal about a fifth of the time, though it is not the most common term, which is board. Of these five instances of tribunal, in one the Court referred to the entity both as a tribunal and as a commission.129 129.SeeFrelinghuysen v. Key, 110 U.S. 63, 73 (1884).Show More In another, it referred to the entity as both a court and a tribunal and seemed to be referring to a court proceeding as arbitration.130 130.SeeProprietors of the Charles River Bridge v. Proprietors of the Warren Bridge, 36 U.S. (11 Pet.) 420, 473, 568 (1837).Show More The other three instances all seem to refer to an international tribunal of arbitration between the United States and Great Britain that was created by treaty and convened in Geneva, Switzerland to handle claims that arose out of the Civil War.131 131.See United States v. Realty Co., 163 U.S. 427, 441 (1896); Williams v. Heard, 140 U.S. 529, 531 (1891); United States v. Weld, 127 U.S. 51, 52 (1888).Show More

In sum, whether in more ordinary American English or in legal American English, at least as used by the U.S. Supreme Court, entities performing arbitration are unlikely to be referred to as a tribunal. This is further evidence that our findings for foreign tribunal are not driven by something other than linguistic usage.

Conclusion

In ZF Automotive US v. Luxshare, the parties have presented the Court with what Justice Frankfurter would call a “contest between probabilities of meaning.”132 132.Frankfurter, supra note 6, at 528.Show More But the methodologies and evidence presented by the parties to resolve that contest—dueling dictionaries and small samples of usage of the individual words of a multi-word term—were inadequate. After sampling 259 usages of the terms foreign tribunal and foreign tribunals across collections of texts using both ordinary and legal American English—including U.S. Supreme Court and federal court opinions, the U.S. Code, and U.S. legal scholarship—the data overwhelmingly show that the term foreign tribunal(s) was used in the sense of an entity using government authority to resolve a dispute, almost always a court. While there may be additional considerations the Court should take into account in resolving the legal question before it, the linguistic question is very clear: the term foreign tribunal seldom referred to a private arbitration body in American English prior to 1965, and the entity that was referred to as conducting arbitration was usually called something other than a tribunal.133 133.Our study was discussed during oral argument. For our response, see Eugene Volokh, Corpus Linguistics in the Supreme Court, Reason: The Volokh Conspiracy (Mar. 24, 2022, 12:28 PM), https://reason.com/volokh/2022/03/24/corpus-linguistics-in-the-supreme-court/ [https://perma.cc/3YWM-QB8Q].Show More

  1. See Oral Argument, ZF Auto. US, Inc. v. Luxshare, Ltd., No. 21-401 (U.S. argued Mar. 23, 2022), https://www.oyez.org/cases/2021/21-401. That case is consolidated with AlixPartners, LLC v. Fund for Protection of Investor Rights in Foreign States, No. 21-518 (U.S. argued Mar. 23, 2022). However, the latter case involves a slightly different question: whether 28 U.S.C. § 1782 applies to investor-state arbitrations pursuant to international treaties. This paper will not address the underlying linguistic questions invoked by AlixPartners.
  2. 28 U.S.C. § 1782(a).
  3. See Luxshare, Ltd. v. ZF Auto. US, Inc., 547 F. Supp. 3d 682, 686–87 (E.D. Mich.), cert. granted 142 S. Ct. 637 (2021).
  4. Carpenter v. United States, 138 S. Ct. 2206, 2238–39 n.4 (2018) (Thomas, J., dissenting) (running a search in the Corpus of Founding-Era American English); Lucia v. S.E.C., 138 S. Ct. 2044, 2056 (2018) (Thomas, J., concurring, joined by Gorsuch, J.) (citing Jennifer Mascott, Who Are “Officers of the United States?” 70 Stan. L. Rev. 443 (2018)); Facebook, Inc. v. Duguid, 141 S. Ct. 1163, 1174 (2021) (Alito, J., concurring) (citing Thomas R. Lee & Stephen C. Mouritsen, Judging Ordinary Meaning, 127 Yale L. J. 788 (2018)); Bostock v. Clayton County, 140 S. Ct. 1731, 1769 n.22 (2020) (Alito, J., dissenting) (citing James C. Phillips, The Overlooked Evidence in the Title VII Cases: The Linguistic (and Therefore Textualist) Principle of Compositionality 3 (unpublished manuscript) (May 11, 2020), https://ssrn.com/abstract=3585940.
  5. Garson Kanin, Conversations with Felix, Reader’s Digest, June 1964, at 116, 117 (replying to counsel who said a question from the bench was just a matter of semantics).
  6. Felix Frankfurter, Some Reflections on the Reading of Statutes, 47 Colum.
    L.

    Rev. 527, 528 (1947).

  7. See Brief for the Petitioners at 18, ZF Auto. US, Inc. v. Luxshare, Ltd., No. 21-401 (Jan. 24, 2022).
  8. Tribunal, Black’s Law Dictionary (4th ed. 1951); Webster’s Third New International Dictionary of the English Language Unabridged 2441 (1961) [hereinafter Webster’s Third (1961)].
  9. Webster’s Third
    (1961),

    supra note 8, at 2441.

  10. Tribunal, Black’s Law Dictionary (4th ed. 1951).
  11. Webster’s Third
    (1961),

    supra note 8, at 2441; Merriam-Webster’s Dictionary of Law

    503 (1996).

  12. The American Heritage Dictionary of the English Language
    1369 (1969)

    .

  13. Tribunal, Black’s Law Dictionary (4th ed. 1951).
  14. 11 The Oxford English Dictionary 341 (1933).
  15. Ballentine’s Law Dictionary 1300 (1969).
  16. Black’s Law Dictionary
    1814 (11

    th ed. 2019).

  17. Merriam-Webster’s Dictionary of Law
    503 (1996

    ).

  18. The American Heritage Dictionary of the English Language
    1369 (1969)

    .

  19. Brief for the Petitioners, supra note 7, at 19 (quoting Webster’s Third (1961), supra note 8, at 2441).
  20. Funk & Wagnalls New Standard Dictionary of the English Language 1340 (1960).
  21. Brief for the Petitioners, supra note 7, at 19–20.
  22. It is worth noting that no contemporaneous legal dictionary included the broader sense of tribunal. This could indicate a divergence from the ordinary and the legal meanings of the word.
  23. See Brief for the Petitioners, supra note 7, at 19; Brief for the Respondent at 12, ZF Auto. US, Inc. v. Luxshare, Ltd., No. 21-401 (filed Feb. 23, 2022).
  24. See Brief for the Petitioners, supra note 7, at 19.
  25. Id.
  26. See Brief for the Respondent, supra note 23, at 12–14.
  27. Brief for the Petitioners, supra note 7, at 20–21.
  28. Brief for the Petitioners, supra note 7, at 20.
  29. See Brief for the Respondent, supra note 23, at 13.
  30. See Brief for the Petitioners, supra note 7, at 21–25.
  31. See Brief for the Respondent, supra note 23, at 13.
  32. Alan Cruse, Meaning in Language: An Introduction to Semantics and Pragmatics 82 (3d ed. 2011).
  33. Id. at 84.
  34. Alison Wray, Why Are We So Sure We Know What a Word Is?, in The Oxford Handbook of the Word 725, 737 (John R. Taylor ed., 2015).
  35. John McH. Sinclair, Collocation: A Progress Report, in
    2

    Language Topics: Essays in Honour of Michael Halliday 319, 320 (Ross Steele & Terry Threadgold eds., 1987).

  36. See Cruse, supra note 32, at 86–91.
  37. FCC v. AT&T Inc., 562 U.S. 397, 406 (2011).
  38. See 2 McCarthy on Trademarks and Unfair

    Competition § 11:27 (5th ed.) (“Under the anti-dissection rule, a composite mark is tested for its validity and distinctiveness by looking at it as a whole, rather than dissecting it into its component parts.”).

  39. Est. of P.D. Beckwith, Inc., v. Comm’r of Pats., 252 U.S. 538, 545–46 (1920).
  40. TE-TA-MA Truth Found.—Fam. of URI, Inc. v. World Church of the Creator, 297 F.3d 662, 666 (7th Cir. 2002) (emphasis added).
  41. Frank H. Easterbrook, Text, History, and Structure in Statutory Interpretation, 17 Harv. J.L. & Pub. Pol’y 61, 67 (1994).
  42. Henry M. Hart, Jr. & Albert M. Sacks, The Legal Process: Basic Problems in the Making and Application of Law 1375 (1994).
  43. Id. at 1375–76.
  44. Thomas R. Lee & James C. Phillips, Data-Driven Originalism, 167 U. Pa. L. Rev. 261, 298–300 (2019).
  45. Domestic Violence, Oxford English Dictionary Online (Mar. 2006), https://www.oed.com/view/Entry/56663?redirectedFrom=domestic+violence#eid41827739 [https://perma.cc/A5ZN-RQRV]; Lee & Phillips, supra note 44, at 300.
  46. James Sledd & Wilma R. Ebbitt, Dictionaries and That Dictionary 79 (1962) (quoting the editor-in-chief of Webster’s Third as stating that “the dictionary’s purpose was to report the language, not to prescribe what belonged in it”). Because of this move, Justice Scalia rejected Webster’s Third, preferring Webster’s Second. See James J. Brudney & Lawrence Baum, Oasis or Mirage: The Supreme Court’s Thirst for Dictionaries in the Rehnquist and Roberts Eras, 55 Wm. & Mary L. Rev
    .

    483, 508–09 (2013) (noting that Scalia’s reliance “on Webster’s Second and American Heritage—identified as belonging to the prescriptive camp—far more than Webster’s Third, the poster child for descriptive dictionaries,” is a “preference” that “is not inadvertent: Scalia has disparaged Webster’s Third in his opinions . . . and in his recent book”). Scalia’s rejection of Webster’s Third is ironic given his purported aim of understanding words in legal texts according to how people at the time would have understood them.

  47. Henri Béjoint, Tradition and Innovation in Modern English Dictionaries 116 (1994).
  48. Sledd & Ebbitt, supra note 46, at 57.
  49. Samuel A. Thumma & Jeffrey L. Kirchmeier, The Lexicon Has Become a Fortress: The United States Supreme Court’s Use of Dictionaries, 47 Buff. L. Rev. 227, 242 (1999).
  50. Id.
  51. Granted, to the extent people rely on dictionaries, even a prescriptive definition could somewhat reflect how people understood language, though it is second-best evidence.
  52. See Stephen C. Mouritsen, The Dictionary Is Not a Fortress: Definitional Fallacies and a Corpus-Based Approach to Plain Meaning, 2010 BYU L. Rev. 1915, 1926–29 (2010).
  53. 524 U.S. 125 (1998).
  54. Id. at 128–31.
  55. As has been noted elsewhere, the one exception to this is The Random House Dictionary of the English Language. See Lee & Mouritsen, supra note 4, at 808 n.89 (observing that dictionary’s front matter declares that “a general policy of putting the most frequently used meanings . . . at the beginning of the entry, followed by other senses in diminishing usage, with archaic, and obsolete senses coming last”) (citing Random House Dictionary of the English Language—Unabridged, at viii (2d ed. 1987) [hereinafter Random House]). However, that dictionary was not cited by the parties here (and would only provide half of the relevant term), and as Lee and Mouritsen note, there are “grounds for skepticism of these sorts of claims” given the way dictionaries are constructed, with even Random House conceding that “sense ranking based on frequency holds only ‘generally.’” Id. (quoting Random House, supra, at xxii).
  56. 1 The Oxford English Dictionary xxix (2d ed. 1989) (“[T]hat sense is placed first which was actually the earliest in the language: the others follow in order in which they appear to have arisen.”).
  57. Webster’s Third New International Dictionary of the English Language Unabridged 19a (1971).
  58. See John Mikhail, The Definition of ‘Emolument’ in English Language and Legal Dictionaries, 1523–1806, at 8–10 (July 12, 2017) (unpublished manuscript) (surveying 50 founding-era dictionaries and concluding that because 100% of the entries included at least one element of the broad definition of emolument, and only 8% of the entries included an office or employment-related definition, the word must have been understand at the founding in its broad sense); see also James Cleith Phillips & Sara White, The Meaning of the Three Emoluments Clauses in the U.S. Constitution: A Corpus Linguistic Analysis of American English from 1760–1799, 59 S. Tex. L. Rev. 181, 196–97 (2017) (critiquing Mikhail for this analysis).
  59. Lee & Mouritsen, supra note 4, at 809 n.90 (quoting Webster’s Third (1971), supra note 57, at 19a).
  60. Kory Stamper, Word by Word: The Secret Life of Dictionaries 119 (2017); see also The Routledge Handbook of Corpus Linguistics 433–34 (Anne O’Keeffe & Michael McCarthy eds., 2010) (discussing “lumpers” and “splitters”).
  61. Stamper, supra note 60, at 119.
  62. Id.
  63. Sidney I. Landau, Dictionaries: The Art and Craft of Lexicography 35 (1984).
  64. Phillips & White, supra note 58, at 191.
  65. See Lee & Mouritsen, supra note 4, at 810 n.98 (“[T]he methods that [dictionaries] use to sample language use don’t create a reliable sample—aggregating dictionaries isn’t going to accomplish anything if none of them has a reliable sample of language usage.”).
  66. For a broader discussion of this, see generally Lee & Mouritsen, supra note 4 (arguing that corpus linguistics can provide answers to questions regarding statutory interpretation).
  67. Amanda K. Fronk, Big Lang at BYU,
    BYU

    Magazine

    (

    Summer 2017), https://magazine.byu.edu/article/big-lang-at-byu/ [https://perma.cc/23QK-W3GJ].

  68. Hans Lindquist, Corpus Linguistics and the Description of English 52 (2009) (observing that “today all major British dictionary publishers have their own corpora . . . . The editors use concordances to find out the typical meanings and constructions in which each word is used, and try to evaluate which of these are worth mentioning in the dictionary. Many dictionaries also quote authentic examples from corpora, either verbatim or in a slightly doctored form.”).
  69. Tony McEnery & Andrew Hardie, Corpus Linguistics: Method, Theory and Practice 1–2 (2012).
  70. Paul Baker et al., Glossary of Corpus Linguistics 65 (2006).
  71. Henry A. Landsberger, Hawthorne Revisited: Management and the Worker, Its Critics, and Developments in Human Relations in Industry 14–15, 23 (1958).

  72. The Cambridge Handbook of English Corpus Linguistics 1 (Douglas Biber & Randi Reppen eds., 2015).
  73. Douglas Biber, Corpus-Based and Corpus-Driven Analyses of Language Variation and Use, in The Oxford Handbook of Linguistic Analysis 160 (Bernd Heine & Heiko Narrog eds., 2010).
  74. Id. at 159.
  75. McEnery & Hardie, supra note 69, at 66.
  76. Jesse Egbert et al., Designing and Evaluating Language Corpora: A Practical Framework for Corpus Representatives
    (2022).

  77. United States v. Esquivel-Rios, 725 F.3d 1231, 1234 (10th Cir. 2013) (Gorsuch, J., majority opinion) (“Garbage in, garbage out. Everyone knows that much about computers: you give them bad data, they give you bad results.”).
  78. See Douglas Biber & Jesse Egbert, Register Variation Online
    6–7 (2018).

  79. Tony McEnery & Andrew Wilson, Corpus Linguistics: An Introduction 82 (2d ed. 2001).
  80. See James C. Phillips & Jesse Egbert, Advancing Law and Corpus Linguistics: Importing Principles and Practices from Survey and Content-Analysis Methodologies to Improve Corpus Design and Analysis, 2017 BYU L. Rev. 1589, 1608 (2017) (“Law and corpus linguistics can learn from the methodologies employed, and the reasons driving those methodologies, in fields that use content-analysis, such as media studies. Specifically, these methodologies can inform and improve what, how, and who codes search results from corpus analysis.”).
  81. Noscitur a sociis, Black’s Law Dictionary (10th ed. 2014).
  82. John Rupert Firth, A Synopsis of Linguistic Theory, 1930–1955, in Studies in Linguistic Analysis 11 (1957).
  83. See Jesse Egbert, Tove Larsson & Douglas Biber, Doing Linguistics with a Corpus: Methodological Considerations for the Everyday User
    25–29

    (2020).

  84. Lee & Phillips, supra note 44, at 298 tbl.1.
  85. Id.
  86. Daniel Keller & Jesse Egbert, Hypothesis Testing Ordinary Meaning, 86 Brook. L. Rev. 489, 505–32 (2021).
  87. Lee & Phillips, supra note 44, at 304 & tbl.3.
  88. Douglas Biber, Susan Conrad, & Viviana Cortes, If you look at . . .: Lexical Bundles in University Teaching and Textbooks, 25 Applied Linguistics 371 (2004).
  89. See Jesse Egbert, Brent Burch, & Douglas Biber, Lexical Dispersion and Corpus Design, 25 Int’l J. Corpus Linguistics 89–90 (2020); Stefan Th. Gries, Dispersions and Adjusted Frequencies in Corpora, 13 Int’l J. Corpus Linguistics 403 (2008).
  90. Jesse Egbert & Douglas Biber, Incorporating Text Dispersion into Keyword Analyses, 14 Corpora 77–78 (2019); Mike Scott, PC Analysis of Key Words—And Key Key Words, 25 System 233 (1997).
  91. Stefan Th. Gries, & Anatol Stefanowitsch, Extending Collostructional Analysis: A Corpus-Based Perspective on ‘Alternations’, 9 Int’l J
    .

    Corpus Linguistics

    97 (2004).

  92. Douglas Biber & Edward Finegan, An Initial Typology of English Text Types, in Corpus Linguistics II
    :

    New Studies in the Analysis and Exploitation of Computer Corpora

    19 (

    Jan Aarts and Willem Meijs eds., 1986).

  93. Douglas Biber, Variation Across Speech and Writing 24 (1988).
  94. Corpus of Historical American English, (2021) [hereinafter COHA] https://www.english-corpora.org/coha/ [https://perma.cc/K3VN-JFJD].
  95. Id.
  96. Id.
  97. Id.
  98. Id.
  99. See Corpus of Supreme Court Opinions of the United States (hereinafter COSCO-US), https://lawcorpus.byu.edu/coscous/concordances [https://perma.cc/Y9L4-8EVG].
  100. See generally James C. Phillips & Jesse Egbert, Advancing Law and Corpus Linguistics: Importing Principles and Practices from Survey and Content-Analysis Methodologies to Improve Corpus Design and Analysis, 2017 BYU L. Rev. 1589, 1613–14 (2017).
  101. COHA, supra note 94. To calculate this number, we subtracted the number of words from the 1970s–2010s, as well as half of the words for the 1960s, a combined total of 176,666,079 words, from the total words in COHA (475,031,831), resulting in a total of 298,365,752 words.
  102. Searching foreign tribunal in COHA yields both singular and plural results.
  103. And one of the hits from COHA came from a legal source: Kent’s Commentaries on American Law. See James C. Phillips & Jesse Egbert, Appendices to a Corpus Linguistic Analysis of “Foreign Tribunal,” at app. 1 (Mar. 20, 2022) [hereinafter Appendices], https://pa​pers.ssrn.com/sol3/papers.cfm?abstract_id=4052959 [https://perma.cc/KYR3-3CS2]; James Kent, Commentaries on American Law, 24 N. Am. Rev. 345, 358 (1827).
  104. See Appendices, supra note 103, at app. 1.
  105. COSCO-US, supra note 99.
  106. See Appendices, supra note 103, at app. 2.
  107. We note that sometimes the term foreign tribunal in referring to courts referred to courts outside of a state’s jurisdiction but not in a foreign country. Thus, to a Maryland state court, a New York state court is sometimes referred to as a foreign tribunal. This usage seemed to occur most often in the context of personal jurisdiction. See, e.g., Hanson v. Denckla, 357 U.S. 235, 250–51 (1958) (“As technological progress has increased the flow of commerce between States, the need for jurisdiction over nonresidents has undergone a similar increase. At the same time, progress in communications and transportation has made the defense of a suit in a foreign tribunal less burdensome.”).
  108. See Corpus of U.S. Caselaw, https://lawcorpus.byu.edu/cusc;showCorpusInfo=true/conc​ordances [https://perma.cc/ZVG9-QCLW].
  109. See Appendices, supra note 103, at app. 3.
  110. See The Ciano, 58 F. Supp. 65, 66–67 (E.D. Pa. 1944) (“I am persuaded to the views set forth in The Edam case, supra, as it seems to me that these provisions are not in a true sense, clauses providing for arbitration, but rather clauses and agreements which attempt to give preference to one court over another, and to attempt to construe then as real agreements for arbitration within the purview of the Arbitration Act would be to confer exclusively jurisdiction as here on a foreign tribunal . . . .”).
  111. Atl. Fruit Co. v. Red Cross Line, 276 F. 319, 322 (S.D.N.Y. 1921).
  112. Id. at 321–22.
  113. See Appendices, supra note 103, at app. 4.
  114. See Appendices, supra note 103, at app. 5.
  115. Vanni E. Treves, Jurisdictional Aspects of the Eichmann Case, 47 Minn. L. Rev. 557, 562–63 (1962).
  116. See Appendices, supra note 103, at app. 5.
  117. One coder deemed that in every instance a court was being referenced. The other coder determined that in eighty-five of the ninety-eight instances, a court was referenced, and the other thirteen instances the coder could not tell what kind of tribunal was being referred to.
  118. It is also possible that our coders may have been mistaken on a few of the results they coded, but that would only change our numbers at the margins. Of course, people may look for themselves at the data in our appendices.
  119. See Lawrence M. Solan & Tammy Gales, Corpus Linguistics as a Tool in Legal Interpretation, 2017 BYU L. Rev. 1311, 1315 (2017); Thomas R. Lee & Stephen C. Mouritsen, The Corpus and the Critics, 88 U. Chi. L. Rev. 275, 340 (2021).
  120. While one could also do collocate analysis here (i.e., seeing which words collocate most frequently with arbitration), we did not because we felt the results would be too muddied by multiple hits from the same document.
  121. We were not always sure whether a Letter to the Editor was multiple letters or one, so we left all of those in the data.
  122. See Appendices, supra note 103, at app. 6.
  123. Given this did not involve such a subjective judgment as determining which sense was being used, but rather just whether a word was being used, we only used one coder for this coding.
  124. This is the percentage of the total times we were able to identify an entity type, which was thirty.
  125. See Sailing on a Pledge, Time, May 6, 1957, at 22.
  126. See Appendices, supra note 103, at app. 6.
  127. See Appendices, supra note 103, at app. 7.
  128. This is the percentage of the total times we were able to identify an entity type, the total of which was twenty-four.
  129. See Frelinghuysen v. Key, 110 U.S. 63, 73 (1884).
  130. See Proprietors of the Charles River Bridge v. Proprietors of the Warren Bridge, 36 U.S. (11 Pet.) 420, 473, 568 (1837).
  131. See United States v. Realty Co., 163 U.S. 427, 441 (1896); Williams v. Heard, 140 U.S. 529, 531 (1891); United States v. Weld, 127 U.S. 51, 52 (1888).
  132. Frankfurter, supra note 6, at 528.
  133. Our study was discussed during oral argument. For our response, see Eugene Volokh, Corpus Linguistics in the Supreme Court, Reason: The Volokh Conspiracy (Mar. 24, 2022, 12:28 PM), https://reason.com/volokh/2022/03/24/corpus-linguistics-in-the-supreme-court/ [https://perma.cc/3YWM-QB8Q].

Click on a link below to access the full text of this article. These are third-party content providers and may require a separate subscription for access.