
How do judges learn from precedent? Scott Baker and Anup Malani1 Introduction Federal appellate judges cite cases by sister circuits. Why is that? Common wisdom holds that judges look to out-of-circuit cases for credible legal arguments -- as persuasive precedent. But hard-core adherents of the attitudinal model (Segal and Spaeth 2002) – and certain cynical realists and critical legal studies scholars2 – might argue that judges decide in accordance with their policy or political preferences and cite cases only to cover up that fact or to legitimize acting out their preferences. In this paper we address that dispute in the process of asking some more basic questions about how judges learn from prior, persuasive precedent. We inquire whether judges put any weight on non- binding precedents or make decisions only based on the case before and/or their own preferences; whether judges weight all precedents the same or make adjustments based on inferences about how confident prior judges were in their opinions or about the quality of those judges; and whether judges’ opinions convey to future jurists all the information that their authors gleaned from the case before them, i.e., whether judges have full information about everything prior judges learned. These are important questions because the manner in which judges learn not only affects how we model them and predict their behavior, but also affects their legitimacy and how much authority we ought to give them. For example, if judges place weight on prior non-binding precedent, then it becomes less plausible that they are the purely political actors that the strongest version of the attitudinal model suggests. Alternatively, suppose we find that judges are subject to information cascades. They rely on prior opinions that do not convey all the information those opinion authors had. In the case, we may want either to limit the jurisdiction of judges or do the opposite and empower them. Specifically, we may want to allow judges even more leeway not to publish. In so doing, judges can self-censor and avoid sparking a cascade when they think a decision is based on a weak evidentiary basis – i.e., it could have easily been decided the other way. Our approach first presents a number of models of judicial learning – many taken from the existing literature -- that offer different answers to these questions. We next test their conflicting predictions in order to discriminate between them. For our tests, we employ nearly 1000 sequences of federal appellate cases that address a common legal issue, e.g., whether Family Educational Rights and Privacy Act allows private rights of action or the Coals Act permits successor liability. Each sequence contains at least one circuit split. We code the different decisions made by the circuit courts in a sequence as A, B, C, etc., depending on whether the circuit agreed or disagreed with the circuits that considered the issue 1 Washington University and University of Chicago, respectively. We thank Ray Mao for excellent research assistance. The authors thank Kaushik Vasudevan, Kevin Jiang, Sarah Wilbanks, Bridget Widdowson and Ray Mao for outstanding research assistance. We also thank Charles Barzun, William Hubbard, Richard McAdams, Tom Miles, Bruce Owen, workshop participants at Washington University School of Law, the Max Planck Institute in Bonn, Northwestern Law School, and ZTH, and participants at the Harvard Conference on Blinding, the L&E Theory Conference at Yale, American Law & Economics Association, and the Center for Empirical Legal Studies for helpful comments. 2 For a nice overview of legal realism and a response to its crudest characterization, see Leiter 2003. 1 previously. We end up with a sequence like ABA, AAABB, ABBC, etc. A typical sequence might look as follows: Circuit Decision Ninth Circuit A Eleven Circuit A Fifth Circuit B Seventh Circuit B Fourth Circuit A For each legal question the dataset contains the decision reached by each circuit and the order in which the circuits reached those decisions. In the example above, the dataset reveals that the Fourth Circuit decided the issue as an “A.” We also know that the Fourth Circuit had access to four other decisions. Those prior decisions split: 2 As (circuits that agreed with the Fourth Circuit) and 2 Bs (circuits that reached the opposite result). Finally, we know the order of the past decisions was 2 As followed by 2 Bs. Depending on how, if at all, judges learn, they might pay attention to both the number of decisions on each side of an issue and the order of those decisions. These data are ideal for our purpose. They take advantage of the sequencing of cases to understand how judges learn from prior cases. Moreover, because they contain splits, they present judges with the flexibility to weight prior cases in different ways, giving us a lot of leverage to rule in or out different models of judicial learning. Our first model posits that judges do not rely at all on prior precedents. This model replicates the hard core attitudinal model, where judges decide based on their personal preferences, with no deference to the opinions or decisions of other courts. This assumption implies that decisions on the same legal issue should be independent of one another. We test this assumption by looking for runs – sequences of two or more consecutive identical decisions (e.g., two or more As in a row) – in sequences of cases addressing a common legal issue. We use bootstrap methods to generate an empirical distribution of runs assuming that decision order is random and compare the actual number and length of runs on our data to this empirical distribution. We are able strongly to reject that the decisions in our data are independent of one another. After rejecting the strong political model, we next consider models of judges who learn from prior opinions. Such models vary on two dimensions. The first is whether the quality of new information made available to each judge during adjudication – including information from lower courts decisions, the factual record, the litigants and their lawyers – and the quality of the each judge herself is constant or varies across case, i.e., whether cases are of “variable quality”. The second dimension is whether judicial opinions reveal all the new, relevant information that their authors gleaned during adjudication, i.e., whether judges consulting prior cases have “full information” about the underlying reasons the prior judges decided the way they did. The structure of our analysis replicates models by Daughety and Reinganum (1999), who suggest that judges may not have full information and may this be subject to information cascades and Talley (1999), 2 who suggests that judges do have full information and are thus protected from cascades. Variable quality adds a twist to the debate because it suggests that judges can mitigate the impact of cascades, to the extent they exist, by self-censoring – not publishing – opinions that are of low quality and that conform to the cascade (Baker and Malani 2014). Turning to empirics, suppose that case quality does indeed vary from judge to judge, from circuit to circuit. What should we expect to see in the circuit split data? Take a judge consulting prior precedent that contained a balanced history – a number of pairs of opposite decisions, like AB or ABAB. If case quality varies, this judge should be more likely than not to agree with the last case in the sequence. The reason is the inference about what the prior judges must have known when they decided their own cases. If the immediate predecessor judge disagreed with the majority of earlier cases, then she must have a great deal of confidence in the information from her case – otherwise, why create a split? Understanding as much, the judge examining a balanced history would weigh the decision by his immediate predecessor more. Of course, such inferences are ruled out if, by assumption, all prior cases are of the same quality. When we test this non-parametrically in our sequences, we find that judges are indeed more likely to follow the last case in a balanced history, suggesting case quality is variable. We find it more challenging to test whether prior opinions convey to judges full information from prior cases. The reason is that information can be conveyed in a number of ways. At one extreme, a judicial opinion might simply report all the relevant information the judge gleaned from their own case, the litigation at hand. At the other extreme, an opinion might report a posterior belief about what the correct answer is. That belief would reflect all the information the judge gleaned from his own case and combine it with all the information that judge culled from the prior cases. In the latter setting, we obtain a potentially testable prediction: judges opinions should depend on the immediately prior opinion in a sequence, but not on earlier ones. I.e., opinions should follow an AR(1) process, but not an AR(k), k>1, process. Even this prediction is a challenge to test because, while it is possible to code decisions, it is harder to code – quantify – opinions, or the value of the posterior belief each future judge could have extracted from those opinions. Nevertheless we implement empirical tests using vector autoregressions on both decisions and dissents and with conditional tetrachoric correlations amongst decisions. The first of these tests suggest that judges do indeed have full information. The second test is still under construction. The remainder of our paper is organized into two halves. The first presents our various models of judicial learning and the testable predictions from each. The second presents our empirical tests of each prediction.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages25 Page
-
File Size-