Ruminate on this!

No doubt you enjoy puzzles:

There is a group of animals, one third of which are sheep. One animal was examined and found to have four legs. The scientist who did that examination tells us that the result “four legs” was three times more likely if the examined animal was a sheep than if it was not a sheep. What is the probability that the examined animal was a sheep?

Most trial lawyers should be able to correctly answer that almost instantly. I have posted the answer in my comment on my alternative site.

Advertisements

Legitimate presuppositions about guilt and innocence

Before the evidence involving the defendant’s acts in a case is considered an assumption may be made about the likelihood that the defendant is guilty. This is different from the presumption of innocence, which simply means that the prosecutor must prove that the defendant is guilty and that the defendant does not have to prove innocence (see R v Wanhalla 24/8/06, CA321/05, [2007] 2 NZLR 573, (2006) 22 CRNZ 843 at [49], mentioned here on 25 August 2006). But – at least on a mathematical approach to conditional probabilities – there must be some starting assumption about whether the defendant is guilty in order that the effect of evidence can be determined. The ultimate decision, the verdict, will depend on how the evidence has affected the prior likelihood of guilt.

The “priors” can be expressed as a likelihood of guilt compared to a likelihood of innocence, each assessed before the evidence as to what the defendant did is considered. A ratio of probabilities is the same way of expressing this comparison of likelihoods. Are there appropriate numbers for making this comparison?

For some people – in the absence of evidence on the point given at trial – a starting point may be that a probability of guilt of 0.02 (that is, two chances out of a hundred), and a corresponding probability of innocence of 0.98, are a good way of reflecting the need to be fair. Almost certainly innocent, but recognising that there could be room for error about that, seems like a fair starting point. This might be the same likelihood ratio as for anyone chosen at random. Indeed, a criterion of random selection can lead to very small probabilities of guilt, for example if the population of a large city is taken as the reference group.

Other people might say, well, the defendant is either guilty or not guilty, so an equal chance of each alternative is a neutral starting point. For these people a probability of guilt of 0.5, and the same probability of innocence, is a fair starting point. Refusing to start with an inclination either way seems fair.

Currently, it is not routine for this to be mentioned in a trial. This may be because it is by no means clear that a starting point is necessary: why not just listen to the evidence and get on with it? The reason is that logical errors are likely to occur. A fact-finder will naturally ask, how much more consistent with guilt than with innocence is this evidence? This is the same as asking, what is the probability of the evidence existing on the assumption that the defendant is guilty, compared with the probability of that evidence existing on the assumption that the defendant is innocent. Having estimated that ratio, it would be tempting, but wrong, to conclude that the ratio expressed the defendant’s probability of guilt compared to probability of innocence. For example, if the issue to be determined was whether an unseen animal was a sheep, and the evidence was that it had four legs, the probability of getting the evidence that it had four legs if it was a sheep (P = 1) is not the same as the probability that it was a sheep if all that is known is that it was a four-legged animal. The error is called transposing the conditional.

Another reason for the priors not being mentioned at trial may be that there is no need to do so. Some evidence setting the scene, background evidence, is likely to have been given as part of the narrative. For example, if a crime was committed by a person in a building, video surveillance evidence may be that only 10 people were in the building around the relevant time, including the defendant. This supports priors of P'(G) / P'(NG) = 0.1 / 0.9. Another example is where it is conceded by the prosecutor that only one of two people could have committed the crime, the defendant being one. It would be intuitive to think that this gave equal priors of P(G) = P(NG) = 0.5. But the prior likelihood of each suspect being the offender may not be equal, and the question becomes to what extent should the fact-finder be given evidence of the unevenness of the respective prior likelihoods.

To get from the evidential likelihood ratio P(E|G) / P(E|NG) [which is read as: the probability of getting the evidence, given that the defendant is guilty, compared with the probability of getting the evidence, given that the defendant is innocent] to the ultimate issue ratio of P(G|E) / P(NG|E) – that is, to legitimately achieve the transposition – it is necessary to multiply the likelihood ratio for the relevant issue by the priors. The need to do this comes from mathematical logic, in a rule known as Bayes’ Rule or Bayes’ Theorem. A form of the rule useful for lawyers is the “odds form of Bayes’ Rule” described, for example, in Bernard Robertson, GA Vignaux and Charles EH Berger, Interpreting Evidence – Evaluating Forensic Science in the Courtroom (2nd ed, John Wiley and Sons Ltd, Chichester, 2016) at 189, [A.2.7]. The logic applies to all forms of conditional probability evidence, not just to scientific evidence. And anything, the probability of occurrence of which varies according to context, can be expressed in terms of conditional probability.

This ratio of priors is the starting point mentioned above, and the problem is, how should it be assessed? The risk is that individual jurors might choose different starting points and indeed may choose any position between the alternatives mentioned above. This is why sufficient evidence needs to be given to establish the prior probabilities.

People who think that the priors should be P’(G) = P’(NG) = 0.5 have the advantage of being able, without error of logic, to say that P(E|G) = P(G|E) and that P(E|NG) = P(NG|E). This is because, for them, the priors do not affect the calculation. Using Bayes’ formula reveals that to find the defendant guilty, a person who starts by understanding the priors to mean P’(G) = P’(NG) = 0.5 will only need the combined (that is, multiplied) likelihood ratios of the other evidence in the case to be about 50 to 1: meaning that the combined evidence is 50 times more likely to have been obtained if the defendant is guilty than if the defendant is innocent.

But a person who understands the priors to mean P’(G) = 0.02 and P’(NG) = 0.98, will, to find the defendant guilty, require the evidence to be about 2400 times more likely to have been obtained if the defendant is guilty than if the defendant is innocent. Leaving the assessment of the priors to individual jurors has obvious dangers.

In a civil case, for example an action for compensation for wrongful conviction, the ultimate issue must be proved to a probability of at least just over 0.5. Again, the level of proof required of the evidence depends on the priors. In civil cases it is especially tempting to think that priors of 0.5 each way is fair. To succeed in a claim for compensation the former defendant (now, plaintiff) would have to prove that the evidence in the criminal trial was slightly more likely to have been obtained if the defendant had been innocent than it was to have been obtained if the defendant had been guilty. But it still may be objected that the prior assumption of a probability of guilt of 0.5 is too high and that the probability attaching to a randomly chosen person should be used.

So a person who has been found not guilty, even on the assumption that the priors are 0.5 each way, may nevertheless fail to obtain compensation: this is because, although the evidence was less that 50 times more likely to have been found if the defendant was guilty than it was to have been found if the defendant was innocent, it may have still been more likely to have been found if the defendant was guilty than if the defendant was innocent.

The point is that to make presuppositions about the defendant’s guilt or innocence legitimate, those probabilities must be assessed from evidence given at trial.

It is appropriate to ask whether assessment of evidence outside a trial context should attract the same logic. For example, does the logic apply to assessing the sufficiency of evidence to meet a requirement of reasonable grounds to suspect that evidence will be found in a search? As may be illustrated by the case I discussed here on 31 July 2017, some judges might think it does, some that it doesn’t. Judicial explanations do not go far enough for us to be sure.

I should add that when mentioning “guilt” in the above discussion I am referring to single-issue cases (for example, who did it, or was it done intentionally?). Where several issues are at play in a case, guilt on each will need to be considered separately. That will avoid the swamping effect of a large likelihood of evidence being obtained on one issue (for example DNA evidence proving the defendant’s presence) overwhelming proof of another issue (such as the defendant’s state of mind).