Three Observations about Justifying AI

Written by:  Anantharaman Muralidharan, G Owen Schaefer, Julian Savulescu
Cross-posted with the Journal of Medical Ethics blog

Consider the following kind of medical AI. It consists of 2 parts. The first part consists of a core deep machine learning algorithm. These blackbox algorithms may be more accurate than human judgment or interpretable algorithms, but are notoriously opaque in terms of telling us on what basis the decision was made. The second part consists of an algorithm that generates a post-hoc medical justification for the core algorithm. Algorithms like this are already available for visual classification. When the primary algorithm identifies a given bird as a Western Grebe, the secondary algorithm provides a justification for this decision: “because the bird has a long white neck, pointy yellow beak and red eyes”. The justification goes beyond just a description of the provided image or a definition of the bird in question, and is able to provide a justification that links the information provided in the image to the features that distinguish the bird. The justification is also sufficiently fine grained as to account for why the bird in the picture is not a similar bird like the Laysan Albatross. It is not hard to imagine that such an algorithm would soon be available for medical decisions if not already so. Let us call this type of AI “justifying AI” to distinguish it from algorithms which try, to some degree or other, to wear their inner workings on their sleeves.

Possibly, it might turn out that the medical justification given by the justifying AI sounds like pure nonsense. Rich Caruana et al present a  case whereby asthmatics were deemed less at risk of dying by pneumonia. As a result, it prescribed less aggressive treatments for asthmatics who contracted pneumonia. The key mistake the primary algorithm made was that it failed to account for the fact that asthmatics who contracted pneumonia had better outcomes only because they tended to receive more aggressive treatment in the first place. Even though the algorithm was more accurate on average, it was systematically mistaken about one subgroup. When incidents like these occur, one option here is to disregard the primary AI’s recommendation. The rationale here is that we could hope to do better than by relying on the blackbox alone by intervening in cases where the blackbox gives an implausible recommendation/prediction. The aim of having justifying AI is to make it easier to identify when the primary AI is misfiring. After all, we can expect trained physicians to recognise a good medical justification when they see one and likewise recognise bad justifications. The thought here is that the secondary algorithm generating a bad justification is good evidence that the primary AI has misfired.

The worry here is that our existing medical knowledge is notoriously incomplete in places. It is to be expected that there will be cases where the optimal decision vis a vis patient welfare does not have a plausible medical justification at least based on our current medical knowledge. For instance, Lithium is used as a mood stabilizer but the reason why this works is poorly understood. This means that ignoring the blackbox whenever a plausible justification in terms of our current medical knowledge is unavailable will tend to lead to less optimal decisions. Below are three observations that we might make about this type of justifying AI.

  1. AI and the need for justification

Despite the above there seems to be a persistent intuition: there are at least some instances, especially when others make decisions that adversely impact us, wherein we are owed a justification for those decisions. Furthermore, in at least some of these cases, we find the following sort of response unsatisfactory: “While we can’t tell you exactly why, according to our algorithm, you are not eligible for chemotherapy and our algorithm is rarely wrong”. An even more extreme response: “According to our best theories you are eligible for chemotherapy, but the algorithm which we believe to be very reliable says otherwise”. In short the thought here is that in some cases where the outputs of algorithm and theory conflict or when theory cannot justify the algorithm’s output, complying with the algorithm’s decision is unjust. This intuition needs more theoretical grounding. The basic claim here is that failing to offer a justification is problematic because it indicates that the decision is not justifiable. That is, the decision would not be rational for the agent from the moral point of view. It is the latter which is wrong

Partly this would be a matter of failing to account for the patient’s well-being, autonomy or considerations of distributive justice. Suppose the chemotherapy did not prolong life but would improve quality of life. It might nevertheless be that the algorithm regarded the patient as ineligible for the treatment only by considering whether life would be prolonged without considering other factors. It seems unjust to make decisions about the patient without accounting for their values. This lines up with a central Kantian thought that in order to treat people as persons, we need to respect their capacity to reason.

Part of this is also a matter of evidential support. Even if the algorithm is fairly reliable in general, if the connection between the evidence and the decision is inexplicable, then the decision in this particular case might be poorly grounded. The thought here is that there is a distinctive requirement to make justifiable decisions (i.e. ones well-grounded by the evidence) and this requirement falls out of the requirements of rationality. If it is wrong for me to serve you a glass containing petrol instead of whiskey, it is also wrong to serve you a glass which I’m justified in believing contains petrol.

Demands for justifiability of AI decisions fall out from this more general duty to make justifiable decisions. AI assisted decisions are not, in this regard, special. They are merely being held to same standards we ought to have for our other decisions.

What does justifiability require? It does not require that we know the inner workings of the algorithm. It only requires physicians be able to give good reasons for their decisions. AI systems must be designed to enable physicians to justify their treatment recommendations. Justifying AI may be sufficient to do this. Any competent medically trained professional should be able to distinguish spurious medical justifications from genuine ones.

  1. Is justifiability only instrumental to good outcomes?

All this talk about justifiability seems to suppose that good outcomes alone can never justify a decision. Why isn’t knowledge that the algorithm is extremely reliable sufficient to satisfy our evidence-relative obligations? Why are medical justifications required?

One answer to this question is that if we rely purely on the result of blackbox algorithms, we will not advance our medical knowledge. Physicians typically learn and improve their skills over time. Senior clinicians are more knowledgeable than their juniors. By providing no justification for why a given treatment is recommended, blackbox AI threatens the ability of physicians to learn. This will in turn be detrimental to future reliability. But not everyone would be satisfied with this answer. After all, the primary duty of physicians is to treat the patient in front of them.

Such objections are perhaps a bit too quick. While physicians indeed ought to do their best to treat the patient in front of them, this is constrained by the diagnostic and treatment options that are available to them in a given institutional context. As a question of institutional design, the addition of blackbox AIs to the physician’s toolkit may be detrimental in the long term.

Another reason why medical justifications have independent value is that a good medical justification does more than merely make an accurate diagnosis or prescribe an optimal treatment regimen. A good medical justification also reflects medical understanding. Medical understanding of a patient’s condition allows the physician to help the patient across a variety of counterfactual or unexpected scenarios. Good medical understanding thus provides a degree of robustness to the reliability of certain medical expertise. Medical expertise is robust as well as reliable if and only if it is not just reliable in the current situation, but also reliable across a range of counterfactual or unexpected scenarios. Otherwise, the reliability is fragile.

This is important because the physician-patient relationship is one that involves a certain degree of trust. Patients trust that their physician would be able to give the right advice (within reason) across a range of situations. This trust in the physician is therefore warranted only if the accuracy of the physician’s diagnosis (or the goodness of medical decision) is robust to a certain degree. The worry with blackbox AI is that even if we know the AI is reliable under conditions relevantly similar to our own, this reliability may very well be fragile.

The problem, as it were, is that medical understanding is far from perfect. Even so, robustness of the reliability provides an independent desideratum to weigh against degree of reliability. There could thus be some situations where we could trade off the reliability of an algorithm against our ability to provide medical justifications.

  1. Is consent sufficient to resolve the tension between reliability and justification?

The third observation focuses on the following type of case: Suppose that a physician consulted a justifying AI. The core algorithm outputs a result which the secondary algorithm cannot give a satisfactory medical justification for. The physician then says the following: “There are two treatment options X and Y that could potentially relieve your symptoms, each with similar side-effects. According to my clinical judgment, X is the most likely to relieve your symptoms for reasons R1, R2 and R3. However, when I inputted your data into this algorithm, the result suggested Y was most likely to relieve your symptoms. I cannot explain why, but I can point to various studies that show this algorithm is more reliable than the average clinician’s judgment.” The physician then gives the patient a choice in choosing between X and Y.

Given that there is a tension between requirements to make highly reliable decisions and requirements to make justifiable decisions, can letting patients choose between the more reliable (but less justifiable) option and the less reliable (but more justifiable) option be a way out of the dilemma?

This approach clearly has limitations:

  1. In some situations, patients may lack mental capacity to choose between different options, for instance, a patient is wheeled in unconscious into A&E and urgent treatment is needed.
  2. In triage situations where we are choosing which of two patients to treat, it does not make sense to ask patients
  3. Patients in general tend to have a limited ability to make adequately informed decisions even if they have the capacity to consent to treatment. An aggravating factor in this case is that patients may be especially poorly positioned to evaluate the comparative merits of the new AI technology.


What ties these three observations together is that they spring from a concern with reasons. Firstly, once we get clear on how reasons relate to normativity, the justifiability requirement becomes well grounded. Secondly, looking at how we actually use and aspire to use reasons helps explain why relying on the black-box alone is insufficient. Thirdly, looking at how reasons relate to autonomy and consent helps us address one potential solution to a dilemma. Nevertheless, one thing is clear: questions around ethical deployment of AI in medicine and beyond are themselves dependent on more fundamental features of human rationality, which require further theoretical investigation.

  • Facebook
  • Twitter
  • Reddit

One Response to Three Observations about Justifying AI

  • Ian says:

    Thank you for an interesting article. When the police started using computerised systems many years ago to identify and justify the charges to lay against an offender, together with the correct evidential standards and requirements, the system appeared to create the possibility of reducing the memory requirements and expertise within the human officers. I am not aware of how that/those systems have developed since that time, but the issues of expertise you raise seem very similar. i.e. Do you rely upon the expertise of the programmer’s (including their own biases) and that the data which informs the system will be input correctly, or the human element attempting to interpret any output.
    Nick Brostrom’s Superintelligence: Paths, Dangers, Strategies., Oxford University Press. may be of interest as this maps out many of the issues you raise and further extends some thinking in different directions. For example where in order to extend the AI’s capabilities further it programmes itself, or they programme themselves, as a sort of virtuous circle of AI with all the potential possibilities and dangers there.
    For myself the strict distinction between rationality and reason that you appear to use is not really reflective of what actually happens in thinking most of the time and so AI which incorporates those strict distinctions would become flawed in its reflection of the wider human element. The focus becomes too refined/blinkered eventually resulting in issues similar to the global ones being suffered today

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use the <em>, <strong> and <blockquote> tags. Links have been disabled to combat spam.

Notify me of followup comments via e-mail. You can also subscribe without commenting.