AI in government: considerations for ethics and responsibility
A roundtable discussion with Mind Foundry.
Decisions made by governments and other public sector organisations affect the lives of large numbers of people in profound ways every day. If considerations for ethics and responsibility are not made during the processes for designing, building, and implementing a solution with AI, unintended and unanticipated far-reaching consequences can be felt.
Mind Foundry recently collaborated with the Scottish Government on a project that requires the use of AI in the public sector to improve human outcomes at scale. We participated in this project through the CivTech program and were asked to share our thinking on how to apply AI ethically and responsibly in the public sector as part of the CivTech5 Demo Week in February 2021.
Brian Mullins (Mind Foundry CEO), Dr Davide Zilli (Mind Foundry VP Applied Machine Learning), Dr Alessandra Tosi (Mind Foundry Senior researcher and specialist in AI Ethics) and Alistair Garfoot (Mind Foundry Product Owner) participated in the discussion on the importance of ethics and responsibility in AI.
What follows is a recap of that roundtable discussion. You can download a copy of this here.
What role do ethics and responsibility play in Mind Foundry’s approach to AI?
BRIAN MULLINS: This all comes back to what we call our pillars, and this is at the centre of what we make, and we think they lead to the right types of considerations as well.
First Principles Transparency is the idea that considerations for transparency can’t be made after the fact, they have to start from first principles before you make any of your technology decisions and before you create your architecture. This is critical to having a system that you can rely on and understand.
Our second pillar is about Human-AI Collaboration. And we mean this both in the sense of intuitive design, that makes the systems more intuitive for the user and easier to understand, but also in the specific technologies that we use to coordinate the interaction between humans and synthetic intelligence technologies — like human-agent collectives and active learning that allow humans to work alongside AI in a way that accentuates the strength and abilities of each in its unique way, contributing to the end objective.
The final pillar, Continuous Meta-learning, refers to a suite of technologies that learn and learn about the learning process to help models adapt to the changes in the world and the changes in the shape of the data; to be able to continuously learn and improve the models as the world changes so that nothing gets left behind. Continuous meta-learning not only helps humans and AI improve over time, it is a safe-guard that helps us prevent the emergence of unintended consequences.
What are the problems associated with deploying AI ethically and responsibly in the Public Sector?
MULLINS: Understand first that it’s hard to do it the right way — it’s harder than to move quickly. These technologies can become very seductive when you see short term high-speed gains but we hope a better understanding leads to the realisation that it doesn’t have to be a compromise. In fact, if you consider the total cost over the lifecycle of a system, making the right choices and having a fundamental understanding from the beginning can protect against the unforeseen costs in outcomes that were unanticipated using methods that were not understandable, especially when deployed at the scale of the public sector. If you think responsible decisions with AI are expensive and take a long time to get right, you should look at the cost of the irresponsible ones.
How do you begin with ethical considerations?
MULLINS: One of the things that we do as an organisation is we look at cautionary tales.
The first one I’m going to talk about is immediately recognisable — the issues with A-level student grading in the summer of 2020. The two biggest problems with this were the chosen methods and the way that the data was not representative of what they were trying to predict. This is a recurring issue that we see a lot. When the separation between the data and what it represents is not considered, then likewise, the resulting outcomes are not considered and are disassociated from the individuals in a way that is obviously problematic.
But there’s a second part to this, in that the right methods could have been chosen…and weren’t. If the decision wasn’t made by the person, in conjunction with the AI that should make the decision, you lose a tremendous amount of context. For us, it’s not enough simply to condemn the method or to not do it. There’s another choice, and one of the things that we’d like to share is how we think there could be a way to make that decision in collaboration with the AI, but in a way that leverages the context and the expertise of the people most involved.
In this case, the educator should make the decision. You don’t want to overly simplify the prediction to be that this person passes or fails, you want the educator to understand the overarching context of the predictions, and see where an intervention could be applied in the context of that student to their betterment.
Another cautionary tale, by way of an example, illustrates where unintended bias can appear. For example, let’s say I have a model that’s been trained to understand natural language and can listen to the statements people make about a potential crime and then try to determine who are the good guys and the bad guys in those statements. That would be helpful right?
Well, imagine that it overheard someone saying “Sherlock Holmes enters a stage.”
That statement, in and of itself, is not enough to know whether Sherlock is a good guy or a bad guy. But if this model has been trained by reading books that include the works of Sir Arthur Conan Doyle, the model probably already has a preconceived notion of what kind of person Sherlock Holmes is and will take that into consideration when predicting whether, in this particular instance of him “entering a stage”, Sherlock is a good guy or a bad guy. And it would be wrong.
Or, if it was right, it would be right for the wrong reasons, and that’s no good either.
Now, this is just an example as a thought experiment for the considerations we have to make, but this isn’t purely hypothetical — the track record of predictive policing is to use statistical methods to predict who will be a criminal in the future.
Getting it wrong with predictive policing, or getting it right but for the wrong reasons, has had a devastating impact on the lives of many individuals and the communities they’re a part of. We can, and must, do better.
What do we mean when we think about ethics in AI? Is there a best practice? Is it like the Hippocratic Oath in medicine, where we simply have to have our best intentions? Or is it more than that?
ALESSANDRA TOSI: Ethics is defined as a system of moral principles that govern a person’s behaviour. And we want to apply a set of moral principles to govern AI behaviour. So, there is a set of questions we need to ask here. First of all, what are these principles? There might not be an agreement on the answer to this question, as well as there’s no agreement in general ethics in Philosophy, so it’s an open discussion. And the other question that’s important is — how do we encode those principles into an AI system?
There’s no agreed-upon one way to do it, but there are definitely best practices, and we must keep asking these types of questions in each application.
ALISTAIR GARFOOT: People have attempted to encode those kinds of rules or laws into AI and robotic systems before. Take Isaac Asimov’s three laws of Robotics: A robot may not injure a human being or, through inaction, allow a human being to come to harm; a robot must obey the orders given it by human beings except where such orders would conflict with the First Law. Is that the right approach?
MULLINS: Certain methods of AI have the potential to go wrong, and oversimplifying the considerations is likely problematic. The Asimov example is a great case in point as things like these three laws of robotics, which many people fail to mention, were actually created to illustrate how they go wrong. Asimov created those rules to lull us into thinking they were a perfectly elegant solution, just so he could then show how a solution that looks perfectly elegant on the surface can still have unintended consequences.
In his story, the three laws are shown to fail when something within the correct parameters of each would contradict and have an emergent (and harmful) behaviour.
Asimov’s three laws illustrate a great way to look at how, when a system is implemented, an over-simplification of methods can defeat itself with emergent complexity when the pieces go together.
If you address them in your project plan and then move forward, they have to be assessed on an ongoing basis as the world changes and as the system moves forward. Likewise, it’s an important point to say that there is no ethical algorithm or ethical architecture. And, in fact, that would be like saying, “Is there an ethical hammer?”.
You need to consider the entirety of the system and the output and the direction of the intervention the system produces when trying to determine the considerations for ethics. We’re not going to solve the problems of philosophy or philosophical ethics. If humans haven’t done that for thousands of years, we’re not going to do that instantaneously. But we can use those frameworks to make better decisions and do the best we can. Keep an open mind when we learn something new to adapt and adjust and update, what we consider ethical considerations, just like we would the technology.
It’s impossible to have a “one size fits all” approach to ethics. Every project must start from scratch. Does explainability solve these problems? Is unexplainable or complex modelling inherently bad?
DAVIDE ZILLI: Complex modelling is not inherently bad and we shouldn’t shy away from complex models. In fact, most of the recent advances in Machine Learning and AI have been driven by these complex models. We as humans are not perfect ourselves, and so checking the perfection of a Machine Learning Model or an AI system might be difficult. At the moment, we don’t live in a perfect world with perfect AI systems.
What we need to do is understand how these systems are constructed, so that we can be the arbiter of the performance of that system. Complex models can be powerful. We have things like Siri on our phone or Alexa doing something very advanced and that we couldn’t do without those complex models. But, while we do this, we have to keep in mind that it’s a process where we are learning and so we need to have interpretability as a means to achieve something greater in the future.
GARFOOT: Humans are inherently uninterpretable. And humans have the ability to lie. Whereas a Machine Learning Model probably doesn’t have the intention to lie or intentionally misdirect, we trust humans with making these decisions. We trust humans because this is what we have to work with. In many cases, we’ve seen and demonstrated that humans are inaccurate in making these decisions, and that’s where our opportunity to build AI systems that can help the human is the greatest — in places where the human is prone to making mistakes.
We generally end up making more mistakes involving things that are very repetitive and require the same judgment over time, things that are away from our core skills of self creativity and imagination. For those situations, that’s where we should elicit the help of an AI system. In other ways, actually, there are plenty of places where a human is still better at making the decision. In that collaboration, we’re going to get the most value in the next few years.
ZILLI: The combination of humans and AI is obviously very important to us at Mind Foundry. Alessandra, can you address the question of the role of humans in an AI system. How does that collaboration of humans and AI look together?
TOSI: People have three key roles in the AI system. First, they design the AI systems. Secondly, they have to act on the outputs of these AI systems. Finally, they’re also often the ones who receive the impact of the actions from the AI system when the system is released in society, in healthcare, or in any other area, which impacts human lives. When discussing ethics, when designing a system, there are some key questions humans have to ask before starting.
The outcome of an AI system is a prediction, some value or some output in some digital form. This alone is not something that usually will have an impact on society. The human works together with the algorithm, with the system, to interpret if you want the outcome, and as a consequence, to decide about an action. When deciding which action to take, the human tries to maximise revenue or minimise risk, depending on if they’re contextually centred around financial or healthcare or insurance concerns.
What is often underlooked is the fact that AI systems are not perfect, they have to deal with incomplete data sources and they often produce outputs with some level of uncertainty. It’s the person’s role to interpret these and take the uncertainty into account in order to make decisions. That’s the important role of the human.
If the algorithm and systems are designed together with the final users that are the decision-makers, this can really have a huge impact. That’s why working together with governments to produce AI systems that can then be deployed at scale in key sectors, like healthcare and education, is important and something one has to plan for from the beginning.
We cannot underestimate how important human-AI collaboration is in the solutions we build today.
In most cases, humans must act as the decision-makers working with the AI system because, the truth is, most of the AI systems available today are not mature enough to be left to act independently, or completely autonomously in society.
If an algorithm is simply replicating what humans are already doing, and if there is some bias in society today, which unquestionably varies in all manner of ways, is that an issue? Will that bias be conveyed to an algorithm?
TOSI: A bias is a behaviour of a prediction algorithm whose outcome is systematically deviating from the expected or correct outcome. As humans, we know that we do have bias in the form of a prejudice, or preconceived notion, but bias also has a statistically clear definition that we can quantify in the AI system. Bias is the difference between the expected value of an estimator and its estimate. So, this seems quite technical, but it’s important that we do have this technical definition because it allows us to identify and quantify the bias mathematically inside our AI algorithm.
GARFOOT: If we have a traditionally human bias and encode that into a machine algorithm or AI system, then that AI system will behave the same way. One opportunity here is to have the AI system help us reason on that bias and expose it in a more objective way. Judging other humans can be difficult, but judging a computer system is easier and can be done in a more structured and quantifiable way. We can train our AI systems to make decisions on that, removing or exposing some of that bias so that it doesn’t affect us in an undesired way. Perhaps it is more difficult to change people but actually gradually changing AI systems to make them less biased and more fair is a much easier thing to do.
TOSI: One strategy to avoid discrimination in our deployed AI algorithms is to try to enhance the explainability of the system. This is currently done through various strategies. There are various metrics we can use to assess if an argument is fair or not. And not just the algorithmic and system part of the Machine Learning and AI systems we work with, but the data as well. Sometimes the bias comes directly from the data sources. There’s quite a common example in the healthcare domain.
For example, I might have an extensive study on the effects of a certain drug or vaccine if you want to talk about that in a population. If this population does not represent the full population where I want to deploy, then the vaccine or new drug might not be good because there may exist some side effects in certain parts of the population that I have not tested on that might arise. So the bias and data collection are something that we need to look at carefully for a fair outcome. The algorithm might be unbiased, but if the data are, the final AI system might be biased as well.
Would explainability of an AI system actually solve that kind of issue?
ZILLI: If you train your system on data that is inherently racist and discriminatory, then your final system will inherit those properties as well. Transparency may not be the end goal, but it certainly will be a good method to expose that and to help us correct those problems in the system that we’re building.
MULLINS: Those are the questions that you need to keep in mind throughout designing this process. Having an understanding of accountability is intimidating when you think about the scale at which governments apply these tools. You know you have bias in human decision making. And we have checks in the system for it; we’re used to it. The AI models coming in now? The checks don’t exist for them. They’re also super efficient at spreading their bias. So, if there’s a mistake, that mistake can get made over and over at scale. That’s dangerous. But at the same time, it comes with an opportunity. Because if we can make something iteratively better, it’ll do that at scale, too. We don’t need to be seduced by the idea that every decision will be 100% better. Let’s make them just a little bit better and scale it at the top, at the level of populations, and the outcomes for people will be so much better. At the end of the day, it’s that potential and that desire to improve human outcomes and make the future of this world one that benefits everyone, that drives everything we do at Mind Foundry.
And we can do it with AI. But we can’t do it alone.
We understand that Governments have uniquely demanding requirements for scaleability, precision, reliability, compliance, and more.
We also understand the responsibility you have when your decisions can affect and impact the lives of all of your citizens.
Mind Foundry works with Governments to solve some of the most challenging and impactful problems. Contact our Government specialists to learn more about working with Mind Foundry.
Mind Foundry is an Oxford University company.
Operating at the intersection of innovation, research, and usability, we empower teams with AI built for the real world.
Founded by Professors Stephen Roberts and Michael Osborne, pioneers in the field of AI and Machine Learning, the mission of Mind Foundry is to create a future where AI and Humans work together to solve the world’s most important problems.
Mind Foundry has developed technology and products that help people bring machine learning closer to their work. Our platform is a new type of Machine Learning that is powerful enough to be trusted by experts and easy enough to be used by people throughout your organization.
Built upon a foundation of scientific principle, organisations use Mind Foundry to empower their teams in entirely new ways.