Until very recently, machine ethics has been a topic reserved for science fiction written before the twenty-first century. The machines don’t rule our society yet, at least not in the way that science fiction would like us to believe it would happen. Driven by easy access and the low cost of data storage and computing capabilities, the use of machine-driven intelligence has increased significantly in the last two decades. Machine learning algorithms, or AI, now impacts our lives in innumerable ways – making decisions that can have material impact on the lives of the individuals affected by those decisions.
These can range from deciding whether an applicant is eligible for a mortgage, if a patient’s scan shows cancer, the route you should take for your commute and even influencing which jar of peanut butter you buy based on the results from a recommender engine. It’s therefore no surprise that the performance of machine learning models behind these applications can have a significant impact on the lives of the individuals that it makes predictions for.
When machine learning gets utilised for solving any of these problems, any model being deployed needs to be trained on historical data – things such as past mortgage decisions and labelled scans for cancer. Based on the users covered, or rather not covered, in the training data and any historical biases present in the training data, the model can perform better for one user group as compared to the other. This introduces biases in the model predictions that can then lead to discriminatory decisions made by the application utilising such a model.
The responsibility of building the model often lies with the data scientists in the team. However, the responsibility to make the model and the resulting software product fair and ethical should belong to all roles within the team. The process should really begin when we first describe the business problem we want to solve and not when the data scientists write the loss function.
In our book, we propose that the responsible AI lifecycle is not different from the standard data science lifecycle. It’s composed of investigations and actions that we need to perform in order to identify biases and discrimination, such as:
- looking at the proxy features and its impact on fairness,
- defining the parameters that the AI model training needs to optimise,
- the metrics that help us evaluate model performance,
- addressing the drift over time for both data and model,
- adding privacy to the data as well as the model,
- the actions that we can take when we identify any issue, along with explainability at every stage.
We also acknowledge that many teams will be considering responsible AI after their product is live. For them, responsible AI would include performing similar investigations followed by actions that can help them reduce the discrimination and bias from the model output and introducing privacy to the model output, followed by the ongoing monitoring.
Since all roles in a product team need to contribute to achieve responsible AI, from product owner/manager to business analysts and data experts, everyone needs to come together to create a fair and ethical AI driven solution.
For all the investigations proposed in the book, we go deep into the detail of how you can utilise them, the scenarios that make one option preferred over the others, and multiple ways of applying them. We hope that you will enjoy reading it as much as we’ve enjoyed writing it, and that it provides you with practical tools that you can use to integrate responsible AI in your data science lifecycle and build products that improve their users’ lives.