Responsible Data Science & AI — Challenges and First Steps Towards a Practical Implementation

3 min readOct 5, 2020

After major incidents, such as the Cambridge Analytica scandal and the alleged racial bias in the COMPAS system that assessed potential recidivism risk in the US, the call for responsible data science & AI frameworks increased. Books as weapons of math destruction, the black box society, automating inequality, and against prediction also helped to create awareness about (unintentional) adverse effects of AI and data science systems. In the last year, interest in the ethical risks of data science has peaked: the Gartner 2020 hype cycle even mentions `responsible AI’ as an emerging technology. Indeed, the major technology companies have published ethical guidelines on the use of data science and AI. In addition, governance bodies and governments have proposed high-level principles. The Ethics guidelines for trustworthy AI by the high-level expert group on AI is an important and leading example in Europe. The Dutch Central Bank has also published a discussion paper containing guidelines that are specific to the financial sector.

However, operationalizing the proposed frameworks in a corporate organizational structure seems to be challenging. The high-level principles stated in these frameworks often contain items as `transparency’, ‘explainability’, and `fairness’. Such principles are, of course, very sympathetic. However, there is no free lunch. Imposing, for example, `fairness’ will, in general, lead to AI and data science systems that are less accurate as compared to their unrestricted versions. At many companies, there will not be an explicit process or policy available that facilitates decisions on such trade-offs. In fact, we suspect that data science departments often make such decisions themselves, while it could be argued that actually organization-wide strategic decision-making is required.

Zooming in on the `fairness’ principle, we see that the implementation in a data science workflow is, in fact, rather subtle. `Thou shalt not discriminate’ is a popular saying. However, discriminating between customers is often the main purpose of a data science system! Take loan applications as an example. On the basis of the data of the applicant, models assess the creditworthiness. Applications with a low value are not accepted or an additional risk premium has to be paid. The question arises which variables might be used in models and how we should quantify (un)fairness of a system. The academic literature has proposed several (mathematical) definitions, which are (in general) incompatible: you can only satisfy one measure of fairness and you will automatically violate the others. As the anti-discrimination laws (in Europe) are not very explicit, it is very complicated to ensure that your data science fully complies with anti-discrimination laws. Within a company, the question arises about which department should be accountable and responsible for such translations. In addition, which departments have the required competencies? If one would like to incorporate additional ethical constraints, the situation becomes even more complex. What methods and processes can we use to identify if there are groups that need to be `protected’ against discrimination? How do we ensure company-wide consistency in choices? Which departments should be involved in this process?

As a financial institution, de Volksbank is familiar with governance frameworks for quantitative modeling. Building on this knowledge, and in line with the social mission of de Volksbank, we have developed the first version of a framework for the responsible use of data science and AI by the data science departments of de Volksbank.

Original post here.

Read more data science articles on OpenDataScience.com, including tutorials and guides from beginner to advanced levels! Subscribe to our weekly newsletter here and receive the latest news every Thursday.

Responsible Data Science & AI — Challenges and First Steps Towards a Practical Implementation

Written by ODSC - Open Data Science

No responses yet