When we talk about racist or sexist AI, what are we talking about exactly?
What are these artificial intelligences doing?
Stéphane d’Ascoli: We are starting to have a lot of examples. There have been cases in recruiting, including an AI from Amazon that disadvantaged the resumes of women of equal skill. We also have Face ID type face recognition tools that work less well on black people. This is problematic and it can be even more so if this efficiency problem is found on AIs supposed, for example, to make a medical diagnosis. The way some AIs process and translate sentences can also reflect a very biased conception. For example, we ask for the feminine “doctor” and we get “nurse”. The fact that certain RNs discriminate against certain populations is all the more problematic since they now influence potentially sensitive decisions: obtaining a loan, a position, court decision, etc.
Why can AI become racist or sexist?
LotR: We tend to imagine that AIs are cold, objective and perfectly rational, but that is not the case. They learn from our data and our data is biased. If, for ten years, women have been disadvantaged during the recruitment process of a company and this company uses this data to train an AI, it is likely that the AI will deduce that the CVs of women are less relevant to this business and that it continues to disadvantage them. Artificial intelligences are not critical, they do not question what we teach them. They are therefore quite conservative. AI reflects the world as it is, not the world as we would like it to be. There is no notion of justice or social progress with them. Another factor that can cause artificial intelligence to treat users differently is the lack of data. If we train facial recognition AI on photo bases where we mostly find white people, it may work less well on black people. And let’s not forget that algorithms can reflect the biases of their designers.
How to prevent an AI from becoming racist or sexist?
LotR: It’s not easy. Already, we must identify the bias and deal with it. And it can be very complicated. Before, we had perfectly determined algos, we knew precisely what they were doing. But, more and more, AI will select the important characteristics themselves. This is the whole point of deep learning but it makes them much more difficult to control. A rather funny example on the subject is that of two hospitals which provided x-ray images of patients. An AI had to determine whether or not the image revealed a fracture. The problem is that hospitals had very different case rates (90% of patients had a fracture for Hospital A and only 10% for Hospital B). We realized that to determine whether or not the image showed a fracture, the AI did not analyze it at all: it only checked the hospital logo. Because Hospital A actually had many more cases of fractures! This type of bias is difficult to spot and it is also very hard to correct: even if we remove the logo, the images of each hospital certainly have specificities (different resolutions etc.). So artificial intelligence can continue to identify the hospital in other ways. It is the same problem that arises with sexist or racist biases: we can say that we are removing the datum of gender or ethnicity. But AI can often infer this parameter from other data. However, we must try as much as possible to identify discriminating biases and find a way to remove them. To prevent AI from becoming racist or sexist, it is also important to ensure that the data sets on which AI will be trained are balanced and diverse. In this case, it is not very complicated to do. And once the AI is operational, it is imperative to test it to verify that it treats users identically.