Microsft last week said that it will stop selling software that guesses a person’s mood by looking at their face. The company said that it could be discriminatory. Computer vision software, which is used in self-driving cars and facial recognition, has long had issues with errors that come at the expense of women and people of color. Microsoft’s decision to halt the system entirely is one way of dealing with the problem.
Something similar is being done to train AI, which depends on carefully labeled data. The software, before, was trained on thousands or millions of images of real people, but since it doesn’t include a large section population, and can be time-consuming, tech firms are approaching a different way.
The idea is a bit like training pilots. Instead of practicing in unpredictable, real-world conditions, most will spend hundreds of hours using flight simulators designed to cover a broad array of different scenarios they could experience in the air.
The growth of fake data is a step in the right direction, not only because it can be diverted from using people’s personal data. Even though, an honorary associate professor of computer science at University College London Julien Cornebise says the Synthetic data also won’t eliminate bias completely.
“Bias is not only in the data. It’s in the people who develop these tools with their own cultural assumptions,” he says. “That’s the case for everything man-made.”