Addressing problems of bias in artificial intelligence, computer scientists from Princeton and Stanford University have developed methods to obtain fairer data sets containing images of people. The researchers propose improvements to ImageNet, a database of more than 14 million images that has played a key role in advancing computer vision over the past decade.
ImageNet, which includes images of objects and landscapes as well as people, serves as a source of training data for researchers creating machine learning algorithms that classify images or recognize elements within them. ImageNet’s unprecedented scale necessitated automated image collection and crowdsourced image annotation. While the database’s person categories have rarely been used by the research community, the ImageNet team has been working to address biases and other concerns about images featuring people that are unintended consequences of ImageNet’s construction.
“Computer vision now works really well, which means it’s being deployed all over the place in all kinds of contexts,” said co-author Olga Russakovsky, an assistant professor of computer science at Princeton. “This means that now is the time for talking about what kind of impact it’s having on the world and thinking about these kinds of fairness issues.”