POLITICO Connected: An AI Community: AI and search — Sector-by-sector potential — Summit flashback

May 8, 2018 at 06:02AM
via "Safiya U. Noble" OR "Sarah T. Roberts" OR "Diana L. Ascher" OR "Diana Ascher" OR "Safiya Noble" OR "Information Ethics &amp - Google News

What are the first images that come up when you use Google to search “baby”? It’s likely to be photos of white infants below the age of 1. What about “CEO”? Elder men, mostly. Try “couple,” and you’ll get a tableau of heterosexual couples.

Search engines have become the No. 1 tool to navigate the web. The companies that run them (above all Google, which controls some 74 percent of global searches) not only determine what content will reach a broad audience, but also influence what are considered the norms of a society thanks to their algorithms, powered by state-of-the-art artificial intelligence technology.

Showing up in the first row of Google results is the new normal, if you will.

The trouble is that, by design, these algorithms “are set up to discriminate,” Danah Boyd, the principal researcher at Microsoft Research, said during this year’s re:publica conference in Berlin last week.

Let’s take a step back: What does Google do during one of the more than 1 billion queries it performs per day? The search engine first looks at certain factors – where you’re located, most importantly – and then its algorithm comes up with what it believes are the most helpful search results for you.

To do that, it also uses machine learning, the technology at the heart of much state-of-the-art AI technology today.

But machine learning relies heavily on clustering data, which is collected in the “real world,” with all its biases, such as racism, homophobia, misogyny. Once that data is clustered, and the algorithm draws its conclusions, its outcome will almost inevitably show biases, as well.

Pandu Nayak, vice president of search at Google, answered a question about the issue last week in Berlin, saying that “diversity in our search results is centrally important to us,” and a “diverse set of search result is actually better for users.”

He pointed out that there were “several hundred” factors – or “signals,” as Google calls them – that determine the ranking of results, adding that “these range from document-specific signals to user-specific signals,” such as the structure of a website, or the language a user is likely to understand, judging from his or her location.

In addition, Google has thousands of search quality raters worldwide to evaluate the search results of the algorithm, Nayak said, adding that “to help the raters make that judgement,” the company has come up with a 160-page guideline document.

Google results for ‘black girls’ yielded porn

Let’s go back to the white babies and the straight couples: Without knowing the exact innerworkings of Google’s algorithm, it’s fair to say that the reasons why such images come up first are numerous.

What’s known is that they also include technical factors, for example the way professional stock photos are tagged. More importantly, however, they’re just a symptom for a problem that reaches even deeper than “just” privileging majorities over minorities.

In her new book Algorithms of Oppression, University of Southern California communications professor Safiya Noble writes that for a long time, when she was searching “black girls” from her computer in the U.S., the first results that came up would be pornography.

This led her to write the book, in which she concludes that the monopoly status of search engines and their business model of selling targeted ads has led to a situation in which people now navigate the web primarily with the help of a biased set of algorithms that discriminates against people of color, specifically women.

Against this backdrop, a growing number of critics have stepped forward to demand that Google face up to the fact that its search engine is not as neutral as the company would like people to think – and that, instead, some old stereotypes persist in its search algorithms.

“We really need to figure out how to hold people who develop machine learning accountable,” said whistleblower and U.S. Senate hopeful Chelsea Manning last week in Berlin. “It is more than just hype. It’s dangerous.”

Janosch Delcker