May 23, 2018 at 12:12PM
via Wired
Nicholas Thompson: Let’s get cracking. You guys have rolled out a ton of stuff since December 2016: you’ve rolled out the fact-checking initiative, you’ve shrunk images on suspect posts, you’ve rolled out machine language tools for fact-checking, and machine language tools for clickbait headlines. I’m curious what’s been the most effective of the many things you’ve introduced?
John Hegeman: I think this is a space where there isn’t a silver bullet. We can name one or two things that have been really effective but, for any one thing, it only covers part of the problem and there are ways around it. I think a lot of this is really about how the different pieces fit together. Thinking more broadly, we weren’t necessarily only targeting fake news. It was part of our broader work on things like quality and integrity overall—doing things like taking down fake accounts more aggressively, enforcing community standards. There’s a strong correlation between the people who are posting things like false news, and people who are violating these other types of policies. So a lot of it comes down to the basics of blocking and tackling and of really enforcing the rules as precisely as we can.
Tessa Lyons: I agree with John’s statement, and the one thing I’d add is I think one of the things we’ve seen is that so much of the false news we see on Facebook is financially motivated. Going after those financial incentives and really working to disrupt them, we knew was a big part of the problem and therefore our efforts in that area have helped us have an impact on all of these different components.
Thompson: I’ve seen that in interviews with people who run the false news sites. When the ad networks were cut off in December 2016, that had a big effect. What were the other steps you took to disrupt the financial benefits fueling false news?
Lyons: One of the things that we did, and that you’re referencing, is when we identified that a publisher was repeatedly sharing false news, we cut off their ability to advertise or monetize. But I think even more so than that, the work that we’ve done to identify some of the common tactics for those who are financially motivated bad actors. One example is clickbait. If you’re constantly posting clickbait, because you’re trying to drive people off of Facebook to your website, we use those predictions to help reduce the distribution that content gets in News Feed. That’s not only valuable because we’re reducing the distribution of it for that specific piece of content, but because that changes the whole incentive structure. If that content is not being viewed, it’s not being monetized, the incentives for creating it in the first place have changed. Now, like any part of this, it’s adversarial and so it’s not as if we’re done and we get to check the box on that. But that’s an area that we invested a lot in.
Thompson: Are there other things? I know that labeling something as false and that fact checkers flagged it had an inverse effect of what everyone expected and you rolled that back. Has there been anything else that has had a surprising effect, where it’s been less effective than you expected or more effective?
Michael McNally: One comment about that, it’s not that it had necessarily a negative effect, it’s that we had a superior effect by showing related articles instead. So we basically traded up from something that worked to some degree, to something that worked more efficiently.
Thompson: Ok. Are there other things that had surprising impacts?
Hegeman: I think, you know, one thing that has been a little surprising in this space is just the difference sometimes you see between the direct effect of something and then the second order effects after people respond to the new incentives of the system. So, a good example of that would be the work on clickbait. Like Tessa was mentioning, we saw some reduction of clickbait when we rolled out improvements to the classifiers that we were making, but we actually saw bigger reduction after that once publishers had a chance to realize: OK this new policy is in place, it’s actually more effective to stop publishing things using these tactics and to write headlines in a better way.
Thompson: As a publisher I am well aware of the way that publishers adapt to Facebook announcements. Two of the things you guys have mentioned briefly here and also in the video but that I haven't seen sophisticated articles about yet are the machine-learning system for fact checking and the machine-learning system for identifying clickbait. Can you explain a little bit about the models that were used? How they were trained? What they do?
McNally: With clickbait, we define what it is as a policy statement. And then we have raters look at large volumes of material, and they label it as clickbait or not. And then we have deep neural networks that do, indeed, train on the text itself and learn the patterns. We also look at things like social connections or user behavior or things that aren't in the text itself but they all become part of the predictive model. And so that gives us the probability that something’s clickbait.
Thompson: I wrote a story last September on Instagram’s efforts to make everybody nice, which seems like a very similar thing. They brought people into Instagram, rated comments: for example, this is mean, this is cruel. They fed that data into DeepText, trained it, re-trained it, re-trained it until it’s ready to go live. Is that more or less what you did here?
McNally: Yes that’s a very common process. So what we did is quite similar.
Adam Mosseri: So, I think it would be good to back up a little bit. So any classifier—you can be trying to be like, is this a photo of a kitten? Or, is this article headline clickbait?—requires a handful of things. One is that you have some policy or definition of what’s a cat or, in this case, what’s clickbait, right. And then you need a data training set, which is ideally tens of thousands, if not hundreds of thousands of examples, both positive and negative. So the way this works in clickbait is we just get, it’s actually I think tens of thousands of examples that this is clickbait, this is not, this is clickbait, this is not. And then you have a bunch of features, so just like things that you can look at. So if it’s a photo, you can look at shapes and colors and textures and whatever. If it’s text, it’s the words, the combination of words, etcetera. And then what you do is you train the classifiers, you write code that can predict a likelihood of the outcome, so in this case, a likelihood that a photo is a cat or an article headline is clickbait, based on the patterns they see in the features.
So having a clean data set to begin with is paramount, otherwise you’ve done nothing. And then you can also use that data set—not the exact same data set but the labeling guidelines—to then see how well your classifier is doing. So you can just say, “Oh, for this new headline that we didn't use in the training data set, the algorithm said it probably is clickbait and it is clickbait. How often are we right and how often are we wrong?” So this is valuable not only to train, so that you can learn, but also to evaluate what we call precision in recall, so how often you’re right and what percentage of things you get. That is standard for machine-learning classification, doesn’t matter what you classify.
Thompson: And then you tune it, right? And you say, if there’s a 90 percent chance of clickbait or 95 or 85, depending on how you feel.
Mosseri: Yeah. You add new features and you tune the model, you do all these things to get more accurate, so that’s called prediction accuracy. But then what you can also tune is, ok now you have a number, let’s say it’s pretty accurate—that this is 90 percent clickbait—what do you want to do with that? And so we, you know, you have to decide, are you going to just demote things above a certain threshold? These are all things that we tune over time just to try and be more effective.
Sara Su: And just to add to Adam’s description, I think this highlights one of the challenges of classifying misinformation versus classifying clickbait and why it’s really important for us to use a combination of algorithms and humans. So most false news is designed to look like real news, and so training based on examples gets us part of the way there, but that’s why it’s important for us to also partner with third-party fact checkers to make that final determination. So I think Tessa can probably speak a little bit more to that process and then I think, Henry, you can also speak more to the details of how we scale this.
Henry Silverman: One of the things that I think is important to know is that we continue this labeling effort, it’s not something that we stop because we want to make sure that if the ecosystem adapts, that we adapt with it. And so you know the way Adam described clickbait, we are still continuing to label clickbait, because we establish these principles about what clickbait is, and we label for it. And maybe our model predicts something of what clickbait was in 2017 but say clickbait becomes different in 2018, we still want to know that. So we’re always evaluating this classifiers against the current ecosystem.
Thompson: Fact checking is a harder problem, right? Because it’s not just a headline it’s the entirety of the text.
Lyons: I was going to say selfishly the reason I thought it was helpful to talk about the clickbait part at first is because it’s helpful to draw some distinctions. And so one of the distinctions is for clickbait or kittens, you can develop a lot of training data. And you can have people that we can hire to develop that training data pretty quickly. One of the challenges in the misinformation space is there’s not a database that you can go to and say, “Everything here is absolutely true and everyone absolute agrees. And everything here is absolutely false and everyone absolutely agrees.” And so actually determining how you get the training data to start training a model is one of the challenges.
So what we’ve done is we’ve used our partnership with fact checkers and the data that we get from fact checking, and some of the features that we focus on are at this point less about the content and more about some of the behavioral signals. So, for example, every piece of content on News Feed, you can give feedback as a user that it is false news. So that’s one piece of information that we get. The other thing that people do is they leave comments, expressing things about the things that they’re reading, and we found that comments that express disbelief can be a good predictor of potentially false news stories. But we are also constantly working to increase the amount of training data that we have, working with fact checkers and starting to explore other systems, and also working to expand the number of features or signals that we’re able to use.
Thompson: So you’re not actually looking at the text, and then comparing it to Wikipedia or checking dates. You’re just looking at comments, headlines, fact checking, right? Or are you analyzing the body of the article?
Lyons: So right now we are analyzing the body of the article to the extent that we’re trying to identify duplicates and near duplicates of things. One of the things we’ve seen and that we’ve seen covered a lot actually is that an individual false news story will be copy and pasted by a bunch of other people to try to create ones that are very similar with maybe a few nuances. The joke that I heard recently is that the only thing cheaper than creating fake news is copying fake news. And so when you think back to those financial incentives, we have to go after not just the first posts but all the duplicates. So we do use a lot of natural language processing to predict those similarities between different articles. But for actually predicting individual pieces of false news, we’re relying a lot on signals from people and on the behavioral signals that we know about a piece of content. So how it goes viral, who’s shared it, what that pattern of growth might look like, and also, looking at predictors of who’s shared or reported this type of content in the past. So, for example, if something is posted by a Page that has a history of sharing a lot of false news, that’s an obvious signal.
Thompson: So there are different kinds of fake news stories which have different civic importance. So, I was just looking at a list of fake news stories so like, “Woman Falls Asleep at Morgue and is Cremated” doesn’t actually affect how America’s democracy functions. Like, “Trump Executes All of the Turkeys Obama Pardoned,” is political but doesn’t matter. “Trump Arrests All the Mayors of Sanctuary Cities” actually matters, right. Do you guys figure out how much it matters civically, when you’re weighting these things? Or do you count it all just as the same?
Lyons: One of the things that we are thinking about is if you're going after individual pieces of content, you’re always going to be behind, right. So there’s an important role to play for fact checking individual pieces of content, which we need to do and we need to get faster at, and we can talk about that at length. But really what we’re trying to do is change the incentives. And we talked about the financial incentives, but there’s other incentives too. You know, if you’re trying to build an audience for ideological reasons, or you're just trying to make money, whatever the incentives might be, all of these different types of content might be helping you achieve the growth that you’re trying to have in your audience and achieve the objectives that you’re trying to have. So while it might seem like a trivial story isn’t as important as a story about real world events, actually knowing that that’s story is false and understanding the pages that have shared it and how it’s grown and being able to take action not just against that content but all those actors, is important for stopping the spread of the really serious stuff as well.
Thompson: That makes sense. But you could weigh your machine learning algorithms differently for different segments, right? You could be like anything that’s about 97 percent chance of clickbait if it’s a joke, knock it out. But if it’s above 80 percent on politics, knock it out, right. Do you do that?
McNally: It’s possible to combine separate signals additively. So if there’s a demotion or a penalty that comes from something being clickbait, another one that comes from ad farms, another one that comes from misinformation risk, yeah, they could be additively combined in some way.
Mosseri: We don’t have varying thresholds for different types of content, just to answer your question very clearly. I think there are pros and cons to doing so. I don't think civic content is necessarily the only content where you’ve got real risk of harm. And then you also complicate the metric, you complicate how you measure success, it can slow down the teams, etcetera. If you’re particularly interested in civic content, the good news and the bad news is, political content is like wildly over represented in most problematic content types, be it clickbait, or withholding content, or false things, etcetera, because tactics to play on people’s emotions in politics are one of the most effective ways on getting people riled up. But no, we don't weight them differently right now. I think we could consider that in the future, but with this type of integrity work I think it’s important to get the basics done well, make real strong progress there, and then you can become more sophisticated as sort of a second or third step.
Thompson: Let’s go to the academia stuff that you guys are announcing. What kind of data do you think you’re going to give to researchers that you haven’t given to them before?
Lyons: A group of us was at Harvard a few weeks ago meeting with academics in misinformation from around the world. And we actually literally sat down and spent a a day and a half drawing out the data sets of what kind of data we would need. But what we started with are what are the kinds of questions that we actually need to be able to answer. And so what we did in that time is we identified that across academia, there is no consensus on the definition of misinformation, false news, fake news, the different buckets, whatever you want to call it. And there’s also a lot of discussion about the right way, once you even have a definition, to even measure the thing that it is you’re focused on, whether that’s the number of people who’ve seen something, or the overall prevalence. So one of the things we wanted to do as part of the work with this election research commission is work with them on misinformation specifically to help provide data to answer some of those questions, and from there, we’ll be able to go on and answer more and more. So the type of data we’ll provide to them in this privacy protective way, will be data where they’ll be able to do that kind of analysis themselves. So they’ll have information about the links, for example, on Facebook, the amount of views that they’re getting, and other signals about them. And they’ll be able to answer the types of research questions related to those topics.
Thompson: So, what specifically? Like what’s a data set that people want?
Lyons: I need to make sure because the data scientist pulling the data isn’t actually in the room, so I don’t want to speak out of turn, but you can imagine that if you were trying to determine the number of views that a subset of false domains that you’ve identified to be false news domains as an external academic, you need to identify, of all those domains, how many views did they get on Facebook over whatever the period of time is that you’re looking at. And right now, there are a lot of efforts, many of which I’m sure you’ve seen, that have tried to do this with data external to Facebook, where they’ve used a third-party sort of vendor that looks at interactive data or publicly available data, but we want to work with academics to get more accurate understandings of some of these different research questions. So those are the types of things that would be included.
Thompson: Is that data harder to get? Because I know the Russia data has all been deleted, so you actually can’t go back and get data on the Russia ads because it’s gone.
Lyons: So I don’t want to speak to the ad side because I don’t understand those data systems as well, but in this case, there’s certainly like, if we’re trying to pull data from a very, very long time ago that will be harder to do with this committee. But we’ll be able to say to them, what are the different data points that they want to have in order to measure the different questions that they have. We’ll work with them to give them data in a privacy protected way and figure out what does that mean in terms of how far back we can go, but certainly what does that mean in terms of what we can do going forward.
Thompson: And how do you do it in a privacy protected way?
Eduardo Ariño de la Rubia: I was just going to say it’s really pretty straightforward. Its URL, views, date. Or URL, views, Likes, date. What we don’t do is we don’t actually provide personal information about what are the user IDs of the people that have viewed it or anything like that. You know, that’s not something that is important to share and we don’t share.
Mosseri: So either anonymization or aggregation, which effectively also anonymizes things. So like this URL, you might not know the million people that saw it, but you know that a million people saw it and a hundred thousand people liked it.
Thompson: There are a hundred signals in the News Feed or maybe thousands. Some of them, in my view, incentivize publishers to do high quality content. So the ratio of shares after a story to before is a really good one, time spent reading is a good one. Some of them are neutral. Meaningful interactions pushes it in a good direction. But some of them don’t correlate with creating a high quality information ecosystem, like likes and shares. Or maybe it weakly correlates. How has the sort of overall structure of News Feed changed to combat disinformation, false news? Like the changes that have been put through to the core News Feed algorithm, obviously trustworthiness is one, meaningful social interactions is another. But what are the other things? Have you re-weighted other parts of it to fight this stuff?
Mosseri: I think it would be good to back up a little bit. So there are hundreds of thousands of signals, there’s only maybe about a few dozen predictions, just to be clear. So a signal would be like: Oh, what time is it right now? How fast is the internet connection? Who posted this? Do people tend to like and comment on her things? Etcetera. A prediction would be like: How likely are you to like? How likely are you to comment? How likely an article is to be clickbait. In general, over the last couple of years, I think you’ve seen us move more and more weight in the value model from lighter weight interactions like clicks and likes etcetera, to heavier weight things like how long do we think you’re going to watch a video? Or how long do we think you’re going to read an article? Or how informative do you think you’d say this article is if we asked you? Or now we’re getting into things like broad trust, etcetera. So you’ve seen weight shift in that direction, which is, I think, our way of shifting towards quality.
But this is an area where I think we need to be really careful. Because there are certain ways where I think it’s appropriate for us to get involved in quality, so within news we focus on informative content, broadly trusted content, and local content. And there are certain ways where I think it would be inappropriate, which would be to say, “Oh, we like this person’s writing style.” Or, like, we think that this ideology is more important than this other one, or we side with this political point of view. And so that is a common area of tension and an interesting subject of conversation, usually with people who work in the industry, because it's just a very different way of doing things.
Now if you’re trying to improve the quality of the ecosystem I think you can do two things: You can try to nurture the good more and address the bad more. And you have to do both. But I think it’s important to correct a common misconception, which is, sometimes people think nurturing the good is going to really address the really dramatic edge cases like false news, and it usually doesn’t. So I’ll give you an example: broad trust. I really believe it helps improve the quality of information in the ecosystem. I think it does very little, if anything, to reduce the chances that a hoax will go viral. Because that’s essentially—it’s an edge case, it’s an anomaly. Broad trust, by the way, only applies to publishers for which we have enough data, and it currently is only in the US. And so, you just can’t rely on that if you have an acute problem that you need to address. And so we do a lot of stuff to try and nurture the good more, and I’m proud of that work and we’ll do more and I think we have a long way to go, but I don’t think it by and large does too much for some of the by and large integrity problems. You need to actually define those problems and try to address them head on.
Thompson: That’s fascinating. Can you say a little bit more about how you re-weighted toward the heavy stuff? Or toward the serious stuff?
Mosseri: We’ve been adding these things, right. Like, we didn’t use to predict how long you would read an article for, we didn’t use to have a sense for how broadly trusted a domain was, we didn’t predict how long you would watch a video for. We call these things “p something,” p comment, p informative—how likely are you to comment, how likely are you to see this story as informative—so as we’ve added those over time, just by adding other predictions and outcomes, that shifts the weight from the lighter weight things to the heavier weight things. Local is another one we launched in January.
Hegeman: I think his last point about just having more of these signals is just really, really important. Because, you know, you pick out any one of these things, and you’re going to be able to point out cases where it goes wrong. Because they all do some of the time. But each one is still additive to the overall picture. And so part of this is just about, we need to have more and more predictors that add more and more nuance to the picture about overall quality and how much people want to see something.
__Thompson:__And none of them is a perfect indicator. We joke at WIRED, the best way to have someone spending a long time reading your article: it has to be really clean and beautiful and then have a terribly edited ending. So people get flummoxed there.
[Laughter]
Mosseri: This is like what it’s like to work on ranking though because there is no black and white. Everything you come up with, not only externally but internally, someone will be like, here’s a use-case where that backfires. And you have to be like, yes but does it work? Does it add more value than it creates problems? Are the problems it creates not particularly expensive? And you deal in the gray all day every day.
Thompson: So there was a chart that circulated recently and it showed the news sites that had done the best since the trustworthy stuff was put out there. And I think Fox was at the top. It was just not what you expected. Was that chart A) wrong, B) right and I don’t understand why it’s right, or C) it shows that this isn’t working exactly as expected.
Mosseri: So that chart wasn’t about—they talked about the trustworthy change—but it wasn’t about the trustworthy change. It was about what traffic are these publishers getting on this day and this other day.
Thompson: Oh right. So there could be factors that are massively more important than trustworthy, right. They just have, like, better writers and editors over the last three months.
Su: I think that in addition to the thousands of signals and dozens of predictions that we’re constantly adding to, there’s also just fluctuation in the ecosystem. So some days there’s just more news, or people are just more engaged with the news. And I think John touched on this earlier, there’s this vicious or virtuous cycle, depending on how you see it, of publishers reacting to the changes. So I think what all of that adds up to is that it’s really hard for us to just take a snapshot. But we’re really lucky to have a really strong data science team led by Eduardo to help us tease apart: What are all of the contributions that the individual changes that we’re making, how do they interact with on another and how do they interact with these ecosystem effects?
Tucker Bounds: And that, not to pile on, but if you look, that was a March to April comparison, if you were to make the exact same comparison January to April, like CNN is way up in that.
Mosseri: So these are the things you should always look for, whenever you get to comparisons...
Thompson: It was fake news.
[laughter]
Mosseri: There are some standard things. Like if you’re comparing two dates you have to make sure you’re looking at those dates because things are so volatile in the ecosystem in general, that you can easily mistakenly pick a peak or a trough and make it look really bad or really good depending on what you want to say. I’m not saying they did that on purpose. But you have to look at, you need to look at the rolling averages or the long term trend lines otherwise you can really misinterpret the data really easily.
Ariño de la Rubia: Misinterpreting data happens literally all the time. I mean if you pick any arbitrary dates and they happen to have April Fools Day in them, then suddenly you’re going to go, “Oh, look at all of these lies that are spreading.” If they happen to have Valentine's Day in them, you’re going to be like, “Oh, the world is falling in love.” There are these massive macro trends that make picking dates hard.
Mosseri: Yeah, we pick two rolling averages. We’ll pick two months and compare two months. Or look at longer term trends. By the way, we make the same mistake internally.
Su: We’re still really grateful to have folks externally doing these analyses, because it is really hard to get it right. And so the more different methodologies we’re trying internally and externally, the better chance we have at getting it right. And just a call back to the partnership with academics, I think it’s also really important there to have independent folks helping us identify the unknown unknowns because the process that we described earlier of identifying the principals and the guidelines, labeling data according to those guidelines, training a classifier, tuning a classifier, and then using that to make ranking changes, that requires us to have the definitions, know what we’re looking for. And there are always going to be new things that our adversaries try that, they’re very creative, they’re very motivated, so we need lots of folks watching this and helping us identify where to go next.
Ariño de la Rubia: For them adversarial excellence is existential. They have to be that good.
Thompson: I have never understood how commercial relevance works as a signal in the News Feed algorithm. How does Facebook use commercial relevancy in figuring out how the core algorithm works? And does that have any impact on this problem?
__Mosseri:__What do you mean by commercial relevancy?
Thompson: So, if I put up a post and it’s something where an advertisement next to it is likely to be clicked on, because of some psychological effect of the post, does that make the post appear more frequently in my friends’ feeds or the people who follow my page’s feeds?
Multiple people: No.
Zigmond: Unless there was some weird feedback where, because ads were doing well next to it, people were then spending more time on feed and so more people were seeing it and interacting… I mean there would have to be some really complicated, indirect relationship. Within News Feed all we do is reserve certain real estate for ads, and then another team works on filling that real estate.
Thompson: So the way that the post interacts with the ads has no bearing?
Multiple people: No
Thompson: Somebody just said to me that they were at a meeting at Facebook and were told about it.
Hegeman: So there’s a little bit of nuance there that maybe we could tease out because I suppose there could have been some confusion. So, the ads aren’t going to have an effect on which posts get shown in the organic, regular News Feed, that’s just based on what people want to see and trying to understand what’s going to be high quality, informative. I suppose it is true that which posts you see, which normal posts you see from the pages or people you are friends with, could have some influence on which ads get shown next, or which exact position that an ad gets shown after that. So I suppose there’s probably some potential for influence in that direction if I’m trying to think through all the details. Maybe that’s where some of the confusion came from…
Mosseri: Or a different context than feed. So there’s like, in like related videos, there’s definitely—in feed, all the research we’ve done suggests that people don’t think of it as like one place, they think of it as a bunch of different stories that they’re scrolling through. Whereas if you’re showing an ad in instant articles or in a video channel where, then there’s much more—people think of it as like… the issues that you’re bringing up come up much more from advertisers than from publishers. So they might have been talking about a different context than News Feed. But the vast majority of ads are in News Feed.
Zigmond: And just very specifically, the ranking of stories is determined before we know which ads are going to show. That happens second, so there just isn’t a way for the causation to work in that direction.
Hegeman: There’s only a number of different things we’re predicting. None of those things represent how much more we would make from ads that get shown next as a result of that…
McNally: It’s literally different people.
Lyons: John was one of them!
Thompson: Yeah, didn’t you build the ad model?
Hegeman: Yeah I mean, there are some similarities. So the advertising system also tries to take into account what people want to see, what’s going to be relevant. Those are things, principles, types of values that feed into both systems. But that doesn’t change the fact that they’re separate.
Thompson: One theory I have, and it might be a false theory, is that much of false information comes out of Groups. It starts in a Group of like-minded people and it’s either people who have self selected or sometimes it will be a Page that has used custom audiences to build an audience which is effectively building a group around custom audiences. And then false information starts in the group and then spreads to the core News Feed. So one way to stop this, you know, the nuclear thing would be to block custom audiences and block segmentation. A second non-nuclear way to do it would be to limit custom audiences and limit segmentation on segments where there’s likely to be a lot of false information. Do you guys do this? Have you thought about this? Am I wrong at every level of this analysis?
Mosseri: I want to separate Groups, and custom audiences and targeting. I get that they’re like thematically related and that there’s a lowercase “g” group of people, but Groups with a capital “G,” there’s a canonical representation on Facebook …
Thompson: So let’s split them out. Is there a way to adjust the way Groups are formed to limit the way disinformation spreads in them? Or if you eliminate Groups would you stop disinformation? And then custom audiences same question.
Mosseri: If you eliminate Groups you would not stop the spread of disinformation.
Thompson: Would you slow it?
Mosseri: Uh, maybe. But you would also slow a whole bunch of other things.
Thompson: What if you eliminated Groups that are highly likely to spread false information or have a tradition of it?
Mosseri: But that’s what we do do. [Facebook does take action against false news that is born from Groups and appears in News Feed, but it doesn’t eliminate Groups unless they violate the platform’s terms of service or community standards.] You wouldn't want to say, “Oh, anything that is political is going to get less distribution. Any political group is going to get less distribution.” Because now you’re impeding speech just because you think you’re going to reduce the spread of one false news story, a small percentage, but you’re also going to reduce a whole bunch of healthy civic discourse. And now you’re really destroying more value than problems that you’re avoiding. Same thing with custom audiences, by the way. I think that targeting doesn't exist really on the feed side, it exists on the ad side. But I think it’s really useful. You don’t really want to see an ad about diapers unless you’ve got kids. So that’s actually a useful thing. And you wouldn’t want to like, all of a sudden, get much less relevant ads because you’re trying to make this problem slightly less easy. We find that it’s much more effective to go after it specifically, so we do—if we think that a group or page is sharing a lot of misinformation or false news, we definitely go after its distribution directly.
Ariño de la Rubia: But I do want to challenge that. Misinformation is born in many places. It doesn’t just come from Groups, it doesn’t just come from Pages. Sometimes it comes from individuals, sometimes it comes out of nowhere and you have this moment where a bunch of people share the same or related misinformation at the same time. That’s literally the challenge here, is like, whenever we look at the data and we say, you know, is there a silver bullet? There isn’t. It’s adversarial and misinformation can come from any place that humans touch and humans can touch lots of places.
Thompson: It definitely can. But does it not come more from Groups? The smartest people I know who have looked at this are all reasonably convinced that Groups are where the stuff starts. There’s an anti-vaccine group and that’s where the like, vaccines-cause-autism stuff will start to spread. And then it will come out.
Mosseri: Do you mean specifically capital G Groups?
Thompson: Yes, capital G Groups.
Silverman: And we do act against that. I do want to make that clear, that it’s not just for false news. This is for misinformation, clickbait, and ad farms. If you are a Page that repeatedly, you know, conducts yourself in a certain way that we think is less valuable to our users, we will go after that entity in some way.
Hegeman: I think this is a good example, too, where I think there’s just lots of nuance here. There’s lots of different things you could mean by fake news, lots of different types. Like for some types I’m sure what you’re saying could be true to some extent. And I think this is why we want to have this partnership where we start to dig into this and try to get nuanced answers to these questions.
Mosseri: But we’re not just going to reduce the distribution of all page content because most false news comes from Pages. That just seems like you’d be destroying way more value than you’d be creating. And I don’t think any publisher would want us to do that either.
Thompson: Ok, another topic. And Antonia I think it was you in the video who said video is harder than text. Are you guys going to be able to apply this? As the web goes to more video and then as it goes to VR and then as we go to like neural link, are the same rules about like how to stop misinformation manipulation going to apply? Seriously, disinformation sucks right now on the web. What’s it going to be like when they’re fucking with our brains? And that’s going to be, like, four years away if you guys succeed in the whole thing Regina Dugan used to run. Is this going to apply to Oculus?
Silverman: Well one thing this gets back to is Tessa’s earlier comments about the types of signals we use. And so some of those signals are going to apply equally in both of these cases. So thinking about things like people commenting on a post and saying that they don’t believe it, or reporting it and saying that it’s false. So those things apply equally across these different types of content, and means that we’re going to do a fair amount just based on that.
Antonia Woodford: I was going to say there’s short-term actions that we’re trying to take and then long-term investments we’re trying to make. So in the short term, we’re starting to pilot in a couple of countries the ability to verify photos and videos, working with the same fact-checking partners that we already have for links. And starting to try to predict what might be misinformation in a photo or video using the same kinds of signals that we already use today, that Tessa spoke to earlier that John also mentioned, but we’re also aware that as technology develops, there will be more and more sophisticated kinds. So there’s been a fair amount of speculation lately about deep fake videos, and what that will mean and that’s really hard for someone to tell with a human eye sometimes if it’s real or fake. That’s where we’re working really hard with our artificial intelligence teams elsewhere in the company to try and get ahead of those trends and be able to start detecting those algorithmically.
Thompson: So are you guys going to gradually, will people on your team move from text to video to VR to…?
McNally: We are, in some sense, already moving some people along that stack.
Mosseri: More photo and video, I think VR is still a little far out...
Thompson: Can you just say what is the best data on how successful you’ve been? I know you knocked out a lot of accounts but what the percentage of content on Facebook that is false in August 2016 versus May 2018, where are we?
Lyons: So, we know that it was a small number to begin with and we know that it’s declining. One of the reasons that I am really excited about this collaboration that we’re doing with academics is because the thing that’s made it hard to share that number is because who is defining what’s false for August 2017 and who’s defining what’s false for August 2018? Or whatever the points in time are that you’re choosing. So we are committed to sharing prevalence data, reach data, whatever the sort of metrics are that we in collaboration with this academic community come up with that will help measure, not just our progress over time, which is really important, but ideally, help become ways that we can measure broader progress across the internet, across social media over time. Which we need not just to show the progress but also so that we can understand when things are spiking what’s happening, so that we can engage this broader set of stakeholders in helping to fight these challenges.
Zigmond: The other thing I would say is, I worked on this a fair amount, I mean, a point you made earlier is not all misinformation is the same. Some has more real-world consequences, some has very little. And so it’s not strictly a numbers game. And I think our perspective also is that any amount is too much. And so, you know, reducing it by 10 percent, 50 percent, even 99 percent, would be great, but there’s still harm that can come from that little bit that’s still seeping through.
Thompson: But if you could reduce it by 90 percent you wouldn’t need to have this many smart, important people who could be working on other projects, working on it. This is clearly a huge priority for the company. You wouldn’t have an 11-minute video, you’d have a two-minute video.
Mosseri: Because there’s still new tactics, right. Because if you get it down by 90 percent and then you stop working, you should assume that it’s going to grow again.
Ariño de la Rubia: And if we killed 90 percent but the only false news that we remove is false news that doesn’t have societal impact, like some story about some celebrity dying or loving donuts or something that isn’t true are the ones we take out, it doesn't matter if the 10 percent that we left is the harmful 10 percent. It’s really not about the numbers. It’s like the numbers times the potential for harm times vectors of possible distribution.
Thompson: I know there are smart people who have looked at this and who say that anyone who thinks fake news changed a single vote is an idiot. And there is an argument that it’s an explanation for why Trump won. Where are you on that spectrum?
Mosseri: I think the important thing to focus on is—take the election completely out of it. It’s still a problem, it’s still important, it’s still threatens all sorts of things that we value and that people who use our product value and so we have to address it. And you can argue a lot about whether or not it affected the election, lots of things affect elections. I almost think that whole argument is just a red herring and doesn’t actually…
Thompson: It may be worse than a red herring because it turned Trump against fake news, which turned him against the media ever more.
Mosseri: It got pretty complicated pretty fast. But for us, honestly, it’s a problem. We are responsible for stopping the spread of false news on our platform to as close to zero as humanly possible and we’re going to pursue that.
Thompson: Is there anything I missed that we haven’t talked about?
Lyons: One thing that’s important to keep in mind is this is a global challenge, that’s been true forever when it comes to misinformation. But it's certainly true today, and the way in which this problem manifests globally and the tools that we have to fight it globally are in some cases different. And so we all spend a lot more of our time than was represented in this conversation thinking about those components.
Thompson: Are there are elections, like, are you guys currently focused on the Mexico election?
Many voices: All the upcoming elections.
Lyons: But also all the non-election times. There’s particularly in some countries outside of an election misinformation can be just as damaging as anything else, and so we’re very globally focused right now.
Zigmond: Two billion people around the world are counting on us to fix this, and that would be true regardless of what happened in the last election, and so this is something that’s very important to us and that I think we’re going to be working on for a very long time.
Thompson: Thanks everybody this was super interesting! I’m so glad you guys all took the time. That was very generous.