research-article

Leveraging the Crowd to Detect and Reduce the Spread of Fake News and Misinformation

Authors:
Jooyeon Kim

Korea Advanced Institute of Science and Technology, Daejeon, South Korea

Korea Advanced Institute of Science and Technology, Daejeon, South Korea
View Profile

,
Behzad Tabibian

Max Planck Institute for Intelligent Systems & Max Planck Institute for Software Systems, Stuttgart, Germany

Max Planck Institute for Intelligent Systems & Max Planck Institute for Software Systems, Stuttgart, Germany
View Profile

,
Alice Oh

Korea Advanced Institute of Science and Technology, Daejeon, South Korea

Korea Advanced Institute of Science and Technology, Daejeon, South Korea
View Profile

,
Bernhard Schölkopf

Max Planck Institute for Intelligent Systems, Stuttgart, South Korea

Max Planck Institute for Intelligent Systems, Stuttgart, South Korea
View Profile

,
Manuel Gomez-Rodriguez

Max Planck Institute for Software Systems, Kaiserslautern, Germany

Max Planck Institute for Software Systems, Kaiserslautern, Germany
View Profile

WSDM '18: Proceedings of the Eleventh ACM International Conference on Web Search and Data MiningFebruary 2018Pages 324–332https://doi.org/10.1145/3159652.3159734

Published:02 February 2018Publication History

WSDM '18: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining

Pages 324–332

ABSTRACT

Online social networking sites are experimenting with the following crowd-powered procedure to reduce the spread of fake news and misinformation: whenever a user is exposed to a story through her feed, she can flag the story as misinformation and, if the story receives enough flags, it is sent to a trusted third party for fact checking. If this party identifies the story as misinformation, it is marked as disputed. However, given the uncertain number of exposures, the high cost of fact checking, and the trade-off between flags and exposures, the above mentioned procedure requires careful reasoning and smart algorithms which, to the best of our knowledge, do not exist to date. In this paper, we first introduce a flexible representation of the above procedure using the framework of marked temporal point processes. Then, we develop a scalable online algorithm, CURB, to select which stories to send for fact checking and when to do so to efficiently reduce the spread of misinformation with provable guarantees. In doing so, we need to solve a novel stochastic optimal control problem for stochastic differential equations with jumps, which is of independent interest. Experiments on two real-world datasets gathered from Twitter and Weibo show that our algorithm may be able to effectively reduce the spread of fake news and misinformation.

References

O. Aalen, O. Borgan, and H. K. Gjessing. 2008. Survival and event history analysis: a process point of view. Springer.Google Scholar
B. T. Adler and L. De Alfaro. 2007. A content-driven reputation system for the Wikipedia. In WWW. Google ScholarDigital Library
D. P. Bertsekas. 1995. Dynamic programming and optimal control. Athena Scientific. Google ScholarDigital Library
A. Borodin, G. O. Roberts, J. S. Rosenthal, and P. Tsaparas. 2005. Link analysis ranking: algorithms, theory, and experiments. TOIT 5, 1(2005), 231--297. Google ScholarDigital Library
L. Chen, Z. Yan, W. Zhang, and R. Kantola. 2015. TruSMS: a trustworthy SMS spam control system based on trust management. Future Generation Computer Systems 49(2015), 77--93. Google ScholarDigital Library
P. Chia and S. Knapskog. 2011. Re-evaluating the wisdom of crowds in assessing web security. In International Conference on Financial Cryptography and Data Security. Google ScholarDigital Library
G. L. Ciampaglia, P. S., L. M. Rocha, J. Bollen, F. Menczer, and A. Flammini. 2015. Computational Fact Checking from Knowledge Networks. PLOS ONE 10, 6(06 2015), 1--13.Google Scholar
A. De, I. Valera, N. Ganguly, S. Bhattacharya, and M. Gomez-Rodriguez. 2016. Learning and Forecasting Opinion Dynamics in Social Networks. In NIPS. Google ScholarDigital Library
X. L. Dong, E. Gabrilovich, G. Heitz, W. Horn, K. Murphy, S. Sun, and W. Zhang. 2014. From data fusion to knowledge fusion. In VLDB. Google ScholarDigital Library
X. L. Dong, E. Gabrilovich, K. Murphy, V. Dang, W. Horn, C. Lugaresi, S. Sun, and W. Zhang. 2015. Knowledge-based trust: Estimating the trustworthiness of web sources. In VLDB. Google ScholarDigital Library
M. Farajtabar, N. Du, M. Gomez-Rodriguez, I. Valera, H. Zha, and L. Song. 2014. Shaping social activity by incentivizing users. In NIPS. Google ScholarDigital Library
D. Freeman. 2017. Can You Spot the Fakes?: On the Limitations of User Feedback in Online Social Networks. In WWW. Google ScholarDigital Library
A. Friggeri, L. Adamic, D. Eckles, and J. Cheng. 2014. Rumor Cascades. In ICWSM.Google Scholar
A. Gupta, P. Kumaraguru, C. Castillo, and P. Meier. 2014. Tweetcred: Real-time credibility assessment of content on twitter. In SocInfo.Google Scholar
Z. Gyöngyi, H. Garcia-Molina, and J. Pedersen. 2004. Combating web spam with trustrank. In VLDB. Google ScholarDigital Library
F. B. Hanson. 2007. Applied stochastic processes and control for Jump-diffusions: modeling, analysis, and computation. Society for Industrial and Applied Mathematics. Google ScholarDigital Library
N. Hung, D. Thang, M. Weidlich, and K. Aberer. 2015. Minimizing Efforts in Validating Crowd Answers. In SIGMOD. Google ScholarDigital Library
J. F. C. Kingman. 1993. Poisson processes. Wiley Online Library.Google Scholar
S. Kumar, R. West, and J. Leskovec. 2016. Disinformation on the web: Impact, characteristics, and detection of wikipedia hoaxes. In WWW. Google ScholarDigital Library
S. Kwon, M. Cha, and K. Jung. 2017. Rumor detection over varying time windows. PLOS ONE 12, 1(2017), e0168344.Google ScholarCross Ref
P. A. Lewis and G. S. Shedler. 1979. Simulation of nonhomogeneous Poisson processes by thinning. Naval Research Logistics 26, 3(1979), 403--413.Google ScholarCross Ref
Y. Li, Q. Li, J. Gao, L. Su, B. Zhao, W. Fan, and J. Han. 2015. On the discovery of evolving truth. In KDD. Google ScholarDigital Library
M. Liu, L. Jiang, J. Liu, X. Wang, J. Zhu, and S. Liu. 2017. Improving Learningfrom-Crowds through Expert Validation. In IJCAI. Google ScholarDigital Library
X. Liu, X. L. Dong, B. C. Ooi, and D. Srivastava. 2011. Online data fusion. In VLDB.Google Scholar
X. Liu, A. Nourbakhsh, Q. Li, R. Fang, and S. Shah. 2015. Real-time rumor debunking on twitter. In CIKM. Google ScholarDigital Library
M. Lukasik, P. K. Srijith, D. Vu, K. Bontcheva, A. Zubiaga, and T. Cohn. 2016. Hawkes processes for continuous time sequence classification: an application to rumour stance classification in twitter. In ACL.Google Scholar
C. Lumezanu, N. Feamster, and H. Klein. 2012. # bias: Measuring the tweeting behavior of propagandists. In ICWSM.Google Scholar
J. Ma, W. Gao, P. Mitra, S. Kwon, B. Jansen, K. Wong, and M. Cha. 2016. Detecting Rumors from Microblogs with Recurrent Neural Networks. In IJCAI. Google ScholarDigital Library
M. Mendoza, B. Poblete, and C. Castillo. 2010. Twitter Under Crisis: Can we trust what we RT?. In Workshop on Social Media Analytics. Google ScholarDigital Library
T. Moore and R. Clayton. 2008. Evaluating the wisdom of crowds in assessing phishing websites. Lecture Notes in Computer Science 5143(2008), 16--30. Google ScholarDigital Library
A. Pal, V. Rastogi, A. Machanavajjhala, and P. Bohannon. 2012. Information integration over time in unreliable and uncertain environments. In WWW. Google ScholarDigital Library
J. Pasternack and D. Roth. 2013. Latent credibility analysis. In WWW. Google ScholarDigital Library
V. Qazvinian, E. Rosengren, D. R. Radev, and Q. Mei. 2011. Rumor has it: Identifying misinformation in microblogs. In EMNLP. Google ScholarDigital Library
J. Ratkiewicz, M. Conover, M. Meiss, B. Gonçalves, S. Patil, A. Flammini, and F. Menczer. 2011. Truthy: mapping the spread of astroturf in microblog streams. In WWW. Google ScholarDigital Library
N. Ruchansky, S. Seo, and Y. Liu. 2017. CSI: A Hybrid Deep Model for Fake News Detection. In CIKM. Google ScholarDigital Library
B. Tabibian, I. Valera, M. Farajtabar, L. Song, B. Schoelkopf, and M. GomezRodriguez. 2017. Distilling Information Reliability and Source Trustworthiness from Digital Traces. In WWW. Google ScholarDigital Library
M. Tanushree, G. Wright, and E. Gilbert. 2017. Parsimonious language model of social media credibility across disparate events. In CSCW.Google Scholar
S. Volkova, K. Shaffer, J. Y. Jang, and N. Hodas. 2017. Separating facts from fiction: Linguistic models to classify suspicious and trusted news posts on twitter. In ACL.Google Scholar
G. Wang, M. Mohanlal, C. Wilson, X. Wang, M. Metzger, H. Zheng, and B. Zhao. 2012. Social Turing Tests: Crowdsourcing Sybil Detection. In arXiv.Google Scholar
S. Wang, D. Wang, L. Su, L. Kaplan, and T. F. Abdelzaher. 2014. Towards cyberphysical systems in social spaces: The data reliability challenge. In RTSS.Google Scholar
Y. Wang, W. Grady, E. Theodorou, and L. Song. 2017. Variational Policy for Guiding Point Processes. In ICML.Google Scholar
Y. Wang, G. Williams, E. Theodorou, and L. Song. 2017. Variational Policy for Guiding Point Processes. In ICML.Google Scholar
W. Wei and X. Wan. 2017. Learning to Identify Ambiguous and Misleading News Headlines. In IJCAI. Google ScholarDigital Library
M. Wu and A. Marian. 2007. Corroborating Answers from Multiple Web Sources.. In WebDB.Google Scholar
H. Xiao, J. Gao, Q. Li, F. Ma, L. Su, Y Feng., and A. Zhang. 2016. Towards Confidence in the Truth: A Bootstrapping based Truth Discovery Approach. In KDD. Google ScholarDigital Library
X. Yin and W. Tan. 2011. Semi-supervised truth discovery. In WWW. Google ScholarDigital Library
A. Zarezade, A. De, H. Rabiee, and M. Gomez-Rodriguez. 2017. Cheshire: An Online Algorithm for Activity Maximization in Social Networks. In Allerton Conference.Google Scholar
A. Zarezade, U. Upadhyay, H. Rabiee, and M. Gomez-Rodriguez. 2017. RedQueen: An Online Algorithm for Smart Broadcasting in Social Networks. In WSDM. Google ScholarDigital Library
B. Zhao and J. Han. 2012. A probabilistic model for estimating real-valued truth from conflicting sources. In QDB.Google Scholar
B. Zhao, B. I. Rubinstein, J. Gemmell, and J. Han. 2012. A bayesian approach to discovering truth from conflicting sources for data integration. In VLDB. Google ScholarDigital Library
Z. Zhao, P. Resnick, and Q. Mei. 2015. Enquiring minds: Early detection of rumors in social media from enquiry posts. In WWW. Google ScholarDigital Library
E. Zheleva, A. Kolcz, and L. Getoor. 2008. Trusting spam reporters: A reporterbased reputation system for email filtering. TOIS 27, 1(2008), 3. Google ScholarDigital Library

Index Terms

Recommendations

Co-spread of Misinformation and Fact-Checking Content During the Covid-19 Pandemic
Social Informatics
Abstract
In the context of the Covid-19 pandemic, the consequences of misinformation are a matter of life and death. Correcting misconceptions and false beliefs are important for injecting reliable information about the outbreak. Fact-checking ...
Read More
Demographics and topics impact on the co-spread of COVID-19 misinformation and fact-checks on Twitter
Abstract
Correcting misconceptions and false beliefs are important for injecting reliable information about COVID-19 into public discourse, but what impact does this have on the continued proliferation of misinforming claims? Fact-checking ...
Highlights
- The relation between misinformation and fact-checking during COVID-19 is studied.
Read More
Diffusion of Community Fact-Checked Misinformation on Twitter
CSCW

The spread of misinformation on social media is a pressing societal problem that platforms, policymakers, and researchers continue to grapple with. As a countermeasure, recent works have proposed to employ non-expert fact-checkers in the crowd to fact-...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WSDM '18: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining
February 2018
821 pages
ISBN:9781450355810
DOI:10.1145/3159652
General Chairs:
Yi Chang
Jilin University, Huawei Inc.
,
Chengxiang Zhai
University of Illinois Urbana-Champaign
,
Program Chairs:
Yan Liu
University of Southern California
,
Yoelle Maarek
Amazon
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 2 February 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
crowdsourcing
fact-checking
fake news
misinformation
social networking sites
stochastic differential equation
stochastic optimal control
temporal point processes
Qualifiers
- research-article
Conference

Acceptance Rates
WSDM '18 Paper Acceptance Rate81of514submissions,16%Overall Acceptance Rate498of2,863submissions,17%
More
Upcoming Conference
WSDM '25

Sponsor:

sigir

sigir

sigir

sigir

The Eighteenth ACM International Conference on Web Search and Data Mining

April 7 - 11, 2025

Hannover , Germany
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 130
  Total Citations
  View Citations
- 2,615
  Total Downloads
- Downloads (Last 12 months)221
- Downloads (Last 6 weeks)27
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Leveraging the Crowd to Detect and Reduce the Spread of Fake News and Misinformation

WSDM '18: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Co-spread of Misinformation and Fact-Checking Content During the Covid-19 Pandemic

Demographics and topics impact on the co-spread of COVID-19 misinformation and fact-checks on Twitter

Diffusion of Community Fact-Checked Misinformation on Twitter