skip to main content
10.1145/3159652.3159734acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
research-article

Leveraging the Crowd to Detect and Reduce the Spread of Fake News and Misinformation

Authors Info & Claims
Published:02 February 2018Publication History

ABSTRACT

Online social networking sites are experimenting with the following crowd-powered procedure to reduce the spread of fake news and misinformation: whenever a user is exposed to a story through her feed, she can flag the story as misinformation and, if the story receives enough flags, it is sent to a trusted third party for fact checking. If this party identifies the story as misinformation, it is marked as disputed. However, given the uncertain number of exposures, the high cost of fact checking, and the trade-off between flags and exposures, the above mentioned procedure requires careful reasoning and smart algorithms which, to the best of our knowledge, do not exist to date. In this paper, we first introduce a flexible representation of the above procedure using the framework of marked temporal point processes. Then, we develop a scalable online algorithm, CURB, to select which stories to send for fact checking and when to do so to efficiently reduce the spread of misinformation with provable guarantees. In doing so, we need to solve a novel stochastic optimal control problem for stochastic differential equations with jumps, which is of independent interest. Experiments on two real-world datasets gathered from Twitter and Weibo show that our algorithm may be able to effectively reduce the spread of fake news and misinformation.

References

  1. O. Aalen, O. Borgan, and H. K. Gjessing. 2008. Survival and event history analysis: a process point of view. Springer.Google ScholarGoogle Scholar
  2. B. T. Adler and L. De Alfaro. 2007. A content-driven reputation system for the Wikipedia. In WWW. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. D. P. Bertsekas. 1995. Dynamic programming and optimal control. Athena Scientific. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. A. Borodin, G. O. Roberts, J. S. Rosenthal, and P. Tsaparas. 2005. Link analysis ranking: algorithms, theory, and experiments. TOIT 5, 1(2005), 231--297. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. L. Chen, Z. Yan, W. Zhang, and R. Kantola. 2015. TruSMS: a trustworthy SMS spam control system based on trust management. Future Generation Computer Systems 49(2015), 77--93. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. P. Chia and S. Knapskog. 2011. Re-evaluating the wisdom of crowds in assessing web security. In International Conference on Financial Cryptography and Data Security. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. G. L. Ciampaglia, P. S., L. M. Rocha, J. Bollen, F. Menczer, and A. Flammini. 2015. Computational Fact Checking from Knowledge Networks. PLOS ONE 10, 6(06 2015), 1--13.Google ScholarGoogle Scholar
  8. A. De, I. Valera, N. Ganguly, S. Bhattacharya, and M. Gomez-Rodriguez. 2016. Learning and Forecasting Opinion Dynamics in Social Networks. In NIPS. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. X. L. Dong, E. Gabrilovich, G. Heitz, W. Horn, K. Murphy, S. Sun, and W. Zhang. 2014. From data fusion to knowledge fusion. In VLDB. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. X. L. Dong, E. Gabrilovich, K. Murphy, V. Dang, W. Horn, C. Lugaresi, S. Sun, and W. Zhang. 2015. Knowledge-based trust: Estimating the trustworthiness of web sources. In VLDB. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. Farajtabar, N. Du, M. Gomez-Rodriguez, I. Valera, H. Zha, and L. Song. 2014. Shaping social activity by incentivizing users. In NIPS. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. D. Freeman. 2017. Can You Spot the Fakes?: On the Limitations of User Feedback in Online Social Networks. In WWW. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. A. Friggeri, L. Adamic, D. Eckles, and J. Cheng. 2014. Rumor Cascades. In ICWSM.Google ScholarGoogle Scholar
  14. A. Gupta, P. Kumaraguru, C. Castillo, and P. Meier. 2014. Tweetcred: Real-time credibility assessment of content on twitter. In SocInfo.Google ScholarGoogle Scholar
  15. Z. Gyöngyi, H. Garcia-Molina, and J. Pedersen. 2004. Combating web spam with trustrank. In VLDB. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. F. B. Hanson. 2007. Applied stochastic processes and control for Jump-diffusions: modeling, analysis, and computation. Society for Industrial and Applied Mathematics. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. N. Hung, D. Thang, M. Weidlich, and K. Aberer. 2015. Minimizing Efforts in Validating Crowd Answers. In SIGMOD. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. F. C. Kingman. 1993. Poisson processes. Wiley Online Library.Google ScholarGoogle Scholar
  19. S. Kumar, R. West, and J. Leskovec. 2016. Disinformation on the web: Impact, characteristics, and detection of wikipedia hoaxes. In WWW. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. S. Kwon, M. Cha, and K. Jung. 2017. Rumor detection over varying time windows. PLOS ONE 12, 1(2017), e0168344.Google ScholarGoogle ScholarCross RefCross Ref
  21. P. A. Lewis and G. S. Shedler. 1979. Simulation of nonhomogeneous Poisson processes by thinning. Naval Research Logistics 26, 3(1979), 403--413.Google ScholarGoogle ScholarCross RefCross Ref
  22. Y. Li, Q. Li, J. Gao, L. Su, B. Zhao, W. Fan, and J. Han. 2015. On the discovery of evolving truth. In KDD. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. M. Liu, L. Jiang, J. Liu, X. Wang, J. Zhu, and S. Liu. 2017. Improving Learningfrom-Crowds through Expert Validation. In IJCAI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. X. Liu, X. L. Dong, B. C. Ooi, and D. Srivastava. 2011. Online data fusion. In VLDB.Google ScholarGoogle Scholar
  25. X. Liu, A. Nourbakhsh, Q. Li, R. Fang, and S. Shah. 2015. Real-time rumor debunking on twitter. In CIKM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. M. Lukasik, P. K. Srijith, D. Vu, K. Bontcheva, A. Zubiaga, and T. Cohn. 2016. Hawkes processes for continuous time sequence classification: an application to rumour stance classification in twitter. In ACL.Google ScholarGoogle Scholar
  27. C. Lumezanu, N. Feamster, and H. Klein. 2012. # bias: Measuring the tweeting behavior of propagandists. In ICWSM.Google ScholarGoogle Scholar
  28. J. Ma, W. Gao, P. Mitra, S. Kwon, B. Jansen, K. Wong, and M. Cha. 2016. Detecting Rumors from Microblogs with Recurrent Neural Networks. In IJCAI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. M. Mendoza, B. Poblete, and C. Castillo. 2010. Twitter Under Crisis: Can we trust what we RT?. In Workshop on Social Media Analytics. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. T. Moore and R. Clayton. 2008. Evaluating the wisdom of crowds in assessing phishing websites. Lecture Notes in Computer Science 5143(2008), 16--30. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. A. Pal, V. Rastogi, A. Machanavajjhala, and P. Bohannon. 2012. Information integration over time in unreliable and uncertain environments. In WWW. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. J. Pasternack and D. Roth. 2013. Latent credibility analysis. In WWW. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. V. Qazvinian, E. Rosengren, D. R. Radev, and Q. Mei. 2011. Rumor has it: Identifying misinformation in microblogs. In EMNLP. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. J. Ratkiewicz, M. Conover, M. Meiss, B. Gonçalves, S. Patil, A. Flammini, and F. Menczer. 2011. Truthy: mapping the spread of astroturf in microblog streams. In WWW. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. N. Ruchansky, S. Seo, and Y. Liu. 2017. CSI: A Hybrid Deep Model for Fake News Detection. In CIKM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. B. Tabibian, I. Valera, M. Farajtabar, L. Song, B. Schoelkopf, and M. GomezRodriguez. 2017. Distilling Information Reliability and Source Trustworthiness from Digital Traces. In WWW. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. M. Tanushree, G. Wright, and E. Gilbert. 2017. Parsimonious language model of social media credibility across disparate events. In CSCW.Google ScholarGoogle Scholar
  38. S. Volkova, K. Shaffer, J. Y. Jang, and N. Hodas. 2017. Separating facts from fiction: Linguistic models to classify suspicious and trusted news posts on twitter. In ACL.Google ScholarGoogle Scholar
  39. G. Wang, M. Mohanlal, C. Wilson, X. Wang, M. Metzger, H. Zheng, and B. Zhao. 2012. Social Turing Tests: Crowdsourcing Sybil Detection. In arXiv.Google ScholarGoogle Scholar
  40. S. Wang, D. Wang, L. Su, L. Kaplan, and T. F. Abdelzaher. 2014. Towards cyberphysical systems in social spaces: The data reliability challenge. In RTSS.Google ScholarGoogle Scholar
  41. Y. Wang, W. Grady, E. Theodorou, and L. Song. 2017. Variational Policy for Guiding Point Processes. In ICML.Google ScholarGoogle Scholar
  42. Y. Wang, G. Williams, E. Theodorou, and L. Song. 2017. Variational Policy for Guiding Point Processes. In ICML.Google ScholarGoogle Scholar
  43. W. Wei and X. Wan. 2017. Learning to Identify Ambiguous and Misleading News Headlines. In IJCAI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. M. Wu and A. Marian. 2007. Corroborating Answers from Multiple Web Sources.. In WebDB.Google ScholarGoogle Scholar
  45. H. Xiao, J. Gao, Q. Li, F. Ma, L. Su, Y Feng., and A. Zhang. 2016. Towards Confidence in the Truth: A Bootstrapping based Truth Discovery Approach. In KDD. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. X. Yin and W. Tan. 2011. Semi-supervised truth discovery. In WWW. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. A. Zarezade, A. De, H. Rabiee, and M. Gomez-Rodriguez. 2017. Cheshire: An Online Algorithm for Activity Maximization in Social Networks. In Allerton Conference.Google ScholarGoogle Scholar
  48. A. Zarezade, U. Upadhyay, H. Rabiee, and M. Gomez-Rodriguez. 2017. RedQueen: An Online Algorithm for Smart Broadcasting in Social Networks. In WSDM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. B. Zhao and J. Han. 2012. A probabilistic model for estimating real-valued truth from conflicting sources. In QDB.Google ScholarGoogle Scholar
  50. B. Zhao, B. I. Rubinstein, J. Gemmell, and J. Han. 2012. A bayesian approach to discovering truth from conflicting sources for data integration. In VLDB. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Z. Zhao, P. Resnick, and Q. Mei. 2015. Enquiring minds: Early detection of rumors in social media from enquiry posts. In WWW. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. E. Zheleva, A. Kolcz, and L. Getoor. 2008. Trusting spam reporters: A reporterbased reputation system for email filtering. TOIS 27, 1(2008), 3. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Leveraging the Crowd to Detect and Reduce the Spread of Fake News and Misinformation

                    Recommendations

                    Comments

                    Login options

                    Check if you have access through your login credentials or your institution to get full access on this article.

                    Sign in

                    PDF Format

                    View or Download as a PDF file.

                    PDF

                    eReader

                    View online with eReader.

                    eReader