Multimodal Prediction of Suspicious News Types on Twitter
Presentation Type
Poster
Abstract
Social media platforms are a convenient, fast, and popular source of information for keeping up-to-date on current events. About 62% of American adults report using social media like Twitter as a means of staying informed. Not every news report, however, is vetted and verified before posted. Manually separating suspicious accounts from reliable sources is time consuming and tiresome. Therefore, the goal of this work is to construct an image and text model in order to automatically classify tweets as verified or deceptive news at various granularities. In addition to this task, our research aims to answer two questions: 1) can we improve the accuracy of existing text-based methods by including images? 2) can we identify the qualities that determine the class of a post?
Our model consists of two sub-networks trained independently for text and images. The two networks are merged and fed as input to a third neural network. The text side contains an embedding layer followed by a Long Short Term Memory (LSTM) recurrent neural network. We also extract linguistic cues on a tweet level to enrich the inputs to the LSTM layer. Linguistic features encode specifically for language that contains biased terminology, moral foundation cues, and the subjectivity of a tweet. Our image features are 2048-dimensional "ImageNet" features using pretrained weights. We designed and tested three independent classification tasks for our experiments: suspicious vs verified (binary), 4-way classification of clickbait, conspiracy, hoax and satire, and 3-way classification of verified, disinformation and propaganda; experiments are ongoing.
Start Date
10-5-2018 12:00 PM
End Date
10-5-2018 2:00 PM
Genre/Form
posters
Subjects - Topical (LCSH)
Online social networks--Research; Information resources; Mass media--Objectivity; Social media--Research
Type
Event
Format
application/pdf
Language
English
Multimodal Prediction of Suspicious News Types on Twitter
Social media platforms are a convenient, fast, and popular source of information for keeping up-to-date on current events. About 62% of American adults report using social media like Twitter as a means of staying informed. Not every news report, however, is vetted and verified before posted. Manually separating suspicious accounts from reliable sources is time consuming and tiresome. Therefore, the goal of this work is to construct an image and text model in order to automatically classify tweets as verified or deceptive news at various granularities. In addition to this task, our research aims to answer two questions: 1) can we improve the accuracy of existing text-based methods by including images? 2) can we identify the qualities that determine the class of a post?
Our model consists of two sub-networks trained independently for text and images. The two networks are merged and fed as input to a third neural network. The text side contains an embedding layer followed by a Long Short Term Memory (LSTM) recurrent neural network. We also extract linguistic cues on a tweet level to enrich the inputs to the LSTM layer. Linguistic features encode specifically for language that contains biased terminology, moral foundation cues, and the subjectivity of a tweet. Our image features are 2048-dimensional "ImageNet" features using pretrained weights. We designed and tested three independent classification tasks for our experiments: suspicious vs verified (binary), 4-way classification of clickbait, conspiracy, hoax and satire, and 3-way classification of verified, disinformation and propaganda; experiments are ongoing.