Abstract
A  rapid  growth  in  the  amount  of  fake  news  onsocial  media  is  a  very  serious  concern  in  our  society.  It  isusually created by manipulating images, text, audio, and videos.This   indicates   that   there   is   a   need   of   multimodal   systemfor  fake  news  detection.  Though,  there  are  multimodal  fakenews  detection  systems  but  they  tend  to  solve  the  problemof  fake  news  by  considering  an  additional  sub-task  like  eventdiscriminator  and  finding  correlations  across  the  modalities.The  results  of  fake  news  detection  are  heavily  dependent  onthe subtask and in absence of subtask training, the performanceof  fake  news  detection  degrade  by  10%  on  an  average.To  solve  this  issue,  we  introduce  SpotFake-  a  multi-modalframework for fake news detection. Our proposed solution de-tects fake news without taking into account any other subtasks.It  exploits  both  the  textual  and  visual  features  of  an  article.Specifically,  we  made  use  of  language  models  (like  BERT)  tolearn text features, and image features are learned from VGG-19  pre-trained  on  ImageNet  dataset.  All  the  experiments  areperformed on two publicly available datasets,i.e.,Twitter andWeibo. The proposed model performs better than the currentstate-of-the-art  on  Twitter  and  Weibo  datasets  by  3.27%  and6.83%,  respectively.