Social media plays a vital role in information sharing during disasters. Unfortunately, the overwhelming volume and variety of data generated on social media make it challenging to sieve through such… Click to show full abstract
Social media plays a vital role in information sharing during disasters. Unfortunately, the overwhelming volume and variety of data generated on social media make it challenging to sieve through such content manually and determine its relevancy. Most automated approaches to classify crisis data for relevancy are based on classic statistical features. However, such approaches do not adapt well to situations when applied on a new crisis event, or to a new language that the model was not trained on. In crisis situations, training a new model for particular crises or languages is not a viable approach. In this paper, we introduce a hybrid semantic-statistical approach for classifying data with regards to relevancy to a given crisis. We demonstrate how this approach outperforms the baselines in scenarios where the model is trained on one type of crisis and language and tested on new crisis types and additional languages.
               
Click one of the above tabs to view related content.