BiSAL - A bilingual sentiment analysis lexicon to analyze Dark Web forums for cyber security.

Khalid Al-Rowaily, Muhammad Abulaish, Nur Al Hasan Haldar, Majed A. AlRubaian

Research output: Contribution to journalArticlepeer-review

45 Citations (Scopus)

Abstract

In this paper, we present the development of a Bilingual Sentiment Analysis Lexicon (BiSAL) for cyber security domain, which consists of a Sentiment Lexicon for ENglish (SentiLEN) and a Sentiment Lexicon for ARabic (SentiLAR) that can be used to develop opinion mining and sentiment analysis systems for bilingual textual data from Dark Web forums. For SentiLEN, a list of 279 sentiment bearing English words related to cyber threats, radicalism, and conflicts are identified and a unifying process is devised to unify their sentiment scores obtained from four different sentiment data sets. Whereas, for SentiLAR, sentiment bearing Arabic words are identified from a collection of 2000 message posts from Alokab Web forum, which contains radical contents. The SentiLAR provides a list of 1019 sentiment bearing Arabic words related to cyber threats, radicalism, and conflicts along with their morphological variants and sentiment polarity. For polarity determination, a semi-automated analysis process by three Arabic language experts is performed and their ratings are aggregated using some aggregate functions. A Web interface is developed to access both the lexicons (SentiLEN and SentiLAR) of BiSAL data set online, and a beta version of the same is available at http://www.abulaish.com/bisal.
Original languageEnglish
Pages (from-to)53-62
Number of pages10
JournalDigital Investigation
Volume14
DOIs
Publication statusPublished - 2015
Externally publishedYes

Fingerprint

Dive into the research topics of 'BiSAL - A bilingual sentiment analysis lexicon to analyze Dark Web forums for cyber security.'. Together they form a unique fingerprint.

Cite this