National Security and Social Media - International Communication | Học viện Ngoại giao Việt Nam

National Security and Social Media - International Communication | Học viện Ngoại giao Việt Nam được sưu tầm và soạn thảo dưới dạng file PDF để gửi tới các bạn sinh viên cùng tham khảo, ôn tập đầy đủ kiến thức, chuẩn bị cho các buổi học thật tốt. Mời bạn đọc đón xem!

This item was submitted to by the author. Loughborough's Research Repository
Items in Figshare are protected by copyright, with all rights reserved, unless otherwise indicated.
National security and social media monitoring: a presentation of the emotiveNational security and social media monitoring: a presentation of the emotive
and related systemsand related systems
PLEASE CITE THE PUBLISHED VERSION
http://dx.doi.org/10.1109/EISIC.2013.38
PUBLISHER
© IEEE
VERSION
AM (Accepted Manuscript)
PUBLISHER STATEMENT
This work is made available according to the conditions of the Creative Commons Attribution-NonCommercial-
NoDerivatives 4.0 International (CC BY-NC-ND 4.0) licence. Full details of this licence are available at:
https://creativecommons.org/licenses/by-nc-nd/4.0/
LICENCE
CC BY-NC-ND 4.0
REPOSITORY RECORD
Sykora, Martin D., Thomas Jackson, Ann O'Brien, and Suzanne Elayan. 2019. “National Security and Social
Media Monitoring: A Presentation of the Emotive and Related Systems”. figshare.
https://hdl.handle.net/2134/18767.
23:42 29/7/24
National Security and Social Media Monitoring - A Presentation of the Emotive a…
about:blank
1/5
National Security and Social Media Monitoring
A Presentation of the EMOTIVE and Related Systems
Martin D. Sykora*, Thomas W. Jackson, Ann O’Brien, Suzanne Elayan
Information Science Department
Loughborough University
Loughborough, United Kingdom
*M.D.Sykora@lboro.ac.uk
Abstract—Today social media streams, such as Twitter,
represent vast amounts of ‘real-time’ daily streaming data.
Topics on these streams cover every range of human
communication, ranging from banal banter, to serious reactions
to events and information sharing regarding any imaginable
product, item or entity. It has now become the norm for publicly
visible events to break news over social media streams first, and
only then followed by main stream media picking up on the news.
It has been suggested in literature that social-media are a valid,
valuable and effective real-time tool for gauging public subjective
reactions to events and entities. Due to the vast big-data that is
generated on a daily basis on social media streams, monitoring
and gauging public reactions has to be automated and most of all
scalable i.e. human, expert monitoring is generally unfeasible.
In this paper the EMOTIVE system, a project funded jointly by
the DSTL (Defence Science and Technology Laboratory) and
EPSRC, which focuses on monitoring fine-grained emotional
responses relating to events of national security importance, will
be presented. Similar systems for monitoring national security
events are also presented and the primary traits of such national
security social media monitoring systems are introduced and
discussed.
Keywords—social media monitoring; national security;
information retrieval; natural language processing; Twitter
I. INTRODUCTION
User generated content comes in many forms; however,
vast quantities of informal text based messages shared by users
of social media applications, such as on micro-blogging
applications, present an exciting and potentially highly
valuable live stream of opinionated, emotional and informative
datasets. For instance, the crisis response community has used
Twitter and similar social media with some success to help deal
with crisis management during conflicts, natural or manmade
disasters [1, 2]. Numerous businesses are now monitoring
social media in order to gauge their brand’s popularity and that
of their products [3]. Governments and law enforcement
organisations are now also actively seeking ways to monitor
and anticipate violence, and to analyse public response to
various events of national security importance [4]. Reference
[5] and [6] point out the importance of gauging the public
response to terrorism events from social media, and specifically
highlight the importance of automatic sentiment detection in
Tweets. There is also a need to geo-locate tweets and organise
the vast social streams in order to provide the most relevant
output in a user interface for monitoring tasks [7]. In [8] social
media monitoring was presented as a useful new source for
situational awareness during terrorism events; however, the
authors were careful to point out the ethical issues with sharing
on Twitter, as oversharing during times of terror, as the authors
have shown, can aid the attackers agenda. In this paper we
review several recent social-media monitoring systems from
the literature, in the security domain. The primary elements of
these systems are briefly presented, a gap in sentiment analysis
capabilities is highlighted and our own EMOTIVE system, a
project funded by DSTL (United Kingdom based Defence
Science and Technology Laboratory) and EPSRC, which
predominantly focused on fine-grained emotion extraction, is
briefly introduced within this context.
The remainder of the paper is structured as follows. Section
2 introduces some background considerations for social stream
monitoring systems. Section 3 describes a compilation of the
essential features expected from most national security
monitoring systems, and subsequently five specific monitoring
systems from literature are reviewed. The EMOTIVE system,
developed by the authors is introduced in section 4 and the
paper is concluded in section 5.
II. BACKGROUND
Social media is often the first source to break the news,
only then followed by main stream news outlets. This has been
the case, as early as the 2008 Mumbai attacks, where
individuals on location broke the news via Twitter [9], or in the
July 2009 Jakarta bombings, where Twitter broke the news
[10]. Interestingly, it was found that even earthquakes, ranging
from seismic intensity scale 3 or more, were reported quicker
by Twitter users as opposed to the relevant Japanese agencies
[11]. However, social streams such as Twitter were also found
to be a great source for gauging or polling public opinion
during relatively event less time-periods [12]. Unfortunately
there is a very wide range of different content and styles of
messages communicated on Twitter [13], which requires
selective interpretation of the messages. Nevertheless the
potential value of social media streams is quite evident from
the highlighted literature above. However, there is a need for
automated tools and techniques to monitor and help analyse the
vast social-media content [14]. Automated analysis and
monitoring is necessary in order to deal with the enormous
amounts of social-media content being generated every day, as
23:42 29/7/24
National Security and Social Media Monitoring - A Presentation of the Emotive a…
about:blank
2/5
purely human based monitoring is not feasible. As opposed to
product, services and business related social media monitoring
applications [15], throughout this paper we focus on the needs
of monitoring and analysis applications relevant to national
security, which are understood to encompass unplanned natural
/ manmade disasters, terrorist attacks, violent incidents, and
scheduled expected events with some national significance.
III. PRIMARY ELEMENTS OF A MONITORING SYSTEM
In order for a system to facilitate monitoring of a certain
event or entity of interest, efficient extraction of messages,
geo-location, emotion evaluation, clustering and organization
of the tweet messages, and an intuitive user-interface are
necessary [16, 14]. Specifically in the field of terrorism
informatics [5] suggested a framework for tweet filtering and
extraction, followed by sentiment detection, exploration of
demographics and finally, reporting and presentation of the
enriched tweet datasets. We argue, based on prior work [16,
17, 7, 18, 19, 20], that a system design which integrates and
unifies different techniques to facilitate effective monitoring in
various situations should hence be composed of at least the
following system steps:
1. Keyword / Keyphrase monitoring or first event
detection, filtering and extraction
2. Accurate geo-location detection
3. Emotion detection and evaluation
4. Tone of tweet message detection, further semantic
enrichment and organisation
5. User-interface visualisation
The idea behind these system steps is that in item 1, sparse
text messages are monitored and retrieved, and some spam is
filtered out at this early stage. Some approaches require manual
input of keywords to monitor, other infer them automatically.
For instance in [5] keywords were based on automated named
entity recognition of regularly re-checked trending twitter
topics. On the other hand in [21] monitoring was performed on
an unfiltered stream of tweets (or tweet firehose), using first
story detection. Essentially tweets that are similar are grouped
together, using a kind of nearest neighbor or clustering
techniques; however, to allow scalability for “Big Data”,
Locality-Sensitive Hashing was used, and seems to be a very
feasible solution to the problem. In steps 2, 3 and 4 further
analysis substantially enriches the retrieved text messages by
providing context through extracting location details,
communicated emotions, and various features from the tone of
the messages, which can be used to help organize the tweets.
These three steps can also be understood to simply be steps in
automated semantic enrichment [22]. The amount of semantic
enrichment may vary from system to system, but for instance
in the EMOTIVE project (section IV) the dominant focus was
step 3, i.e. fine-grained emotion detection, although some geo-
location and simple tone detection was also performed in
EMOTIVE. The work in [23] provides further support for why
this kind of information and the resulting situational awareness
is very useful in security incidents monitoring. In item 5 the
visualisation and possibly a faceted user-interface allows to
explore the enriched sparse text message data (items; 2, 3, 4).
Thanks to item 2, map based visualisations are largely possible,
with the possibility of overlaying various dimensions of
information over a map interface. All system steps discussed
must be adapted to (near) real-time monitoring scenarios, and
require the use of efficient algorithms and data-structures to
handle big-data, irrespective of parallelization architectures and
scalable frameworks that could speed up the techniques in their
own right.
In the rest of this section a brief overview of five systems
for monitoring crisis and national security or emergency
situations, using Twitter are presented. The systems are
relatively recent efforts and some of them are still being
developed. Crisis Tracker [16] organises Twitter messages
into stories and categorises them by type of incidents and
displays them on a geographic map (geo-map), although much
of its functionality is supported by crowdsourcing, i.e. human
users annotating content, rather than an automated algorithm
performing the tasks. It is currently being developed for uses in
the crisis management and monitoring community, as part of a
PhD thesis, and an experimental, live deployment run of the
system has completed in September 2012, in which crisis
mapping volunteers used the system to track incidents in the
Syrian conflict. Crisees [17] is a prototype system for
aggregating social media content for crises monitoring, and its
basic capabilities are extraction and monitoring of multiple
events from social media with a geo-map visualisation, with a
basic (lexicon based) emotion (sentiment polarity)
classification. SensePlace 2 [7] is being designed for the crisis
management community, and the system essentially allows to
search by place (geo-map), theme and time. Detection of
themes was aided by named entity recognition (NER) from
tweet content. Location mentions from messages were also
extracted to place a non geo-tagged tweet geographically.
Swiftriver [18, 19] is Ushahidi’s semi-automatic linguistic text
analysis sub-system (i.e. a large portion of the processing is
conducted by human operators), but it is still in development
(currently in beta) without any relevant published evaluation of
the system, other than some initial material and open-sourced
code
1
. SiLCC is the main module for performing some of the
automated linguistic text analysis; however, the interface itself
is mostly concerned with establishing veracity of messages and
sources, with the possibility for Swiftriver to be integrated with
an Ushahidi instance. Twitcident [20] is developed in co-
operation with Netherland’s emergency services, to provide
contextual social media information for significant reported
emergency incidents, which may have a public dimension and
hence may be of interest for monitoring purposes, as people on
Twitter may provide information of causalities, risks or
damages. The system’s main focus is not necessarily a geo-
map display (although available in a secondary view) but rather
an event-specific display of Tweets that is searchable by a
faceted user-interface, with further statistics on tweets
matching the search criteria. Table I highlights the main
features of the above mentioned systems, where the features of
each system are grouped under the following headings; 1-
extraction of messages, 2-geo-location, 3-emotion evaluation,
1
Code available on GitHub https://github.com/ushahidi/silcc;
Swiftriver sites – http://swiftly.org/;
23:42 29/7/24
National Security and Social Media Monitoring - A Presentation of the Emotive a…
about:blank
3/5
4-clustering and organisation of tweet messages and the 5-user-
interface.
TABLE I. FIVE TWITTER MONITORING SYSTEMS (PUBLISHED IN
ACADEMIC LITERATURE)
System Features
CrisisTracker
[16]
-EXTRACTION: Manual search-terms and their subsequent
monitoring, and automated breaking-news detection
-LOCATION: Geo-tagged Tweets only; consideration is given to
users on the ground – i.e. on location users are highlighted by users
(Crowdsourced).
-EMOTION: n/a
-ORGANISATION: Automated clustering of messages into Stories.
Crowdsourced message / entity annotation by humans.
-UI: Geo-Map interface, faceted search, colour coded message /event
types, and main-detail view of stories-messages.
Crisees [17]
-EXTRACTION: Manual search-terms entry and subsequent
monitoring
-LOCATION: Geo-tagged – only lat/long tagged Tweets
-EMOTION: Lexicon based sentiment word matching (+)ve/(-)ve
-ORGANISATION: Simple list
-UI: Geo-Map interface, individual tweet message display – colour
coded by emotion, and display of images.
SensePlace2
[7]
-EXTRACTION: Manual search-terms entry and subsequent
monitoring
-LOCATION: Geo-tagged (lat/long) tagged Tweets, and inferring
location from user profiles or tweet messages themselves (using
Geonames), and the TtT standard is mentioned, although not clear
whether implemented.
-EMOTION: n/a
-ORGANISATION: Based on the principle of Time, Theme and
Location.
-UI: A structured survey of professionals in crisis response was
conducted to guide the UI design.
Swiftriver
(Ushahidi)
[18, 19]
-EXTRACTION: -
-LOCATION: Gazetteer lists of place-name locations
-EMOTION: n/a
-ORGANISATION: Trustworthy sources are determined by
Trustworthiness ratings, and context is added by tagging.
-UI: Ushahidi’s UI consists of a map as main search interface of
events, possible to select type of events.
Twitcident
[20]
-EXTRACTION: Search-terms based on an incident profile (e.g.
locations, persons) from emergency broadcasting services (incidents
requiring public emergency services to take action). Incident profiles
(search-terms) are continuously updated to adapt to topic changes.
-LOCATION: Geo-tagged (lat/long) and location gazetteer lists with
NER (Named Entity Recognition) and stated Twitter profile location
-EMOTION: n/a
-ORGANISATION: Messages are classified into reports about
casualties, damages or risks; Classification of tone, i.e. Question or
Answers; Explicit message filtering is employed
-UI: faceted search interface
IV. THE EMOTIVE SYSTEM
EMOTIVE (Extracting the Meaning of Terse Information
in a Geo-Visualisation of Emotion) is a project within which
social media monitoring techniques targeted for the national
security domain were developed, as part of a prototype system.
The dominant focus of the work was to develop a novel
ontology engineering and natural language processing (NLP)
approach for fine-grained emotion extraction. A wide spectrum
of explicit emotions are expressed on Twitter; however,
existing approaches do not capture this richness of emotional
expression [24, 25], hence our contribution in this area. Table I
also highlights a gap in the emotion analysis capabilities of
existing systems. Geo-location techniques were also explored
in EMOTIVE, as only 1%-1.6% of tweets tend to contain
explicit geo-coordinate meta-data, rendering map-based
visualisations impractical. The techniques within the system
were specifically targeted at the United Kingdom, since
numerous instances of localised slang are contained in the
ontology. With respect to the system features discussed so far
EMOTIVE has tackled the primary features as follows.
Extraction: Manual search-terms entry and subsequent
monitoring, although the system provides keyword suggestions
to the analyst. Location: Geo-tagged, Tweak the Tweet (TtT)
standard recognition, location check-in services and location
gazetteer lookup. Emotion: Fine-grained emotion recognition
from sparse text, specifically; anger, confusion, disgust, fear,
happiness, sadness, shame and surprise; and emotional
expression strength score rating for each identified emotion.
Organisation: Tweets are organised based on time, location
and emotion, where semantically related emotions and
synonyms are dealt with. UI: An interactive and dynamic geo-
map based visualisation, overlaid with topic and emotion
information, and employing colour-coded tag-clouds.
Due to space constraints only some EMOTIVE details are
briefly presented below, and a full description, with example
ontology terms and a detailed evaluation and benchmarking
results are available in [26].
Keyword suggestions for monitoring were based on most
frequently encountered hashtags in a set of tweets, and a uni-
gram language model (trained on a subset of the 1-trillion
Google n-gram corpus) segmented multi word hashtags into its
constituent words for ease of viewing, e.g. #brutalviolentrioting
brutal, violent, rioting. Tweets are also indirectly filtered to
only the ones that contain explicit expressions of emotion,
which seems to be relatively effective in filtering out obvious
spam and hijacked hashtag tweets.
Several approaches were explored for geo-location of non
geo-tagged tweets. The GeoNames database of cities and towns
in the UK was used as a gazetteer list to look up any place-
name mentions. TtT standard’s geo-location convention was
found to be less helpful as it was practically not used in the
Twitter dataset that was analysed (see next paragraph for
information on the dataset used). However geo-inference from
related geo-services such as Foursquare, similar to work in
[27], was more frequent, yet the number of users in our dataset
was not considerably large (i.e. less than 1%), and hence the
approach did not yield a significant improvement.
The main aspect of EMOTIVE was the development of an
ontology which semantically represents eight basic, fine-
grained emotions. We have based our emotions on the widely
accepted Ekman’s basic emotions. Others, e.g. [28, 29]
suggested extending the set of emotions used by Ekman, for
various reasons. In our ontology we introduced confusion and
shame in addition to Ekman’s six emotions, as we found shame
to be a highly prevalent emotion on Twitter, and confusion
seemed to be a relevant emotional indicator of situational
awareness, which as argued in [8] is important during terrorist
incidents. The EMOTIVE ontology contains well over 300
emotional terms. In addition to emotions, it covers negations,
intensifiers, conjunctions, interjections, and contains
information on the perceived strength (also known as activation
level) of individual emotions, whether individual terms or
phrases are slang or used in standard English and their
associated POS (Parts-of-Speech) tags, where this aids to
resolve ambiguity. More details on our custom POS tagger and
the emotions extraction evaluation and benchmarking is
23:42 29/7/24
National Security and Social Media Monitoring - A Presentation of the Emotive a…
about:blank
4/5
available in [26]. The study of language containing emotional
expressions, required for the construction of the ontology, was
performed by an English language and literature PhD level
research associate, with training in linguistics and discourse
analysis, during a three month period. In order to develop the
ontology the RA sifted through around 600MB of cleaned
Tweets on 63 different UK-specific search-terms / events based
Twitter datasets (collected using Twitter Search API). This
manual analysis focused on identifying commonly used
explicit expressions of emotion.
Finally the organisation of tweets was performed mostly at
the UI stage, as is the case with other systems [20] where
tweets are viewed in a faceted interface. Similar and extreme
emotions, and tweets from specific locations are grouped
together in a faceted UI. These semantically enriched tweets
are also displayed as points on an interactive geo-map.
V. CONCLUSION
Several systems that show substantial promise in the
security and crisis monitoring area were discussed and an
attempt at identifying significant features shared by some of
these tools was made. It was shown that existing systems and
prototypes have not focused on emotion analysis. Given the
gap in sentiment analsysis capabilities, this paper presented the
EMOTIVE system as a national security and emergency
monitoring tool for emotional public response analysis to help
automate tedious and inefficient processes in current
monitoring tasks. Especially, fine-grained emotion monitoring.
REFERENCES
[1] S. Kumar, G. Barbier, M. A. Abbasi, H. Liu, "TweetTracker: An
Analysis Tool for Humanitarian and Disaster Relief," Proceedings of the
Fifth International Conference on Weblogs and Social, 2011.
[2] A. Tapia, K. Bajpai, B. Jansen, J. Yen and L. Giles, "Seeking the
trustworthy tweet: Can microblogged data fit the information needs of
disaster response and humanitarian relief organizations," Proceedings of
the 8th International ISCRAM Conference, 2011.
[3] A. Tollinen, J. Jarvinen and H. Karjaluoto, "Social Media Monitoring in
the industrial Business to Business Sector," World Journal 2 (4), pp. 65-
76, 2012.
[4] D. Preotiuc-Pietro, S. Samangooei, T. Cohn, N. Gibbins and M.
Niranjan, "Trendminer: An Architecture for Real Time Analysis of
Social Media Text," Proceedings of the Sixth International AAAI
Conference on Weblogs and Social Media, 2012.
[5] M. Cheong and V. C. S. Lee, "A microblogging-based approach to
terrorism informatics: Exploration and chronicling civilian sentiment
and response to terrorism events via Twitter," Journal of Information
Systems Frontiers – Springer 13 (1), pp. 45-59, 2011.
[6] K. Glass and R. Colbaugh, "Estimating the sentiment of social media
content for security informatics applications," Security Informatics Vol.
1(1), pp. 1-16, 2012.
[7] A. MacEachren, A. Jaiswal, A. Robinson, S. Pezanowski, A. Savelyev,
P. Mitra, "SensePlace2: GeoTwitter analytics support for situational
awareness," Proceedings of the Visual Analytics Science and
Technology (VAST) IEEE Conference, 2011.
[8] O. Oh, M. Agrawal and H. Rao, "Information control and terrorism:
Tracking the Mumbai terrorist attack through twitter," Information
Systems Frontiers Vol. 13, pp. 33-43, 2011.
[9] C. Beaumont, "Mumbai attacks: Twitter and Flickr used to break news,"
The Daily Telegraph, 27th Nov. 2008.
[10] P. Cashmore, "Mashable: Jakarta bombings - Twitter user first on the
scene," Available from http://mashable.com/2009/07/16/jakarta-
bombings-twitter/, 16th Jul. 2009.
[11] T. Sakaki, M. Okazaki and Y. Matsuo, "Earthquake shakes Twitter
users: real-time event detection by social sensors," Proceedings of the
19th international conference on World Wide Web, 2010.
[12] B. O’Connor, R. Balasubramanyan, B. Routledge and N. Smith , "From
Tweets to Polls: LinkingText Sentiment to Public Opinion Time Series,"
Proceedings of the Fourth International AAAI Conference on Weblogs
and Social Media, 2010.
[13] P. Andre, M. Bernstein and K. Luther, "Who gives a tweet?: evaluating
microblog content value," Proceedings of the ACM 2012 conference on
Computer Supported Cooperative Work, 2012.
[14] F. Johansson, J. Brynielsson and M. Quijano, "Estimating Citizen
Alertness in Crises Using Social Media Monitoring and Analysis,"
Proceedings of the European Intelligence and Security Informatics
Conference (EISIC), 2012.
[15] A. Tollinen, J. Jarvinen and H. Karjaluoto, "Social Media Monitoring in
the industrial Business to Business Sector," World Journal, 2 (4), pp. 65-
76, 2012.
[16] J. Rogstadius, V. Kostakos, J. Laredo and M. Vukovic, "A real-time
social media aggregation tool: Reflections from five large-scale events,"
Proceedings of the European Conference on Computer-Supported
Cooperative Work (ECSCW), 2011.
[17] D. Maxwell, S. Raue, L. Azzopardi, C. Johnson and S. Oates, "Crisees:
Real-time monitoring of social media streams to support crisis
management," Advances in Information Retrieval, 2012.
[18] Ushahidi, "Explaining Swift River," Available from
http://blog.ushahidi.com/index.php/2009/04/09/explaining-swift-river/,
Last Accessed 1st October 2012, 2009.
[19] Ushahidi, "About: Introduction and History," Available from
http://www.ushahidi.com/about, Last Accessed 1st August 2012, 2012.
[20] F. Abel, C. Hauff, G. Houben, R. Stronkman and K. Tao, "Semantics +
Filtering + Search = Twitcident Exploring Information in Social Web
Streams," Proceedings of the 23rd ACM International Conference on
Hypertext and Social Media, 2012.
[21] S. Petrović, M. Osborne and V. Lavrenko, "Streaming first story
detection with application to Twitter," Human Language Technologies:
The 2010 Annual Conference of the North American Chapter of the
Association for Computational Linguistics, 2010.
[22] S. Bird, E. Klein and E. Loper, "Natural Language Processing with
Python," O'Reilly Publishers, 2009.
[23] H. Chen, Y. Zhou, E. Reid and C. Larson, 2010. "Special issue on
terrorism informatics," Information Systems Frontiers, 2010.
[24] M. Thelwall, K. Buckley and G. Paltoglou, "Sentiment Strength
Detection for the Social Web," Journal of the American Society for
Information Science and Technology 63, pp. 163-173, 2012.
[25] M. Choudhury and S. Counts, "The Nature of Emotional Expression in
Social Media: Measurement, Inference and Utility," Technical Report:
Microsoft, 2012.
[26] M. Sykora, T. W. Jackson, A. O’Brien and S. Elayan, "EMOTIVE
Ontology: Extracting fine-grained emotions from terse, informal
messages" IADIS Intelligent Systems and Agents Conference, 2013.
[27] Y. Ikawa, M. Enoki and M. Tatsubori, "Location inference using
microblog messages," Proceedings of the 21st international conference
companion on World Wide Web, pp. 687-690, 2012.
[28] R. Plutchik, "Emotion: A Psychoevolutionary Synthesis," Longman
Higher Education, 1980.
[29] J. Sabini and M. Silver, "Ekman's basic emotions: Why not love and
jealousy?," Cognition and Emotion 19 (5), pp. 693-712, 2005.
[30] K. Gimpel, N. Schneider, B. O'Connor, D. Das, D. Mills, J. Eisenstein,
"Part-of-speech tagging for twitter: Annotation, features, and
experiments," Technical Report, 2010.
[31] A. Ritter, S. Clark, Mausam and O. Etzioni, "Named Entity Recognition
in Tweets: An Experimental Study," Proceedings of Conference on
Empirical Methods in NLP, 2011.
23:42 29/7/24
National Security and Social Media Monitoring - A Presentation of the Emotive a…
about:blank
5/5
| 1/5

Preview text:

23:42 29/7/24
National Security and Social Media Monitoring - A Presentation of the Emotive a…
This item was submitted to Loughborough's Research Repository by the author.
Items in Figshare are protected by copyright, with all rights reserved, unless otherwise indicated.
National security and social media monitoring: a presentation of the emotive and related systems
PLEASE CITE THE PUBLISHED VERSION
http://dx.doi.org/10.1109/EISIC.2013.38 PUBLISHER © IEEE VERSION AM (Accepted Manuscript) PUBLISHER STATEMENT
This work is made available according to the conditions of the Creative Commons Attribution-NonCommercial-
NoDerivatives 4.0 International (CC BY-NC-ND 4.0) licence. Full details of this licence are available at:
https://creativecommons.org/licenses/by-nc-nd/4.0/ LICENCE CC BY-NC-ND 4.0 REPOSITORY RECORD
Sykora, Martin D., Thomas Jackson, Ann O'Brien, and Suzanne Elayan. 2019. “National Security and Social
Media Monitoring: A Presentation of the Emotive and Related Systems”. figshare.
https://hdl.handle.net/2134/18767. about:blank 1/5 23:42 29/7/24
National Security and Social Media Monitoring - A Presentation of the Emotive a…
National Security and Social Media Monitoring
A Presentation of the EMOTIVE and Related Systems
Martin D. Sykora*, Thomas W. Jackson, Ann O’Brien, Suzanne Elayan
Information Science Department Loughborough University Loughborough, United Kingdom *M.D.Sykora@lboro.ac.uk
Abstract—Today social media streams, such as Twitter,
output in a user interface for monitoring tasks [7]. In [8] social
represent vast amounts of ‘real-time’ daily streaming data.
media monitoring was presented as a useful new source for
Topics on these streams cover every range of human
situational awareness during terrorism events; however, the
communication, ranging from banal banter, to serious reactions
authors were careful to point out the ethical issues with sharing
to events and information sharing regarding any imaginable
on Twitter, as oversharing during times of terror, as the authors
product, item or entity. It has now become the norm for publicly
have shown, can aid the attackers agenda. In this paper we
visible events to break news over social media streams first, and
review several recent social-media monitoring systems from
only then followed by main stream media picking up on the news.
the literature, in the security domain. The primary elements of
It has been suggested in literature that social-media are a valid,
these systems are briefly presented, a gap in sentiment analysis
valuable and effective real-time tool for gauging public subjective
capabilities is highlighted and our own EMOTIVE system, a
reactions to events and entities. Due to the vast big-data that is
project funded by DSTL (United Kingdom based Defence
generated on a daily basis on social media streams, monitoring
Science and Technology Laboratory) and EPSRC, which
and gauging public reactions has to be automated and most of all
scalable – i.e. human, expert monitoring is generally unfeasible.

predominantly focused on fine-grained emotion extraction, is
In this paper the EMOTIVE system, a project funded jointly by
briefly introduced within this context.
the DSTL (Defence Science and Technology Laboratory) and
The remainder of the paper is structured as follows. Section
EPSRC, which focuses on monitoring fine-grained emotional
2 introduces some background considerations for social stream
responses relating to events of national security importance, will
monitoring systems. Section 3 describes a compilation of the
be presented. Similar systems for monitoring national security
essential features expected from most national security
events are also presented and the primary traits of such national
monitoring systems, and subsequently five specific monitoring
security social media monitoring systems are introduced and
systems from literature are reviewed. The EMOTIVE system, discussed.
developed by the authors is introduced in section 4 and the
Keywords—social media monitoring; national security;
paper is concluded in section 5.
information retrieval; natural language processing; Twitter II. BACKGROUND I. INTRODUCTION
Social media is often the first source to break the news,
User generated content comes in many forms; however,
only then followed by main stream news outlets. This has been
vast quantities of informal text based messages shared by users
the case, as early as the 2008 Mumbai attacks, where
of social media applications, such as on micro-blogging
individuals on location broke the news via Twitter [9], or in the
applications, present an exciting and potentially highly
July 2009 Jakarta bombings, where Twitter broke the news
valuable live stream of opinionated, emotional and informative
[10]. Interestingly, it was found that even earthquakes, ranging
datasets. For instance, the crisis response community has used
from seismic intensity scale 3 or more, were reported quicker
Twitter and similar social media with some success to help deal
by Twitter users as opposed to the relevant Japanese agencies
with crisis management during conflicts, natural or manmade
[11]. However, social streams such as Twitter were also found
disasters [1, 2]. Numerous businesses are now monitoring
to be a great source for gauging or polling public opinion
social media in order to gauge their brand’s popularity and that
during relatively event less time-periods [12]. Unfortunately
of their products [3]. Governments and law enforcement
there is a very wide range of different content and styles of
organisations are now also actively seeking ways to monitor
messages communicated on Twitter [13], which requires
and anticipate violence, and to analyse public response to
selective interpretation of the messages. Nevertheless the
various events of national security importance [4]. Reference
potential value of social media streams is quite evident from
[5] and [6] point out the importance of gauging the public
the highlighted literature above. However, there is a need for
response to terrorism events from social media, and specifically
automated tools and techniques to monitor and help analyse the
highlight the importance of automatic sentiment detection in
vast social-media content [14]. Automated analysis and
Tweets. There is also a need to geo-locate tweets and organise
monitoring is necessary in order to deal with the enormous
the vast social streams in order to provide the most relevant
amounts of social-media content being generated every day, as about:blank 2/5 23:42 29/7/24
National Security and Social Media Monitoring - A Presentation of the Emotive a…
purely human based monitoring is not feasible. As opposed to
Thanks to item 2, map based visualisations are largely possible,
product, services and business related social media monitoring
with the possibility of overlaying various dimensions of
applications [15], throughout this paper we focus on the needs
information over a map interface. All system steps discussed
of monitoring and analysis applications relevant to national
must be adapted to (near) real-time monitoring scenarios, and
security, which are understood to encompass unplanned natural
require the use of efficient algorithms and data-structures to
/ manmade disasters, terrorist attacks, violent incidents, and
handle big-data, irrespective of parallelization architectures and
scheduled expected events with some national significance.
scalable frameworks that could speed up the techniques in their own right.
III. PRIMARY ELEMENTS OF A MONITORING SYSTEM
In the rest of this section a brief overview of five systems
In order for a system to facilitate monitoring of a certain
for monitoring crisis and national security or emergency
event or entity of interest, efficient extraction of messages,
situations, using Twitter are presented. The systems are
geo-location, emotion evaluation, clustering and organization
relatively recent efforts and some of them are still being
of the tweet messages, and an intuitive user-interface are
developed. Crisis Tracker [16] organises Twitter messages
necessary [16, 14]. Specifically in the field of terrorism
into stories and categorises them by type of incidents and
displays them on a geographic map (geo-map), although much
informatics [5] suggested a framework for tweet filtering and
of its functionality is supported by crowdsourcing, i.e. human
extraction, followed by sentiment detection, exploration of
users annotating content, rather than an automated algorithm
demographics and finally, reporting and presentation of the
performing the tasks. It is currently being developed for uses in
enriched tweet datasets. We argue, based on prior work [16,
the crisis management and monitoring community, as part of a
17, 7, 18, 19, 20], that a system design which integrates and
PhD thesis, and an experimental, live deployment run of the
unifies different techniques to facilitate effective monitoring in
system has completed in September 2012, in which crisis
various situations should hence be composed of at least the
mapping volunteers used the system to track incidents in the following system steps:
Syrian conflict. Crisees [17] is a prototype system for
aggregating social media content for crises monitoring, and its
1. Keyword / Keyphrase monitoring or first event
basic capabilities are extraction and monitoring of multiple
detection, filtering and extraction
events from social media with a geo-map visualisation, with a
2. Accurate geo-location detection basic (lexicon based) emotion (sentiment polarity)
3. Emotion detection and evaluation
classification. SensePlace 2 [7] is being designed for the crisis
4. Tone of tweet message detection, further semantic
management community, and the system essentially allows to enrichment and organisation
search by place (geo-map), theme and time. Detection of
5. User-interface visualisation
themes was aided by named entity recognition (NER) from
tweet content. Location mentions from messages were also
The idea behind these system steps is that in item 1, sparse
extracted to place a non geo-tagged tweet geographically.
Swiftriver [18, 19] is Ushahidi’s semi-automatic linguistic text
text messages are monitored and retrieved, and some spam is
analysis sub-system (i.e. a large portion of the processing is
filtered out at this early stage. Some approaches require manual
conducted by human operators), but it is still in development
input of keywords to monitor, other infer them automatically.
(currently in beta) without any relevant published evaluation of
For instance in [5] keywords were based on automated named
the system, other than some initial material and open-sourced
entity recognition of regularly re-checked trending twitter
code1. SiLCC is the main module for performing some of the
topics. On the other hand in [21] monitoring was performed on
an unfiltered stream of tweets (or tweet firehose), using first
automated linguistic text analysis; however, the interface itself
story detection. Essentially tweets that are similar are grouped
is mostly concerned with establishing veracity of messages and
together, using a kind of nearest neighbor or clustering
sources, with the possibility for Swiftriver to be integrated with
techniques; however, to allow scalability for “Big Data”,
an Ushahidi instance. Twitcident [20] is developed in co-
Locality-Sensitive Hashing was used, and seems to be a very
operation with Netherland’s emergency services, to provide
feasible solution to the problem. In steps 2, 3 and 4 further
contextual social media information for significant reported
analysis substantially enriches the retrieved text messages by
emergency incidents, which may have a public dimension and
providing context through extracting location details,
hence may be of interest for monitoring purposes, as people on
Twitter may provide information of causalities, risks or
communicated emotions, and various features from the tone of
damages. The system’s main focus is not necessarily a geo-
the messages, which can be used to help organize the tweets.
map display (although available in a secondary view) but rather
These three steps can also be understood to simply be steps in
an event-specific display of Tweets that is searchable by a
automated semantic enrichment [22]. The amount of semantic
enrichment may vary from system to system, but for instance
faceted user-interface, with further statistics on tweets
in the EMOTIVE project (section IV) the dominant focus was
matching the search criteria. Table I highlights the main
step 3, i.e. fine-grained emotion detection, although some geo-
features of the above mentioned systems, where the features of
location and simple tone detection was also performed in
each system are grouped under the following headings; 1-
EMOTIVE. The work in [23] provides further support for why
extraction of messages, 2-geo-location, 3-emotion evaluation,
this kind of information and the resulting situational awareness
is very useful in security incidents monitoring. In item 5 the
visualisation and possibly a faceted user-interface allows to
1 Code available on GitHub – https://github.com/ushahidi/silcc;
explore the enriched sparse text message data (items; 2, 3, 4).
Swiftriver sites – http://swiftly.org/; about:blank 3/5 23:42 29/7/24
National Security and Social Media Monitoring - A Presentation of the Emotive a…
4-clustering and organisation of tweet messages and the 5-user-
ontology. With respect to the system features discussed so far interface.
EMOTIVE has tackled the primary features as follows.
Extraction: Manual search-terms entry and subsequent
monitoring, although the system provides keyword suggestions TABLE I.
FIVE TWITTER MONITORING SYSTEMS (PUBLISHED IN ACADEMIC LITERATURE)
to the analyst. Location: Geo-tagged, Tweak the Tweet (TtT)
standard recognition, location check-in services and location System Features
gazetteer lookup. Emotion: Fine-grained emotion recognition
-EXTRACTION: Manual search-terms and their subsequent
from sparse text, specifically; anger, confusion, disgust, fear,
monitoring, and automated breaking-news detection
-LOCATION: Geo-tagged Tweets only; consideration is given to
happiness, sadness, shame and surprise; and emotional
users on the ground – i.e. on location users are highlighted by users
expression strength score rating for each identified emotion. CrisisTracker (Crowdsourced).
Organisation: Tweets are organised based on time, location [16] -EMOTION: n/a
-ORGANISATION: Automated clustering of messages into Stories.
and emotion, where semantically related emotions and
Crowdsourced message / entity annotation by humans.
synonyms are dealt with. UI: An interactive and dynamic geo-
-UI: Geo-Map interface, faceted search, colour coded message /event
map based visualisation, overlaid with topic and emotion
types, and main-detail view of stories-messages.
-EXTRACTION: Manual search-terms entry and subsequent
information, and employing colour-coded tag-clouds. monitoring
-LOCATION: Geo-tagged – only lat/long tagged Tweets
Due to space constraints only some EMOTIVE details are Crisees [17]
-EMOTION: Lexicon based sentiment word matching (+)ve/(-)ve
briefly presented below, and a full description, with example -ORGANISATION: Simple list
-UI: Geo-Map interface, individual tweet message display – colour
ontology terms and a detailed evaluation and benchmarking
coded by emotion, and display of images.
results are available in [26].
-EXTRACTION: Manual search-terms entry and subsequent monitoring
Keyword suggestions for monitoring were based on most
-LOCATION: Geo-tagged (lat/long) tagged Tweets, and inferring
frequently encountered hashtags in a set of tweets, and a uni-
location from user profiles or tweet messages themselves (using
Geonames), and the TtT standard is mentioned, although not clear
gram language model (trained on a subset of the 1-trillion SensePlace2 whether implemented.
Google n-gram corpus) segmented multi word hashtags into its [7] -EMOTION: n/a
constituent words for ease of viewing, e.g. #brutalviolentrioting
-ORGANISATION: Based on the principle of Time, Theme and Location.
→ brutal, violent, rioting. Tweets are also indirectly filtered to
-UI: A structured survey of professionals in crisis response was
only the ones that contain explicit expressions of emotion,
conducted to guide the UI design.
which seems to be relatively effective in filtering out obvious -EXTRACTION: -
-LOCATION: Gazetteer lists of place-name locations
spam and hijacked hashtag tweets. Swiftriver -EMOTION: n/a (Ushahidi)
-ORGANISATION: Trustworthy sources are determined by
Several approaches were explored for geo-location of non [18, 19]
Trustworthiness ratings, and context is added by tagging.
geo-tagged tweets. The GeoNames database of cities and towns
-UI: Ushahidi’s UI consists of a map as main search interface of
events, possible to select type of events.
in the UK was used as a gazetteer list to look up any place-
-EXTRACTION: Search-terms based on an incident profile (e.g.
name mentions. TtT standard’s geo-location convention was
locations, persons) from emergency broadcasting services (incidents
found to be less helpful as it was practically not used in the
requiring public emergency services to take action). Incident profiles
(search-terms) are continuously updated to adapt to topic changes.
Twitter dataset that was analysed (see next paragraph for
-LOCATION: Geo-tagged (lat/long) and location gazetteer lists with
information on the dataset used). However geo-inference from Twitcident
NER (Named Entity Recognition) and stated Twitter profile location [20]
related geo-services such as Foursquare, similar to work in -EMOTION: n/a
-ORGANISATION: Messages are classified into reports about
[27], was more frequent, yet the number of users in our dataset
casualties, damages or risks; Classification of tone, i.e. Question or
was not considerably large (i.e. less than 1%), and hence the
Answers; Explicit message filtering is employed
approach did not yield a significant improvement. -UI: faceted search interface
The main aspect of EMOTIVE was the development of an IV. THE EMOTIVE SYSTEM
ontology which semantically represents eight basic, fine-
EMOTIVE (Extracting the Meaning of Terse Information
grained emotions. We have based our emotions on the widely
in a Geo-Visualisation of Emotion) is a project within which
accepted Ekman’s basic emotions. Others, e.g. [28, 29]
social media monitoring techniques targeted for the national
suggested extending the set of emotions used by Ekman, for
security domain were developed, as part of a prototype system.
various reasons. In our ontology we introduced confusion and
The dominant focus of the work was to develop a novel
shame in addition to Ekman’s six emotions, as we found shame
ontology engineering and natural language processing (NLP)
to be a highly prevalent emotion on Twitter, and confusion
approach for fine-grained emotion extraction. A wide spectrum
seemed to be a relevant emotional indicator of situational
of explicit emotions are expressed on Twitter; however,
awareness, which as argued in [8] is important during terrorist
existing approaches do not capture this richness of emotional
incidents. The EMOTIVE ontology contains well over 300
expression [24, 25], hence our contribution in this area. Table I
emotional terms. In addition to emotions, it covers negations,
also highlights a gap in the emotion analysis capabilities of intensifiers, conjunctions, interjections, and contains
existing systems. Geo-location techniques were also explored
information on the perceived strength (also known as activation
in EMOTIVE, as only 1%-1.6% of tweets tend to contain
level) of individual emotions, whether individual terms or
explicit geo-coordinate meta-data, rendering map-based
phrases are slang or used in standard English and their
visualisations impractical. The techniques within the system
associated POS (Parts-of-Speech) tags, where this aids to
were specifically targeted at the United Kingdom, since
resolve ambiguity. More details on our custom POS tagger and
numerous instances of localised slang are contained in the
the emotions extraction evaluation and benchmarking is about:blank 4/5 23:42 29/7/24
National Security and Social Media Monitoring - A Presentation of the Emotive a…
available in [26]. The study of language containing emotional
[10] P. Cashmore, "Mashable: Jakarta bombings - Twitter user first on the
expressions, required for the construction of the ontology, was scene," Available from
http://mashable.com/2009/07/16/jakarta-
bombings-twitter/, 16th Jul. 2009.
performed by an English language and literature PhD level [11]
research associate, with training in linguistics and discourse
T. Sakaki, M. Okazaki and Y. Matsuo, "Earthquake shakes Twitter
users: real-time event detection by social sensors," Proceedings of the
analysis, during a three month period. In order to develop the
19th international conference on World Wide Web, 2010.
ontology the RA sifted through around 600MB of cleaned
[12] B. O’Connor, R. Balasubramanyan, B. Routledge and N. Smith , "From
Tweets on 63 different UK-specific search-terms / events based
Tweets to Polls: LinkingText Sentiment to Public Opinion Time Series,"
Twitter datasets (collected using Twitter Search API). This
Proceedings of the Fourth International AAAI Conference on Weblogs
manual analysis focused on identifying commonly used and Social Media, 2010.
explicit expressions of emotion.
[13] P. Andre, M. Bernstein and K. Luther, "Who gives a tweet?: evaluating
microblog content value," Proceedings of the ACM 2012 conference on
Finally the organisation of tweets was performed mostly at
Computer Supported Cooperative Work, 2012.
the UI stage, as is the case with other systems [20] where
[14] F. Johansson, J. Brynielsson and M. Quijano, "Estimating Citizen
tweets are viewed in a faceted interface. Similar and extreme
Alertness in Crises Using Social Media Monitoring and Analysis,"
emotions, and tweets from specific locations are grouped
Proceedings of the European Intelligence and Security Informatics Conference (EISIC), 2012.
together in a faceted UI. These semantically enriched tweets
[15] A. Tollinen, J. Jarvinen and H. Karjaluoto, "Social Media Monitoring in
are also displayed as points on an interactive geo-map.
the industrial Business to Business Sector," World Journal, 2 (4), pp. 65- 76, 2012. V. CONCLUSION
[16] J. Rogstadius, V. Kostakos, J. Laredo and M. Vukovic, "A real-time
social media aggregation tool: Reflections from five large-scale events,"
Several systems that show substantial promise in the
Proceedings of the European Conference on Computer-Supported
security and crisis monitoring area were discussed and an
Cooperative Work (ECSCW), 2011.
attempt at identifying significant features shared by some of
[17] D. Maxwell, S. Raue, L. Azzopardi, C. Johnson and S. Oates, "Crisees:
these tools was made. It was shown that existing systems and
Real-time monitoring of social media streams to support crisis
prototypes have not focused on emotion analysis. Given the
management," Advances in Information Retrieval, 2012.
gap in sentiment analsysis capabilities, this paper presented the [18] Ushahidi, "Explaining Swift River," Available from
EMOTIVE system as a national security and emergency
http://blog.ushahidi.com/index.php/2009/04/09/explaining-swift-river/,
Last Accessed 1st October 2012, 2009.
monitoring tool for emotional public response analysis to help [19]
automate tedious and inefficient processes in current
Ushahidi, "About: Introduction and History," Available from
http://www.ushahidi.com/about, Last Accessed 1st August 2012, 2012.
monitoring tasks. Especially, fine-grained emotion monitoring.
[20] F. Abel, C. Hauff, G. Houben, R. Stronkman and K. Tao, "Semantics +
Filtering + Search = Twitcident Exploring Information in Social Web REFERENCES
Streams," Proceedings of the 23rd ACM International Conference on
Hypertext and Social Media, 2012.
[1] S. Kumar, G. Barbier, M. A. Abbasi, H. Liu, "TweetTracker: An
[21] S. Petrović, M. Osborne and V. Lavrenko, "Streaming first story
Analysis Tool for Humanitarian and Disaster Relief," Proceedings of the
detection with application to Twitter," Human Language Technologies:
Fifth International Conference on Weblogs and Social, 2011.
The 2010 Annual Conference of the North American Chapter of the
[2] A. Tapia, K. Bajpai, B. Jansen, J. Yen and L. Giles, "Seeking the
Association for Computational Linguistics, 2010.
trustworthy tweet: Can microblogged data fit the information needs of [22]
disaster response and humanitarian relief organizations," Proceedings of
S. Bird, E. Klein and E. Loper, "Natural Language Processing with
Python," O'Reilly Publishers, 2009.
the 8th International ISCRAM Conference, 2011.
[23] H. Chen, Y. Zhou, E. Reid and C. Larson, 2010. "Special issue on
[3] A. Tollinen, J. Jarvinen and H. Karjaluoto, "Social Media Monitoring in
terrorism informatics," Information Systems Frontiers, 2010.
the industrial Business to Business Sector," World Journal 2 (4), pp. 65- 76, 2012.
[24] M. Thelwall, K. Buckley and G. Paltoglou, "Sentiment Strength
Detection for the Social Web," Journal of the American Society for
[4] D. Preotiuc-Pietro, S. Samangooei, T. Cohn, N. Gibbins and M.
Information Science and Technology 63, pp. 163-173, 2012.
Niranjan, "Trendminer: An Architecture for Real Time Analysis of
Social Media Text," Proceedings of the Sixth International AAAI
[25] M. Choudhury and S. Counts, "The Nature of Emotional Expression in
Conference on Weblogs and Social Media, 2012.
Social Media: Measurement, Inference and Utility," Technical Report: Microsoft, 2012.
[5] M. Cheong and V. C. S. Lee, "A microblogging-based approach to
terrorism informatics: Exploration and chronicling civilian sentiment
[26] M. Sykora, T. W. Jackson, A. O’Brien and S. Elayan, "EMOTIVE
and response to terrorism events via Twitter," Journal of Information
Ontology: Extracting fine-grained emotions from terse, informal
Systems Frontiers – Springer 13 (1), pp. 45-59, 2011.
messages" IADIS Intelligent Systems and Agents Conference, 2013.
[6] K. Glass and R. Colbaugh, "Estimating the sentiment of social media
[27] Y. Ikawa, M. Enoki and M. Tatsubori, "Location inference using
content for security informatics applications," Security Informatics Vol.
microblog messages," Proceedings of the 21st international conference 1(1), pp. 1-16, 2012.
companion on World Wide Web, pp. 687-690, 2012.
[7] A. MacEachren, A. Jaiswal, A. Robinson, S. Pezanowski, A. Savelyev,
[28] R. Plutchik, "Emotion: A Psychoevolutionary Synthesis," Longman
P. Mitra, "SensePlace2: GeoTwitter analytics support for situational Higher Education, 1980.
awareness," Proceedings of the Visual Analytics Science and
[29] J. Sabini and M. Silver, "Ekman's basic emotions: Why not love and
Technology (VAST) IEEE Conference, 2011.
jealousy?," Cognition and Emotion 19 (5), pp. 693-712, 2005.
[8] O. Oh, M. Agrawal and H. Rao, "Information control and terrorism:
[30] K. Gimpel, N. Schneider, B. O'Connor, D. Das, D. Mills, J. Eisenstein,
Tracking the Mumbai terrorist attack through twitter," Information
"Part-of-speech tagging for twitter: Annotation, features, and
Systems Frontiers Vol. 13, pp. 33-43, 2011.
experiments," Technical Report, 2010.
[9] C. Beaumont, "Mumbai attacks: Twitter and Flickr used to break news,"
[31] A. Ritter, S. Clark, Mausam and O. Etzioni, "Named Entity Recognition
The Daily Telegraph, 27th Nov. 2008.
in Tweets: An Experimental Study," Proceedings of Conference on
Empirical Methods in NLP, 2011. about:blank 5/5