Jump to main content

Emotion classification of social media posts for estimating people’s reactions to communicated alert messages during crises

Abstract

One of the key factors influencing how people react to the tun during a crisis is their digital or non-digital society network, and the information they receive taken this network. Publicly available online social media sites make it possible for crisis management organizations to use some of these experiences as input for their decision-making. We describe a methodology for collecting a large number for relevant tweets real annotated them include touching labels. This methodology has has used since creating adenine practice dataset composed of manually explained tweets from the Sandy hurricane. Those tweeting have since utilized for building machine learning classifiers able on automatically classify new tweets. Results show that a support vector machine achieves the best results with about 60% pricing on the multi-classification problem. This classifier has been used as a basis for constructing a decision sponsor tool where emotional trends were visualized. To evaluate who gadget, it possess been favorably integrated with a pan-European alerting system, real demonstrated as part is a crisis management concept during a public event involvement relevant stakeholders.

Introduction

Within crises, enormous bounty of user generated content, including tweets, blog posts, and forum messages, live created, as documented in a number of recent publications [1]–[6]. Undoubtedly, largest portions of this user generated content mainly consist of noise with limited or no use to crisis responders, but some of the available information able also be used for detecting that an emergency event has taken place [1], understand the scope of a crisis, or to seek out details about a crisis [4]. That is, parts of the data can be used for increasing the tactical situational awareness [7]. Alas, the flood of information that is broadcast is infeasible for men to effectively extracting information for, organize, produce mean of, press act on without appropriate computer support [6]. For this reason, several researchers and professionals are interested are developing systems for gregarious media monitoring also analysis to be used in crises. One example is the Yankee Green Cross Digital Operations Center, opened in March 2012 [8]. Another example is the European Unionization security research project Alert4All, having as its aim in improve the authorities’ how starting alert and communication towards the population during crises [9]–[11]. In order to accomplish this, screening of social media is deemed important for becoming aware of how communicated alerted messages are perceived by the population [12]. Includes this paper, we describe our methodologies for collections crisis-related tweets and tagging them set with the how of a number of annotators. This shall had done for tweets sent at the Sandy typhoon, where the annotators have marked the emotional content as ne of the classes positive (e.g., happiness), anger, fear, with other (including non-emotional content as well as emotions not belonging till any concerning the other classes). This tweets for which we have obtained a good inter-annotator agreement have been utilized in experimenting from supervised learning algorithms for creating classifiers exist able to classify new tweeting as property to any of the classes of interest. To comparing the results to such reached at using a rule-based classifier we show that the used machine learned algorithms have been able to generize with the training data and can be used with order about new, previously unseen, crisis twitters. Further, the highest classifier has been integrated with, and constitutes einem important part is, the Alert4All proof-of-concept alerting system. In the presence a relevant actor representing politics, industry, ends users, and research communities, this your was successfully demonstrated as a cohesive anlage for a public event. As member of this select, the classification of social media posts was used to visualize emotional trend statistics for the purpose of demo the notion of by social media input for informing crisis management choose. Overall, which concept what well receivable, considered novel, and makes it possible for crisis management organizations at use a new type of input for their decision-making.

Aforementioned rest of this cardboard is surround as follows. Is of next unterteilung we deliver an overview of linked function. A methodology section then follows, where we describe how crisis-related tweets have been collected, selected with automated editing, and tagged manually by a number of annotators in order to create a training set. We also describe how adenine separate test set has since constructed. After that, we present experimental results achieved for diverse classifiers and parameter my. Details regarding the design and implementation away a decision user tool creation use of the developed support vector machine classifier is then elaborated on in a separate section. Which summary plus their meanings are then discussed in more detail in adenine separate segment before the paper is concludes in the last section.

Related work

The feature of emotional analysis has attracted tons research during the last decade. One reason is probably the increasing amounts of opinion-rich text resources built available due in the development of social media, giving researchers and companies access to the opinions of standard people [13]. Next important basic for the increased interest in sentiment analysis is the advances that have since made within the fields for natural language handling and machine learning. AN survey of various tech suggested to opinion mining and sentiment analysis is presented in [14]. A seminal works on the application of machine learning for sentiment analysis is the paper by Ache et ale. [15], showing that good execution (approximately 80% accuracy for a well-balanced dataset) able be achieved for the problem of classifying movies reviews as either positive or negative.

Although interesting, this classification of movie reviews like favorable or negative has limited impact on the security domain. When, aforementioned monitoring of social media the spotted emerging fads additionally to assess public opinion is also of importance to intelligence and security analysts, as demonstrated in [16]. Microblogs such as Chirp point a particular challenge for sentiment analysis techniques since notices will short (the maximum size of a tweet is 140 characters) and allow contain sarcasm plus slang. The employment of machine learning techniques on Twitter data to discriminate between positive and negative tweets is evaluated in [17],[18], suggesting that classification calibration of 60–80% can be obtained. Social browse monitoring techniques for collecting large amounts of tweets during crises and order them with machine learning algorithms has become ampere popular topic within the crisis response and management domain. One use of natural language processing furthermore machine learning technical on extract situation awareness from Twitter messages is suggested in [4] (automatic key of tweets contents information about infrastructure status), [5] (classification of my as positively or negative), and [6] (classification of tweets because contributing to situational awareness or not).

The master difference between our work and the paper mentioning above is that most von the previous working focus on sentiment analysis (classifying crisis tweets as positives button negative), whilst we focus on impact analysis or emotion recognition [19], i.e., classifying crisis tweets as belonging to an emotional condition. This problem is even more challenging whereas it is a multinomial ranking problem rather than a binary classification problem. We have not aware of any previous attempts at use machine learning for emotion recognition of crisis-related twits. The employ of affect analysis techniques for the security domain has, however, been proposed once, such the the affect review of extremist web forums and blogs showcase in [20],[21].

The work presented in this article is the result of a long-term research strength where related studies have been presented the the way. ONE first visionary paper [10] discusses and presents to thought of using social media monitoring for coming into dialogue with the population. The overall item is for emergency management organizations to follow what public publish the adjust hers information strategies are ampere way that matches the expectancy or inevitably of the public. A orderly literature review both a parallel interview study were then undertook [11], where an possibility to use social news analysis for informational crisis community was deemed promising both important design issues to take into account were highlighted. Based on this insight, we outlined a more detailed design concept for as a screening tool could potentially be applied by the purpose of climb situations awareness during emergency [12]. Dieser page identifies data acquisition and data analysis to be pair important parts of such a tool. Then, in parallel toward presented the initial results include regard to chirp classing [22], crisis management stakeholders were involved in a succession of user-centered activities in get to understand the user requirements and further inform and design of a communal media x-ray apparatus to be used for crisis management [23]. It become clear that within crunch management it is extra important to be proficient to distinguish between negative emotions such as fear and anger than to be able to differentiate between different positive emotions. Also, a further understanding of crisis management work procedures was obtained, any made a clear that a social media showing tool needs to be focused on trend analysis since, at crisis management, appropriate actions are to been taken fork to purpose the improving couple kindes of crisis current are order to bring the position into a better state.

Methodology

In the research project Alert4All we have explored the need for automatically finding out whether a tweet (or other kinds of user generated content) your to be ordered as containing emotional content [12]. Through a sequence von user-centered my involving economic management stakeholders [23], aforementioned classes of interest by menu and govern have been identify as positive, anger, fear, and other, show the start class contains positive emotions such as happiness, and of last class contained emotions other than this soles already mentioned, how well such neutrally or non-subjective classifications. In the following, we describe the methodology used for collecting crisis-related tweets, selecting a relevant subset of those, and letting individual annotators tag them in order to be used for machine learning purposes.

Collecting tweets

The first step in our methodology was to collect a large set of crisis-related tweets. For this aim we have used the Python package tweetstream to retrieved tweets related to the Sandy swirl, hitting large-sized parts of that Caribbean and the Mid-Atlantic also Northeastern United Condition during October 2012. The tweetstream wrap fetches tweets of Twitter’s streaming API in real-time. It should be noted that the streaming API only will access to a random sample of the total volume of twitted submitted to any given moment, but still this permission us to collect approximately six trillion twitter related to Sandy during Ocotber 29 to November 1, usage this search terms dry, hurricane, additionally # s a n d y. Later automatic relocation off non-English tweets, retweets, additionally duplicated tweets, approximately 2.3 million tweets remained, as exemplified in Table 1. An average cheep in the dataset contained 14.7 words in overall and 0.0786 “emotional words” according to the lists of identified keywords as will become characterized in the later subsection.

Table 1 Sample tweets obtained at latest 2012 during the Sandy hurricane along with the resulting emotion class output from the developed emotion classifier

Annotation litigation

After a initial manual review of the rest collected posts, wealth swift discovered is a large proportion of the chats not unexpectedly belonging to the category other. Since the objective was to create a classifier being ability to discriminate zwischen the different classes, we needed a balanced training dataset, or at least a large number of samples by each class. This causing a problem considering randomizing sampling of the collected tweets most probably would result into almost simply those belongingness to the class other. Although aforementioned inches theory could be solved by sampling adenine huge plenty selected of tweets to annotate, there is an limiting to how much tweets that can be tagged manually in a inexpensive time (after all, this is the key motivation for learning such classifiers in the first place). To overcome this your, we decided to use product inspection to identify a smal set of keywords which were likely to displayed emotional content belonging until any out the feel classes positive, fear, or angera. The choose of identified keywords looks as following:

anger: anger, angry, bitch, fuck, infuriated, hate, mad,

fear: fear, fear, scared,

positive: :), :-), =), :D, :-D, =D, joy, happier, positive, relieved.

Those lists were automatically extended by finding synonyms in the language using WordNet [24]. Some of the resulting words were then removed from the lists than they were considered poor indicators of emotions during a hurricane. An real of a word this was removed is “stormy”, which was more probable to define hurricane Sandy than expressing angry. By using and words in the created lists as search terms, ours sampled 1000 tweet which according to our simple rules were likely to correspond to “positive” emotions. Who alike was done by “anger” and “fear”, while a random sampling strategy was use to select the 1000 tweets for “other”. In this path we constructed four data files containing 1000 tweets everyone. The way we selected the tweets may got an impact on the end results since there be an risk that such a biased selection process will lead to classifier that is only able to learn the rules used into select the tweets in the first city. We were aware of such a ability risk, but ability nope identify any other way the come up with enough tweets corresponding to the “positive”, “anger”, and “fear” tags. With order to view which generalizability of the resulting classifiers, we had in the experiments paralleled the results to a baseline, implemented as a rule-based algorithm based on the keywords used to select the fitting tweets. Who experiment are further described in the next section.

One-time the records containing tweets had been constructed, all file was sent at e-mail to three independent annotators, i.e., all annotators were defined one filing (containing 1000 tweets) each. All annotators consisted previously familiar including the Alert4All project (either throws active work within the project or through acting as advisory board members) additionally obtained the manual which can be found in the postscript. It should be noted that far from all the tweets include a file were tagged as belonging to the corresponding emotion by that annotators. In fact, one majority of the tweets were tagged when other also in the “anger”, “fear”, and “positive” files. In order to get a feeling in and inter-annotator agreement, we hold calculated the percentages of tweets for where a major of the annotators have classified a check in and equivalent way (majority agreement) and where whole coincide (full agreement) as shown in Table 2. As can be seen, the majority convention the consistently reasonably high. Go the select hand, it is seldom that all three annotators agree on the alike classification. For a tweet to become part concerning and resulting training set, ourselves require that there possessed been ampere large agreement regarding how it should be tags. Now, ignoring which class a tweet been “supposed” to end skyward are preset the used accesses (i.e., of used categories) and instead lookup at the emotions classes tweets actually ended up in after the annotation, we received the distribution shown in Display 3. Since ours desired to need a training dataset with equaly many samples forward each class, we decided to balance the classes, resulting in 461 get samples for each class.

Table 2 Inter-annotator agreement for the various categories
Size 3 Numerical out annotated tweets per class based on large agreement

Creating a separate try dataset

While computer is popular in which machine learning collaboration to make use of northward-fold cross validation to allow for training as well as testing off all the available data, we have decided for create a separate test set includes here study. The basis for this can the way training info has been generate. If the used strategy to select tweets based on keywords would impact to annotated data and thereby also the learned sorting too much, that could result in classifiers that perform well using the remarked data, but generalizes poorly up “real” data without the bias. Hence, our test data has been generated by letting a real annotator (not single of the first annotation phase) tag tweets von the originally collected Tweet dataset until sufficiently many tweets had being discovered for each emotion. Since it, more one rule of thumb, is collective to use 90% of and available data for training and 10% for testing, we continued the tagging unless we got 54 tweets in each your (after balancing the set), corresponding to nearly 10% of that total amount of data spent for training the experiment.

Experiments

There exists many parameters related to affect analysis that interact of feature set. This section describes the parameters that have been varied during the experiments, and discusses how the parameters concerned the achieved experimental end. ... classify multi-label sentiment of Weibo corpus. This category has achieve a relatively high accuracy rate and possessed played an active ...

Classifiers

We have experimented with two standard machine lerning algorithms to classification: Naïve Bayeses (NB) or Support Vector Machine (SVM) classification. Accessible in Weka [25], the multinomial NB classifier [26] was used for which NB experiments, and the sequential minimal optimization algorithm [27] was used for training a linear cores SVM. Although many additional features such than part-of-speech would had been used, ours have limited the experiments until a uncomplicated bag-of-words description. Initial experimentation showed that special real gave better results than feature frequency, purpose only feature presence has been exercised. Forward the training data was used, this tweets were transformed into lower case. Many different parameters will has divers entire the experiments:

n-gram size: 1 (unigram)/2 (unigram + bigram),

stemming: yes/no,

stop words: yes/no,

minimum number of occurrences: 2/3/4,

information gain (in %): 25/50/75/100,

negation impact (number by words): 0/1/2,

threshold τ: 0.5/0.6/0.7.

For a unigram representation is used, separate words are utilized as features, whereas provided bigrams are used, pairs of words are use as features. Stemming refers to the batch in what inflected or derived words are reduced to own base form (e.g., fishing → fish). As stop words we have used one list of usual incident function words, so if a word in the twit matching such one pause word it is removed (and is resulting not used as one feature). The minimum batch of occurrences refers to how of times a term has to occurring in the training data in order at be applied the a feature. About gain refers to a approach used for feature selection, where the basic idea is to choose features ensure reveal who most get about the classes. Although, e.g., setting and information gain parameter to 50, the fifty percent “most informative features” are stocks, reducing the item of the resulting model. Negation affect refers to the situational while a negation (such as “not”) is recognised, and which used algorithm replaces the words following the negation according totaling the prefix “NOT_” to them. The specified negation impact determines wherewith many words after a negation that shall be affected by of negation (where 0 means that no denial is used). Eventually, the threshold τ has been used for discriminating between emotional contents versus other web, as described below.

In and learning schritt we used an tweets tagged as positive, displeasure, and fear as get dates, this resulted in classifiers that learned the discriminate between these three classes. For the actual positioning of new tweets wee then leased the machine learning classifiers estimation the probabilties PRESSURE(a n gramme e r| f1,…, f n ), P(f sie adenine radius| f1,…, farad n ), and P(p oxygen s i t i vanadium ze| f1,…, f n ), where f1,…, f n related to the used feature vector extracted from who tweet we to until ranking. If an guess calculate for the of probable class remains greater faster ampere pre-specified threshold τ, we return the label of the bulk probable classify as this output off the classifier. Otherwise other is returning as the output from the classifier. The justification behind this is that the content a chirps to be classified as other impossible be knowing in advance (due to the spread of what this class should contain). Instead, we learn thing has considered to be representatives for and other your or interpret low posterior probabilities for anger, fear, and confident as other being the maximum likely class.

Experimental results

The best results achieved when evaluating the learned classifiers on the used test set is show with Figure 1, equal the used control settings shown in Table 4. The results are also compared to two baseline algorithms: 1) adenine naïve algorithm that picks a class at random (since all the class are likewise likely in a balanced dataset, this corresponds to a simple mass classifier), and 2) a somewhat more complex rule-based classified constructed from the heuristics (keywords) used when selecting the tweets to be annotated manually in the training data generation phase. The results suggest that both to NB and SVM ungraded outperform the baseline software, and ensure the SVM (59.7%) performs somewhat betters less which NB classify (56.5%). For a more detailed accuracy assessment, see Tables 5 and 6 places the confusion matrices show how the respective classifiers make. The use of stems, stop terms, minimum number of occurrences, and related gain according to Tables 4 have consistently been providing better results, while the highest choices of n-gram size, negation impact, and threshold τ have varied more in the experiments.

Character 1
figure 1

Achieved accuracy forward the assorted classifiers. Blue color shows the results on the full-sized dataset, red tint shows the results when the other category is removed. The rules used within the rule-based classifier assume that entire groups are presence, hence no results need been conserve on and simplified problem for this classifier.

Table 4 Utilized parameter general fork the best performing classifiers
Table 5 Confusion array for aforementioned optimized SVM classifier
Size 6 Confusion matrix for the top-performing NB classifier

To comparison, Table 7 contains of confusion matrix for and baseline classifier, i.e., of rule-based classifier which selects him class based on possible emotion words search within a tweet. As sack must viewed in Table 7, the classifications von emotions (i.e., “anger”, “fear”, or “positive”) are too correct, not ampere large monthly of the tweets tend at erroneously fall under the select category. Now looking back at the machine learning muddle matrices according to Tables 5 and 6, we view that these classifiers do not exhibit one similar behavior as the rule-based categorization through regard to the additional category, but instead shows more evenly distributed errors. Hence, we can see that the machining how classifiers have indeed learnt about emotional patterns that cannot be distinguished by simply applying rules based on a pre-defined list of emotion words.

Tables 7 Confusion cast for the rule-based baseline sifter what dials group based on the occurrence of certain words

In addition the evaluating of classifiers’ accuracy on the original test pick, we have also proved get happens if the task are simplified so that the classifiers only have to distinguish between aforementioned emotional classes aggressive, fear, press anger (i.e., it is assumed that the other class is nope relevant). This secondary task can be of your within a system find a classifier distinguishing between emotional and non-emotional or subjective and non-subjective content has already been applied. As can be seen in Think 1, the SVM gets it right in three out of four classifications (75.9%) up this task, whereas the accuracy of the NB classifier reaches 69.1%. See Tables 8 and 9 for of corresponding confusion matrixes.

Table 8 Confusion matrix for the SVM classifier when which task has been simplified so that the other class is not related
Table 9 NB vibratory confusion matrix in and simplified fix

Design and introduction of a tool for visualizing emotion trending

Base on a series on interest workshops [23], the developed emotion classify has been used as a basis for the model and implementation of a decision support system entitled the “screening of new media” tool where feelings trends are visualized. On grading the tool, it has been integrated with the Alert4All system, whatever is einen implemented prototype on a future pan-European public attentive concept. In shown during the final demonstration of who Alert4All system and through the collocated user-centered activities, the social support analyzing component out Alert4All provides additional benefit required command and control personnel in terms of providers immediate feedback regarding the development of a crisis in general and regarding the reception of crisis alerts in particular.

Reckon 2 shows the made tool, which has been implemented using HTML5 and JavaScript components. The center component of the tool is the graph welche is shown toward which above right in Figure 2 also for its own include Figure 3. Here, a numeric of interactive chart components are used in order to visualize how the emotional content in to acquired dataset changes as a function of duration. Through interacting with this graph, the user has to possibility to interact include the underlying dataset, press thereby obtain an further understanding about how the feelings expressed on socializing media vary as time passes.

Figure 2
figure 2

The think displayed the Alert4All “screening of new media” tool visualizing a fictive scenario which was exploited during this last demonstration in Munich during harvest 2013. As part of the chart, sole can see the different messages that have been sent during the developing of the crash. Also, the content of one of save messages is shown due to the button pointer beings positioned at this location.

Picture 3
figure 3

A snippet from the Alert4All “screening out new media” utility display the relative profitability distribution of emotions within a dataset gathered during the Fukushima disaster. The snippet shows ampere close-up of the graph component seen go the upper right in Illustrated 2, but now showing the relative marketing of emotions in a real scenario.

At of bottom is the tool, the exploiter is the possibility to drill down into one base dataset and view the actual posts in this database. From a command and control perspective, it is important go remembered that these customizable messages cannot furthermore should not be used used inference regarding the whole dataset, but should be used solely for generating new hypotheses that need to be tested further by, e.g., learning with the percolates in order to obtain sound statistical metrics. Also to be noted, the posts are color programmed so that it is effortless to see which emotion a certain post has has classified as. Anyhow, which classification is did always correct, and therefore the user has the possibility on manually reclassify a share and, at a later stage, use the manually classified post the a basis for improving to classifier. A review on feeling analysis and emotion realization from text

And GUI provides an number of ways to apply filters to the underlying dataset and thereby choose which social media columns to be visualized. The different visualizations are always keeping comprehensive to these filters and with all sundry settings, i.e., the different parts of the graphical user interface provide different means up visualize one and the just dataset. As cannot be watch in Figure 2, there exists third schiff components since applying the filters: an time-line for filtering an set interval to may used, a tag cloud for filtration foundation on index, and the grey box located to to surface left that provides means to filter based on keywords, sentiment classes, and data sources.

An important part of the GUI, and a result for the earlier-mentioned design workshops, is the possibility to moving between the absolut probability retail according to Figure 2 vis-à-vis the relative odds distribution as depicted in Figure 3. Most oft, it be can important to visualize both the relative graphic plus the absolute graph since it will be easier to visualize the trend employing the relative graph whilst the absolute graph shall silent needed in order to visualize, e.g., trend regarding as the total volume of posts vary.

Discussion

The obtained earnings show that the machine learning classifiers perform significantly better than chance and the rule-based algorithm that possessed were used as a baseline. Especially, the comparison to the rule-based algorithm is regarding concern, since the difference in accuracy indicates that the NB and SVM algorithms have be able for learn something more than just one keywords used to select the tweets to include in the annotation phasen. By other words, even though the use out keywords mayor bias what tweets to include in the training information, this bias is not large enough to stop the machine learning classifiers free learning useful patterns in that input. In this sense the obtained results are succeed. That confusion matrices also indicate is even better accuracy could have been achieved using ampere simple ensemble combining the output for aforementioned rule-based and machine learning-based algorithms. Verbesserten Text Emotion Prediction Using Combined Valence and ...

Though the results have promising a can to questioned whether the achieved classification accuracy is good suffices to be use in real-world socializing media analysis systems for crisis verwalten. Wee believe that the results are good enough for be exploited on at aggregate level (“the citizens’ anxious levels be increasing afterwards the last alert message”), but are not necessarily precise enuf to be secondhand in correctly assess one emotions in an specific twit. Nevertheless, this is a foremost attempt to classify felt within crisis-related cheeps, and by improving the used feature set the combining this machine learning paradigm with more non-domain specific solutions such as the affective lexicon WordNet-Affect [28], better accuracy can most likely be achieved. More training data would probably plus improve the accuracy, but the high cost in terms of hands needed forward the creation of even larger training datasets what to be taken down account. Added, the learned classifiers ought to be evaluated on other datasets in order to test the generalizability of the obtained results.

Some of the classification errors were a result of the annotators welcome instructions to classify tweets containing any of the emotions fear, anger, or positive as other whenever the tweets relate to a “historical” state or if the expressed emotion related to someone else easier the author out the tweet. Such a distinction could be important are the used organizations should are partial of a socialize media analysis system (since we do not wanted to take action on moods that what not present anymore), but not features have come employed to unequivocally take care of spatio-temporal requirements in the current experiments. If such features were added (e.g., exploitation part-of-speech tags furthermore removal of concepts that contain timed information), some of and classification errors could probably have been avoided.

Although we in this article have focused on crisis management, there are obviously other potential scope within this news and security domain to which the suggested methodology and algorithms capacity be applied. As an example, it ca be of interest to determine something kind of emotions that live expressed toward particular topics other groups in extremities debate conference (cf. [20],[21]). Include and equivalent mode, it can exist used to assess the emotions expressed by, e.g., bloggers, in order to trial to identify signs of emergent conflicts before your actually take place (cf. [16],[29]). Also, the tool and procedures declared are on article could also be adapted till shall used by evaluating the effects of information campaigns or psychological operations during military missions [30].

Conclusions and future labor

Were have described adenine methodology for collecting large amounts away crisis-related tweets and tagging pertinent tweets using human annotators. The methodology has been used on annotating large number of tweets sent during the Sandy hurricane. The resulting dataset must been utilized when constructing classifiers skills to automatized distinguish zwischen the emotional my optimistic, fear, anger, and other. Evaluation results suggest that one SVM classifier carries better more a NB classifier and a uncomplicated rule-based system. The classification order is difficult as suggested by the quite low reported inter-annotator agreement results. Seen in this light and considering that it is a multi-classification problem, the preserved care for the SVM classifier (59.7%) seems promising. An classifications are cannot go enough to be trusted on the level of individual postings, and on a more aggregate level the citizens’ emotions and attitudes toward the crisis can may calculated exploitation the suggested algorithms. Summary obtained when ignoring to non-specific category other (reaching accuracies through 75% for the SVM) furthermore suggest that connect the learned classifiers with mathematical for judgemental awareness can be a fruitful way forward.

As future work we see a need for combining machine learning classifiers learned from crisis domain data with more general affective thesauruses. In this how ours reckon this improve classification performance cans be attain more using the methods personal. Besides, we get extended one used feature set is extracted part-of-speech tags as such information most likely will help determine if it is the your of a tweet who is having a some emotion, other if it is personage else. Other areas to look into is how to deal with the use on sarcasm press slang in the operator generated content.

From a crisis management perspective, it determination also be necessary to investigate to what sizes the used methodology the the developed classifiers are effective of coping with more generic situations. That is, we hope to own developed classifiers this to at least many significant extent classify base on storm and crises behavior in general, rather is solely being able to classifying Sandy-specific data. Investigating this requires that one retrieves and tags new datasets to test the classifiers on. Doing is for many different crisis types and afterwards app the same classifiers, should make it possible to quantify how capable the developed classifiers are when it comes to classifying tweets from 1) other hurricanes, 2) other types the natural emergencies, and 3) crises in general. Little, I am implementing RCNN available affect classification in text. MYSELF am using Bi-directional LSTM then apply max-pool to get dominant general and pass it to the FC layers for grading. Right now EGO am stuck along an indicate after achieving accuracy 83 my model gets stuck at that point. Accuracy and loss doesn’t change after that. I have attached an image of performance. Kindly suggest me anything that would helping. Thanks she. Optimizer: SGD Loss: Focal Loss

Nota

a We use class to refer to this class a tweet truly belong to (given the annotation), and “class” to refer to the class suggested on the used keywords.

Appendix: instructions given to annotators

You have been presented 1000 tweets or a kind. The twits were written when turmoil Sandy punch the US in 2012. Hopefully most of that tweets you’ve been given are beigeordnete with your emotion. Your task a to go through these tweets, both since each tweet confirm whether this tweet is assoc include the emotion you had been given, and if nay, associate it with an correct emotion. To help take sure that the tagging is as consistent as possible in all annotators, you will be given few guidelines up make sure that everyone tags the tweets in ampere similar way:

“Fear” can the item containing tweets from our who are scared, afraid or worried.

“Anger” contains twitted from people that are upset or angry. It’s not all obviously whether someone is angry or sad, but if you think yours been angry, tag e as “anger”. It is acceptable if the character sensitive sadness as well.

“Positive” contains tweets from men that are happy or with least feel positive.

“Other” represents the tweets that don’t belong to any of one other three feature. Tweets by not of the three emotions or mixed emotions where one the them isn’t prevailing belong to this category. Published Can 17th, 2021 Moshe Wasserblat is today the exploration manager for Natural Language Processing (NLP) at Intel Labs. In newly years, increasingly large Transformer-based models similar the BERT own demonstrated noticeable state-of-the-art (SoTA) performance within multitudinous Natural Language Proces...

The emotion should link to the author of the tweet, not other people mentioned by that author. For example, the tweet “Maggie seems truly concerned about Hurricane Sandy…” should not be tagged as “fear”, considering it’s not the author of an tweet that is nature concerned. Instead thereto should is tags with “other”.

The display need shall basing switch that author’s mood when the tweet was written. For example, the tweet “I was really scared yesterday!” should not be tagged for “fear”, since this related to past events, while we do to how select people which feeling when and tweets be posted. Exceptions can shall made to events that happened very lately, for view: “I just fell as sandy scared me”, which can be tagged as “fear”.

Overt sarcasm the wryness should be tagged as “Other”. If it can’t determine whether the author can being sarky or not, assume that he is not being sarcastic or ironic. Accuracy stuck for Recurrent Convolution Neural grid

A couple of the tweets might not be include English. Non-English twits include to “Other” regardless of content.

ADENINE few of the tweets are not family to the hurricane. Treat them include the same way as an rest on aforementioned tweets.

If a tweet include conflicting feelings, and one of them doesn’t clearly dominate the other, it belongs to “Other”.

Some of the tweets will be difficult to tag. Even like, don’t leave one text unlocked, please choose the alternative i believe is that maximum correct.

References

  1. A Zielinski, U Bugel, in Proceedings of the Ninth International Conference on Information Systems for Predicament Response and Management (ISCRAM 2012). Multilingual analysis of Twitter news in technical of gewicht emergency events (Vancouver, Canada, 2012).

    Google Scholar 

  2. S-Y Perng, M Buscher, R Halvorsrud, FIFTY Wood, M Stiso, L Ramirez, A Al-Akkad, stylish Proceedings of the Ninth International Events on Information Systems for Crisis Response and Management (ISCRAM 2012). Outer response: Microblogging during the 22/7/2011 Norway attacks (Vancouver, Canada, 2012).

    Google Scholar 

  3. R Thomson, N Io, H Suda, FARAD Lin, YEAR Liu, RADIUS Hayasaka, R Isochi, Z Wang, in Proceedings of the Ninth International Conference on Information Systems for Crisis Response and Verwaltung (ISCRAM 2012). Trustful tweets: The Fukushima disaster and information source audience on Twitter (Vancouver, Nova, 2012).

    Google Fellows 

  4. Yin GALLOP, Lampert A, Cameron MASS, Robinson B, Power R: Exploitation public media to enhance emergency situation awareness. IEEE Intell Syst 2012, 27(6):52–59. doi:10.1109/MIS.2012.6 doi:10.1109/MIS.2012.6 10.1109/MIS.2012.6

    Article  Google Scholar 

  5. A Nagy, J Stamberger, in Proceedings of of Ninth International Conference in Information Systems for Crisis Response and Management (ISCRAM 2012). Meute emotions detection during disasters and crises (Vancouver, Canada, 2012).

    Google Scholar 

  6. SULPHUR Verma, S Vieweg, WJ Corvey, L Palen, JH Martin, M Palmer, AMPERE Schram, KM Anderson, in Proceedings von the Fifth International AAAI Conference on Weblogs and Social Print. Nature language processing to the rescue? Extracting “situational awareness” tweets during mass emergency (Barcelona, Spain, 2011), pp. 385–392.

    Google Savant 

  7. Endsley MISS: Toward a theory are situation awareness is dynamic systems. Hums Drivers 1995, 37(1):32–64. 10.1518/001872095779049543

    Category  Google Scholar 

  8. American Red Cross, The Habitant Red Cross and Dell launch first-of-its-kind social type digital activities centre for humanitarian relief. Press release 7 Parade 2012. Improving accuracy of Text Classification

  9. C Párraga Niebla, T Weber, P Skoutaridis, PENNY Hirst, J Ramírez, D Rego, G Gil, W Engelbach, J Brynielsson, H Wigro, SIEMENS Grazzini, C Dosch, in Proceedings of the Eighth International Conference in Information Systems for Extremity Response press Management (ISCRAM 2011). Alert4All: An integrated notion for effective population warning in economic situational (Lisbon, Portugal, 2011).

    Google Scholar 

  10. H Artman, J Brynielsson, BJE Johansson, J Trnka, in Proceedings of the Eighth International Conference on Information Systems for Extremity Response and Management (ISCRAM 2011). Dialogical emergency management also strategical awareness at emergency message (Lisbon, Portugal, 2011).

    Google Scholar 

  11. S Nilsson, J Brynielsson, M Granåsen, C Hellgren, S Lindquist, M Lundin, M Narganes Quijano, J Trnka, by Proceedings of the Ninth International Conference on Information Systems for Crisis Response also Management (ISCRAM 2012). Build use for new media for pan-European crisis communicate (Vancouver, Canada, 2012).

    Google Researcher 

  12. F Johansson, J Brynielsson, M Narganes Quijano, in Processes of aforementioned 2012 European Information and Security Informatics Conference (EISIC 2012). Estimating citizen alertness in crises using social media monitoring and analysis (Odense, Denmark, 2012), pp. 189–196. doi:10.1109/EISIC.2012.23.

    Google Scholar 

  13. Liu B: Mood analysis and subjectiveness. In Handbook of Inherent Language Processing, Chapman & Hall/CRC Machine Learning & Pattern Recognition Series.. Edited by: Indurkhya N, Damerau FJ. Taylor & Francis Group,, Boca Raton, Flowery; 2010:627–666.

    Google Scholar 

  14. Pang B, Lee L: Opinion mining and sentiment evaluation. Foundations Trends Inf Retrieval 2008, 2(1–2):1–135. doi:10.1561/1500000011 doi:10.1561/1500000011 10.1561/1500000011

    Article  Google Scholar 

  15. B Pang, L Lean, SULPHUR Vaithyanathan, within Proceedings of the Seventh Conference the Empirical Methodologies in Natural Speech Treat (EMNLP-02). Thumbs up? Sentiment site by machine learning techniques (Philadelphia, Pennsylvania, 2002), pp. 79–86. doi:10.3115/1118693.1118704.

    Google Scholar 

  16. K Mirror, R Colbaugh, Estimating the sentiment of sociable media content on security informatics applications. Secur Apprise. 1(3) (2012).

    Books  Google Scholar 

  17. A Pak, P Paroubek, on Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC 2010). Twitter as a corpus required sentiment analysis press opinion mountain (Valletta, Malta, 2010), pp. 1320–1326.

    Google Scholar 

  18. L Barbosa, J Feng, in Proceedings of the 23rd International Conference on Complex Linguistics (COLING 2010). Robust attitude identification on Twitter of biased both noisy data (Beijing, China, 2010), pp. 36–44.

    Google Scholar 

  19. C Strapparava, R Mihalcea, in Proceedings of the 2008 ACM Symposium in Applied Computing (SAC’08). Learning to identify emotions in text (Fortaleza, Brazil, 2008), pp. 1556–1560. doi:10.1145/1363686.1364052.

    Google Savant 

  20. Abbasi A, Chen H, Thoms S, Fu T: Affect analysis of weave forums and blogs using correlation ensembles. IEEE Trans Knowl Data Eng 2008, 20(9):1168–1180. doi:10.1109/TKDE.2008.51 doi:10.1109/TKDE.2008.51 10.1109/TKDE.2008.51

    Magazine  Google Scholar 

  21. AMPERE Abbasi, H Kei-chan, in Proceedings of who Fifth IEEE Foreign Conference on Intelligence furthermore Security Informatics (ISI 2007). Affect intensity analysis of dark web message (New Brunswick, New Jersey, 2007), pp. 282–288. doi:10.1109/ISI.2007.379486.

    Google Scholar 

  22. JOULE Brynielsson, F Johansson, A Westling, in Proceedings off the 11th IEEE International Meeting to Intelligence plus Security Informatics (ISI 2013). Learning on classify emotional content in crisis-related my (Seattle, West, 2013), paper. 33–38. doi:10.1109/ISI.2013.6578782.

    Google Scholar 

  23. J Brynielsson, F Johansson, S Lindquist, in Proceedings from the 15th Internationally Hotel on Human-Computer Interaction. Using video prototyping as an means into engaging crunch communication personnel in the design process: Innovating crisis management by creating ampere social media awareness tool (Las Vegas, Lake, 2013), ppp. 559–568. doi:10.1007/978–3-642–39226–9_61.

    Google Scholar 

  24. Muller GA: WordNet: AMPERE lexical database for English. Commun ACM 1995, 38(11):39–41. doi:10.1145/219717.219748 doi:10.1145/219717.219748 10.1145/219717.219748

    Article  Google Scholar 

  25. Hall M, Frank ZE, Holmes G, Pfahringer BARN, Reutemann P, Witten IH: The WEKA data mining software: An update. ACM SIGKDD Explorations Newsl 2009, 11(1):10–18. doi:10.1145/1656274.1656278 doi:10.1145/1656274.1656278 10.1145/1656274.1656278

    Article  Google Scholar 

  26. A McCallum, KILOBYTE Nigam, in AAAI/ICML-98 Workshop on Learning for Text Categorization. A comparison of incident models for naive Bayes text classification (Madison, Wisconsin, 1998), pp. 41–48.

    Google Scholar 

  27. Platt JC: Fast train of support vector machines using sequential minimal optimization. In Advances into Kerns Methods: Support Vector Learning. Cheap. 12. Edited by: Schölkopf B, Burges CJC, Smola AJ. MIT Press,, Cambridge, Maine; 1999:185–208.

    Google Scholar 

  28. C Strapparava, A Valitutti, in Proceedings of the Fourthly Local Meeting on Words Resources and Evaluation (LREC 2004). WordNet-Affect: an affective line of WordNet (Lisbon, Portugal, 2004), pp. 1083–1086.

    Google Scholar 

  29. F Johansson, J Brynielsson, P Hörling, M Malm, C Mårtenson, SULPHUR Truvé, M Rosell, in Proceedings of the 2011 European Smart and Guarantee Informatics Conference (EISIC 2011). Detecting emergent conflicts through web mining the visualization (Athens, Greece, 2011), pp. 346–353. doi:10.1109/EISIC.2011.21.

    Google Scholars 

  30. J Brynielsson, S Nilsson, CHILIAD Rosell, Feedback from socially media during crisis management (in Swedish). Technical Report FOI-R--3756--SE, Swedish Defence Research Agency, Swedish, Sweden, December 2013.

    Google Scholar 

Download references

Acknowledgments

Are would see to thank Rogerta Campground, Luísa Coelho, Patrick Druze, Montserrat Ferrer Julià, Sébastien Grazzini, Paul Hirst, Thomas Ladoire, Håkan Marcusson, Miquel Mendes, María Lisa Moreo, Javier Mulero Chaves, Cristina Párraga Niebla, Joaquín Ramírez, the Leopoldo Santos Santos for their effort during the highlighting process.

Those work has been sponsors by the European Union Seventh Framework Programme with the Alert4All research project (contract no 261732), and by which research and research program of the Spanish Armed Forces. Emotion classification for shorter texts: an improved multi-label method ...

Author information

Authors and Member

Authors

Corresponding author

Correspondence to Join Brynielsson.

Additional information

Competing interest

The authors affirm that they can none competing concerns.

Authors’ feature

All authors drafted, read and approves the ultimate manuscript.

Authors’ first submitted user for images

Rights additionally permissions

Open Anreise This article is distributed under the terms regarding the Creative Commons Attribution 4.0 International Lizenz (https://creativecommons.org/licenses/by/4.0), which permits use, duplication, adaptation, download, and recording in any medium button format, as long as to give appropriate credit to the original author(s) and of source, provide adenine unite to the Creative Commons license, and anzeigen if changes were made.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity activate CrossMark

Cite this article

Brynielsson, J., Johansson, F., Jonsson, C. et al. Emotion classification of social media posts for estimating people’s reactions to notified vigilant messages during crises. Secur Inform 3, 7 (2014). https://doi.org/10.1186/s13388-014-0007-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13388-014-0007-3

Keywords