compared with
Current by laura.tolosi
on Dec 08, 2014 16:46.

This line was removed.
This word was removed. This word was added.
This line was added.

Changes (9)

View Page History


h2. Introduction

min_synsets(sentiwordnet_Pos(w)-sentiwordnet_Neg(w)) < -0.5
max_synsets(sentiwordnet_Pos(w)-sentiwordnet_Neg(w)) > 0.5
* words with very similar positive and negative score in all synsets are said to be neutral. We also remove them:
max_synsets(abs(sentiwordnet_Pos(w)-sentiwordnet_Neg(w))) < 0.2
* the final score is the most polarizing difference in a synset and map it to [0,1]
*score_sentiwordnet(w) = 0.5 max_synsets(sentiwordnet_Pos(w)-sentiwordnet_Neg(w))+0.5*

h5. MPQA processing:
The dataset is annotated with positive, negative and neutral, without probabilities. We assigned:
positive, score = 1
negative, score = 0
neutral, score = 0.5
w positive, *score_MPQA(w) = 1*
w negative, *score_MPQA(w) = 0*
w neutral, *score_MPQA(w) = 0.5*

h5. IMDB processing:
We obtain probabilities from counts as follows:

*score_IMDB = P(positive|w) = count(w in positive documents) / count(w in documents)*
P(negative|w) = count(w in negative documents) / count(w in documents)

h5. Aggregation into one final score:

Aggregation into one score:

*score(w) = 0.4 score_SentiWordNet(w) + 0.4 score_MPQA(w) + 0.2 score_IMDB(w)*

The resulting file is [attached|^Lexicon_combined.csv] to this page.

h3. Sentiment evaluation algorithms