|
Key
This line was removed.
This word was removed. This word was added.
This line was added.
|
Changes (9)
View Page History{toc}
{attachments}
h2. Introduction

min_synsets(sentiwordnet_Pos(w)-sentiwordnet_Neg(w)) < -0.5
max_synsets(sentiwordnet_Pos(w)-sentiwordnet_Neg(w)) > 0.5
max_synsets(sentiwordnet_Pos(w)-sentiwordnet_Neg(w)) > 0.5
* words with very similar positive and negative score in all synsets are said to be neutral. We also remove them:
max_synsets(abs(sentiwordnet_Pos(w)-sentiwordnet_Neg(w))) < 0.2
* the final score is the most polarizing difference in a synset and map it to [0,1]
*score_sentiwordnet(w) = 0.5 max_synsets(sentiwordnet_Pos(w)-sentiwordnet_Neg(w))+0.5*
max_synsets(abs(sentiwordnet_Pos(w)-sentiwordnet_Neg(w))) < 0.2
* the final score is the most polarizing difference in a synset and map it to [0,1]
*score_sentiwordnet(w) = 0.5 max_synsets(sentiwordnet_Pos(w)-sentiwordnet_Neg(w))+0.5*
h5. MPQA processing:
The dataset is annotated with positive, negative and neutral, without probabilities. We assigned:
positive, score = 1
negative, score = 0
neutral, score = 0.5
negative, score = 0
neutral, score = 0.5
w positive, *score_MPQA(w) = 1*
w negative, *score_MPQA(w) = 0*
w neutral, *score_MPQA(w) = 0.5*
w negative, *score_MPQA(w) = 0*
w neutral, *score_MPQA(w) = 0.5*
h5. IMDB processing:

We obtain probabilities from counts as follows:
*score_IMDB = P(positive|w) = count(w in positive documents) / count(w in documents)*
P(negative|w) = count(w in negative documents) / count(w in documents)
h5. Aggregation into one final score:
Aggregation into one score:
*score(w) = 0.4 score_SentiWordNet(w) + 0.4 score_MPQA(w) + 0.2 score_IMDB(w)*
The resulting file is [attached|^Lexicon_combined.csv] to this page.
h3. Sentiment evaluation algorithms
