Week 4: VADER Sentiment Analysis

Week 4 Goals

Use VADER for sentiment analysis
Understand compound scores
Handle negations and emphasis

This week's code: https://drive.google.com/file/d/1RaHTwIE8f73y9tePc3L4D-df_Pp5Ocr7/view?usp=sharing

Introduction

Last week, we built a basic sentiment analyzer using word counting. It worked, but had limitations - it couldn't handle "not good" or understand that "AMAZING!!!" is stronger than "good."

This week, we'll use VADER (Valence Aware Dictionary and sEntiment Reasoner) - a powerful sentiment analysis tool that solves these problems.

What is VADER?

VADER is a pre-built sentiment analysis tool specifically designed for social media and short texts. It understands:

Negations: "not good" is negative
Emphasis: "GOOD" vs "good" vs "GOOOOD"
Punctuation: "good!" vs "good"
Degree modifiers: "very good" vs "good"
Context: Multiple rules working together

Why VADER is Better

Our basic analyzer from Week 3:

"This game is not good" → Positive (found "good")
"This is AMAZING!!!" → score of 1 (same as "good")

VADER:

"This game is not good" → Negative (understands "not")
"This is AMAZING!!!" → Higher score (understands emphasis)

1. Setting Up VADER

First, let's import and set up VADER.

Installing and Importing VADER

import nltk

from nltk.sentiment import SentimentIntensityAnalyzer

# Download VADER lexicon

nltk.download('vader_lexicon')

# Create VADER analyzer

sia = SentimentIntensityAnalyzer()

print("VADER is ready!")

2. Understanding VADER Scores

VADER returns four scores for any text.

The Four VADER Scores

from nltk.sentiment import SentimentIntensityAnalyzer

sia = SentimentIntensityAnalyzer()

text = "This game is good."

scores = sia.polarity_scores(text)

print("Text:", text)

print("Scores:", scores)

Output:

Text: This game is good.

Scores: {'neg': 0.0, 'neu': 0.508, 'pos': 0.492, 'compound': 0.4404}

What Each Score Means:

neg: Proportion of negative sentiment (0.0 to 1.0)
neu: Proportion of neutral sentiment (0.0 to 1.0)
pos: Proportion of positive sentiment (0.0 to 1.0)
compound: Overall sentiment score (-1.0 to +1.0)

Note: neg + neu + pos = 1.0 (they're proportions)

Understanding the Compound Score

The compound score is the most useful - it's a single number from -1 to +1:

+1.0: Extremely positive
0.0: Neutral
-1.0: Extremely negative

Classification rules:

compound >= 0.05 → Positive
compound <= -0.05 → Negative
-0.05 < compound < 0.05 → Neutral

3. Handling Negations

Let's see how VADER handles negations compared to our basic analyzer.

Negation Examples

from nltk.sentiment import SentimentIntensityAnalyzer

sia = SentimentIntensityAnalyzer()

# Test sentences with negations

sentences = [

"This game is good.",*
"This game is not good.",*
"This game is not bad.",*
"I don't hate this game."*

]

for sentence in sentences:

scores = sia.polarity_scores(sentence)*
compound = scores['compound']*
if compound >= 0.05:*
sentiment = "Positive"*
elif compound <= -0.05:*
sentiment = "Negative"*
else:*
sentiment = "Neutral"*
print(f"Text: {sentence}")*
print(f"Compound: {compound:.3f} → {sentiment}")*
print()*

Expected Output:

Text: This game is good.

Compound: 0.440 → Positive

Text: This game is not good.

Compound: -0.296 → Negative

Text: This game is not bad.

Compound: 0.431 → Positive

Text: I don't hate this game.

Compound: 0.318 → Positive

Notice: VADER correctly flips sentiment when it sees "not"!

4. Handling Emphasis and Punctuation

VADER understands that emphasis makes sentiment stronger.

Emphasis Examples

from nltk.sentiment import SentimentIntensityAnalyzer

sia = SentimentIntensityAnalyzer()

# Different levels of emphasis

emphasis_tests = [

"This game is good",*
"This game is GOOD",*
"This game is good!",*
"This game is GOOD!",*
"This game is GOOD!!!",*
"This game is gooood"*

]

for text in emphasis_tests:

compound = sia.polarity_scores(text)['compound']*
print(f"{text:30} → Compound: {compound:.3f}")*

Output shows increasing scores:

This game is good → Compound: 0.440

This game is GOOD → Compound: 0.506

This game is good! → Compound: 0.502

This game is GOOD! → Compound: 0.569

This game is GOOD!!! → Compound: 0.629

This game is gooood → Compound: 0.473

Key Insights:

ALL CAPS increases sentiment strength
Exclamation marks boost sentiment
Multiple punctuation marks boost even more
Letter repetition adds emphasis

5. Handling Degree Modifiers

Words like "very", "extremely", and "somewhat" modify sentiment intensity.

Degree Modifier Examples

from nltk.sentiment import SentimentIntensityAnalyzer

sia = SentimentIntensityAnalyzer()

modifiers = [

"The game is good.",*
"The game is very good.",*
"The game is extremely good.",*
"The game is somewhat good.",*
"The game is barely good."*

]

for text in modifiers:

compound = sia.polarity_scores(text)['compound']*
print(f"{text:35} → {compound:.3f}")*

Output:

The game is good. → 0.440

The game is very good. → 0.603

The game is extremely good. → 0.632

The game is somewhat good. → 0.439

The game is barely good. → 0.296

6. Building a VADER Sentiment Analyzer Function

Let's create a complete function using VADER.

Complete VADER Analyzer

from nltk.sentiment import SentimentIntensityAnalyzer

def analyze_sentiment_vader(text):

"""Analyze sentiment using VADER"""*
sia = SentimentIntensityAnalyzer()*
scores = sia.polarity_scores(text)*
Get compound score*
compound = scores['compound']*
Classify based on compound score*
if compound >= 0.05:*
classification = "Positive"*
elif compound <= -0.05:*
classification = "Negative"*
else:*
classification = "Neutral"*
return {*
'compound': compound,*
'classification': classification,*
'positive': scores['pos'],*
'neutral': scores['neu'],*
'negative': scores['neg']*
}*

# Test the function

test_reviews = [

"This game is absolutely AMAZING!!!",*
"Terrible game. Complete waste of money.",*
"The game is okay, nothing special.",*
"Not bad, but not great either."*

]

for review in test_reviews:

result = analyze_sentiment_vader(review)*
print(f"Review: {review}")*
print(f"Result: {result['classification']} (compound: {result['compound']:.3f})")*
print()*

7. Comparing Basic vs VADER Analyzers

Let's compare our Week 3 analyzer with VADER on tricky examples.

Side-by-Side Comparison

from nltk.sentiment import SentimentIntensityAnalyzer

from nltk.tokenize import word_tokenize

# Our basic analyzer from Week 3

def analyze_sentiment_basic(text):

positive_words = ['good', 'great', 'excellent', 'amazing', 'love']*
negative_words = ['bad', 'terrible', 'awful', 'hate', 'worst']*
tokens = word_tokenize(text.lower())*
pos_count = sum(1 for token in tokens if token in positive_words)*
neg_count = sum(1 for token in tokens if token in negative_words)*
score = pos_count - neg_count*
if score > 0:*
return "Positive"*
elif score < 0:*
return "Negative"*
else:*
return "Neutral"*

# VADER analyzer

sia = SentimentIntensityAnalyzer()

def analyze_vader(text):

compound = sia.polarity_scores(text)['compound']*
if compound >= 0.05:*
return "Positive"*
elif compound <= -0.05:*
return "Negative"*
else:*
return "Neutral"*

# Tricky test cases

tricky_reviews = [

"This game is not good.",*
"The game is AMAZING!!!",*
"I really, really love this game!",*
"It's not terrible.",*
"Very bad game."*

]

print("Basic vs VADER Comparison")

print("=" * 60)

for review in tricky_reviews:

basic = analyze_sentiment_basic(review)*
vader = analyze_vader(review)*
match = "✓" if basic == vader else "✗"*
print(f"Review: {review}")*
print(f" Basic: {basic:8} | VADER: {vader:8} {match}")*
print()*

8. Testing VADER on Movie Reviews

Let's see how accurate VADER is on our movie reviews dataset.

VADER Accuracy Test

from nltk.corpus import movie_reviews

from nltk.sentiment import SentimentIntensityAnalyzer

sia = SentimentIntensityAnalyzer()

# Test on positive reviews

correct_pos = 0

total_pos = 100

for fileid in movie_reviews.fileids('pos')[:total_pos]:

text = movie_reviews.raw(fileid)*
compound = sia.polarity_scores(text)['compound']*
if compound >= 0.05:*
correct_pos += 1*

# Test on negative reviews

correct_neg = 0

total_neg = 100

for fileid in movie_reviews.fileids('neg')[:total_neg]:

text = movie_reviews.raw(fileid)*
compound = sia.polarity_scores(text)['compound']*
if compound <= -0.05:*
correct_neg += 1*

# Calculate accuracy

accuracy_pos = (correct_pos / total_pos) * 100

accuracy_neg = (correct_neg / total_neg) * 100

overall = (correct_pos + correct_neg) / (total_pos + total_neg) * 100

print("VADER Accuracy on Movie Reviews:")

print(f" Positive reviews: {accuracy_pos:.1f}%")

print(f" Negative reviews: {accuracy_neg:.1f}%")

print(f" Overall accuracy: {overall:.1f}%")

9. Real-World Example: Analyzing Game Reviews

Let's analyze some realistic game review examples.

Game Review Analysis

from nltk.sentiment import SentimentIntensityAnalyzer

sia = SentimentIntensityAnalyzer()

game_reviews = [

"This game is absolutely incredible! The graphics are STUNNING and gameplay is super smooth. Highly recommend!!!",*
"Buggy mess. Game crashes every 10 minutes. Not worth the money at all.",*
"It's okay. Graphics are decent but gameplay gets repetitive after a while. Not bad but not great.",*
"I didn't think I would like this game, but I was wrong! It's actually pretty fun.",*
"WORST GAME EVER! Total waste of time and money. Extremely disappointing."*

]

print("Game Review Sentiment Analysis")

print("=" * 70)

for i, review in enumerate(game_reviews, 1):

result = analyze_sentiment_vader(review)*
print(f"Review {i}: {review[:60]}...")*
print(f"Sentiment: {result['classification']}")*
print(f"Compound Score: {result['compound']:.3f}")*
print(f"Positive: {result['positive']:.2f} | Neutral: {result['neutral']:.2f} | Negative: {result['negative']:.2f}")*
print()*

Practice Exercise

Test VADER on your own reviews and compare with the basic analyzer:

Write 3 reviews with tricky language:
- One with negation ("not bad")
- One with emphasis ("AMAZING!!!")
- One with degree modifiers ("very good")
Analyze them with both methods:
- Basic word counting (Week 3)
- VADER (Week 4)
Compare the results - which is more accurate?

from nltk.sentiment import SentimentIntensityAnalyzer

sia = SentimentIntensityAnalyzer()

# Your practice reviews

my_review_1 = "Write your first review here"

my_review_2 = "Write your second review here"

my_review_3 = "Write your third review here"

# Analyze with VADER

print("Review 1:", analyze_sentiment_vader(my_review_1))

print("Review 2:", analyze_sentiment_vader(my_review_2))

print("Review 3:", analyze_sentiment_vader(my_review_3))

Key Takeaways

VADER is much more sophisticated than basic word counting
Compound score (-1 to +1) is the main score to use
Negations are handled automatically ("not good" → negative)
Emphasis through caps, punctuation, and repetition increases strength
Degree modifiers like "very" and "extremely" adjust intensity
VADER achieves 70-80% accuracy on most review datasets

When to Use VADER vs Basic Analyzer

Use VADER when:

You need high accuracy
Text has complex language (negations, emphasis)
Working with social media or reviews
You want a quick, ready-to-use solution

Use Basic Analyzer when:

Learning NLP fundamentals
Need full control over word lists
Working with very specific domain vocabulary
Building custom sentiment rules

Next Week Preview

In Week 5, we'll learn:

Combining multiple reviews into overall sentiment
Visualizing sentiment distributions
Finding sentiment trends over time
Building a complete sentiment analysis report
Preparing for the Chrome extension integration

You now have a powerful sentiment analysis tool! Next, we'll learn how to use it at scale.

Steam Review Analyzer Week 4

Week 4: VADER Sentiment Analysis

Week 4 Goals

Introduction

What is VADER?

Why VADER is Better

1. Setting Up VADER

Installing and Importing VADER

2. Understanding VADER Scores

The Four VADER Scores

What Each Score Means:

Understanding the Compound Score

3. Handling Negations

Negation Examples

4. Handling Emphasis and Punctuation

Emphasis Examples

5. Handling Degree Modifiers

Degree Modifier Examples

6. Building a VADER Sentiment Analyzer Function

Complete VADER Analyzer

7. Comparing Basic vs VADER Analyzers

Side-by-Side Comparison

8. Testing VADER on Movie Reviews

VADER Accuracy Test

9. Real-World Example: Analyzing Game Reviews

Game Review Analysis

Practice Exercise

Key Takeaways

When to Use VADER vs Basic Analyzer

Next Week Preview

Comments