Dictionaries and Sets: Key-Value and Unique Data

Welcome to the powerhouses of Python data structures! Dictionaries are like phone books - find information instantly by name. Sets are like guest lists - unique items only, no duplicates allowed.

Dictionaries: Key-Value Powerhouses

Creating Dictionaries

# Empty dictionary
empty_dict = {}

# Dictionary with initial values
person = {
    "name": "Alice",
    "age": 25,
    "city": "New York"
}

# Using dict() constructor
from_pairs = dict([("a", 1), ("b", 2), ("c", 3)])
from_kwargs = dict(name="Bob", age=30, city="London")

# Dictionary comprehension
squares = {x: x**2 for x in range(5)}  # {0: 1, 1: 4, 2: 9, 3: 16, 4: 25}

Accessing Dictionary Values

person = {"name": "Alice", "age": 25, "city": "New York"}

# Access by key
print(person["name"])  # "Alice"
print(person["age"])   # 25

# Safe access with get() - no KeyError if key missing
print(person.get("salary"))           # None
print(person.get("salary", 0))        # 0 (default value)

# Check if key exists
if "name" in person:
    print("Name found!")

# Get all keys, values, or items
print(person.keys())    # dict_keys(['name', 'age', 'city'])
print(person.values())  # dict_values(['Alice', 25, 'New York'])
print(person.items())   # dict_items([('name', 'Alice'), ('age', 25), ('city', 'New York')])

Modifying Dictionaries

person = {"name": "Alice", "age": 25}

# Add new key-value pair
person["job"] = "Engineer"
print(person)  # {'name': 'Alice', 'age': 25, 'job': 'Engineer'}

# Update existing value
person["age"] = 26
print(person)  # {'name': 'Alice', 'age': 26, 'job': 'Engineer'}

# Update multiple values
person.update({"city": "Boston", "salary": 75000})
print(person)  # {'name': 'Alice', 'age': 26, 'job': 'Engineer', 'city': 'Boston', 'salary': 75000}

# Remove by key
removed_job = person.pop("job")
print(f"Removed job: {removed_job}")  # "Engineer"
print(person)

# Remove last item (arbitrary order)
last_item = person.popitem()
print(f"Removed last: {last_item}")  # ('salary', 75000)

# Clear all items
person.clear()
print(person)  # {}

Dictionary Methods

# setdefault() - get value or set default
person = {"name": "Alice"}
age = person.setdefault("age", 25)  # Sets default if key doesn't exist
print(age)  # 25
print(person)  # {'name': 'Alice', 'age': 25}

# Copy dictionary
original = {"a": 1, "b": 2}
copy_dict = original.copy()
copy_dict["c"] = 3
print(original)  # {'a': 1, 'b': 2} (unchanged)
print(copy_dict)  # {'a': 1, 'b': 2, 'c': 3}

Sets: Unique Collections

Creating Sets

# Empty set ({} creates empty dict!)
empty_set = set()

# Set with initial values
numbers = {1, 2, 3, 4, 5}
fruits = {"apple", "banana", "orange"}

# From list (removes duplicates)
duplicates = [1, 2, 2, 3, 3, 3, 4]
unique_numbers = set(duplicates)
print(unique_numbers)  # {1, 2, 3, 4}

# Set comprehension
even_squares = {x**2 for x in range(10) if x**2 % 2 == 0}
print(even_squares)  # {0, 4, 16, 36, 64}

Set Operations

set_a = {1, 2, 3, 4, 5}
set_b = {4, 5, 6, 7, 8}

# Add elements
set_a.add(6)
print(set_a)  # {1, 2, 3, 4, 5, 6}

# Remove elements
set_a.remove(6)     # Raises KeyError if not found
set_a.discard(10)   # No error if not found

# Set operations
print("Union:", set_a | set_b)              # {1, 2, 3, 4, 5, 6, 7, 8}
print("Intersection:", set_a & set_b)       # {4, 5}
print("Difference (A-B):", set_a - set_b)   # {1, 2, 3}
print("Difference (B-A):", set_b - set_a)   # {6, 7, 8}
print("Symmetric diff:", set_a ^ set_b)     # {1, 2, 3, 6, 7, 8}

# Update operations (modify in place)
set_a.update(set_b)      # Union
print("After update:", set_a)

set_a.intersection_update(set_b)  # Keep only intersection
set_a.difference_update(set_b)    # Remove elements in set_b

Set Methods

numbers = {1, 2, 3, 4, 5}

# Check membership
print(3 in numbers)      # True
print(6 in numbers)      # False

# Check subsets/supersets
subset = {1, 2}
print(subset.issubset(numbers))    # True
print(numbers.issuperset(subset))  # True

# Check disjoint (no common elements)
other = {6, 7, 8}
print(numbers.isdisjoint(other))   # True

# Copy set
numbers_copy = numbers.copy()
numbers_copy.add(6)
print(numbers)       # {1, 2, 3, 4, 5}
print(numbers_copy)  # {1, 2, 3, 4, 5, 6}

Real-World Dictionary Examples

Example 1: Student Grade Book

def create_student(name, grades):
    """Create a student record."""
    return {
        "name": name,
        "grades": grades,
        "average": sum(grades) / len(grades) if grades else 0
    }

def add_grade(student, grade):
    """Add a grade to student record."""
    student["grades"].append(grade)
    student["average"] = sum(student["grades"]) / len(student["grades"])

def get_letter_grade(average):
    """Convert numeric average to letter grade."""
    if average >= 90: return "A"
    elif average >= 80: return "B"
    elif average >= 70: return "C"
    elif average >= 60: return "D"
    else: return "F"

# Create student records
students = {}
students["alice"] = create_student("Alice", [85, 92, 88])
students["bob"] = create_student("Bob", [78, 85, 90])

# Add more grades
add_grade(students["alice"], 95)
add_grade(students["bob"], 82)

# Display results
for student_id, student in students.items():
    letter = get_letter_grade(student["average"])
    print(f"{student['name']}: Average={student['average']:.1f}, Grade={letter}")

Example 2: Word Frequency Counter

def count_words(text):
    """Count word frequencies in text."""
    words = text.lower().split()
    word_count = {}

    for word in words:
        # Remove punctuation
        word = word.strip(".,!?\"'")
        if word:  # Skip empty strings
            word_count[word] = word_count.get(word, 0) + 1

    return word_count

def get_most_common_words(word_count, n=5):
    """Get the n most common words."""
    # Sort by frequency (descending), then by word (ascending)
    sorted_words = sorted(word_count.items(),
                         key=lambda x: (-x[1], x[0]))
    return sorted_words[:n]

# Analyze text
text = """
Python is a powerful programming language. Python is easy to learn and
Python has a simple syntax. Many developers love Python because Python
is versatile and Python can be used for web development, data science,
and automation.
"""

word_frequencies = count_words(text)
most_common = get_most_common_words(word_frequencies, 3)

print("Word frequencies:")
for word, count in most_common:
    print(f"  {word}: {count}")

print(f"\nTotal unique words: {len(word_frequencies)}")

Example 3: Phone Book Application

def add_contact(phone_book, name, number, email=None):
    """Add a contact to phone book."""
    phone_book[name.lower()] = {
        "name": name,
        "number": number,
        "email": email
    }
    print(f"Added contact: {name}")

def find_contact(phone_book, name):
    """Find a contact by name."""
    return phone_book.get(name.lower())

def search_contacts(phone_book, query):
    """Search contacts by name or number."""
    results = {}
    query = query.lower()

    for key, contact in phone_book.items():
        if (query in contact["name"].lower() or
            query in contact["number"]):
            results[key] = contact

    return results

def delete_contact(phone_book, name):
    """Delete a contact."""
    key = name.lower()
    if key in phone_book:
        deleted = phone_book.pop(key)
        print(f"Deleted contact: {deleted['name']}")
        return True
    else:
        print(f"Contact not found: {name}")
        return False

# Usage
phone_book = {}

add_contact(phone_book, "Alice Johnson", "555-0123", "alice@email.com")
add_contact(phone_book, "Bob Smith", "555-0456")
add_contact(phone_book, "Charlie Brown", "555-0789", "charlie@email.com")

# Find specific contact
alice = find_contact(phone_book, "alice johnson")
if alice:
    print(f"Found: {alice['name']} - {alice['number']}")

# Search contacts
results = search_contacts(phone_book, "555")
print(f"Contacts with '555': {len(results)}")

# Delete contact
delete_contact(phone_book, "bob smith")

Real-World Set Examples

Example 1: Unique Visitor Tracker

def track_visitors():
    """Track unique website visitors."""
    visitors = set()

    def add_visitor(ip_address):
        if ip_address in visitors:
            print(f"Returning visitor: {ip_address}")
        else:
            visitors.add(ip_address)
            print(f"New visitor: {ip_address}")

    def get_visitor_count():
        return len(visitors)

    def get_unique_visitors():
        return visitors.copy()

    # Return interface functions
    return add_visitor, get_visitor_count, get_unique_visitors

# Usage
add_visitor, get_count, get_visitors = track_visitors()

# Simulate visitors
visits = ["192.168.1.1", "192.168.1.2", "192.168.1.1", "192.168.1.3", "192.168.1.2"]
for ip in visits:
    add_visitor(ip)

print(f"Total unique visitors: {get_count()}")
print(f"Visitor IPs: {get_visitors()}")

Example 2: Tag-Based Content Filter

def create_content_filter():
    """Create a content filtering system based on tags."""
    content_items = [
        {"id": 1, "title": "Python Tutorial", "tags": {"python", "tutorial", "programming"}},
        {"id": 2, "title": "Web Development", "tags": {"web", "javascript", "html"}},
        {"id": 3, "title": "Data Science", "tags": {"python", "data", "machine-learning"}},
        {"id": 4, "title": "Mobile Apps", "tags": {"mobile", "ios", "android"}},
        {"id": 5, "title": "Game Development", "tags": {"gaming", "unity", "c#"}}
    ]

    def find_by_tags(required_tags, any_tags=None):
        """Find content that matches tag criteria."""
        if any_tags is None:
            any_tags = set()

        matches = []
        for item in content_items:
            item_tags = item["tags"]

            # Must have all required tags
            has_required = required_tags.issubset(item_tags)

            # Must have at least one of the any_tags (if specified)
            has_any = not any_tags or not any_tags.isdisjoint(item_tags)

            if has_required and has_any:
                matches.append(item)

        return matches

    def get_all_tags():
        """Get all unique tags."""
        all_tags = set()
        for item in content_items:
            all_tags.update(item["tags"])
        return all_tags

    return find_by_tags, get_all_tags

# Usage
find_content, get_tags = create_content_filter()

# Find Python tutorials
python_tutorials = find_content({"python"}, {"tutorial"})
print("Python tutorials:")
for item in python_tutorials:
    print(f"  {item['title']}")

# Find content for beginners (has "tutorial" or "basics" tag)
beginner_content = find_content(set(), {"tutorial"})
print(f"\nBeginner content: {len(beginner_content)} items")

# Get all available tags
print(f"\nAll tags: {get_tags()}")

def create_social_network():
    """Create a simple social network for friend recommendations."""
    # User friendships (adjacency sets)
    friendships = {
        "alice": {"bob", "charlie", "diana"},
        "bob": {"alice", "charlie", "eve"},
        "charlie": {"alice", "bob", "diana"},
        "diana": {"alice", "charlie", "frank"},
        "eve": {"bob", "frank"},
        "frank": {"diana", "eve"}
    }

    def get_friends(user):
        """Get direct friends of a user."""
        return friendships.get(user, set()).copy()

    def suggest_friends(user):
        """Suggest friends based on mutual connections."""
        if user not in friendships:
            return set()

        friends = get_friends(user)
        suggestions = set()

        # Find friends of friends
        for friend in friends:
            friend_friends = get_friends(friend)
            # Add friends of friends (excluding self and direct friends)
            suggestions.update(friend_friends - friends - {user})

        # Sort by number of mutual friends (simple ranking)
        return suggestions

    def get_mutual_friends(user1, user2):
        """Find mutual friends between two users."""
        friends1 = get_friends(user1)
        friends2 = get_friends(user2)
        return friends1 & friends2

    def add_friendship(user1, user2):
        """Add friendship between two users."""
        if user1 not in friendships:
            friendships[user1] = set()
        if user2 not in friendships:
            friendships[user2] = set()

        friendships[user1].add(user2)
        friendships[user2].add(user1)
        print(f"Added friendship: {user1} ↔ {user2}")

    return get_friends, suggest_friends, get_mutual_friends, add_friendship

# Usage
get_friends, suggest_friends, get_mutual, add_friendship = create_social_network()

# Check friendships
print(f"Alice's friends: {get_friends('alice')}")
print(f"Bob's friends: {get_friends('bob')}")

# Find mutual friends
mutual = get_mutual("alice", "bob")
print(f"Alice and Bob's mutual friends: {mutual}")

# Get friend suggestions
suggestions = suggest_friends("alice")
print(f"Friend suggestions for Alice: {suggestions}")

# Add new friendship
add_friendship("alice", "eve")
print(f"Alice's friends after adding Eve: {get_friends('alice')}")

Dictionary vs Set Performance

When to Use Dictionaries

# ✅ Fast lookups by key
user_data = {"alice": {"age": 25, "city": "NYC"}}
print(user_data["alice"])  # O(1) - instant access

# ✅ Key-value relationships
settings = {"theme": "dark", "language": "en", "notifications": True}

# ✅ Counting frequencies
word_counts = {}
for word in text.split():
    word_counts[word] = word_counts.get(word, 0) + 1

When to Use Sets

# ✅ Unique items only
unique_visitors = set()
unique_visitors.add("192.168.1.1")  # Added
unique_visitors.add("192.168.1.1")  # Ignored (duplicate)

# ✅ Fast membership testing
allowed_users = {"alice", "bob", "charlie"}
if "alice" in allowed_users:  # O(1) - very fast
    print("Access granted")

# ✅ Set operations
python_devs = {"alice", "bob", "charlie"}
web_devs = {"bob", "diana", "eve"}
fullstack_devs = python_devs & web_devs  # {"bob"}

Common Patterns

Pattern 1: Default Dictionary

from collections import defaultdict

# Regular dict - KeyError if key missing
# word_count = {}
# word_count["hello"] += 1  # KeyError!

# Default dict - provides default value
word_count = defaultdict(int)  # Default value is 0
word_count["hello"] += 1  # Works! "hello": 1

# Other defaults
name_list = defaultdict(list)    # Default empty list
name_list["smith"].append("John")
name_list["smith"].append("Jane")
print(name_list["smith"])  # ["John", "Jane"]

Pattern 2: Dictionary of Lists/Sets

# Group items by category
items_by_category = defaultdict(list)
items = [
    {"name": "laptop", "category": "electronics"},
    {"name": "book", "category": "education"},
    {"name": "mouse", "category": "electronics"}
]

for item in items:
    items_by_category[item["category"]].append(item["name"])

print(dict(items_by_category))
# {"electronics": ["laptop", "mouse"], "education": ["book"]}

Pattern 3: Set for Deduplication

# Remove duplicates from list
original = [1, 2, 2, 3, 3, 3, 4, 5, 5]
unique = list(set(original))
print(unique)  # [1, 2, 3, 4, 5]

# Find unique words in text
text = "the cat sat on the mat with the rat"
unique_words = set(text.split())
print(unique_words)  # {'the', 'cat', 'sat', 'on', 'mat', 'with', 'rat'}

Practice Exercises

Exercise 1: Dictionary-Based Inventory

Create an inventory system:

Add products (name, price, quantity)
Update stock levels
Calculate total inventory value
Find products by price range
Generate low stock alerts

Exercise 2: Set-Based Tag System

Build a blog tag system:

Add articles with tags
Find articles by tag
Find articles with multiple tags
Get popular tags
Suggest related articles based on shared tags

Exercise 3: Student Performance Tracker

Create a student tracking system:

Store student grades by subject
Calculate GPA
Find top performers
Identify struggling students
Generate progress reports

Exercise 4: URL Shortener

Build a URL shortener:

Shorten URLs (generate short codes)
Expand short URLs
Track click counts
Prevent duplicate URLs
Generate usage statistics

Exercise 5: Movie Recommendation System

Create a movie recommendation system:

Store user ratings
Find similar users
Recommend movies based on similar users
Calculate movie popularity
Generate personalized recommendations

Summary

Dictionaries and sets are powerful data structures:

Dictionaries (key-value pairs):

Create: {} or dict()
Access: dict[key] or dict.get(key, default)
Modify: dict[key] = value, dict.update()
Use for: fast lookups, structured data, counting

Sets (unique collections):

Create: set() or {}
Operations: add(), remove(), discard()
Math: |, &, -, ^ (union, intersection, difference, symmetric diff)
Use for: unique items, membership testing, set operations

Performance:

Dict/set operations: O(1) - very fast
List operations: O(n) - slower for large data
Choose based on access patterns

Next: Advanced Data Structures - stacks, queues, and more! 📚

Popular Topics

Categories