Dictionaries and Sets: Key-Value and Unique Data
Welcome to the powerhouses of Python data structures! Dictionaries are like phone books - find information instantly by name. Sets are like guest lists - unique items only, no duplicates allowed.
Dictionaries: Key-Value Powerhouses
Creating Dictionaries
# Empty dictionary
empty_dict = {}
# Dictionary with initial values
person = {
"name": "Alice",
"age": 25,
"city": "New York"
}
# Using dict() constructor
from_pairs = dict([("a", 1), ("b", 2), ("c", 3)])
from_kwargs = dict(name="Bob", age=30, city="London")
# Dictionary comprehension
squares = {x: x**2 for x in range(5)} # {0: 1, 1: 4, 2: 9, 3: 16, 4: 25}
Accessing Dictionary Values
person = {"name": "Alice", "age": 25, "city": "New York"}
# Access by key
print(person["name"]) # "Alice"
print(person["age"]) # 25
# Safe access with get() - no KeyError if key missing
print(person.get("salary")) # None
print(person.get("salary", 0)) # 0 (default value)
# Check if key exists
if "name" in person:
print("Name found!")
# Get all keys, values, or items
print(person.keys()) # dict_keys(['name', 'age', 'city'])
print(person.values()) # dict_values(['Alice', 25, 'New York'])
print(person.items()) # dict_items([('name', 'Alice'), ('age', 25), ('city', 'New York')])
Modifying Dictionaries
person = {"name": "Alice", "age": 25}
# Add new key-value pair
person["job"] = "Engineer"
print(person) # {'name': 'Alice', 'age': 25, 'job': 'Engineer'}
# Update existing value
person["age"] = 26
print(person) # {'name': 'Alice', 'age': 26, 'job': 'Engineer'}
# Update multiple values
person.update({"city": "Boston", "salary": 75000})
print(person) # {'name': 'Alice', 'age': 26, 'job': 'Engineer', 'city': 'Boston', 'salary': 75000}
# Remove by key
removed_job = person.pop("job")
print(f"Removed job: {removed_job}") # "Engineer"
print(person)
# Remove last item (arbitrary order)
last_item = person.popitem()
print(f"Removed last: {last_item}") # ('salary', 75000)
# Clear all items
person.clear()
print(person) # {}
Dictionary Methods
# setdefault() - get value or set default
person = {"name": "Alice"}
age = person.setdefault("age", 25) # Sets default if key doesn't exist
print(age) # 25
print(person) # {'name': 'Alice', 'age': 25}
# Copy dictionary
original = {"a": 1, "b": 2}
copy_dict = original.copy()
copy_dict["c"] = 3
print(original) # {'a': 1, 'b': 2} (unchanged)
print(copy_dict) # {'a': 1, 'b': 2, 'c': 3}
Sets: Unique Collections
Creating Sets
# Empty set ({} creates empty dict!)
empty_set = set()
# Set with initial values
numbers = {1, 2, 3, 4, 5}
fruits = {"apple", "banana", "orange"}
# From list (removes duplicates)
duplicates = [1, 2, 2, 3, 3, 3, 4]
unique_numbers = set(duplicates)
print(unique_numbers) # {1, 2, 3, 4}
# Set comprehension
even_squares = {x**2 for x in range(10) if x**2 % 2 == 0}
print(even_squares) # {0, 4, 16, 36, 64}
Set Operations
set_a = {1, 2, 3, 4, 5}
set_b = {4, 5, 6, 7, 8}
# Add elements
set_a.add(6)
print(set_a) # {1, 2, 3, 4, 5, 6}
# Remove elements
set_a.remove(6) # Raises KeyError if not found
set_a.discard(10) # No error if not found
# Set operations
print("Union:", set_a | set_b) # {1, 2, 3, 4, 5, 6, 7, 8}
print("Intersection:", set_a & set_b) # {4, 5}
print("Difference (A-B):", set_a - set_b) # {1, 2, 3}
print("Difference (B-A):", set_b - set_a) # {6, 7, 8}
print("Symmetric diff:", set_a ^ set_b) # {1, 2, 3, 6, 7, 8}
# Update operations (modify in place)
set_a.update(set_b) # Union
print("After update:", set_a)
set_a.intersection_update(set_b) # Keep only intersection
set_a.difference_update(set_b) # Remove elements in set_b
Set Methods
numbers = {1, 2, 3, 4, 5}
# Check membership
print(3 in numbers) # True
print(6 in numbers) # False
# Check subsets/supersets
subset = {1, 2}
print(subset.issubset(numbers)) # True
print(numbers.issuperset(subset)) # True
# Check disjoint (no common elements)
other = {6, 7, 8}
print(numbers.isdisjoint(other)) # True
# Copy set
numbers_copy = numbers.copy()
numbers_copy.add(6)
print(numbers) # {1, 2, 3, 4, 5}
print(numbers_copy) # {1, 2, 3, 4, 5, 6}
Real-World Dictionary Examples
Example 1: Student Grade Book
def create_student(name, grades):
"""Create a student record."""
return {
"name": name,
"grades": grades,
"average": sum(grades) / len(grades) if grades else 0
}
def add_grade(student, grade):
"""Add a grade to student record."""
student["grades"].append(grade)
student["average"] = sum(student["grades"]) / len(student["grades"])
def get_letter_grade(average):
"""Convert numeric average to letter grade."""
if average >= 90: return "A"
elif average >= 80: return "B"
elif average >= 70: return "C"
elif average >= 60: return "D"
else: return "F"
# Create student records
students = {}
students["alice"] = create_student("Alice", [85, 92, 88])
students["bob"] = create_student("Bob", [78, 85, 90])
# Add more grades
add_grade(students["alice"], 95)
add_grade(students["bob"], 82)
# Display results
for student_id, student in students.items():
letter = get_letter_grade(student["average"])
print(f"{student['name']}: Average={student['average']:.1f}, Grade={letter}")
Example 2: Word Frequency Counter
def count_words(text):
"""Count word frequencies in text."""
words = text.lower().split()
word_count = {}
for word in words:
# Remove punctuation
word = word.strip(".,!?\"'")
if word: # Skip empty strings
word_count[word] = word_count.get(word, 0) + 1
return word_count
def get_most_common_words(word_count, n=5):
"""Get the n most common words."""
# Sort by frequency (descending), then by word (ascending)
sorted_words = sorted(word_count.items(),
key=lambda x: (-x[1], x[0]))
return sorted_words[:n]
# Analyze text
text = """
Python is a powerful programming language. Python is easy to learn and
Python has a simple syntax. Many developers love Python because Python
is versatile and Python can be used for web development, data science,
and automation.
"""
word_frequencies = count_words(text)
most_common = get_most_common_words(word_frequencies, 3)
print("Word frequencies:")
for word, count in most_common:
print(f" {word}: {count}")
print(f"\nTotal unique words: {len(word_frequencies)}")
Example 3: Phone Book Application
def add_contact(phone_book, name, number, email=None):
"""Add a contact to phone book."""
phone_book[name.lower()] = {
"name": name,
"number": number,
"email": email
}
print(f"Added contact: {name}")
def find_contact(phone_book, name):
"""Find a contact by name."""
return phone_book.get(name.lower())
def search_contacts(phone_book, query):
"""Search contacts by name or number."""
results = {}
query = query.lower()
for key, contact in phone_book.items():
if (query in contact["name"].lower() or
query in contact["number"]):
results[key] = contact
return results
def delete_contact(phone_book, name):
"""Delete a contact."""
key = name.lower()
if key in phone_book:
deleted = phone_book.pop(key)
print(f"Deleted contact: {deleted['name']}")
return True
else:
print(f"Contact not found: {name}")
return False
# Usage
phone_book = {}
add_contact(phone_book, "Alice Johnson", "555-0123", "alice@email.com")
add_contact(phone_book, "Bob Smith", "555-0456")
add_contact(phone_book, "Charlie Brown", "555-0789", "charlie@email.com")
# Find specific contact
alice = find_contact(phone_book, "alice johnson")
if alice:
print(f"Found: {alice['name']} - {alice['number']}")
# Search contacts
results = search_contacts(phone_book, "555")
print(f"Contacts with '555': {len(results)}")
# Delete contact
delete_contact(phone_book, "bob smith")
Real-World Set Examples
Example 1: Unique Visitor Tracker
def track_visitors():
"""Track unique website visitors."""
visitors = set()
def add_visitor(ip_address):
if ip_address in visitors:
print(f"Returning visitor: {ip_address}")
else:
visitors.add(ip_address)
print(f"New visitor: {ip_address}")
def get_visitor_count():
return len(visitors)
def get_unique_visitors():
return visitors.copy()
# Return interface functions
return add_visitor, get_visitor_count, get_unique_visitors
# Usage
add_visitor, get_count, get_visitors = track_visitors()
# Simulate visitors
visits = ["192.168.1.1", "192.168.1.2", "192.168.1.1", "192.168.1.3", "192.168.1.2"]
for ip in visits:
add_visitor(ip)
print(f"Total unique visitors: {get_count()}")
print(f"Visitor IPs: {get_visitors()}")
Example 2: Tag-Based Content Filter
def create_content_filter():
"""Create a content filtering system based on tags."""
content_items = [
{"id": 1, "title": "Python Tutorial", "tags": {"python", "tutorial", "programming"}},
{"id": 2, "title": "Web Development", "tags": {"web", "javascript", "html"}},
{"id": 3, "title": "Data Science", "tags": {"python", "data", "machine-learning"}},
{"id": 4, "title": "Mobile Apps", "tags": {"mobile", "ios", "android"}},
{"id": 5, "title": "Game Development", "tags": {"gaming", "unity", "c#"}}
]
def find_by_tags(required_tags, any_tags=None):
"""Find content that matches tag criteria."""
if any_tags is None:
any_tags = set()
matches = []
for item in content_items:
item_tags = item["tags"]
# Must have all required tags
has_required = required_tags.issubset(item_tags)
# Must have at least one of the any_tags (if specified)
has_any = not any_tags or not any_tags.isdisjoint(item_tags)
if has_required and has_any:
matches.append(item)
return matches
def get_all_tags():
"""Get all unique tags."""
all_tags = set()
for item in content_items:
all_tags.update(item["tags"])
return all_tags
return find_by_tags, get_all_tags
# Usage
find_content, get_tags = create_content_filter()
# Find Python tutorials
python_tutorials = find_content({"python"}, {"tutorial"})
print("Python tutorials:")
for item in python_tutorials:
print(f" {item['title']}")
# Find content for beginners (has "tutorial" or "basics" tag)
beginner_content = find_content(set(), {"tutorial"})
print(f"\nBeginner content: {len(beginner_content)} items")
# Get all available tags
print(f"\nAll tags: {get_tags()}")
Example 3: Social Network Friend Recommendations
def create_social_network():
"""Create a simple social network for friend recommendations."""
# User friendships (adjacency sets)
friendships = {
"alice": {"bob", "charlie", "diana"},
"bob": {"alice", "charlie", "eve"},
"charlie": {"alice", "bob", "diana"},
"diana": {"alice", "charlie", "frank"},
"eve": {"bob", "frank"},
"frank": {"diana", "eve"}
}
def get_friends(user):
"""Get direct friends of a user."""
return friendships.get(user, set()).copy()
def suggest_friends(user):
"""Suggest friends based on mutual connections."""
if user not in friendships:
return set()
friends = get_friends(user)
suggestions = set()
# Find friends of friends
for friend in friends:
friend_friends = get_friends(friend)
# Add friends of friends (excluding self and direct friends)
suggestions.update(friend_friends - friends - {user})
# Sort by number of mutual friends (simple ranking)
return suggestions
def get_mutual_friends(user1, user2):
"""Find mutual friends between two users."""
friends1 = get_friends(user1)
friends2 = get_friends(user2)
return friends1 & friends2
def add_friendship(user1, user2):
"""Add friendship between two users."""
if user1 not in friendships:
friendships[user1] = set()
if user2 not in friendships:
friendships[user2] = set()
friendships[user1].add(user2)
friendships[user2].add(user1)
print(f"Added friendship: {user1} ↔ {user2}")
return get_friends, suggest_friends, get_mutual_friends, add_friendship
# Usage
get_friends, suggest_friends, get_mutual, add_friendship = create_social_network()
# Check friendships
print(f"Alice's friends: {get_friends('alice')}")
print(f"Bob's friends: {get_friends('bob')}")
# Find mutual friends
mutual = get_mutual("alice", "bob")
print(f"Alice and Bob's mutual friends: {mutual}")
# Get friend suggestions
suggestions = suggest_friends("alice")
print(f"Friend suggestions for Alice: {suggestions}")
# Add new friendship
add_friendship("alice", "eve")
print(f"Alice's friends after adding Eve: {get_friends('alice')}")
Dictionary vs Set Performance
When to Use Dictionaries
# ✅ Fast lookups by key
user_data = {"alice": {"age": 25, "city": "NYC"}}
print(user_data["alice"]) # O(1) - instant access
# ✅ Key-value relationships
settings = {"theme": "dark", "language": "en", "notifications": True}
# ✅ Counting frequencies
word_counts = {}
for word in text.split():
word_counts[word] = word_counts.get(word, 0) + 1
When to Use Sets
# ✅ Unique items only
unique_visitors = set()
unique_visitors.add("192.168.1.1") # Added
unique_visitors.add("192.168.1.1") # Ignored (duplicate)
# ✅ Fast membership testing
allowed_users = {"alice", "bob", "charlie"}
if "alice" in allowed_users: # O(1) - very fast
print("Access granted")
# ✅ Set operations
python_devs = {"alice", "bob", "charlie"}
web_devs = {"bob", "diana", "eve"}
fullstack_devs = python_devs & web_devs # {"bob"}
Common Patterns
Pattern 1: Default Dictionary
from collections import defaultdict
# Regular dict - KeyError if key missing
# word_count = {}
# word_count["hello"] += 1 # KeyError!
# Default dict - provides default value
word_count = defaultdict(int) # Default value is 0
word_count["hello"] += 1 # Works! "hello": 1
# Other defaults
name_list = defaultdict(list) # Default empty list
name_list["smith"].append("John")
name_list["smith"].append("Jane")
print(name_list["smith"]) # ["John", "Jane"]
Pattern 2: Dictionary of Lists/Sets
# Group items by category
items_by_category = defaultdict(list)
items = [
{"name": "laptop", "category": "electronics"},
{"name": "book", "category": "education"},
{"name": "mouse", "category": "electronics"}
]
for item in items:
items_by_category[item["category"]].append(item["name"])
print(dict(items_by_category))
# {"electronics": ["laptop", "mouse"], "education": ["book"]}
Pattern 3: Set for Deduplication
# Remove duplicates from list
original = [1, 2, 2, 3, 3, 3, 4, 5, 5]
unique = list(set(original))
print(unique) # [1, 2, 3, 4, 5]
# Find unique words in text
text = "the cat sat on the mat with the rat"
unique_words = set(text.split())
print(unique_words) # {'the', 'cat', 'sat', 'on', 'mat', 'with', 'rat'}
Practice Exercises
Exercise 1: Dictionary-Based Inventory
Create an inventory system:
- Add products (name, price, quantity)
- Update stock levels
- Calculate total inventory value
- Find products by price range
- Generate low stock alerts
Exercise 2: Set-Based Tag System
Build a blog tag system:
- Add articles with tags
- Find articles by tag
- Find articles with multiple tags
- Get popular tags
- Suggest related articles based on shared tags
Exercise 3: Student Performance Tracker
Create a student tracking system:
- Store student grades by subject
- Calculate GPA
- Find top performers
- Identify struggling students
- Generate progress reports
Exercise 4: URL Shortener
Build a URL shortener:
- Shorten URLs (generate short codes)
- Expand short URLs
- Track click counts
- Prevent duplicate URLs
- Generate usage statistics
Exercise 5: Movie Recommendation System
Create a movie recommendation system:
- Store user ratings
- Find similar users
- Recommend movies based on similar users
- Calculate movie popularity
- Generate personalized recommendations
Summary
Dictionaries and sets are powerful data structures:
Dictionaries (key-value pairs):
- Create:
{}ordict() - Access:
dict[key]ordict.get(key, default) - Modify:
dict[key] = value,dict.update() - Use for: fast lookups, structured data, counting
Sets (unique collections):
- Create:
set()or{} - Operations:
add(),remove(),discard() - Math:
|,&,-,^(union, intersection, difference, symmetric diff) - Use for: unique items, membership testing, set operations
Performance:
- Dict/set operations: O(1) - very fast
- List operations: O(n) - slower for large data
- Choose based on access patterns
Next: Advanced Data Structures - stacks, queues, and more! 📚