Magic Filing Cabinets: Hash Tables!
Welcome! You’ve learned searching and dynamic programming. Now Hash Tables - the magic that makes computers find things INSTANTLY!
Why Hash Tables? (The Filing Cabinet Problem)
The Library Card Catalog Problem
Imagine a library with 1 million books. How do you find “Harry Potter” instantly?
Without hash tables: Check every book (slow!) With hash tables: Magic formula gives you exact shelf location!
Hash tables make computers super fast:
- Phone contacts: Find “Mom” instantly
- Web browsers: Remember passwords securely
- Databases: Find customer records in milliseconds
- Compilers: Remember variable names
The Magic Formula: Hash Functions
What is a Hash Function?
A hash function is like a secret code machine that turns any word into a number:
Examples:
- “apple” → 5
- “banana” → 12
- “cherry” → 7
Good hash functions:
- Same word always gives same number
- Different words usually get different numbers
- Numbers spread evenly (not all in drawer 1!)
Simple Hash Function (For Learning)
// Turn word into drawer number
func SimpleHash(word string) int {
total := 0
for _, letter := range word {
total += int(letter) // Add up letter codes
}
return total % 10 // Pick drawer 0-9
}
// Test it:
SimpleHash("cat") // 'c'=99, 'a'=97, 't'=116 → 312 % 10 = 2
SimpleHash("dog") // 'd'=100, 'o'=111, 'g'=103 → 314 % 10 = 4
SimpleHash("act") // Same letters! → 312 % 10 = 2 (same drawer!)
Problem: “cat” and “act” go to same drawer! (Anagrams collide)
Better Hash Function (Polynomial)
// Smarter hash: each position matters more
func BetterHash(word string) int {
hash := 0
for i, letter := range word {
// Later letters count more (31^position)
hash += int(letter) * (31 ^ i) // 31 is a magic prime number
}
return hash % 100 // Bigger cabinet: 100 drawers
}
Now “cat” and “act” get different drawers! 🎉
Building Your First Hash Table
Method 1: Extra Shelves (Separate Chaining)
Problem: Two items want drawer 5! Solution: Make drawer 5 hold multiple items (like extra shelves)
// Each drawer can hold a chain of items
type DrawerItem struct {
Key string
Value interface{}
Next *DrawerItem // Link to next item in chain
}
type HashTable struct {
Drawers []*DrawerItem // Array of drawer chains
Size int // Number of drawers
}
func NewHashTable(numDrawers int) *HashTable {
return &HashTable{
Drawers: make([]*DrawerItem, numDrawers),
Size: numDrawers,
}
}
// Magic formula: turn key into drawer number
func (ht *HashTable) GetDrawerNumber(key string) int {
hash := 0
for _, letter := range key {
hash = (hash*31 + int(letter)) % ht.Size
}
return hash
}
// Store item in hash table
func (ht *HashTable) Store(key string, value interface{}) {
drawerNum := ht.GetDrawerNumber(key)
// Check if key already exists in this drawer
current := ht.Drawers[drawerNum]
for current != nil {
if current.Key == key {
current.Value = value // Update existing
return
}
current = current.Next
}
// Add new item at start of chain
newItem := &DrawerItem{
Key: key,
Value: value,
Next: ht.Drawers[drawerNum], // Link to previous first item
}
ht.Drawers[drawerNum] = newItem
}
// Find item in hash table
func (ht *HashTable) Find(key string) (interface{}, bool) {
drawerNum := ht.GetDrawerNumber(key)
// Search through chain in this drawer
current := ht.Drawers[drawerNum]
for current != nil {
if current.Key == key {
return current.Value, true
}
current = current.Next
}
return nil, false // Not found
}
// Remove item from hash table
func (ht *HashTable) Remove(key string) bool {
drawerNum := ht.GetDrawerNumber(key)
var previous *DrawerItem
current := ht.Drawers[drawerNum]
for current != nil {
if current.Key == key {
if previous == nil {
// Remove first item in chain
ht.Drawers[drawerNum] = current.Next
} else {
// Remove item from middle/end of chain
previous.Next = current.Next
}
return true
}
previous = current
current = current.Next
}
return false // Not found
}
Like: Filing cabinet where drawer 5 has a stack of papers!
Method 2: Next Available Drawer (Linear Probing)
Problem: Drawer 5 is full! Solution: Put it in drawer 6, or 7, or next available…
type SimpleHashTable struct {
Keys []string // Item names
Values []interface{} // Item values
Size int // Total drawers
Count int // Items stored
}
func NewSimpleHashTable(numDrawers int) *SimpleHashTable {
return &SimpleHashTable{
Keys: make([]string, numDrawers),
Values: make([]interface{}, numDrawers),
Size: numDrawers,
Count: 0,
}
}
func (ht *SimpleHashTable) Store(key string, value interface{}) {
// Expand cabinet if getting full
if ht.Count >= ht.Size/2 {
ht.ExpandCabinet(ht.Size * 2)
}
drawerNum := ht.GetDrawerNumber(key)
originalDrawer := drawerNum
// Find empty drawer (might need to skip some)
for ht.Keys[drawerNum] != "" && ht.Keys[drawerNum] != key {
drawerNum = (drawerNum + 1) % ht.Size // Try next drawer
if drawerNum == originalDrawer {
// Went all the way around! Expand and retry
ht.ExpandCabinet(ht.Size * 2)
ht.Store(key, value)
return
}
}
// Found spot!
if ht.Keys[drawerNum] == "" {
ht.Count++ // New item
}
ht.Keys[drawerNum] = key
ht.Values[drawerNum] = value
}
func (ht *SimpleHashTable) Find(key string) (interface{}, bool) {
drawerNum := ht.GetDrawerNumber(key)
originalDrawer := drawerNum
// Search from drawer, might need to check neighbors
for ht.Keys[drawerNum] != "" {
if ht.Keys[drawerNum] == key {
return ht.Values[drawerNum], true
}
drawerNum = (drawerNum + 1) % ht.Size
if drawerNum == originalDrawer {
break // Searched all drawers
}
}
return nil, false
}
func (ht *SimpleHashTable) ExpandCabinet(newSize int) {
oldKeys := ht.Keys
oldValues := ht.Values
// New bigger cabinet
ht.Keys = make([]string, newSize)
ht.Values = make([]interface{}, newSize)
ht.Size = newSize
ht.Count = 0
// Move everything to new cabinet
for i, key := range oldKeys {
if key != "" {
ht.Store(key, oldValues[i])
}
}
}
Like: Filing cabinet where if drawer 5 is full, use drawer 6!
Go’s Magic Built-in Maps
Go gives you hash tables for free! No need to build from scratch.
// Three ways to create maps:
// 1. Start with nothing (will crash if used!)
var phoneBook map[string]string // nil map - don't use yet!
// 2. Create empty map (ready to use)
phoneBook = make(map[string]string)
// 3. Create with starting data
phoneBook = map[string]string{
"mom": "555-0123",
"dad": "555-0456",
"school": "555-0789",
}
// Use the map:
phoneBook["friend"] = "555-1011" // Add entry
number := phoneBook["mom"] // Get number ("555-0123")
delete(phoneBook, "school") // Remove entry
count := len(phoneBook) // How many entries (3)
// Check if entry exists:
if number, exists := phoneBook["mom"]; exists {
fmt.Println("Mom's number:", number)
} else {
fmt.Println("No number for mom")
}
// See all entries:
for name, number := range phoneBook {
fmt.Printf("%s: %s\n", name, number)
}
Like: Magic phone book that finds numbers instantly!
Real-World Magic: Hash Table Applications
1. Word Counter - “How many times does each word appear?”
func CountWords(text string) map[string]int {
words := strings.Fields(text) // Split into words
counts := make(map[string]int)
for _, word := range words {
// Clean up: lowercase, remove punctuation
word = strings.ToLower(strings.Trim(word, ".,!?"))
counts[word]++ // Auto-creates entry if new!
}
return counts
}
// Usage:
text := "The cat sat on the mat. The cat was fat."
wordCounts := CountWords(text)
// Result: {"the": 3, "cat": 2, "sat": 1, "on": 1, "mat": 1, "was": 1, "fat": 1}
Real life: Analyzing books, counting website visits, finding trending topics
2. Two Numbers Add to Target - “Find pair that sums to 10”
func FindTwoSum(numbers []int, target int) []int {
seen := make(map[int]int) // number -> position
for position, number := range numbers {
needed := target - number
if seenPos, found := seen[needed]; found {
return []int{seenPos, position} // Found pair!
}
seen[number] = position // Remember this number
}
return nil // No pair found
}
// Usage:
nums := []int{2, 7, 11, 15}
pair := FindTwoSum(nums, 9)
// Result: [0, 1] because 2 + 7 = 9
Real life: Finding matching transactions, puzzle solving
3. Recently Used Items (LRU Cache)
Imagine a cache that keeps recently used items and removes old ones.
type LRUCache struct {
capacity int
cache map[int]*CacheItem
head *CacheItem // Most recently used
tail *CacheItem // Least recently used
}
type CacheItem struct {
key int
value int
prev *CacheItem
next *CacheItem
}
func (c *LRUCache) Get(key int) int {
if item, exists := c.cache[key]; exists {
c.moveToFront(item) // Mark as recently used
return item.value
}
return -1 // Not in cache
}
func (c *LRUCache) Put(key int, value int) {
if item, exists := c.cache[key]; exists {
item.value = value
c.moveToFront(item)
return
}
// Cache full? Remove least recently used
if len(c.cache) >= c.capacity {
c.removeOldest()
}
// Add new item
item := &CacheItem{key: key, value: value}
c.cache[key] = item
c.addToFront(item)
}
Real life: Web browser caches, database query caches, app memory management
4. Group Words by Letters - “Find anagrams”
func GroupAnagrams(words []string) [][]string {
groups := make(map[string][]string)
for _, word := range words {
// Sort letters to create "signature"
letters := strings.Split(word, "")
sort.Strings(letters)
signature := strings.Join(letters, "")
// All anagrams have same signature!
groups[signature] = append(groups[signature], word)
}
// Convert to result
result := make([][]string, 0, len(groups))
for _, group := range groups {
result = append(result, group)
}
return result
}
// Usage:
words := []string{"eat", "tea", "tan", "ate", "nat", "bat"}
groups := GroupAnagrams(words)
// Result: [["eat","tea","ate"], ["tan","nat"], ["bat"]]
Real life: Finding similar documents, word games, spell checking
5. Subarray Sum Counter - “How many subarrays sum to K?”
func CountSubarraysWithSum(nums []int, target int) int {
count := 0
currentSum := 0
sumCounts := make(map[int]int)
sumCounts[0] = 1 // Empty subarray sums to 0
for _, num := range nums {
currentSum += num
// If currentSum - target was seen before,
// then subarray from that point to now sums to target!
if previousCount, exists := sumCounts[currentSum-target]; exists {
count += previousCount
}
sumCounts[currentSum]++
}
return count
}
// Usage:
nums := []int{1, 1, 1}
count := CountSubarraysWithSum(nums, 2)
// Result: 2 (subarrays [1,1] at positions 0-1 and 1-2)
Real life: Financial analysis, signal processing
What Happens When Drawers Get Full? (Collisions)
Problem: Two items want the same drawer!
“cat” and “dog” both want drawer 5! What now?
Solution 1: Chain Extra Shelves (Separate Chaining)
- Drawer 5 gets a chain of items
- Like hanging multiple folders in one drawer
- Simple, but chains can get long
Solution 2: Find Next Empty Drawer (Linear Probing)
- Drawer 5 full? Try drawer 6, then 7, etc.
- Like “musical chairs” - find any empty seat
- Fast, but can create “traffic jams” of full drawers
Solution 3: Jump Further (Quadratic Probing)
- Drawer 5 full? Try 5+1²=6, then 5+2²=9, then 5+3²=14
- Spreads things out better than linear
- Like jumping 1, 4, 9, 16 steps away
Solution 4: Two Magic Formulas (Double Hashing)
- Have TWO hash functions
- First gives drawer 5, second gives step size 3
- So try: 5, then 5+3=8, then 8+3=11, etc.
- Most balanced, but complex
Growing Your Filing Cabinet (Rehashing)
Problem: Cabinet is 80% full - too slow!
Solution: Move to bigger cabinet!
func (ht *HashTable) ShouldExpand() bool {
return float64(ht.Count) / float64(ht.Size) > 0.75 // Over 75% full
}
func (ht *HashTable) ExpandCabinet() {
newSize := ht.Size * 2 // Double the size
newCabinet := make([]*DrawerItem, newSize)
// Move everything to new cabinet (rehash all items!)
for drawerNum := 0; drawerNum < ht.Size; drawerNum++ {
current := ht.Drawers[drawerNum]
for current != nil {
// Recalculate drawer for new cabinet
newDrawer := ht.GetDrawerNumber(current.Key) % newSize
// Add to new cabinet
newItem := &DrawerItem{
Key: current.Key,
Value: current.Value,
Next: newCabinet[newDrawer],
}
newCabinet[newDrawer] = newItem
current = current.Next
}
}
ht.Drawers = newCabinet
ht.Size = newSize
}
Like: Moving to a bigger house when family grows!
Speed Report Card
| Operation | Usually | Worst Case (Bad Luck) |
|---|---|---|
| Store | Instant (O(1)) | Slow (O(n)) |
| Find | Instant (O(1)) | Slow (O(n)) |
| Remove | Instant (O(1)) | Slow (O(n)) |
Worst case happens with: Bad hash function, or cabinet too full
Practice Problems - Master the Magic!
- Duplicate Detector: Check if array has duplicates
- Shopping List: Find items in both shopping lists
- Word Patterns: Check if two words follow same pattern (“abba” matches “dog cat cat dog”)
- First Unique: Find first letter that appears only once
- Top Words: Find most frequent words in text
🎉 Congratulations! You’re a Hash Table Wizard!
Hash tables are everywhere in programming:
- Dictionaries: Word → definition
- Databases: ID → customer data
- Compilers: Variable name → memory location
- Web servers: URL → webpage
- Your phone: Contact name → phone number
Key Takeaways:
- Hash functions turn keys into drawer numbers
- Collisions happen - handle them gracefully
- Keep cabinet size reasonable (expand when needed)
- Go’s maps make it easy!
Next: Advanced Topics - the grand finale! 🚀