Sets - Duplicate Detection, Consecutive Sequences & More

∅

What Is a Set?

Set = a collection of unique elements with O(1) avg lookup Add: set.add(5) → true (new element) Add dup: set.add(5) → false (already exists) Check: 5 in set → true (O(1) membership) Remove: set.remove(5) → true (was present) How it works: HashSet is HashMap<E, PRESENT> internally Element → hashCode() → bucket → O(1) lookup Same hashing mechanics as HashMap — see HashMaps page

A Set guarantees no duplicates — adding an existing element is a no-op
Use when you care about membership not key-value pairs
Elements must implement hashCode()/equals() correctly (data classes do this automatically)
add() returning false = instant duplicate detection

∅

Kotlin Set Types, Creation & Mutation

Type Ordered? Backing Set<E> — read-only view HashSet no hash table LinkedHashSet insertion hash + linked list TreeSet sorted red-black tree ⚠ setOf() & mutableSetOf() → LinkedHashSet (insertion order preserved)

// ── Creation ── setOf(1, 2, 3) // read-only mutableSetOf(1, 2, 3) // mutable hashSetOf(1, 2, 3) // unordered sortedSetOf(3, 1, 2) // {1, 2, 3} listOf(1,2,2,3).toSet() // dedup → {1,2,3} buildSet { add(1); add(2) } // ── Query — O(1) avg for HashSet ── 3 in set // contains set.size set.isEmpty() // ── Mutate ── set.add(4) // returns true if new set.remove(2) // returns true if removed set.removeIf { it < 0 }; set.clear()

∩

Set Algebra — Union, Intersect, Difference

A = {1, 2, 3, 4} B = {3, 4, 5, 6} union: {1, 2, 3, 4, 5, 6} a union b / a + b intersect: {3, 4} a intersect b subtract: {1, 2} a subtract b / a - b sym diff: {1, 2, 5, 6} (a - b) + (b - a)

// ── Immutable (returns new Set) ── a union b // or: a + b a intersect b a subtract b // or: a - b // ── Checks ── a.containsAll(b) // is b a subset of a? (a intersect b).isEmpty() // disjoint? // ── Mutable (in-place) ── set.addAll(other) // union set.retainAll(other) // intersect set.removeAll(other) // subtract set += element; set -= element

↕

TreeSet — Sorted Sets

// ── All operations O(log n) ── val ts = sortedSetOf(3, 1, 4, 1, 5) // {1,3,4,5} ts.first() // 1 ts.last() // 5 ts.higher(3) // 4 (strictly >) ts.lower(3) // 1 (strictly <) ts.ceiling(2) // 3 (>=) ts.floor(2) // 1 (<=) ts.subSet(2, 5) // {3, 4} (from incl, to excl) ts.headSet(4) // {1, 3} ts.tailSet(4) // {4, 5} ts.pollFirst() // removes & returns 1 ts.pollLast() // removes & returns 5 list.toSortedSet() // convert any collection // Custom comparator sortedSetOf(compareByDescending { it }, 3, 1, 5)

See Also: HashMaps Page & Arrays Page

Many patterns use Maps and Sets together: Sliding Window + Set → uniqueness constraints (see HashMaps page) | Frequency Counting → uses Map, but eachCount().keys yields a Set | Two Pointers → see Arrays page

⊘

Duplicate Detection

O(n) O(n)

Contains Duplicate: set.add(x) returns false if x already in set → one pass with early return One-liner check: nums.toSet().size != nums.size → has duplicates Duplicate within distance K: Sliding window HashSet of size K [1 2 3] 4 5 6 window size = 3 1 [2 3 4] 5 6 slide: remove 1, add 4 if add() returns false → duplicate within K

When to Use

contains duplicateWhy: HashSet.add() returns false if the element was already present. One pass with early return. Or one-liner: nums.toSet().size != nums.size. duplicate within distance KWhy: Maintain a sliding window HashSet of size K. As you iterate, add the current element. If add() returns false → duplicate found within K. Remove elements that fall outside the window. find all duplicatesWhy: Use two sets: "seen" and "duplicates". For each element, if seen.add() returns false, add to duplicates set. O(n) time, O(n) space. happy numberWhy: Sum of squared digits eventually loops or reaches 1. Use a HashSet to detect the cycle — if sum was seen before, it's not happy. Set membership is the cycle detection mechanism. O(1) avg per check. intersection of two arraysWhy: Put array1 into a HashSet, then iterate array2 — if element is in the set, it's in the intersection. Remove from set to handle duplicates. O(n+m) time. Set algebra in action.

set.add(x) returning false = instant duplicate detection — the most common set idiom
toSet().size vs original size — quick dedup check without early return
Sliding window set maintains a fixed-size window for distance-constrained duplicates

▶ Visualize

→

Longest Consecutive Sequence

O(n) O(n)

Find the longest run of consecutive integers: nums = [100, 4, 200, 1, 3, 2] Step 1: put all in HashSet → {100, 4, 200, 1, 3, 2} Step 2: only start counting from sequence starts: if (num - 1) NOT in set → it's a start 100: 99 not in set → start → 100 → length 1 4: 3 in set → skip 200: 199 not in set → start → 200 → length 1 1: 0 not in set → start → 1→2→3→4 → length 4 answer = 4

When to Use

longest consecutive sequenceWhy: Put all numbers in a HashSet. For each number, check if num-1 is NOT in the set (= start of sequence). Then extend forward: num+1, num+2, ... Each element visited at most twice → O(n). count distinct sequencesWhy: Same start-detection trick. Count how many sequence starts exist, or track all sequence lengths. The key insight is skipping non-starts to avoid O(n²). missing rangesWhy: Put all numbers in a HashSet, then walk the expected range [lo, hi]. When a number is missing, extend the gap. Set gives O(1) membership checks for each expected value. array nestingWhy: Follow index chains: i → nums[i] → nums[nums[i]] → ... until a cycle. Use a visited HashSet to skip already-explored indices. Each index visited at most once → O(n). Track longest cycle.

Key insight: only start counting from sequence beginnings — avoids redundant work
Each element is visited at most twice (once in scan, once in chain extension) → O(n)
Sorting would also work in O(n log n) — the HashSet approach is O(n)

▶ Visualize

①

First Unique / Non-Repeating Element

O(n) O(k)

Find the first element that appears exactly once: s = "leetcode" LinkedHashMap preserves insertion order: {l=1, e=3, t=1, c=1, o=1, d=1} iterate → first entry with value 1 → 'l'

// ── First unique character ── val freq = linkedMapOf<Char, Int>() for (ch in s) freq[ch] = (freq[ch] ?: 0) + 1 val first = freq.entries.first { it.value == 1 }.key // ── Streaming variant (with removal) ── val seen = mutableSetOf<Char>() val unique = linkedSetOf<Char>() // order-preserving for (ch in s) { if (!seen.add(ch)) unique.remove(ch) else unique.add(ch) } // unique.first() = first non-repeating

When to Use

first unique characterWhy: LinkedHashMap preserves insertion order. Build frequency map in one pass, then iterate entries — first with count 1 is the answer. O(n) time, O(k) space. Classic interview problem. single numberWhy: Every element appears twice except one. XOR approach is O(1) space, but set approach works too: add if unseen, remove if seen. The one remaining element is the single number. first non-repeating in streamWhy: Use a LinkedHashSet of candidates + a "seen" HashSet. On each element: if already seen, remove from candidates; otherwise add to both. candidates.first() is always the answer. O(1) per element.

LinkedHashMap = HashMap + insertion order — iterate to find first entry with count 1
LinkedHashSet streaming variant — maintains unique candidates in order as you scan
Both approaches: O(n) time, O(k) space where k = distinct elements

▶ Visualize

⚠

Set Pitfalls

// ⚠ Mutable element in set → lost entry! val item = mutableListOf(1, 2) set.add(item) item.add(3) // hashCode changes! item in set // false — unreachable // RIGHT: use immutable elements (data class, String) // ⚠ toSet() vs distinct() list.toSet() // → Set (O(1) lookup) list.distinct() // → List (indexed access) // Choose based on what you need next // ⚠ read-only ≠ immutable val ro = setOf(1, 2) (ro as MutableSet).add(3) // compiles! 💥 // RIGHT: defensive copy with .toSet() // ⚠ ConcurrentModificationException for (x in set) set.remove(x) // 💥 // RIGHT: set.removeIf { ... } // ⚠ Set equality ignores order setOf(1, 2) == setOf(2, 1) // true // If order matters, use LinkedHashSet

Never mutate elements after inserting into a set — entry becomes permanently unreachable
toSet() returns Set (O(1) lookup); distinct() returns List (indexed access) — choose based on use
removeIf { } is always safe; manual iteration + mutation is not

⚡

Set Operations — Complexity

Operation	HashSet	LinkedHashSet	TreeSet
contains / in	`O(1)` avg	`O(1)` avg	`O(log n)`
add	`O(1)` amortized	`O(1)` amortized	`O(log n)`
remove	`O(1)` avg	`O(1)` avg	`O(log n)`
first / last	—	—	`O(log n)`
higher / lower / ceiling / floor	—	—	`O(log n)`
union / intersect / subtract	`O(n+m)`	`O(n+m)`	`O(n+m)`
Iteration	`O(n + cap)`	`O(n)`	`O(n)`
Memory per element	~44 bytes	~60 bytes	~56 bytes

HashSet is literally HashMap<E, PRESENT> — same performance characteristics as HashMap
LinkedHashSet = same complexity as HashSet + insertion order + ~16 bytes extra per entry
TreeSet worst case is always O(log n) — no degeneration like hash collisions

✨

Which Pattern?

If You See...	Try This
"contains duplicate"	HashSet membership
"duplicate within distance k"	Sliding Window HashSet
"find all duplicates"	Two sets: seen + duplicates
"first unique" / "non-repeating"	LinkedHashSet / LinkedHashMap
"longest consecutive sequence"	HashSet + start-of-sequence
"longest substring" + "unique"	Sliding Window + Set (HashMaps)
"intersection / union of lists"	Set algebra operators
"range queries on elements"	TreeSet navigation
"deduplicate a list"	`toSet()` or `distinct()`

Sets — The Complete Visual Reference