DSA Interview Prep
Arrays & Strings
The most fundamental data structure. Elements stored in contiguous memory, accessed by index in O(1). Strings are essentially character arrays. Master this first — it's the base of 40%+ of interview questions.
Key Properties
- Index-based access:
O(1) - Insert/Delete at end:
O(1)amortized - Insert/Delete at middle:
O(n) - Search (unsorted):
O(n) - Search (sorted):
O(log n)
Common Patterns
- Two Pointers — left/right scan
- Sliding Window — subarray problems
- Prefix Sum — range queries
- Kadane's Algorithm — max subarray
- Dutch Flag — 3-way partition
# O(n) time | O(1) space def max_sum_subarray(arr, k): window_sum = sum(arr[:k]) max_sum = window_sum for i in range(k, len(arr)): window_sum += arr[i] - arr[i - k] max_sum = max(max_sum, window_sum) return max_sum # Two Pointers — check if pair sums to target (sorted array) def two_sum_sorted(arr, target): l, r = 0, len(arr) - 1 while l < r: s = arr[l] + arr[r] if s == target: return l, r elif s < target: l += 1 else: r -= 1 return -1, -1
Linked Lists
Nodes connected via pointers. No contiguous memory required. Excellent for frequent insert/delete at head. Crucial concept for understanding pointers, memory, and recursive thinking.
Complexities
- Access by index:
O(n) - Insert/Delete at head:
O(1) - Insert/Delete at tail:
O(n)singly /O(1)doubly - Search:
O(n) - Space:
O(n)
Must-Know Techniques
- Fast & Slow Pointers — cycle detection (Floyd's)
- Reverse in-place — 3 pointer technique
- Find middle — slow/fast = middle
- Merge two sorted — O(n+m)
- Dummy head node — simplifies edge cases
class Node: def __init__(self, val): self.val = val; self.next = None # O(n) time | O(1) space — iterative (preferred in interviews) def reverse(head): prev, curr = None, head while curr: nxt = curr.next # save next curr.next = prev # reverse pointer prev = curr # advance prev curr = nxt # advance curr return prev # new head # Floyd's Cycle Detection — O(n) time, O(1) space def has_cycle(head): slow = fast = head while fast and fast.next: slow = slow.next fast = fast.next.next if slow == fast: return True return False
Stack & Queue
Abstract data types built on arrays or linked lists. Stack = LIFO (Last-In First-Out). Queue = FIFO (First-In First-Out). Both appear constantly in parsing, BFS/DFS, and expression evaluation problems.
Stack — LIFO
- push/pop/peek:
O(1) - Use cases: function calls, undo, valid parentheses, monotonic stack
- In Python:
list—append()/pop()
Queue — FIFO
- enqueue/dequeue:
O(1) - Use cases: BFS, task scheduling, sliding window max
- In Python:
collections.deque—appendleft()/pop()
def is_valid(s: str) -> bool: stack = [] mapping = {')': '(', '}': '{', ']': '['} for ch in s: if ch in mapping: top = stack.pop() if stack else '#' if mapping[ch] != top: return False else: stack.append(ch) return not stack # O(n) time | O(n) space # Monotonic Stack — Next Greater Element def next_greater(nums): res = [-1] * len(nums); stack = [] for i, n in enumerate(nums): while stack and nums[stack[-1]] < n: res[stack.pop()] = n stack.append(i) return res
Trees & Binary Search Tree
Hierarchical data structure. Binary Tree: each node ≤ 2 children. BST: left < node < right, giving O(log n) search. Trees are everywhere in interviews — traversals, path problems, serialization.
BST Operations
- Search: avg
O(log n) - Insert: avg
O(log n) - Delete: avg
O(log n) - Worst (skewed):
O(n)
Traversals
- In-order: Left → Root → Right (sorted in BST)
- Pre-order: Root → Left → Right
- Post-order: Left → Right → Root
- Level-order: BFS with queue
Key Patterns
- DFS (recursive) for path problems
- BFS (queue) for level problems
- Height / diameter / LCA
- Validate BST with bounds
class TreeNode: def __init__(self,v): self.val=v; self.left=self.right=None # In-order (yields sorted output for BST) def inorder(root): return inorder(root.left) + [root.val] + inorder(root.right) if root else [] # Validate BST — O(n) time, O(h) space def is_valid_bst(root, lo=float('-inf'), hi=float('inf')): if not root: return True if not (lo < root.val < hi): return False return is_valid_bst(root.left,lo,root.val) and is_valid_bst(root.right,root.val,hi) # Max Depth — O(n) def max_depth(root): return 0 if not root else 1 + max(max_depth(root.left), max_depth(root.right))
Graphs
Nodes (vertices) connected by edges. Core of network problems — social graphs, maps, dependencies. BFS finds shortest path in unweighted graphs. DFS explores all paths, detects cycles, topological sort.
Representations
- Adjacency List:
dict[node] = [neighbors]— sparse graphs - Adjacency Matrix:
grid[i][j] = 1— dense graphs - Edge List: list of (u,v) pairs
Algorithms & Complexity
- BFS:
O(V+E)— shortest path (unweighted) - DFS:
O(V+E)— cycle detection, topo sort - Dijkstra:
O((V+E) log V)— weighted shortest path - Union-Find:
O(α(n)) ≈ O(1)— connected components
from collections import deque # BFS — shortest path in unweighted graph def bfs(graph, start): visited = {start}; q = deque([start]) while q: node = q.popleft() for nbr in graph[node]: if nbr not in visited: visited.add(nbr); q.append(nbr) # DFS — iterative (avoid recursion limit) def dfs(graph, start): visited = set(); stack = [start] while stack: node = stack.pop() if node in visited: continue visited.add(node) stack.extend(graph[node]) # Number of Islands — classic DFS on grid def num_islands(grid): count = 0 def sink(r, c): if not (0<=r<len(grid) and 0<=c<len(grid[0]) and grid[r][c]=='1'): return grid[r][c] = '0' for dr,dc in [(0,1),(0,-1),(1,0),(-1,0)]: sink(r+dr,c+dc) for r in range(len(grid)): for c in range(len(grid[0])): if grid[r][c]=='1': sink(r,c); count += 1 return count
Hashing & Hash Maps
Hash maps give O(1) average-case lookup, insert, and delete. Fundamental for counting frequencies, two-sum, grouping anagrams. Python's dict and Counter are your best friends in interviews.
Complexities
- Insert / Delete / Lookup:
O(1)avg,O(n)worst - Collision resolution: chaining or open addressing
- Load factor α < 0.75 → resize triggered
Common Patterns
- Frequency Count:
Counter(arr) - Two Sum: store seen values in set/dict
- Group Anagrams: key = sorted word
- LRU Cache: OrderedDict or doubly-LL + dict
from collections import Counter, defaultdict # Two Sum — O(n) time, O(n) space def two_sum(nums, target): seen = {} for i, n in enumerate(nums): complement = target - n if complement in seen: return [seen[complement], i] seen[n] = i # Group Anagrams — O(nk log k) time def group_anagrams(strs): groups = defaultdict(list) for s in strs: groups[tuple(sorted(s))].append(s) return list(groups.values()) # Longest Consecutive Sequence — O(n) time with set def longest_consecutive(nums): num_set = set(nums); longest = 0 for n in num_set: if n-1 not in num_set: # start of sequence length = 1 while n+length in num_set: length += 1 longest = max(longest, length) return longest
Sorting Algorithms
Know the trade-offs cold. Merge Sort is O(n log n) guaranteed. Quick Sort averages O(n log n) but degrades to O(n²) with bad pivots. Heap Sort is O(n log n) in-place. In Python interviews, use sorted() (Timsort) but be ready to implement from scratch.
| Algorithm | Best | Average | Worst | Space | Stable |
|---|---|---|---|---|---|
| Bubble Sort | O(n) | O(n²) | O(n²) | O(1) | Yes |
| Selection Sort | O(n²) | O(n²) | O(n²) | O(1) | No |
| Insertion Sort | O(n) | O(n²) | O(n²) | O(1) | Yes |
| Merge Sort | O(n log n) | O(n log n) | O(n log n) | O(n) | Yes |
| Quick Sort | O(n log n) | O(n log n) | O(n²) | O(log n) | No |
| Heap Sort | O(n log n) | O(n log n) | O(n log n) | O(1) | No |
# Merge Sort — O(n log n) guaranteed, O(n) space def merge_sort(arr): if len(arr) <= 1: return arr mid = len(arr) // 2 L, R = merge_sort(arr[:mid]), merge_sort(arr[mid:]) i = j = 0; res = [] while i < len(L) and j < len(R): if L[i] <= R[j]: res.append(L[i]); i += 1 else: res.append(R[j]); j += 1 return res + L[i:] + R[j:] # Quick Sort — O(n log n) avg, O(n²) worst (bad pivot) def quick_sort(arr, lo=0, hi=None): if hi is None: hi = len(arr) - 1 if lo >= hi: return pivot = arr[hi]; i = lo - 1 for j in range(lo, hi): if arr[j] <= pivot: i += 1; arr[i], arr[j] = arr[j], arr[i] arr[i+1], arr[hi] = arr[hi], arr[i+1] p = i + 1 quick_sort(arr, lo, p-1); quick_sort(arr, p+1, hi)
Searching Algorithms
Binary Search is one of the most high-value patterns in interviews. It applies not just to sorted arrays — apply it to any monotonic condition: first bad version, minimum speed, capacity to ship packages. Template: lo, hi, mid, shrink by half each step.
Linear Search
- Time:
O(n)— Space:O(1) - Works on unsorted data
- No preconditions required
Binary Search
- Time:
O(log n)— Space:O(1) - Requires sorted input (or monotonic answer space)
- Always check: can I BS on the answer?
# Standard — exact match def binary_search(arr, target): lo, hi = 0, len(arr) - 1 while lo <= hi: mid = (lo + hi) // 2 if arr[mid] == target: return mid elif arr[mid] < target: lo = mid + 1 else: hi = mid - 1 return -1 # Search on answer space — minimum valid answer def search_answer(lo, hi, feasible): while lo < hi: mid = (lo + hi) // 2 if feasible(mid): hi = mid else: lo = mid + 1 return lo # minimum satisfying value
Dynamic Programming
DP = Recursion + Memoization (or Tabulation). Identify overlapping subproblems and optimal substructure. The 4-step framework: define state → recurrence relation → base case → direction of computation.
DP Framework
- 1. State: what does dp[i] represent?
- 2. Transition: dp[i] = f(dp[i-1],...)
- 3. Base case: dp[0] = ?
- 4. Answer: dp[n] or max(dp)
Classic Problems
- Fibonacci — O(n)
- 0/1 Knapsack — O(nW)
- Longest Common Subsequence — O(nm)
- Coin Change — O(n × amount)
- Longest Increasing Subsequence — O(n log n)
# Coin Change — min coins to make amount. O(n×amount) def coin_change(coins, amount): dp = [float('inf')] * (amount + 1); dp[0] = 0 for a in range(1, amount + 1): for c in coins: if c <= a: dp[a] = min(dp[a], dp[a - c] + 1) return dp[amount] if dp[amount] != float('inf') else -1 # Longest Common Subsequence — O(nm) time and space def lcs(s, t): m, n = len(s), len(t) dp = [[0]*(n+1) for _ in range(m+1)] for i in range(1, m+1): for j in range(1, n+1): dp[i][j] = dp[i-1][j-1] + 1 if s[i-1]==t[j-1] else max(dp[i-1][j], dp[i][j-1]) return dp[m][n]
Big-O Complexity
Always state time AND space after writing a solution. Drop constants and lower-order terms. When asked "can you do better?" — usually yes via a hash map, sorting, or binary search.
Order (best → worst)
- O(1) Constant
- O(log n) Logarithmic
- O(n) Linear
- O(n log n) Linearithmic
- O(n²) Quadratic
- O(2ⁿ) Exponential
Common Examples
- Array index → O(1)
- Binary search → O(log n)
- Hash map ops → O(1)
- Single loop → O(n)
- Merge Sort → O(n log n)
- Nested loops → O(n²)
Space Complexity
- In-place sort → O(1)
- Hash map → O(n)
- Recursion stack → O(h)
- DP 2D table → O(nm)
- Always ask: optimize space?
| Data Structure | Access | Search | Insert | Delete |
|---|---|---|---|---|
| Array | O(1) | O(n) | O(n) | O(n) |
| Linked List | O(n) | O(n) | O(1) | O(1) |
| Hash Table | O(1) | O(1) | O(1) | O(1) |
| BST (balanced) | O(log n) | O(log n) | O(log n) | O(log n) |
| Heap | O(1) peek | O(n) | O(log n) | O(log n) |
DSA Pattern Cheat Sheet
Recognizing which pattern to apply is 80% of the interview. Memorize this mapping.
| Problem Pattern | Best Tool | Time |
|---|---|---|
| Find pair summing to target | Hash Map (Two Sum) | O(n) |
| Max subarray sum | Kadane's Algorithm | O(n) |
| Subarray / substring window | Sliding Window | O(n) |
| Sorted array search | Binary Search | O(log n) |
| Cycle in linked list | Floyd's Fast & Slow Pointers | O(n) |
| Valid parentheses / expression | Stack | O(n) |
| Shortest path (unweighted) | BFS + Queue | O(V+E) |
| All paths / cycle detection | DFS + Stack/Recursion | O(V+E) |
| Top-K elements | Min-Heap of size K | O(n log k) |
| Weighted shortest path | Dijkstra's Algorithm | O((V+E) log V) |
| Overlapping subproblems | Dynamic Programming | varies |
| Permutations / combinations | Backtracking (DFS) | O(n!) |
| Merge intervals | Sort + Greedy Sweep | O(n log n) |
Interview Strategy
Technical skill alone is not enough. Interviewers evaluate communication, structure, and how you handle uncertainty.
Clarify requirements before coding. Ask about constraints, edge cases (empty, negatives, duplicates), and expected output format. Never assume.
State brute force first. Mention the naive O(n²) approach, then explain how you optimize. This shows structured thinking and buy time.
Think out loud. Narrate your reasoning continuously. Interviewers want to see process, not just the final answer.
Recognize the pattern. Two pointers? Sliding window? BFS? Map the problem to a known pattern — practice recognition, not blind memorization.
State time and space complexity after every solution. Show you understand trade-offs between time, space, and code complexity.
Test against edge cases. Walk through empty input, single element, all same values, negative numbers before declaring done.
Know Python built-ins. heapq, deque, Counter, defaultdict, bisect — these save real time in interviews.
Time-box yourself. In a real interview you have ~35 min per problem. If stuck after 5 minutes, ask for a hint — silence helps no one.
Knowledge Check — 15 Questions
Test your readiness. Identify gaps with targeted review.