├── README.md
├── pom.xml
├── src
│ ├── main
│ │ ├── java
│ │ │ ├── graphs
│ │ │ │ ├── BinarySearchTree.java ------------------- O(height)
│ │ │ │ ├── BreadthFirstSearch.java ----------------- O(V + E)
│ │ │ │ ├── DepthFirstSearch.java ------------------- O(V + E)
│ │ │ │ ├── DepthFirstSearchNonRecursive.java ------- O(V + E)
│ │ │ │ ├── DijkstraShortestPath.java --------------- O(E log V)
│ │ │ │ ├── DirectedGraphOrder.java ----------------- O(V + E)
│ │ │ │ └── TopologicalSort.java -------------------- O(V + E)
│ │ │ ├── numerical
│ │ │ │ ├── BitwiseOperations.java ------------------ O(1)
│ │ │ │ ├── DecimalToBinary.java -------------------- O(1)
│ │ │ │ ├── ExponentiationBySquaring.java ----------- O(log n)
│ │ │ │ ├── GreatestCommonDivisor.java -------------- O((log min(a, b))^2)
│ │ │ │ ├── Primality.java -------------------------- O(n^1/2)
│ │ │ │ ├── PrimeSieve.java ------------------------- O(n log(log n))
│ │ │ │ └── RomanToArabic.java ---------------------- O(1)
│ │ │ ├── optimization
│ │ │ │ ├── DynamicProgrammingCheckerboard.java ----- O(n^2)
│ │ │ │ ├── DynamicProgrammingKnapsack.java --------- O(N * W)
│ │ │ │ ├── GameOfLife.java ------------------------- O(n^2)
│ │ │ │ ├── MaintainingMedian.java ------------------ O(log n)
│ │ │ │ └── MarkovChainPageRank.java ---------------- O(n^2)
│ │ │ ├── searching
│ │ │ │ ├── BinarySearch.java ----------------------- O(log n)
│ │ │ │ └── SelectKthSmallest.java ------------------ O(n)
│ │ │ ├── sorting
│ │ │ │ ├── HeapSort.java --------------------------- O(n log n)
│ │ │ │ ├── KnuthShuffle.java ----------------------- O(n)
│ │ │ │ ├── MergeSort.java -------------------------- O(n log n)
│ │ │ │ └── QuickSort.java -------------------------- O(n log n)
│ │ │ ├── statistics
│ │ │ │ ├── ProbabilityDistributionsContinuous.java - O(1)
│ │ │ │ └── ProbabilityDistributionsDiscrete.java --- O(n)
│ │ │ ├── strings
│ │ │ │ ├── LZWCompression.java --------------------- O(n)
│ │ │ │ ├── Palindrome.java ------------------------- O(n)
│ │ │ │ └── ROT13Cipher.java ------------------------ O(n)
│ │ │ └── utils
│ │ │ ├── DirectedGraph.java
│ │ │ ├── Graph.java
│ │ │ ├── VertexScore.java
│ │ │ ├── WeightedDirectedGraph.java
│ │ │ └── WeightedEdge.java
│ │ └── resources
│ │ └── log4j.properties
│ └── test
│ └── java
│ ├── graphs
│ │ ├── BinarySearchTreeTest.java
│ │ ├── BreadthFirstSearchTest.java
│ │ ├── DepthFirstSearchNonRecursiveTest.java
│ │ ├── DepthFirstSearchTest.java
│ │ ├── DijkstraShortestPathTest.java
│ │ ├── DirectedGraphOrderTest.java
│ │ └── TopologicalSortTest.java
│ ├── numerical
│ │ ├── BitwiseOperationsTest.java
│ │ ├── DecimalToBinaryTest.java
│ │ ├── ExponentiationBySquaringTest.java
│ │ ├── GreatestCommonDivisorTest.java
│ │ ├── PrimalityTest.java
│ │ ├── PrimeSieveTest.java
│ │ └── RomanToArabicTest.java
│ ├── optimization
│ │ ├── DynamicProgrammingCheckerboardTest.java
│ │ ├── DynamicProgrammingKnapsackTest.java
│ │ ├── GameOfLifeTest.java
│ │ ├── MaintainingMedianTest.java
│ │ └── MarkovChainPageRankTest.java
│ ├── searching
│ │ ├── BinarySearchTest.java
│ │ └── SelectKthSmallestTest.java
│ ├── sorting
│ │ ├── HeapSortTest.java
│ │ ├── KnuthShuffleTest.java
│ │ ├── MergeSortTest.java
│ │ └── QuickSortTest.java
│ ├── statistics
│ │ ├── ProbabilityDistributionsContinuousTest.java
│ │ └── ProbabilityDistributionsDiscreteTest.java
│ └── strings
│ ├── LZWCompressionTest.java
│ ├── PalindromeTest.java
│ └── ROT13CipherTest.java
└── target
Description | Order of Growth | Method | Example |
---|---|---|---|
constant | 1 | statement | add 2 numbers; hash table access |
logarithmic | log n | divide in half | binary search |
linear | n | loop | find the maximum |
linearithmic | n log n | divide and conquer | mergesort |
quadratic | n2 | double loop | check all pairs; bubble sort |
cubic | n3 | triple loop | check all triples |
exponential | 2n | exhaustive search | check all subsets |
factorial | n! | brute-force | traveling salesman problem solved by brute force |
- A Sorting Lower Bound: every "comparison-based" sorting algorithm has worst-case running time of Ω(n log n).
- Stable sort preserves the relative order of equal keys in the array.
- Quicksort is the fastest general-purpose sort, but it's not stable.
- Mergesort might be best if stability is important and space is available.
- Java's system sort method
Arrays.sort()
in thejava.util
library is overloaded:- for primitive types: quicksort (with 3-way partitioning) - for speed and memory usage
- for reference types: mergesort with insertion sort after a small size threshold - for stability and guaranteed performance
Algorithm | Time Complexity | Space Complexity | Stable | Method |
---|---|---|---|---|
Quicksort | O(n log(n)) | log n | No | Partitioning |
Mergesort | O(n log(n)) | n | Yes | Merging |
Heapsort | O(n log(n)) | 1 | No | Selection |
Insertion Sort | O(n2) | 1 | Yes | Insertion |
Selection Sort | O(n2) | 1 | No | Selection |
Bubble Sort | O(n2) | 1 | Yes | Exchanging |
Shell Sort | O(n (log n)2) | 1 | No | Insertion |
Bucket Sort | O(n+k) | O(n) | Yes | (Non-comparison sort) |
Radix Sort | O(nk) | O(n+k) | Yes | (Non-comparison sort) |
-
Breadth-First Search (BFS):
- O(V+E) time complexity using a
queue
(FIFO) - explore nodes in "layers"
- compute shortest paths
- compute connected components of an undirected graph
- O(V+E) time complexity using a
-
Depth-First Search (DFS):
- O(V+E) time complexity using a
stack
(LIFO) (or via recursion) - explore aggressively like a maze, backtrack only when necessary
- compute topological ordering of a directed acyclic graph
- compute connected components in directed graphs
- O(V+E) time complexity using a
-
Depth-First Orders in a directed acyclical graph (DAG):
- Preorder: root, left, right (put the vertex on a queue before the recursive calls)
- Postorder: left, right, root (put the vertex on a queue after the recursive calls)
- Reverse postorder: right, left, root
- Inorder: left, root, right
- Reverse preorder: root, right, left
- Reverse inorder: right, root, left
- Applications:
- Preorder: copying a binary tree; prefix expression (Polish notation)
- Inorder: returning values in order, according to the comparator of a binary search tree
- Postorder: deleting nodes or an entire binary tree; postfix representation (reverse Polish notation)
- Reverse postorder: topological sort
-
Topological Sort:
- Given a directed graph, put the vertices in order such that all its directed edges point from a vertex earlier in the order to a vertex later in the order.
- A directed graph has a topological order if and only if it is a directed acyclical graph (DAG).
- DFS reverse postorder in a DAG is a topological sort.
- With DFS, topological sort time complexity is O(V+E).
- Application: sequence tasks while respecting all precedence constraints.
-
Strongly Connected Components:
- strongly connected components (SCCs) of a directed graph G are the equivalence classes of the relation:
u<->v <==> there exists a path u->v and a path v->u in G - Kosaraju’s Two-Pass Algorithm computes SCCs in 2*DFS = O(V+E) time.
- strongly connected components (SCCs) of a directed graph G are the equivalence classes of the relation:
-
Dijkstra's Shortest Path algorithm:
- computes the shortest path in an edge-weighted directed graph with non-negative weights
- canonical use of Heap data structure: fast way to do repeated minimum computations
- time complexity: O(E log V)
- space complexity: O(V)
- Canonical use of Heap: 💰 fast way to do repeated minimum computations - perform Insert, Extract-Min in O(log n) time
- Supported Operations:
- Insert: O(log n)
- Extract-Min: O(log n) - remove an object with a minimum key value, ties broken arbitrarily
- Peek: O(1)
also : - Heapify: O(n) - batch insert n elements
- Delete: O(log n)
- Conceptually: a perfectly balanced binary tree of height = log n
- Heap Property: at every node, key <= children’s keys
- Implementation: array (index starts at 1)
- children of i = 2i, 2i+1
- parent of i = [i/2]
- Extract-Min: swapping up last leaf and then bubbling down
- Insert: bubbling up from the last leaf
- Applications: sorting, Dijkstra's Shortest Path algorithm, event manager, median maintenance
- Raison d’etre: 💰 like sorted array + fast (logarithmic) inserts and deletes
- Search Tree Property: each node's key > all keys in the node's left subtree, and < all keys in the node's right subtree
- Binary: each node has at most 2 children
- The height of a BST: log2(n) (best case, balanced) <= height <= n (worst case, a chain)
- Insert a new key k into a tree:
- search for k (unsuccessfully)
- rewire final NULL pointer to point to new node with key k
- Search and Insert worst-case running time: Θ(height)
- Delete a key k:
- easy case: k’s node has no children
- medium case: k’s node has 1 child
- difficult case: k’s node has 2 children
- Balanced Search Tree:
- Idea: ensure that height is always best possible: O(log n)
- then Search / Insert / Delete / Min / Max / Predecessor / Successor will run in O(log n) time
- Example: red-black trees, AUL trees, splay trees, B trees
- Red-Black Tree:
- Red-Black Tree Invariants:
- Each node red or black
- Root is black
- No 2 reds in a row (red node => only black children)
- Every root-null path (like in an unsuccessful search) has same number of black nodes
- Height Guarantee: every red-black tree with n nodes has a height of O(log n).
- Time complexity of Search, Insert, Delete in a red-black tree: O(log n).
- Red-Black Tree Invariants:
Operations / Running Times | Balanced Search Tree | Sorted Array |
---|---|---|
Search | Θ(log n) | Θ(log n) |
Select (given order statistic i) | O(log n) | O(1) |
Min/Max | O(log n) | O(1) |
Predecessor/Successor (given pointer to a key) | O(log n) | O(1) |
Rank (# of keys <= a given value) | O(log n) | O(log n) |
Output in Sorted Order | O(n) | O(n) |
Insert | O(log n) | - (would take Θ(n) time) |
Delete | O(log n) | - (would take Θ(n) time) |
Also supported by heaps: Select, Insert, Delete
Also supported by hash tables: Search, Insert, Delete
- Purpose: maintain a (possibly evolving) set of values indexed by key
- Clue to use Hash Table: repeated lookups (fast constant time)
- 💰 Amazing Guarantee:
Insert, Delete, Search - all operations in O(1) time, using a key, subject to 2 caveats:- Hash function properly implemented (provides a uniform distribution of hash values)
- No worst bound guarantee for pathological data
- Resolving collisions:
- Solution 1: separate chaining (keep linked list in each bucket)
- Solution 2: open addressing (only one object per bucket): linear probing, double hashing
- Sample applications: de-duplication, the 2-sum problem, symbol tables in compilers, game tree exploration.
- The Load of a hash table alpha = # of objects in hash table divided by # of buckets of hash table
- load alpha = O(1) is necessary condition for operations to run in constant time
- for good Hash Table performance, need to control load
- Pathological Data can paralyze real-world systems by exploiting badly designed open source hash functions. Solutions:
- cryptographic hash functions (infeasible to reverse engineer)
- randomization ("universal family of hash functions")
- Divide & conquer:
- Example: Mergesort
- Master Method, straightforward inductive correctness proof
- Randomized algorithms:
- Example: Quicksort
- Greedy algorithms:
- Iteratively make "myopic" locally optimal decisions, hope everything works out at the end.
- Examples:
- Dijkstra’s shortest path (process each destination once, irrevocably)
- Optimal caching (Farthest-in-Future or Least Recently Used algorithm, given the locality of reference)
- 💣 Danger: Most greedy algorithms are NOT correct: for example, Dijkstra with negative edge weights.
- Dynamic programming:
- Main difference from greedy algorithms: After every stage, dynamic programming makes decisions based on all the decisions made in the previous stage, and may reconsider the previous stage's choices.
- Example: Sequence alignment