Skip to content

moodcheerful/java-algorithms

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 

Repository files navigation

🔥 👍 ⁉️ Common CS Algorithms ⁉️ 👍 🔥
implemented in Java and tested in JUnit

🌲 Repo's tree view and ⌛ Algorithms' time complexities:
├── README.md
├── pom.xml
├── src
│   ├── main
│   │   ├── java
│   │   │   ├── graphs
│   │   │   │   ├── BinarySearchTree.java ------------------- O(height)
│   │   │   │   ├── BreadthFirstSearch.java ----------------- O(V + E)
│   │   │   │   ├── DepthFirstSearch.java ------------------- O(V + E)
│   │   │   │   ├── DepthFirstSearchNonRecursive.java ------- O(V + E)
│   │   │   │   ├── DijkstraShortestPath.java --------------- O(E log V)
│   │   │   │   ├── DirectedGraphOrder.java ----------------- O(V + E)
│   │   │   │   └── TopologicalSort.java -------------------- O(V + E)
│   │   │   ├── numerical
│   │   │   │   ├── BitwiseOperations.java ------------------ O(1)
│   │   │   │   ├── DecimalToBinary.java -------------------- O(1)
│   │   │   │   ├── ExponentiationBySquaring.java ----------- O(log n)
│   │   │   │   ├── GreatestCommonDivisor.java -------------- O((log min(a, b))^2)
│   │   │   │   ├── Primality.java -------------------------- O(n^1/2)
│   │   │   │   ├── PrimeSieve.java ------------------------- O(n log(log n))
│   │   │   │   └── RomanToArabic.java ---------------------- O(1)
│   │   │   ├── optimization
│   │   │   │   ├── DynamicProgrammingCheckerboard.java ----- O(n^2)
│   │   │   │   ├── DynamicProgrammingKnapsack.java --------- O(N * W)
│   │   │   │   ├── GameOfLife.java ------------------------- O(n^2)
│   │   │   │   ├── MaintainingMedian.java ------------------ O(log n)
│   │   │   │   └── MarkovChainPageRank.java ---------------- O(n^2)
│   │   │   ├── searching
│   │   │   │   ├── BinarySearch.java ----------------------- O(log n)
│   │   │   │   └── SelectKthSmallest.java ------------------ O(n)
│   │   │   ├── sorting
│   │   │   │   ├── HeapSort.java --------------------------- O(n log n)
│   │   │   │   ├── KnuthShuffle.java ----------------------- O(n)
│   │   │   │   ├── MergeSort.java -------------------------- O(n log n)
│   │   │   │   └── QuickSort.java -------------------------- O(n log n)
│   │   │   ├── statistics
│   │   │   │   ├── ProbabilityDistributionsContinuous.java - O(1)
│   │   │   │   └── ProbabilityDistributionsDiscrete.java --- O(n)
│   │   │   ├── strings
│   │   │   │   ├── LZWCompression.java --------------------- O(n)
│   │   │   │   ├── Palindrome.java ------------------------- O(n)
│   │   │   │   └── ROT13Cipher.java ------------------------ O(n)
│   │   │   └── utils
│   │   │       ├── DirectedGraph.java
│   │   │       ├── Graph.java
│   │   │       ├── VertexScore.java
│   │   │       ├── WeightedDirectedGraph.java
│   │   │       └── WeightedEdge.java
│   │   └── resources
│   │       └── log4j.properties
│   └── test
│       └── java
│           ├── graphs
│           │   ├── BinarySearchTreeTest.java
│           │   ├── BreadthFirstSearchTest.java
│           │   ├── DepthFirstSearchNonRecursiveTest.java
│           │   ├── DepthFirstSearchTest.java
│           │   ├── DijkstraShortestPathTest.java
│           │   ├── DirectedGraphOrderTest.java
│           │   └── TopologicalSortTest.java
│           ├── numerical
│           │   ├── BitwiseOperationsTest.java
│           │   ├── DecimalToBinaryTest.java
│           │   ├── ExponentiationBySquaringTest.java
│           │   ├── GreatestCommonDivisorTest.java
│           │   ├── PrimalityTest.java
│           │   ├── PrimeSieveTest.java
│           │   └── RomanToArabicTest.java
│           ├── optimization
│           │   ├── DynamicProgrammingCheckerboardTest.java
│           │   ├── DynamicProgrammingKnapsackTest.java
│           │   ├── GameOfLifeTest.java
│           │   ├── MaintainingMedianTest.java
│           │   └── MarkovChainPageRankTest.java
│           ├── searching
│           │   ├── BinarySearchTest.java
│           │   └── SelectKthSmallestTest.java
│           ├── sorting
│           │   ├── HeapSortTest.java
│           │   ├── KnuthShuffleTest.java
│           │   ├── MergeSortTest.java
│           │   └── QuickSortTest.java
│           ├── statistics
│           │   ├── ProbabilityDistributionsContinuousTest.java
│           │   └── ProbabilityDistributionsDiscreteTest.java
│           └── strings
│               ├── LZWCompressionTest.java
│               ├── PalindromeTest.java
│               └── ROT13CipherTest.java
└── target

⤴️ Asymptotic Order of Growth (Big O notation):

Description Order of Growth Method Example
constant 1 statement add 2 numbers; hash table access
logarithmic log n divide in half binary search
linear n loop find the maximum
linearithmic n log n divide and conquer mergesort
quadratic n2 double loop check all pairs; bubble sort
cubic n3 triple loop check all triples
exponential 2n exhaustive search check all subsets
factorial n! brute-force traveling salesman problem solved by brute force

🔄 Array Sorting Algorithms:

  • A Sorting Lower Bound: every "comparison-based" sorting algorithm has worst-case running time of Ω(n log n).
  • Stable sort preserves the relative order of equal keys in the array.
  • Quicksort is the fastest general-purpose sort, but it's not stable.
  • Mergesort might be best if stability is important and space is available.
  • Java's system sort method Arrays.sort() in the java.util library is overloaded:
    • for primitive types: quicksort (with 3-way partitioning) - for speed and memory usage
    • for reference types: mergesort with insertion sort after a small size threshold - for stability and guaranteed performance
Algorithm Time Complexity Space Complexity Stable Method
Quicksort O(n log(n)) log n No Partitioning
Mergesort O(n log(n)) n Yes Merging
Heapsort O(n log(n)) 1 No Selection
Insertion Sort O(n2) 1 Yes Insertion
Selection Sort O(n2) 1 No Selection
Bubble Sort O(n2) 1 Yes Exchanging
Shell Sort O(n (log n)2) 1 No Insertion
Bucket Sort O(n+k) O(n) Yes (Non-comparison sort)
Radix Sort O(nk) O(n+k) Yes (Non-comparison sort)

🌐 Graph Algorithms:

  • Breadth-First Search (BFS):

    • O(V+E) time complexity using a queue (FIFO)
    • explore nodes in "layers"
    • compute shortest paths
    • compute connected components of an undirected graph
  • Depth-First Search (DFS):

    • O(V+E) time complexity using a stack (LIFO) (or via recursion)
    • explore aggressively like a maze, backtrack only when necessary
    • compute topological ordering of a directed acyclic graph
    • compute connected components in directed graphs
  • Depth-First Orders in a directed acyclical graph (DAG):

    • Preorder: root, left, right (put the vertex on a queue before the recursive calls)
    • Postorder: left, right, root (put the vertex on a queue after the recursive calls)
    • Reverse postorder: right, left, root
    • Inorder: left, root, right
    • Reverse preorder: root, right, left
    • Reverse inorder: right, root, left
    • Applications:
      • Preorder: copying a binary tree; prefix expression (Polish notation)
      • Inorder: returning values in order, according to the comparator of a binary search tree
      • Postorder: deleting nodes or an entire binary tree; postfix representation (reverse Polish notation)
      • Reverse postorder: topological sort
  • Topological Sort:

    • Given a directed graph, put the vertices in order such that all its directed edges point from a vertex earlier in the order to a vertex later in the order.
    • A directed graph has a topological order if and only if it is a directed acyclical graph (DAG).
    • DFS reverse postorder in a DAG is a topological sort.
    • With DFS, topological sort time complexity is O(V+E).
    • Application: sequence tasks while respecting all precedence constraints.
  • Strongly Connected Components:

    • strongly connected components (SCCs) of a directed graph G are the equivalence classes of the relation:
      u<->v <==> there exists a path u->v and a path v->u in G
    • Kosaraju’s Two-Pass Algorithm computes SCCs in 2*DFS = O(V+E) time.
  • Dijkstra's Shortest Path algorithm:

    • computes the shortest path in an edge-weighted directed graph with non-negative weights
    • canonical use of Heap data structure: fast way to do repeated minimum computations
    • time complexity: O(E log V)
    • space complexity: O(V)

💎 Heap (Priority Queue):

  • Canonical use of Heap: 💰 fast way to do repeated minimum computations - perform Insert, Extract-Min in O(log n) time
  • Supported Operations:
    • Insert: O(log n)
    • Extract-Min: O(log n) - remove an object with a minimum key value, ties broken arbitrarily
    • Peek: O(1)
      also :
    • Heapify: O(n) - batch insert n elements
    • Delete: O(log n)
  • Conceptually: a perfectly balanced binary tree of height = log n
  • Heap Property: at every node, key <= children’s keys
  • Implementation: array (index starts at 1)
    • children of i = 2i, 2i+1
    • parent of i = [i/2]
    • Extract-Min: swapping up last leaf and then bubbling down
    • Insert: bubbling up from the last leaf
  • Applications: sorting, Dijkstra's Shortest Path algorithm, event manager, median maintenance

🌲 Balanced Binary Search Tree:

  • Raison d’etre: 💰 like sorted array + fast (logarithmic) inserts and deletes
  • Search Tree Property: each node's key > all keys in the node's left subtree, and < all keys in the node's right subtree
  • Binary: each node has at most 2 children
  • The height of a BST: log2(n) (best case, balanced) <= height <= n (worst case, a chain)
  • Insert a new key k into a tree:
    • search for k (unsuccessfully)
    • rewire final NULL pointer to point to new node with key k
  • Search and Insert worst-case running time: Θ(height)
  • Delete a key k:
    1. easy case: k’s node has no children
    2. medium case: k’s node has 1 child
    3. difficult case: k’s node has 2 children
  • Balanced Search Tree:
    • Idea: ensure that height is always best possible: O(log n)
    • then Search / Insert / Delete / Min / Max / Predecessor / Successor will run in O(log n) time
    • Example: red-black trees, AUL trees, splay trees, B trees
  • Red-Black Tree:
    • Red-Black Tree Invariants:
      1. Each node red or black
      2. Root is black
      3. No 2 reds in a row (red node => only black children)
      4. Every root-null path (like in an unsuccessful search) has same number of black nodes
    • Height Guarantee: every red-black tree with n nodes has a height of O(log n).
    • Time complexity of Search, Insert, Delete in a red-black tree: O(log n).
Operations / Running Times Balanced Search Tree Sorted Array
Search Θ(log n) Θ(log n)
Select (given order statistic i) O(log n) O(1)
Min/Max O(log n) O(1)
Predecessor/Successor (given pointer to a key) O(log n) O(1)
Rank (# of keys <= a given value) O(log n) O(log n)
Output in Sorted Order O(n) O(n)
Insert O(log n) - (would take Θ(n) time)
Delete O(log n) - (would take Θ(n) time)

Also supported by heaps: Select, Insert, Delete
Also supported by hash tables: Search, Insert, Delete


📊 Hash Tables

  • Purpose: maintain a (possibly evolving) set of values indexed by key
  • Clue to use Hash Table: repeated lookups (fast constant time)
  • 💰 Amazing Guarantee:
    Insert, Delete, Search - all operations in O(1) time, using a key, subject to 2 caveats:
    1. Hash function properly implemented (provides a uniform distribution of hash values)
    2. No worst bound guarantee for pathological data
  • Resolving collisions:
    • Solution 1: separate chaining (keep linked list in each bucket)
    • Solution 2: open addressing (only one object per bucket): linear probing, double hashing
  • Sample applications: de-duplication, the 2-sum problem, symbol tables in compilers, game tree exploration.
  • The Load of a hash table alpha = # of objects in hash table divided by # of buckets of hash table
    • load alpha = O(1) is necessary condition for operations to run in constant time
    • for good Hash Table performance, need to control load
  • Pathological Data can paralyze real-world systems by exploiting badly designed open source hash functions. Solutions:
    1. cryptographic hash functions (infeasible to reverse engineer)
    2. randomization ("universal family of hash functions")

📚 Algorithm Design Paradigms:

  • Divide & conquer:
    • Example: Mergesort
    • Master Method, straightforward inductive correctness proof
  • Randomized algorithms:
    • Example: Quicksort
  • Greedy algorithms:
    • Iteratively make "myopic" locally optimal decisions, hope everything works out at the end.
    • Examples:
      • Dijkstra’s shortest path (process each destination once, irrevocably)
      • Optimal caching (Farthest-in-Future or Least Recently Used algorithm, given the locality of reference)
    • 💣 Danger: Most greedy algorithms are NOT correct: for example, Dijkstra with negative edge weights.
  • Dynamic programming:
    • Main difference from greedy algorithms: After every stage, dynamic programming makes decisions based on all the decisions made in the previous stage, and may reconsider the previous stage's choices.
    • Example: Sequence alignment

About

Common CS Algorithms, implemented in Java and tested in JUnit

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages