CS Algorithms Real-World Applications

Jun 23, 2020
6 min read

Updated: Jul 17, 2020

In computer science an algorithm is a sequence of instructions/steps with a given number of input(s) to provide an output(s) in order to solve a problem or a calculation.

Algorithms are applied to, or on top of, Data Structures, which are a way to store, organize, and retrieve data. Or a Data Structure is used within an Algorithm.

Data structure is a data organization, management, and storage format that enables efficient access and modification. More precisely, a data structure is a collection of data values, the relationships among them, and the functions or operations that can be applied to the data.

Common Data Structures:

Arrays
Lists
Dictionary
Stacks
Queues
Deques
Trees
Hashes
Graphs

By type-

Linear Data Structures:

Arrays:

Dynamic Arrays

Lists:

Array List

Linked-List

Double-Linked List

Skip-List

Trees:

Binary Search Trees

Heaps

Binary Heaps & Heapsort- for Priority Queues

Complex Trees: Cartesian Trees, B-Trees, Red-Black Tree, AVL Tree

Hashes:

Hash Tables

Graphs:

Adjacency List

Adjacency Matrix

Directed acyclic Graph

Common Algorithms:

Sorting:

Quick Sort

Merge Sort

Bubble Sort

Insertion Sort

Shell Sort

Selection Sort

Tree Sort

Bucket Sort

Radix Sort

Counting Sort

Cubesort

Heapsort

Search:

Sequential

Binary Search

Binary Tree Search

Binary Heap Search

Hashing (used to search hash tables)

Breadth-First Search (used to search Graphs)

Depth-First Search (used to search Graphs)

Common Difficult Algorithms:

Dijkstra

CLASSIFICATIONS OF ALGORITHMS:

By implementation

1. A recursive algorithm is one that invokes (makes reference to) itself repeatedly until a certain condition (also known as termination condition) matches.

examples: "8 queens", "BST walk", and "recursive descent parsing"

2. Logical.

3. Distributed, Series, Parallel.

By runtime complexity

Algorithms can be classified by the amount of time they need to complete in a computer memory compared to their input size:

Constant time: O(1) if the time needed by the algorithm is the same, regardless of the input size. E.g. an access to an array element.
Logarithmic time: O(log n) if the time is a logarithmic function of the input size. E.g. binary search algorithm.
Linear time: O(n) if the time is proportional to the input size. E.g. the traverse of a list.
Polynomial time: O(n^2 or n^3) if the time is a power of the input size. E.g. the bubble sort algorithm has quadratic time complexity.
Exponential time: O(2^n) if the time is an exponential function of the input size. E.g. Brute-force search.
Factorial time: O(n!) if the time is a factorial function of the input size. E.g. you have n! number of lists

By design

Brute-force or exhaustive search - This is the naive method of trying every possible solution to see which is best

A divide and conquer algorithm repeatedly reduces an instance of a problem to one or more smaller instances of the same problem (usually recursively) until the instances are small enough to solve easily. One such example of divide and conquer is merge sorting. Sorting can be done on each segment of data after dividing data into segments and sorting of entire data can be obtained in the conquer phase by merging the segments. A simpler variant of divide and conquer is called a decrease and conquer algorithm, that solves an identical subproblem and uses the solution of this subproblem to solve the bigger problem. Divide and conquer divides the problem into multiple subproblems and so the conquer stage is more complex than decrease and conquer algorithms. An example of a decrease and conquer algorithm is the binary search algorithm.

Search and enumeration - Many problems (such as playing chess) can be modeled as problems on graphs. A graph exploration algorithm specifies rules for moving around a graph and is useful for such problems. This category also includes search algorithms,branch and bound enumeration and backtracking.

Randomized algorithm - Such algorithms make some choices randomly (or pseudo-randomly). They can be very useful in finding approximate solutions for problems where finding exact solutions can be impractical (see heuristic method below).

By Optimization Type Problems

Linear programming

When searching for optimal solutions to a linear function bound to linear equality and inequality constraints, the constraints of the problem can be used directly in producing the optimal solutions. There are algorithms that can solve any problem in this category, such as the popular simplex algorithm. Problems that can be solved with linear programming include the maximum flow problem for directed graphs. If a problem additionally requires that one or more of the unknowns must be an integer then it is classified in integer programming. A linear programming algorithm can solve such a problem if it can be proved that all restrictions for integer values are superficial, i.e., the solutions satisfy these restrictions anyway. In the general case, a specialized algorithm or an algorithm that finds approximate solutions is used, depending on the difficulty of the problem.

Dynamic programming

When a problem shows optimal substructures—meaning the optimal solution to a problem can be constructed from optimal solutions to subproblems—and over lapping subproblems, meaning the same subproblems are used to solve many different problem instances, a quicker approach called dynamic programming avoids recomputing solutions that have already been computed. For example, Floyd–Warshall algorithm, the shortest path to a goal from a vertex in a weighted graph can be found by using the shortest path to the goal from all adjacent vertices.

Dynamic programming and memoization go together. The main difference between dynamic programming and divide and conquer is that subproblems are more or less independent in divide and conquer, whereas subproblems overlap in dynamic programming. The difference between dynamic programming and straightforward recursion is in caching or memoization of recursive calls. When subproblems are independent and there is no repetition, memoization does not help; hence dynamic programming is not a solution for all complex problems. By using memoization or maintaining a table of subproblems already solved, dynamic programming reduces the exponential nature of many problems to polynomial complexity.

Dynamic Programming: Matrix-Chain Multiplication, Longest Common Subsequence

The greedy method

A greedy algorithm is similar to a dynamic programming algorithm in that it works by examining substructures, in this case not of the problem but of a given solution. Such algorithms start with some solution, which may be given or have been constructed in some way, and improve it by making small modifications. For some problems they can find the optimal solution while for others they stop at local optima, that is, at solutions that cannot be improved by the algorithm but are not optimum. The most popular use of greedy algorithms is for finding the minimal spanning tree where finding the optimal solution is possible with this method. Huffman Tree, Kruskal, Prim, Sollin are greedy algorithms that can solve this optimization problem.

Greedy Algorithms: Activity-Selection Problem, Huffman Codes

Top Algorithms and Data Structures for Competitive Programming

Graph algorithms

Breadth First Search (BFS)
Depth First Search (DFS)
Shortest Path from source to all vertices **Dijkstra**
Shortest Path from every vertex to every other vertex **Floyd Warshall**
Minimum Spanning tree **Prim**
Minimum Spanning tree **Kruskal**
Topological Sort

Dynamic Programming

Longest Common Subsequence
Matrix-Chain Multiplication
Longest Increasing Subsequence
Edit Distance
Minimum Partition
Ways to Cover a Distance
Longest Path In Matrix
Subset Sum Problem
Optimal Strategy for a Game
0-1 Knapsack Problem
Assembly Line Scheduling

Searching And Sorting

Binary Search
Quick Sort
Merge Sort
Order Statistics
KMP algorithm
Rabin karp
Z’s algorithm
Aho Corasick String Matching
Counting Sort
Manacher’s algorithm: Part 1, Part 2 and Part 3

Data Structures

Binary Indexed Tree or Fenwick tree
Segment Tree (RMQ, Range Sum and Lazy Propagation)
K-D tree (See insert, minimum and delete)
Union Find Disjoint Set (Cycle Detection and By Rank and Path Compression)
Tries
Suffix array (this, this and this)
Sparse table
Suffix automata
Suffix automata II
LCA and RMQ

Number theory and Other Mathematical- Prime Numbers and Prime Factorization

Geometrical and Network Flow Algorithms

Natural Language Processing

N-grams

Levenshtein distance

Markov Chains

Naive Bayes

Latent Dirichlet allocation (LDA)

Neural Networks Training Algorithms

Real World Applications

Real World Applications of Sorting:

Merge Sort: Databases use an external merge sort to sort sets of data that are too large to be loaded entirely into memory. The driving factor in this sort is the reduction in the number of disk I/Os.
Bubble Sort: Bubble sort is used in programming TV remote to sort channels on the basis of longer viewing time.
Heap Sort: Heap sort is used in reading barcodes on plastic cards. The service allows to communicate with the database to constantly run checks to ensure that they were all still online and had to constantly report statistics on which readers were performing the worst, which ones got the most/least user activity, etc.
Quick Sort: Sports scores are organised by quick sort on the basis of win-loss ratio.
Radix Sort: eBay allows you to sort listings by the current Bid amount leveraging radix sort.
Selection Sort: K12 education portal allows sorting list of pupils alphabetically through selection sort.

real-world application of LIS is Patience Diff, a diffing algorithm by Bram Cohen (the creator of BitTorrent) which is used in the Bazaar version control system. The regular diff algorithm involves computing the LCS (Longest Common Subsequence) between two documents.