数据结构与算法笔记

PREFACE


Reference Book:

  • Data Structures and Algorithm Analysis in C++ 4th Edition
  • Data Structures and Algorithm Analysis in Java 3rd Edition

📓 This note would present code in multiple languages (C++, Java and Python) where they are tested well to ensure their running. Make sure you have commanded at least one of them first, or it will be hard for you to read.

Copyright: WRITTEN BY RYKER ZHU in Shanghai under CC BY-NC-SA CC BY-NC-SA

Unless otherwise noted, all graphics and images in this post were drawn by the author and are protected by copyright under the Creative Commons agreement. Any commercial use of the works is strictly prohibited. By copying or adapting this article, you are agreeing to and accepting the restrictions of the terms and conditions.

If you find any errors, please contact me via email and I will make corrections as soon as possible.

Tips: You can click on the table of contents on the right sidebar of the web page to navigate.

Overview

C++ Details

Lvalues, Rvalues and References

As the textbook defines,

Lvalue: expression that identifies a non-temporary object;
Rvalue: expression that identifies a temporary object or or is a value not associated with any object.

📓 If a variable has a name, then regardless of whether it is modifiable, it is a lvalue.

Click Here to know more about lvalues and rvalues

Example

1
2
666 = variable;         // ERROR !!!
(variable + 1) = 1234; // ERROR !!!

Neither the constant 666 nor the expression variable + 1 are lvalues since they are temporary results of expressions, and they reside in some temporary register for the duration of computation.

1
2
const int variable = 1234;  // It's a lvalue,
variable = 4321; // but it cannot be assigned!

Not all lvalues can be assigned to but except for this special case, initially, lvalues meant "values suitable for left-hand-side of assignment".

1
2
3
std::string firstName = "Ryker";    // Lvalue
std::string lastName = "Zhu"; // Lvalue
std::string fullName = firstName + lastName; // Converted to Rvalue

All lvalues that aren't arrays, functions or of incomplete types can be converted to rvalues.

1
2
3
4
5
6
7
8
9
10
11
void doSomething(int& lvalue_ref) {
}

int main() {
int a = 10;
doSomething(a); // WORKS
doSomething(10); // ERROR

const int& b = 10;
doSomething(b); // WORKS
}

Rvalues cannot be converted into a lvalue reference but can be const lvalue reference. (That is why pass-by-constant is so common in C++)

1
2
3
4
5
6
7
void doSomething(cosnt int& value) {
std::cout << "[lvalue] " << value << std::endl;
}

void doSomething(int&& value) {
std::cout << "[rvalue] " << value << std::endl;
}

C++11 standard supports rvalue reference and it is useful for detecting whether an object is temporary or persistent and especially with move semantics.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
class string {
public:
string(char* str) {
m_size = strlen(str) + 1;
m_data = new char[m_size]; // Deep Copy
memcpy(m_data, str, m_size);
printf("Created!\n");
}

~string() {
printf("Destroyed!\n");
delete m_data;
}

string(const string& another) {
m_size = another.m_size;
m_data = new char[m_size]; // Deep Copy
memcpy(m_data, another.m_data, m_size);
printf("Copied!\n");
}

string(string&& another) {
// Move Semantics
// Just like stealing the data from another object
m_data = another.m_data; // Move the pointer
m_size = another.m_size;

another.m_size = 0;
another.m_data = nullptr;
printf("Moved!\n");
}

void print() {
for(uint32_t i = 0; i < m_size; ++i)
printf("%c", m_data[i]);
}

private:
char* m_data;
int m_size;
};

class entity {
public:
entity(const string& str) : m_string(str) {
}

entity(string&& str) : m_string(std::move(str)) {
// std::move works same as just simply
// cast it into rvalue reference.
}

void print() {
m_string.print();
}

private:
string m_string;
};

int main() {
string lvalue_string("Hello!\n");
entity copied_entity(lvalue_string);
copied_entity.print();

entity moved_entity("Bonjour!\n");
moved_entity.print();
}

Generic Classes and Templates

Both Java and C++ provides "template" but their mechanisms differ from each other. Java employs Type Erasure (since all classes written in Java extends from java.lang.Object) while C++ template member functions would only be created when they are called.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
template<typename Type>
class MemoryCell {
public:
explicit MemoryCell(const Type &initial = Type()) :
m_data(initial) {}

void write(const Type& data) {
m_data = data;
}

const Type& read() {
return m_data;
}

private:
Type m_data;
}

Same class written in Java:

1
2
3
4
5
public class MemoryCell<Type> {
public Type read() {return m_data;}
public void write(Type data) {m_data = data;}
private Type m_data;
}

Algorithm Analysis

Mathematical Analysis

Some contents are just directly copied from another post: Note of Introduction to Algorithms

Running Time Calculation

General Rules

  • for loops

    The running time of a for loop is at most the running time of the statements inside the for loop (including tests) times the number of iterations.

  • Nested loops

    The total running time of a statement inside a group of nested loops is the running time of the statement multiplied by the product of the sizes of all the loops.

  • Consecutive statements

    These just add (which means that the maximum is the one that counts).

  • if / else

    The running time of an if/else statement is never more than the running time of the test plus the larger one of the running times of two braches.

Logarithms in the Running Time

General rules to distinguish between \(O(n)\) and \(O(\log n)\)

  • An algorithm is \(O\left(\log{n}\right)\) if it takes constant \(O\left(1\right)\) time to cut the problem size by a fraction (which is usually \(\dfrac12\)).
  • An algorithm is \(O\left(n\right)\) if constant time is required to merely reduce the problem by a constant amount (such as to make the problem smaller by 1).
Algorithms Running Time
Binary Search Logarithm
Euclidean Algorithm

Lists, Stacks and Queues

flowchart LR;
logical(Logical Structure) --- linear(Linear Structure) & nonlinear(Non-linear Structure)
linear --- List & Stack & Queue & String & Array
nonlinear --- Tree & Graph

click Tree href "https://devexzh.github.io/2023/Note_Of_Data_Structures_And_Algorithms/#trees" _blank

click Graph href "https://devexzh.github.io/2023/Note_Of_Data_Structures_And_Algorithms/#graph-algorithms" _blank

click Stack href "https://devexzh.github.io/2023/Note_Of_Data_Structures_And_Algorithms/#stacks" _blank

click Queue href "https://devexzh.github.io/2023/Note_Of_Data_Structures_And_Algorithms/#queues" _blank

Stacks

A stack is a list with the restriction that insertions and deletions can be performed in only one position, namely, the end of the list, called the top.

Implementations

  1. Array or vector based

    Features: easy, convenient but easy to cause errors.

    • Stack Underflow — An error called when an item is popped from a stack, but the stack is empty.
    • Stack Overflow — An error called when an item is pushed onto a stack, but the stack is full.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    #define STACK_MAX_SIZE 128
    template<typename ElementType>
    class Stack {
    private:
    ElementType *base, *top;
    int stacksize;

    public:
    Stack() : base(new ElementType[STACK_MAX_SIZE]),
    top(base),
    stacksize(STACK_MAX_SIZE) {
    if(!base) throw "Stack Overflow";
    }

    ~Stack() noexcept {
    delete[] base;
    }

    void push(const ElementType& value) {
    if(top - base == stacksize)
    throw "Full Stack";
    *top = value;
    ++top;
    }

    ElementType& pop() {
    if(top == base)
    throw "Empty Stack";
    ElementType* e = top;
    --top;
    return *e;
    }

    bool isEmpty() const noexcept {
    return top == base;
    }

    int size() const noexcept {
    return top - base;
    }
    };
  2. Linked-list based

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    template<typename ElementType>
    class Stack {
    private:
    class StackNode {
    friend Stack;
    private:
    ElementType data;
    StackNode* next;
    } *top;

    public:
    Stack() noexcept : next(nullptr) {}
    ~Stack() noexcept {
    StackNode *p = top;
    while(p->next) {
    delete p;
    p = p->next;
    }
    }

    void push(const ElementType& value) noexcept {
    StackNode* p = new StackNode;
    p->data = value;
    p->next = top;
    top = p;
    }

    ElementType& pop() {
    if(!top) throw "Stack Underflow";
    ElementType&& e = top->data;
    StackNode* p = top;
    top = top->next;
    delete p;
    return e;
    }

    bool isEmpty() const noexcept {
    return !top;
    }

    int size() const noexcept {
    if(!top) return 0;
    StackNode* p = top;
    int count = 1;
    while(p->next) {
    p = p->next;
    ++count;
    }
    return count;
    }
    };

Standard Libraries Support

C++ Java
Standard Library #include <stack>
std::stack<T>
package java.util;
Stack<E> extends Vector<E>
Check if is empty bool empty() const boolean empty()
Number of elements size_type size() const int size()
(inherited from Vector<E>)
Top element access T& top() E peek()
Push void push(const T&) E push(E)
Pop void pop() E pop()

Queues

Like stacks, queues are lists. With a queue, however, insertion is done at one end whereas deletion is performed at the other end.

Implementations

  1. Array or vector based

    When array is employed to serve as the base container of queue, it would sometimes encounter the problem of overflow (the pointer rear points out of range, as is depicted as the fourth figure above). A better way to solve it is to consider the array as a circular one (i.e. circular array), as is shown below.

    There still exists a problem: how to deal with rear == front since both empty \(Q_e\) and full queue \(Q_f\) satisfy the condition, as is shown above. There are three methods to wrestle with it:

    1. Add an extra boolean field to distinguish whether it is full or empty;
    2. Add an extra integer field to keep the record of number of elements;
    3. Reserve an index intentionally.


    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    #define MAX_QUEUE_SIZE 128
    template<typename ElementType>
    class Queue {
    private:
    ElementType *base;
    int front, rear;

    public:
    Queue() : font(0), rear(0),
    base(new ElementType[MAX_QUEUE_SIZE]) {}

    void enqueue(const ElementType& value) {
    if((rear + 1) % MAX_QUEUE_SIZE == front)
    throw "Full Stack, cannot enqueue";
    base[rear] = value;
    // Simply increment rear might cause overflow
    rear = (rear + 1 == MAX_QUEUE_SIZE ? 0 : rear + 1);
    }

    ElementType& dequeue() {
    if(front == rear)
    throw "Empty Queue, Nothing to dequeue";
    ElementType&& e = base[front];
    front = (front + 1 == MAX_QUEUE_SIZE ? 0 : front + 1);
    return e;
    }

    int size() const noexcept {
    return (rear - front + MAX_QUEUE_SIZE) % MAX_QUEUE_SIZE;
    }

    bool isEmpty() const noexcept {
    return front == rear;
    }
    };
  2. Linked-list based

Trees

A tree is a collection of \(N\) nodes.

  • \(N=0\), it is empty;

  • \(N\gt 0\), it is consists of

    1. a distinguished node \(r\), called the root;
    2. zero or more nonempty (sub)trees \(T_1,\,T_2,\,\cdots,\,T_k\), each of whose roots are connected by a directed edge from r.

Properties

  • Depth: The length of the unique path from the root to a node \(n_i\). (\(\text{Level}=\text{depth}+1\))

  • Height: The length of the longest path from a node \(n_i\) to a leaf.

    • The height of a tree is equal to the height of the root.

Binary Trees

Properties

  1. A binary tree has \(\textcolor{orange}{\text{at most}}\) \(2^d\) nodes with depth \(d\);
  2. A binary tree has \(\textcolor{orange}{\text{at most}}\) \(2^{h-1}-1\) nodes with height \(h\);
  3. For all binary tree \(T\), if it has \(n_0\) leaves and \(n_2\) nodes with degree of \(2\), then \(n_0=n_2+1\).

Full Binary Trees 🆚 Complete Binary Trees

  • A full binary tree (sometimes proper binary tree) is a tree in which every node other than the leaves has two children.
  • A complete binary tree is a binary tree in which every level, except possibly the last, is completely filled, and all nodes are as far left as possible.

Implementation (C++)

  1. Sequential storage

    1
    2
    template<typename ElementType>
    using BinaryTree<ElementType> = std::vector<ElementType>;
  2. Linked-list storage

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    template<typename ElementType>
    class BinaryTree {
    public:
    BinaryTree() : left_child(nullptr), right_child(nullptr) {}
    ~BinaryTree() {
    delete left_child;
    delete right_child;
    left_child = right_child = nullptr;
    }

    private:
    BinaryTree* left_child, right_child;
    ElementType data;
    };

    or add an extra field parent, a pointer to its parent

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    template<typename ElementType>
    class BinaryTree {
    public:
    BinaryTree() : left_child(nullptr), right_child(nullptr), parent(nullptr) {}
    ~BinaryTree() {
    delete left_child;
    delete right_child;
    left_child = right_child = nullptr;
    }

    protected:
    BinaryTree* left_child, right_child, parent;
    ElementType data;
    };

Tree Traversals

Preorder Traversal Inorder Traversal Postorder Traversal
If the tree is empty, then simply return.
  1. Visit the root;
  2. Traverse the left subtree;
  3. Traverse the right subtree.
  1. Traverse the left subtree;
  2. Visit the root;
  3. Traverse the right subtree.
  1. Traverse the left subtree;
  2. Traverse the right subtree;
  3. Visit the root;


Implementation (C++)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
template<typename ElementType>
void PreorderTraversal(BinaryTree<ElementType> *tree) {
if(!tree) return;
visit(tree); // Visit the root node
PreorderTraversal(tree->left_child);
PreorderTraversal(tree->right_child);
}

template<typename ElementType>
void InorderTraversal(BinaryTree<ElementType> *tree) {
if(!tree) return;
InorderTraversal(tree->left_child);
visit(tree); // Visit the root node
InorderTraversal(tree->right_child);
}

template<typename ElementType>
void PostorderTraversal(BinaryTree<ElementType> *tree) {
if(!tree) return;
PostorderTraversal(tree->left_child);
PostorderTraversal(tree->right_child);
visit(tree); // Visit the root node
}

Applications

Mathematical notations

Preorder Traversal: +a×bcd÷ef
Also called Polish notation (PN)
Inorder Traversal: a+b×cd-e÷f
Postorder Traversal: abcd-×+ef÷-
Also called Reverse Polish notation (RPN)

Construct a binary tree

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
template<typename ElementType>
void BinaryTree::createBinaryTree(BinaryTree<ElementType>* tree) {
char current_char;
std::cin >> current_char;
if(current_char == '#')
tree = nullptr;
else {
tree = new BinaryTree<ElementType>();
tree->data = current_char;
createBinaryTree(tree->left_child);
createBinaryTree(tree->right_child);
}
}

BinaryTree::BinaryTree() {
char current_char;
std::cin >> current_char;
if(current_char != '#') {
data = current_char;
createBinaryTree(left_child);
createBinaryTree(right_child);
}
}

Copy a binary tree

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
template<typename ElementType>
void BinaryTree::copyBinaryTree(BinaryTree<ElementType> *original,
BinaryTree<ElementType> *target) {
if(!original) return;
else {
target = new BinaryTree<ElementType>();
target->data = original->data;
copyBinaryTree(original->left_child, target->left_child);
copyBinaryTree(original->right_child, target->right_child);
}
}

template<typename ElementType>
BinaryTree::BinaryTree(BinaryTree<ElementType>* another) {
copyBinaryTree(this, another);
}

Calculate the depth

1
2
3
4
5
6
7
8
9
10
unsigned int BinaryTree::depth() const {
return depth(this);
}

template<typename ElementType>
unsigned int BinaryTree::depth(const BinaryTree<ElementType> *tree) const {
if(!tree) return 0;
int&& m = depth(tree->left_child), n = depth(tree->right_child);
return (m > n ? m + 1 : n + 1);
}

Calculate the number of nodes

1
2
3
4
5
6
7
8
9
10
unsigned int BinaryTree::nodeCount() const {
return nodeCount(this);
}

template<typename ElementType>
unsigned int BinaryTree::nodeCount(const BinaryTree<ElementType> *tree) const {
if(!tree) return 0;
return nodeCount(tree->left_child) +
nodeCount(tree->right_child) + 1;
}

Calculate the number of leaves

1
2
3
4
5
6
7
8
9
10
11
unsigned int BinaryTree::leafCount() const {
return leafCount(this);
}

template<typename ElementType>
unsigned int BinaryTree::leafCount(const BinaryTree<ElementType> *tree) const {
if(!tree) return 0;
if(!tree->left_child && !tree->right_child) return 1;
return leafCount(tree->left_child) +
leafCount(tree->right_child);
}

Binary Search Trees

The property that makes a binary tree into a binary search tree is that for every node \(X\),

  • the values of all the items in its left subtree are smaller than the item in \(X\);
  • and the values of all the items in its right subtree are larger than the item in \(X\).

Implementation (C++)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
template<typename ElementType>
class BinarySearchTree : public BinaryTree<ElementType> {
using Node = BinarySearchTree<ElementType>;

Node* search(const ElementType& key) {
if(data == key) return this;
if(key < data) return search(left_child, key);
else return search(right_child, key);
}

Node* search(Node* node, const ElementType& key) {
if(!node || node->data == key) return node;
if(key < node->data) return search(node->left_child, key);
else return search(node->right_child, key);
}
};

Treaps

AVL Trees

An AVL (Adelson-Velskii and Landis) tree is a binary search tree with a balance condition.

An AVL tree is identical to a binary search tree, except that for every node in the tree, the height of the left and right subtrees \(\mathrm{H}_L,\,\mathrm{H}_R\) can differ by at most 1, i.e. \(\left|\mathrm{H}_L - \mathrm{H}_R\right|\leq1\).

Adjustments for unbalanced AVL trees

Red-Black Trees

Red–black trees are isomorphic to 2-3 trees, meaning that they are equivalent data structures.

To understand red-black trees, hence, we should first know what 2-3 trees are:

By artificially making the right key in the 3-nodes of the 2-3 tree point to the left key and setting the color of the left key to red, we get a transformed 2-3 tree; then we just need to make the red node the left child of the black node, and we get a red-black tree.

Splay Trees

The basic idea of the splay tree is that after a node is accessed, it is pushed to the root by a series of AVL tree rotations.

Splay tree is another kind of binary search tree, with the property that the most recently searched or added content will be moved to the root of the tree, so that when the same content is searched for next time, the speed can be improved, because the tree search starts from the root, and the closer to the root, the faster it will be found.

The act of moving the most recently searched item to the root is called splay. It is important to note that after splaying, the tree may not necessarily be a balanced tree, but may become a skewed tree at worst.

Top-Down Splay Trees

Suffix Trees

k-d Trees

Pairing Heaps

Heaps

Sorting

Hashing

Hash Table is a data structure that maps keys into their location of storage where the mapping is called a hash function \(\mathrm{Location}=\mathrm{Hash}\left(key\right)\).

Terminologies:

  • Collision: two keys hash to the same value, i.e. \({key}_1\neq{key}_2\) but \(\mathrm{Hash}\left({key}_1\right)=\mathrm{Hash}\left({key}_2\right)\);
  • Those colliding keys are called synonyms.

Hash Function

Principles of designing a "good" hash function:

  1. As simple as possible to increase the speed of processing of mapping;
  2. Addresses should be distributed as evenly as possible to minimize wasted space.

Several hash functions:

  • Division Modulo Method

    \[\mathrm{Hash}\left(key\right)=key\;\mathrm{mod}\;p\]

    The hash function divides the value \(key\) by \(p\) and then uses the remainder obtained. It is best suited that \(p\) is less than or equal to the size of the hash table and is a prime number as that can make sure the keys are more uniformly distributed. The hash function is dependent upon the remainder of a division.

  • Mid Square Method

  • Folding Method

Methods to solve collisions

Open Addressing

Basic Idea: If there exists collision, try the next alternative cells until an empty cell is found.

Linear Probing

Collision Function: \(f\left(i\right)=i\)

Quadratic Probing

Collision Function: \(f\left(i\right)=\left(-1\right)^{i-1}i^2\)

Separate Chaining

The Disjoint Sets Class

The equivalence class of an element \(a\in{S}\) is the subset of \(S\) that contains all the elements that are related to \(a\).

  • An equivalence relation is a relation \(R\) that satisfies three properties: reflexive, symmetric, transitive, denoted by \(\sim\).
  • To decide if \(a\sim b\), we need only to check whether \(a\) and \(b\) are in the same equivalence class.

Implementation (C++)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
#include <vector>
#include <numeric> // import std::iota since C++11

typedef class DisjointSetUnion {
public:
explicit DisjointSetUnion(size_t size) noexcept
: parents(size) {
// Fill the vector with consecutive numbers
// i.e. {0, 1, 2, ...}
std::iota(parents.begin(), parents.end(), 0);
}

size_t find(size_t node) const noexcept {
// Only the root node would point to itself
return parents[node] == node ? node
// Compress the path to accelerate the query
#ifdef OPTIMIZATION_ENABLED
: parents[node] = find(parents[node]);
#else
// If ignore the compression, the branch can
// be written as find(parents[node])
: find(parents[node]);
#endif
}

void unionSets(size_t node1, size_t node2) noexcept {
// To union two sets, we can simply make the root of one node
// point to another's root
#ifndef OPTIMIZATION_ENABLED
parents[find(node1)] = find(node2);
#else
// Attach the smaller tree to the root of the larger tree
// aka Union by rank
node1 = find(node1), node2 = find(node2);
if(node1 == node2) return;
if(size[node1] < size[node2]) std::swap(node1, node2);
parents[node2] = node1;
size[node1] += size[node2];
#endif
}

private:
std::vector<size_t> parents;
} DSU;

Graph Algorithms

A graph \(G=\left(V,\,E\right)\) consists of a non-empty set of vertices \(V\), and a set of edges \(E\). Each edge is a pair \(\left(v,\,w\right)\), where \(v,\,w\in{V}\).

  • If the pair is ordered, then the graph is directed (also referred to as digraphs).
  • Sometimes an edge has a third component, known as either a weight or a cost.

Path: a sequence of vertices \(v_1,\,v_2,\,\cdots,\,v_N\) such that \(\left(v_i,\,v_{i+1}\right)\in E\) for \(1\leq i\lt N\)

  • Length: the number of edges on the path, which is equal to \(N−1\).
  • Loop : an edge \(\left(v,\,v\right)\) from a vertex to itself.

Representation

Adjacency Matrix

Undirected Graph Adjacency Matrix
\(\begin{array}{c}\begin{array}{c}\\v_1\\v_2\\v_3\\v_4\\v_5\\\end{array}\begin{array}{c}\begin{matrix}v_1&v_2&v_3&v_4&v_5\end{matrix}\\\begin{pmatrix}\,0\,&\,1\,&\,0\,&\,1\,&\,0\,\\1&0&1&0&1\\0&1&0&1&1\\1&0&1&0&0\\0&1&1&0&0\end{pmatrix}\end{array}\end{array}\)

Edges Without Weight

  • For each edge \((v_i,\,v_j)\), set \(A[v_i][v_j]\) to true (i.e. 1);
  • Otherwise the entry in the array is false (i.e. 0).

Edges With Weight

  • For edges with weight associated, set \(A[v_i][v_j]\) equal to the weight;
  • Use either a very large (i.e. \(+\infty\)) or a very small weight (or simply \(0\)) as a sentinel to indicate nonexistent edges.

Features

  1. The adjacency matrix of undirected graph is symmetric;
  2. The degree of vertex \(v_i\) is the summation of the i-th row (column).
Directed Graph Adjacency Matrix
\(\begin{array}{c}\begin{array}{c}\\v_1\\v_2\\v_3\\v_4\\\end{array}\begin{array}{c}\begin{matrix}v_1&v_2&v_3&v_4\end{matrix}\\\begin{pmatrix}\,0\,&\,1\,&\,1\,&\,0\,\\0&0&0&0\\0&0&0&1\\1&0&0&0\end{pmatrix}\end{array}\end{array}\)
  • The i-th row: arc ended with \(v_i\) (outdegree edge)
  • The j-th column: arc started with \(v_j\) (indegree edge)

Pros versus Cons

  • Pros

    1. Easy to check whether there exists an edge between two vertices;
    2. Easy to calculate degrees of any vertex.
  • Cons

    1. High Space Requirement \(\Theta\left(\left|V\right|^2\right)\) (where \(\left|V\right|\) stands for the cardinality of the set of vertices);
    2. Only suitable when the graph is dense: \(\left|E\right|=\Theta\left(\left|V\right|^2\right)\) (otherwise it is called sparse and most of these entries would contain zero.);
    3. Hard to insert and remove vertices;
    4. Time-costly to calculate how many edges in the graph \(O\left(\left|V\right|^2\right)\).

Implementation (C++)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
#include <iostream>
#include <climits>
#define MVN 100 // Maximum Vertices Number

template<typename VertexType, typename ArcType>
class AdjacencyMatrix {
public:
AdjacencyMatrix() {
std::cin >> vertices_count >> arcs_count;
for(int i = 0; i < vertices_count; ++i)
std::cin >> vertices[i];
for(int i = 0; i < vertices_count; ++i)
for(int j = 0; j < vertices_count; ++j)
matrix[i][j] = INT_MAX; // INT_MAX stands for infinity
VertexType v1, v2;
int weight, i, j;
for(int k = 0; k < arcs_count; ++k) {
std::cin >> v1 >> v2 >> weight;
i = locate(v1);
j = locate(v2);
matrix[j][i] = matrix[i][j] = weight;
}
}

int locate(const VertexType &vertex) const {
for(int i = 0; i < vertices_count; ++i)
if(vertex == vertices[i]) return i;
return -1;
}

private:
VertexType vertices[MVN];
ArcType matrix[MVN][MVN];
int vertices_count, arcs_count;
};

Adjacency List

Graph Corresponding Adjacency List

Features

  1. Adjacency lists are not unique;
  2. Undirected graphs double the space usage (if it has \(e\) edges, then it has to store \(2e\) nodes) and are thus suitable for storing sparse graphs;
  3. Less space is needed: \(O\left(\left|E\right|+\left|V\right|\right)\)

Implementation (C++)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
#include <vector>
#include <iostream>
class AdjacencyList;

// Nodes in blue in the diagram above
class ArcNode {
friend AdjacencyList;
private:
int vertex_index; // adjacency vertex index
int weight; // Arc weight
ArcNode* next_arc; // pointer to the next ArcNode
};

// Nodes in red in the diagram above
template<typename VertexType>
class VertexNode {
friend AdjacencyList;
private:
VertexType data; // store the data of the vertex
ArcNode* first_arc; // pointer to the first adjacency vertex
};

// Graph stored as adjacency list
template<typename VertexType>
class AdjacencyList {
public:
AdjacencyList() {
std::cin >> vertices_count >> arc_count;
for(int i = 0; i < vertices_count; ++i) {
std::cin >> vertices[i].data;
vertices[i].first_arc = nullptr;
}
VertexType v1, v2;
int i, j;
for(int k = 0; k < arc_count; ++k) {
std::cin >> v1 >> v2;
i = locate(v1);
j = locate(v2);
ArcNode* p1 = new ArcNode();
p1->vertex_index = j;
p1->next_arc = vertices[i].first_arc;
vertices[i].first_arc = p1;

ArcNode* p2 = new ArcNode();
p2->vertex_index = i;
p2->next_arc = vertices[j].first_arc;
vertices[j].first_arc = p2;
}
}

int locate(const VertexType &vertex) const {
for(int i = 0; i < vertices_count; ++i)
if(vertex == vertices[i]) return i;
return -1;
}

private:
std::vector<VertexNode<VertexType>> vertices;
int vertices_count, arc_count;
};

Improvements

flowchart LR;
adjlist(Adjacency List)---udg(Undirected Graph) & dg(Directed Graph)
udg -->|Hard to calculate the degrees of vertices| ol(Orthogonal List)
dg -->|Each edge doubles the storage| am(Adjacency Multilist)

Iteration

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#define VERTEX_NUM

bool visited[VERTEX_NUM] = {false};

void DFS(const AdjacencyMatrix &graph, int vertex_index) {
// Visit the vertex at the given index
visited[vertex_index] = true;
// Check the row of the given vertex
for(int i = 0; i < graph.vertexCount(); ++i) {
if(!graph.at(vertex_index, i) && !visited[i])
DFS(graph, i);
// i is the adjacency vertex of the given vertex
// if i has not been visited yet, call DFS itself
}
}

Time complexity: \(O(n^2)\) since rows of every vertex would be iterated.

Applications

Minimum Spanning Tree (MST)

A spanning tree is a sub-graph of an undirected connected graph, which includes all the vertices (the reason why it is called spanning) of the graph with a minimum possible number of edges. If a vertex is missed or the graph is not acyclic, then it is not a spanning tree.

Minimum Spanning Tree (MST) of an undirected graph is a spanning tree with the lowest total cost among all the spanning trees.

Properties:

Let \(N=\left(V,\,E\right)\) denote a connected network (i.e. graph with weights), and \(U\) denote a non-empty subset of the set of vertices \(V\). There must exist a MST that includes the edge \(\left(u,\,v\right)\) where \(u\in U,\;v\in V-U\).

In other words, during the process of constructing spanning trees, the vertices:

  • Either in the vertex set \(U\) that has already been in the spanning tree (i.e. visited);
  • Or in the vertex set \(V-U\) that has not been in the spanning tree yet (i.e. not visited).
Prim's Algorithm Kruskal's Algorithm
Flow
Chart
Key
Point
Select vertices Select edges
Time
Complexity
O(|V|2) O(|E|log|E|)

Shortest-Path Algorithms

flowchart LR;
  spa(Shortest-Path) --- apsp(All-pairs shortest-path) --> f(Floyd's)
  spa --- spsp(Single pair shortest-path) --> d(Dijkstra's)

Dijkstra's Algorithm

Dijkstra (/ˈdikstrɑ/ or /ˈdɛikstrɑ/)

  • Version 1: Brutal Force Implemented
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    const int Infinity = 0x3f3f3f3f; // Represent ∞
    typedef unsigned int index_t; // Index type

    template<typename VertexType, typename ArcType>
    void Dijkstra(const AdjacencyMatrix<VertexType, ArcType> &graph
    const index_t& source) {
    /**
    Preparation for searching
    */
    unsigned vertices = graph.vertexCount();
    // The array that holds the shortest distance from source to i
    int* distances = new int[vertices];
    // Indicate whether the vertex i has been visited
    bool* visited = new bool[vertices];
    // Initialize all the elements in the array with default values
    memset(distances, 0x3f, sizeof(distances));
    memset(visited, false, sizeof(visited));
    // Distance of source vertex from itself is always 0
    distances[source] = 0;

    /**
    Utility function
    */
    // Utility function to find the minimum distance value
    auto mimimumDistance = [&distances, &visited, vertices] -> index_t () {
    unsigned min = Infinity;
    index_t min_index = 0;
    for(unsigned i = 0; i < vertices; ++i) {
    if(!visited[i] && distances[i] < min) {
    min = distances[i];
    min_index = i;
    }
    }
    return min_index;
    };

    /**
    Core body of Dijkstra's Algorithm
    */
    for(index_t i = 0; i < vertices - 1; ++i) {
    // Select the minimum distance vertex
    index_t vertex_index = mimimumDistance();
    // Mark the vertex at given index as visited
    visited[vertex_index] = true;

    // Update the distance value of the adjacent vertices
    // of the selected vertex
    for(index_t j = 0; j < vertices; ++j)
    if(
    // The vertex has not been visited yet
    !distances[j] &&
    // There exists an arc from the selected to j
    // since 0 represents a non-existent edge
    graph.at(vertex_index, j) &&
    // Infinity stands for a unreachable vertex
    distances[vertex_index] != Infinity &&
    // The total cost from source to j through the
    // selected vertex is smaller than the current
    // cost of vertex at j
    distances[vertex_index] +
    graph[vertex_index][j] < distances[j]
    )
    distances[j] = distances[vertex_index] +
    graph[vertex_index][j];
    }
    }

Floyd's Algorithm

Directed Acycline Graph

A directed acyclic graph (DAG) is often used to describe the process of a project or system; a project can be divided into a number of sub-projects, and once these sub-projects are completed, the entire project can be completed.

According to whether vertices or arcs are used to represent a DAG, it can be categorized into:

  • AOV nets (Activity On Vertex)
  • AOE Nets (Activity On Edge)

If there exists a directed path from \(i\) to \(j\), then \(i\) is the predecessor of \(j\) and \(j\) is the successor of \(i\). If \((i,\,j)\) is a directed edge of the net, then \(i\) is the immediate predecessor of \(j\) and \(j\) is the immediate successor of \(i\).

Topological Sorting

Under the premise that there are no loops in the AOV net, we arrange all the activities into a linear sequence such that if an arc \((i,j)\) exists in the AOV network, then \(i\) must come before \(j\) in the sequence, and linear sequences with this property are known as topologically ordered sequences, and the corresponding algorithm for topologically ordered sequences is known as topological sorting.

Algorithm

  1. Select a vertex without any predecessors and then print it out;
  2. Remove the selected vertex and arcs that end with it;
  3. Unless all vertices are printed or there is no vertex that do not have a predecessor, repeat the process 1 and 2.

Note that the topological sorted sequences are not unique!

Application of topological sorting

Construct a topologically ordered sequence of its vertices for a directed graph, and if all the vertices in the net are in its topologically ordered sequence, then the AOV network must be free of loops.

Algorithm Design Techniques

Amortized Analysis


数据结构与算法笔记
https://devexzh.github.io/2023/Note_Of_Data_Structures_And_Algorithms/
Author
Ryker Zhu
Posted on
July 3, 2023
Licensed under