并查集(disjoint set)¶

并查集是一种数据结构，它支持：
- 查找（Find）：确定某个元素属于哪个子集
- 合并（Union）：合并两个子集

等价关系¶

自反性(reflexive)：\(\forall a, a \sim a\)
对称性(symmetric)：\(\forall a, b, a \sim b \Rightarrow b \sim a\)
传递性(transitive)：\(\forall a, b, c, a \sim b, b \sim c \Rightarrow a \sim c\)

储存¶

用树来表示集合，树的根节点是集合的代表元

用数组来表示树：

    typedef struct {
        ElementType Data;// the value of the element
        int Parent;// the index of the parent node
    }SetType;

根节点的Parent为-1

也可简化表示：

    typedef int ElementType;
    typedef int SetName;
    typedef ElementType SetType[MaxSize]

S[root] = -1；

S[i] = j 表示i的父节点是j

操作¶

查找元素X所在集合: 从下往上找到根节点

    int Find( SetType S[], ElementType X ) {
        int i;
        for( i = 0; i < MaxSize && S[i].Data != X; i++ );
        // find the index of X
        if( i >= MaxSize ) return -1;// X is not found
        for( ; S[i].Parent >= 0; i = S[i].Parent );
        // find the root of the tree
        return i;
    }

若采取简化表达：

    SetName Find( SetType S, ElementType X ) {
        for( ; S[X] > 0; X = S[X] );
        return X;
    }

合并两个集合:

找到两个元素所在集合的根节点；

若根节点不同，则将其中一个根节点的Parent设置为另一个根节点的数组下标

    void Union( SetType S[], ElementType X1, ElementType X2 ) {
        int Root1, Root2;
        Root1 = Find( S, X1 );
        Root2 = Find( S, X2 );
        if( Root1 != Root2 ) S[Root2].Parent = Root1;
    }

若采取简化表示：

    void Union( SetType S, SetName Root1, SetName Root2 ) {
        S[Root2] = Root1;
    }

可以想见，这样的合并操作多次进行会使树的高度增加，从而影响查找操作的效率。为了降低树的高度，可以采用：

按秩合并：
- 按大小：将节点数较少的树合并到节点数较多的树上
  - 将S[root]修改为-|T|，其中T是树的节点数
  - 按大小合并所得树的高度最大为\(\lfloor \log_2 N \rfloor + 1\)
- 按高度：将高度较小的树合并到高度较大的树上
  - 将S[root]修改为-|T|，其中T是树的高度
  - 只有等高时，高度才会增加 1
```
    void Union( SetType S, SetName Root1, SetName Root2 ) {
        if( S[Root1] > S[Root2] ) S[Root1] = Root2;
        else {
            if( S[Root1] == S[Root2] ) S[Root2]--;
            S[Root1] = Root2;
        }
    } // 始终注意S[Root1]和S[Root2]是负值
```
    对于N个Union操作和M个Find操作，按秩合并的时间复杂度为\(O(N + M \log N)\)

路径压缩：在查找根节点的过程中，将沿途的所有节点的Parent都设置为根节点

    SetName Find( SetType S, ElementType X ) {
        if( S[X] < 0 ) return X; // X is the root
        else return S[X] = Find( S, S[X] ); 
        // first, find the root; then, link all the nodes in this path to the root
    }

好处是多次查找时除了第一次外，后续查找的时间复杂度为\(O(1)\)

以及Find是尾递归，可以被优化为迭代。

最后更新: 2024年10月2日 14:45:20
创建日期: 2024年4月9日 15:03:21