力扣 1519——子树中标签相同的节点数

本题主要在于对树这种数据结构的考察，以及深度优先遍历的使用，优化时可以采取空间换时间的策略。

原题

给你一棵树（即，一个连通的无环无向图），这棵树由编号从 0 到 n - 1 的 n 个节点组成，且恰好有 n - 1 条 edges 。树的根节点为节点 0 ，树上的每一个节点都有一个标签，也就是字符串 labels 中的一个小写字符（编号为 i 的节点的标签就是 labels[i] ）

边数组 edges 以 edges[i] = [ai, bi] 的形式给出，该格式表示节点 ai 和 bi 之间存在一条边。

返回一个大小为 n 的数组，其中 ans[i] 表示第 i 个节点的子树中与节点 i 标签相同的节点数。

树 T 中的子树是由 T 中的某个节点及其所有后代节点组成的树。

示例 1：

输入：n = 7, edges = [[0,1],[0,2],[1,4],[1,5],[2,3],[2,6]], labels = "abaedcd"
输出：[2,1,1,1,1,1,1]
解释：节点 0 的标签为 'a' ，以 'a' 为根节点的子树中，节点 2 的标签也是 'a' ，因此答案为 2 。注意树中的每个节点都是这棵子树的一部分。
节点 1 的标签为 'b' ，节点 1 的子树包含节点 1、4 和 5，但是节点 4、5 的标签与节点 1 不同，故而答案为 1（即，该节点本身）。

示例 2：

输入：n = 4, edges = [[0,1],[1,2],[0,3]], labels = "bbbb"
输出：[4,2,1,1]
解释：节点 2 的子树中只有节点 2 ，所以答案为 1 。
节点 3 的子树中只有节点 3 ，所以答案为 1 。
节点 1 的子树中包含节点 1 和 2 ，标签都是 'b' ，因此答案为 2 。
节点 0 的子树中包含节点 0、1、2 和 3，标签都是 'b'，因此答案为 4 。

示例 3 ：

输入：n = 5, edges = [[0,1],[0,2],[1,3],[0,4]], labels = "aabab"
输出：[3,2,1,1,1]

示例 4：

输入：n = 6, edges = [[0,1],[0,2],[1,3],[3,4],[4,5]], labels = "cbabaa"
输出：[1,2,1,1,2,1]

示例 5：

输入：n = 7, edges = [[0,1],[1,2],[2,3],[3,4],[4,5],[5,6]], labels = "aaabaaa"
输出：[6,5,4,1,3,2,1]

提示：

1 <= n <= 10^5
edges.length == n - 1
edges[i].length == 2
0 <= ai, bi < n
ai != bi
labels.length == n
labels 仅由小写英文字母组成

原题 url：https://leetcode-cn.com/problems/number-of-nodes-in-the-sub-tree-with-the-same-label

解题

首次尝试

这道题是要让我们计算：在子树中，和当前节点字符相同的节点个数。

那么我们就必然需要构建树中各个节点的关系，那么就需要记录父子节点的关系，因为是普通的树，一个节点的子节点可能有多个，因此我用LinkedList<Integer>[] tree这样一个数组进行存储，其中tree[i]代表节点 i 的所有子节点。

至于求相同节点的个数，我想着可以从根节点 0 开始逐个遍历，先获取其第一层子节点，再根据第一层子节点逐个获取，可以采用广度优先遍历的形式。

让我们看看代码：

class Solution {

    public int[] countSubTrees(int n, int[][] edges, String labels) {
        // 构造树
        LinkedList<Integer>[] tree = new LinkedList[n];
        for (int[] edge : edges) {
            // edge[0]的子节点
            LinkedList<Integer> child = tree[edge[0]];
            if (child == null) {
                child = new LinkedList<>();
                tree[edge[0]] = child;
            }
            // 增加子节点
            child.add(edge[1]);
        }

        // 结果
        int[] result = new int[n];
        // 遍历并计算
        for (int i = 0; i < n; i++) {
            // 需要遍历的字符
            char cur = labels.charAt(i);
            // 该节点的子树中与该字符相同的节点数
            int curCount = 0;
            // 广度优先遍历
            LinkedList<Integer> searchList = new LinkedList<>();
            searchList.add(i);
            while(!searchList.isEmpty()) {
                int index = searchList.removeFirst();
                if (cur == labels.charAt(index)) {
                    curCount++;
                }
                // 找出该节点的子树
                if (tree[index] == null) {
                    continue;
                }
                searchList.addAll(tree[index]);
            }

            result[i] = curCount;
        }

        return result;
    }
}

提交之后，发现有错误。错误的情况是：

输入：
4
[[0,2],[0,3],[1,2]]
"aeed"

输出：
[1,2,1,1]

预期：
[1,1,2,1]

根据这样输入，我构造出的树是：

1   0
  / 
  2   3

但根据预期结果反推出来的树是：

那么输入中最后给出的[1,2]就不是从父节点指向子节点，也就是输入中给出的边关联的节点顺序，是任意的。

那我们的树究竟该如何构造呢？

双向记录构造树

既然我们在构造树的时候，无法直接得出父子关系，那么就将对应两个节点同时记录另一个节点。

根据题目中给出的条件：树的根节点为节点 0。这样我们在遍历的时候，就从 0 开始，只要 0 关联的节点，一定是 0 的子节点。将这些节点进行标记，这样再递归访问接下来的节点时，如果是标记过的，则说明是父节点，这样就可以明确父子节点关系了。

至于遍历的时候，因为这次我们是不知道父子节点关系的，所以无法直接采用广度优先遍历，换成深度优先遍历。

让我们看看代码：

class Solution {

    // 总节点数
    int n;
    // 树
    Map<Integer, LinkedList<Integer>> tree;
    // 字符串
    String labels;
    // 最终结果
    int[] result;

    public int[] countSubTrees(int n, int[][] edges, String labels) {
        this.n = n;
        this.labels = labels;
        result = new int[n];

        LinkedList<Integer> list;
        // 双向构造树的关系
        tree = new HashMap<>(n / 4 * 3 + 1);
        for (int[] edge : edges) {
            // 添加映射关系
            list = tree.computeIfAbsent(edge[0], k -> new LinkedList<>());
            list.add(edge[1]);
            list = tree.computeIfAbsent(edge[1], k -> new LinkedList<>());
            list.add(edge[0]);
        }


        // 深度优先搜索
        dfs(0);

        return result;
    }

    public int[] dfs(int index) {
        // 当前子树中，所有字符的个数
        int[] charArray = new int[26];

        // 开始计算，标志该节点已经计算过
        result[index] = 1;
        // 获得其关联的节点
        List<Integer> nodes = tree.get(index);
        // 遍历
        for (int node : nodes) {
            // 如果该节点已经访问过
            if (result[node] > 0) {
                continue;
            }

            // 递归遍历子节点
            int[] array = dfs(node);
            for (int i = 0; i < 26; i++) {
                charArray[i] += array[i];
            }
        }
        // 将当前节点的值计算一下
        charArray[labels.charAt(index) - 'a'] += 1;
        result[index] = charArray[labels.charAt(index) - 'a'];
        return charArray;
    }
}

提交OK，执行用时136ms，超过36.71%，内存消耗104.5MB，超过91.38%。

时间复杂度上，应该是要研究dfs方法中的两个for循环，外层肯定是每个节点都遍历一遍，内层还需要遍历26个英文字母，也就是O(n)。

空间复杂度上，最大的应该就是存储节点映射关系的tree了，里面实际上就是 2n 个节点（因为每条边对应的两个节点都会互相存一次对方），因此也就是O(n)。

虽然过了，但执行速度很慢，可以进一步优化。

用空间换时间

针对我上面的解法，其中tree我是用的Map，虽然其get方法理论上是O(n)，但毕竟涉及 hash，可以优化成数组。

至于每次取节点对应的字符所用的charAt方法，具体其实是：

    public char charAt(int index) {
        if ((index < 0) || (index >= value.length)) {
            throw new StringIndexOutOfBoundsException(index);
        }
        return value[index];
    }

每次都会检查一次 index，其实这完全是可以省略的，因此可以提前构造好每个位置对应的值，也用一个数组存储。

让我们看看新的代码：

class Solution {

    // 总节点数
    int n;
    // 树
    LinkedList<Integer>[] tree;
    // 每个节点的值（用数字表示）
    int[] nodeValueArray;
    // 最终结果
    int[] result;

    public int[] countSubTrees(int n, int[][] edges, String labels) {
        this.n = n;
        nodeValueArray = new int[n];
        result = new int[n];

        // 双向构造树的关系
        tree = new LinkedList[n];
        for (int i = 0; i < n; i++) {
            tree[i] = new LinkedList<>();
        }
        for (int[] edge : edges) {
            // 添加映射关系
            tree[edge[0]].add(edge[1]);
            tree[edge[1]].add(edge[0]);
        }

        // 生成节点的值
        for (int i = 0; i < n; i++) {
            nodeValueArray[i] = labels.charAt(i) - 'a';
        }

        // 深度优先搜索
        dfs(0);

        return result;
    }

    public int[] dfs(int index) {
        // 当前子树中，所有字符的个数
        int[] charArray = new int[26];

        // 开始计算，标志该节点已经计算过
        result[index] = 1;
        // 获得其关联的节点
        List<Integer> nodes = tree[index];
        // 遍历
        for (int node : nodes) {
            // 如果该节点已经访问过
            if (result[node] > 0) {
                continue;
            }

            // 递归遍历子节点
            int[] array = dfs(node);
            for (int i = 0; i < 26; i++) {
                charArray[i] += array[i];
            }
        }
        // 将当前节点的值计算一下
        charArray[nodeValueArray[index]] += 1;
        result[index] = charArray[nodeValueArray[index]];
        return charArray;
    }
}

提交之后，执行用时是96ms，内存消耗是402.2MB。看来优化的效果并不明显。

研究一下目前最优解法

这个解法真的是巧妙，执行用时20ms，超过了100%，内存消耗76.3MB，超过了100%。

我在代码中增加了注释，方便大家理解。但这样的写法，研究一下是能够看懂，但让我想估计是永远不可能想出来，可以让大家也一起学习和借鉴：

public class Solution {

  static class Next {

    Next next;
    Node node;

    Next(Next next, Node node) {
      this.next = next;
      this.node = node;
    }
  }

  static class Node {

    /**
     * 当前节点的index
     */
    final int index;

    /**
     * 当前节点对应的字符值（减去'a'）
     */
    final int ci;

    /**
     * 所有关联的节点
     */
    Next children;

    /**
     * 该节点的父节点
     */
    Node parent;

    /**
     * 子树中和该节点含有相同字符的节点总个数
     */
    int result;

    /**
     * 是否还在队列中，可以理解为是否已访问过
     */
    boolean inQueue;

    public Node(int index, int ci) {
      this.index = index;
      this.ci = ci;
      this.result = 1;
    }

    /**
     * 从后往前，找到当前节点没有访问过的第一个子节点
     */
    Node popChild() {
      for (; ; ) {
        // 当前节点的所有关联节点
        Next n = this.children;
        // 如果没有，说明子节点都遍历完了
        if (n == null) {
          return null;
        }
        // 从后往前移除关联节点
        this.children = n.next;
        // 返回第一个没有访问过的节点
        if (!n.node.inQueue) {
          return n.node;
        }
      }
    }

    /**
     * 访问了该节点
     */
    Node enqueue(Node[] cnodes) {
      // 该节点标记为访问过
      this.inQueue = true;
      // 记录该节点的父节点
      this.parent = cnodes[ci];
      // 那么现在该字符值对应的最高节点，就是当前节点。
      // 这样如果之后也遇到相同字符的子节点，就可以为子节点赋值其父节点，也就是上面一行是有效的
      cnodes[ci] = this;
      return this;
    }

    /**
     * 退出该节点
     */
    void dequeue(Node[] cnodes, int[] res) {
      // 之后会访问该节点的兄弟节点，因此父节点需要重新设置
      cnodes[ci] = this.parent;
      // 设置当前节点的值
      res[index] = this.result;
      // 父节点也可以进行累加
      if (this.parent != null) {
        this.parent.result += this.result;
      }
    }

    void link(Node x) {
      // this节点和x节点，互相绑定
      this.children = new Next(this.children, x);
      x.children = new Next(x.children, this);
    }
  }

  public int[] countSubTrees(int n, int[][] edges, String labels) {
    // 构造树
    Node[] nodes = new Node[n];
    // 每个节点对应的字符
    for (int i = 0; i < n; i++) {
      nodes[i] = new Node(i, labels.charAt(i) - 'a');
    }
    // 通过边的关系，将节点互相绑定
    for (int[] es : edges) {
      nodes[es[0]].link(nodes[es[1]]);
    }

    // 最终的结果
    int[] res = new int[n];
    // 当前访问的节点下标
    int sz = 0;
    // 26个小写英文字母对应的节点数组
    Node[] cnodes = new Node[26];
    // 下面三行可以合并成这一行：
    // Node node = nodes[sz++] = nodes[0].enqueue(cnodes);
    nodes[sz] = nodes[0].enqueue(cnodes);
    // 当前访问的节点
    Node node = nodes[sz];
    // 因为当前节点已经访问过，自然下标需要+1
    sz++;

    for (; ; ) {
      // 从后往前，找到当前节点没有访问过的第一个子节点
      Node child = node.popChild();
      // 如果已经全部访问过了
      if (child == null) {
        // 开始计算
        node.dequeue(cnodes, res);
        if (--sz == 0) {
          break;
        }
        // 回溯到父节点
        node = nodes[sz - 1];
      } else {
        // 保证了相邻节点一定是父子节点
        node = nodes[sz++] = child.enqueue(cnodes);
      }
    }
    return res;
  }
}

总结

以上就是这道题目我的解答过程了，不知道大家是否理解了。本题主要在于对树这种数据结构的考察，以及深度优先遍历的使用，优化时可以采取空间换时间的策略。

有兴趣的话可以访问我的博客或者关注我的公众号、头条号，说不定会有意外的惊喜。

https://death00.github.io/

公众号：健程之道