Leetcode 211: Add and Search Word - Data structure design

时间:2022-06-20
本文章向大家介绍Leetcode 211: Add and Search Word - Data structure design,主要内容包括其使用实例、应用技巧、基本知识点总结和需要注意事项,具有一定的参考价值,需要的朋友可以参考一下。

原文链接

https://blog.baozitraining.org/2019/03/leetcode-solution-211-add-and-search.html

Problem Statement

Design a data structure that supports the following two operations:

void addWord(word)
bool search(word)

search(word) can search a literal word or a regular expression string containing only letters a-z or .. A .means it can represent any one letter.

Example:

addWord("bad")
addWord("dad")
addWord("mad")
search("pad") -> false
search("bad") -> true
search(".ad") -> true
search("b..") -> true

Problem link

Video Tutorial

You can find the detailed video tutorial here

https://youtu.be/qi2ohSEyyDw

Thought Process

It's a string pattern searching problem, and because it matches the word strictly from the beginning to the end (note, even the partial matching is considered False, e.g., word "abcd", search("abc") should return False), the Trie (prefix tree) data structure comes very handy. It's a common simple data structure that often comes in coding interviews, be sure you can write it bug free. Once we are familiar with Trie, addWord simply builds the Trie and search just looks for the exact matching. One tricky part of this problem is ".", which can match any one and only one letter. This requires a recursive search of all the sub-nodes in the trie when encountering a "." Even not asked in this leetcode question, but there are some very good follow up question

  • If we support "*" matching, Similar to Leetcode wildcard matching, how would you solve it? Solution here
  • How do you remove a word from the trie?

Solutions

class TrieNode {
    public Character letter = null;
    // I just like to indicate whether a node is end or not, being explicit and clear, also it's good to know
    // if a word is actually a word or a substring of a long word
    public boolean isEnd = false;
    // This can be optimized by a 256 char array.
    public Map<Character, TrieNode> children = new HashMap<>();
}

public class AddAndSearchWord {
    private TrieNode root;
    public AddAndSearchWord() {
        this.root = new TrieNode();
    }

    // Adds a word into the data structure.
    public void addWord(String word) {
        if (word == null || word.length() == 0) {
            return;
        }

        this.insertHelper(this.root, word, 0);
    }

    // this can be done iteratively too
    private void insertHelper(TrieNode node, String word, int index) {
        if (index == word.length()) {
            node.isEnd = true;
            return;
        }

        char c = word.charAt(index);
        if (node.children.containsKey(c)) {
            insertHelper(node.children.get(c), word, index + 1);
            return;
        }

        TrieNode t = new TrieNode();
        t.letter = c;
        node.children.put(c, t);
        insertHelper(t, word, index + 1);
    }

    // Returns if the word is in the trie. A word could
    // contain the dot character '.' to represent any one and only one letter.
    public boolean search(String word) {
        if (word == null) {
            return false;
        }
        return this.searchHelper(this.root, word, 0);
    }

    // DFS to search the word, worst case is to search the entire trie space if all dots
    private boolean searchHelper(TrieNode node, String word, int index) {
        if (node == null) {
            return false;
        }
        // This is strict word matching, not including the substring, e.g., add("abcde"), search("abc") false, seasrch("abcde") true
        if (index == word.length()) {
            return node.isEnd;
        }

        char c = word.charAt(index);
        if (c != '.') {
            if (!node.children.containsKey(c)) {
                return false;
            }

            return searchHelper(node.children.get(c), word, index + 1);
        } else {
            for (Map.Entry<Character, TrieNode> entry : node.children.entrySet()) {
                // Don't need to go through a to z, just go through neighbours is enough, DFS
                if (searchHelper(entry.getValue(), word, index + 1)) {
                    return true;
                }
            }
            return false;
        }
    }
}

Time Complexity:

  • addWord: O(N), N is the length of the word
  • search: O(M), where M is the entire Trie's space (i.e., the total number of Trie nodes). Think about a worst case of N "............" dots. where N is the length of the word that is larger than the depth of the Trie (larger than the longest word seen so far). More specifically, it's N * (Nodes on Trie's each level) = N * (M / lgM) assuming trie's height is lgM, for worst case, N equals lgM, so N *(M / lgM) = lgM * (M / lgM) = M. Note a trie could be compressed (e.g., the single nodes are merged back to the upper level) but this analysis still holds true

Space Complexity:

  • addWord: O(N), creating N more nodes in the trie, N is the length of the word
  • search: No additional space needed

References

  • GeeksForGeeks Trie
  • Trie wildcard matching solution