Working with the Trie data structure: Part 3

A Trie is a tree like data structure used for storing words in a way which facilitates fast look-ups. They prove very useful for handling dictionaries. In Part 1 we defined the trie data structure and started outlining the code for it. In this articles we are going to take a look at the basic operation we can do on a trie: Insert, Query and Remove. And in Part 2 we take a look at how we would implement basic operations on a trie.

We recommend that you go trough the first two parts before this one if you are new to tries. Now, let’s go trough a few of the tricks we can do with a trie.

7. Prefix Counter

Having the isFinal member helps, but it doesn’t offer enough functionalities. If we were to count the number of unique words in our dictionary, how would you do it ? A short answer : a new TrieNode member calledprefixCount. prefixCounter

class TrieNode{

    var children: [Character:TrieNode] = [:]
    var isFinal: Bool = false
    var prefixCount: Int = 0
    
    init() {

    }
	//...

According to the previous graphical representation, the insertand remove functions suffer changes.

When inserting a word, a simple rule would be to increment prefixCount for each node of the way, including the root. prefixCountInsert

And the implementation doesn’t change much:

    func insert(_ word: String){
        insert(characters: Array(word.characters))
    }

    func insert(characters: [Character]) {
      var node = root
      node.prefixCount += 1
      for character in characters {
          node = node.getOrCreateChildFor(character)
          node.prefixCount += 1
      }
      node.isFinal = true //
    }

The query function doesn’t need modifications, but that’s not the case for remove.

Having the sequence of nodes for a word given by getNodeSequence, we need to decrement the prefixCount value for each node. After we do that, we also do the check prefixCount == 0. If it’s true, it means that there are no words in our dictionary that reach that node. It’s safe to remove the entire sequence of nodes from now on. Step by step: prefixCountRemove

    func remove(_ word: String){
        remove(characters: Array(word.characters))
    }
    func remove(characters: [Character]) {
      var charactersNodes:[(Character,TrieNode)] = getNodeSequence(characters: characters)
      if charactersNodes.count != characters.count + 1 {
        return
      }

      for i in 0..<charactersNodes.count {
        charactersNodes[i].1.prefixCount -= 1
        if charactersNodes[i].1.prefixCount == 0 {
          if i != 0 {
            charactersNodes[i - 1].1.children.removeValue(forKey: charactersNodes[i].0)
            return // we completely deleted the reference to an entire sequence of nodes
                    // swift will delete those nodes from memory
          }
        }
      }
      charactersNodes[charactersNodes.count - 1].1.isFinal = false
    }

8. Unique Words

Mentioned in the previous chapter, the implementation consists in returning the prefixCount member of the rootnode. How simple is that ? :)

    func uniqueWords() -> Int {
      return root.prefixCount
    }

9. Words Same Prefix

Given a prefix, the function returns how many words in our dictionary have it in common. Take the Trie from chapter 7. If we were to call the function for the prefixes “t”, “to”, the function would return 2 because both “to” and “tour” share those prefixes. Same logic goes for “r”, “ro”.

Every word is a prefix of it’s own; therefore, calling the function with one of the 4 words in our dictionary( roll, round, to, tour) will return 1. 1 is also returned when called for the prefix “roun” for example.

Here’s how we do it:

We iterate the word character by character. If the corresponding TrieNode doesn’t exist, we return 0; otherwise we return the value of prefixCount from the last TrieNode.

    func wordsSamePrefix(_ word: String) -> Int {
        return wordsSamePrefix(characters: Array(word.characters))
    }

    func wordsSamePrefix(characters: [Character]) -> Int {
      var node : TrieNode? = root
      for character in characters {
        node = node?.children[character]
        if node == nil {
          return 0
        }
      }
      return node!.prefixCount
    }

Notice how similiar the function is to query. They both use the last TrieNode of the sequence. In order to avoid repeating code, we might just create a function to return that node, if it exists.

    func getNodeFor(_ characters: [Character]) -> TrieNode? 
        var node: TrieNode? = root
        for character in characters {
            node = node?.children[character]  
        }
        return node
    }

And the query and wordsSamePrefix functions will look cleaner:

    func query(_ word: String) -> Bool {
        return query2(characters: Array(word.characters))
    }

    func query2(characters: [Character]) -> Bool {
      let node = getNodeFor(characters)
      if node == nil {
        return false
      }
      return node!.isFinal
    }
    
    func wordsSamePrefix(_ word: String) -> Int {
        return wordsSamePrefix(characters: Array(word.characters))
    }

    func wordsSamePrefix(characters: [Character]) -> Int {
      let node = getNodeFor(characters)
      if node == nil {
        return 0
      }
      return node!.prefixCount
    }

10. Remove Words Same Prefix

The function will remove all words having some prefix. The only diference from this and the remove function is that we decrease the prefixCountmember of all TrieNodes with the prefixCount of the last character TrieNode.

To make the example clear, consider adding to our previous Trie the word tea aswell.

prefixCount2

Now, let’s say I want to remove all the words with the prefix “to”. That means a total of 2 words: to, tour.

When we get to the TrieNode for the letter ‘o’, we delete the reference from ‘t’ to it. For root, prefixCount = 5and for the letter ‘t’, prefixCount = 3 which, should in fact be 3 and 1 respectively. It means that for these TrieNodes the prefixCount member should be decreased by 2, the value from the letter ‘o’.

prefixCountRemovePrefixTo

1 Star2 Stars3 Stars4 Stars5 Stars (2 votes, average: 5.00 out of 5)
Loading...

  1 comment for “Working with the Trie data structure: Part 3

  1. June 16, 2017 at 8:39 am

    I just wanted to thank you for explaining all these rules and exceptions. It is pretty clear for me to comprehend the issue and its solution.

Leave a Reply

Your email address will not be published. Required fields are marked *

Subscribe
We send about one email per week with our latest tutorials and updates
Never display this again :)