Working with the Trie data structure: Part 2

A Trie is a tree like data structure used for storing words in a way which facilitates fast look-ups. They prove very useful for handling dictionaries. In Part 1 we defined the trie data structure and started outlining the code for it. In this articles we are going to take a look at the basic operation we can do on a trie: Insert, Query and Remove.

4. Insert

Inserting means creating new TrieNodes and that functionality should be provided by the TrieNode class. Therefore, we need to add more functions to it .

    //...
    func createChildFor(_ character: Character) -> TrieNode {
        let node = TrieNode()

        children[character] = node

        return node
    }

    func getOrCreateChildFor(_ character: Character) -> TrieNode {
        if let child = children[character] {
            return child
        } else {
            return createChildFor(character)
        }
    }

They are used for finding or creating the TrieNode corresponding to character and then return it.

Now we can proceed to adding our first few words to the Trie.

Given a String, we iterate it character by character. Starting from the root, we use getOrCreateChildFor(character) and switch to the newly created node. Below is the step-by-step representation of adding the words “roll” and “round”.

insertIsFinalRollRound

Notice that in the process of inserting “roll” we create nodes for all characters in the word, while for adding “round” we don’t create additional nodes for the already existing prefix “ro”. We only insert the nodes for “und”.

Inserting the words “to” and “tour” will follow the same procedure. Inserting “to” means creating a child with letter “t” from the root node and, from it, another for the letter “o”. Now, when inserting “tour”, “to” is part of the Trie, so we only need to insert “ur”.

insertIsFinalToTour

Here’s a straightforward implementation using the helper functions from the TrieNode class.

func insert(_ word: String){
        insert(characters: Array(word.characters))
    }

    func insert(characters: [Character]) {
      var node = root
      for character in characters {
          node = node.getOrCreateChildFor(character)
      }
      node.isFinal = true //
    }

5. Query

The function should return true if a given word is present in the Trie, false otherwise.

Note: our Trie contains only unique words

The process is very similar to the Insert function. Instead of creating a node at each step, we only check for its existence. When we get to the last character of the word, the corresponding TrieNode must have isFinal set to true, in order for the query to return true.

    func query(_ word: String) -> Bool {
        return query(characters: Array(word.characters))
    }

    func query(characters: [Character]) -> Bool {
        var node : TrieNode? = root
        for character in characters {
          node = node?.children[character]
          if node == nil {
            return false
          }
        }
        return node!.isFinal
    }

This line of code is something called Optional Chaining and it lets us write cleaner code.

node = node?.children[character]

When it encounters the first nil in the expression it automatically returns nil.

6. Remove

Remove works the same way query does, but we change isFinal at the last TrieNode.

    func remove(_ word: String){
        remove(characters: Array(word.characters))
    }
    func remove(characters: [Character]) {
      var node : TrieNode? = root
      for character in characters {
        node = node?.children[character]
        if node == nil {
          return
        }
      }
      node!.isFinal = false
    }

What do you notice with this implementation ? If we delete the word “round”, this is how the Trie will look like: TrieDeleteRound

We are still left with the actual sequence “round” in the Trie, but the next queries will return false because isFinalwould be set to false, when called for the word “round”. The following code :

var myTrie = Trie()
myTrie.insert("roll")
myTrie.insert("round")
print(myTrie.query("round"))
myTrie.remove("round")
print(myTrie.query("round"))

will print: true false

If we were to remove the actual sequence, we would have to delete the TrieNode references from the childrenmember, the key – value pairs, for each node in the Trie.

It would be useful to have the path in TrieNodes for a given word, don’t you think ?

    func getNodeSequence(characters: [Character]) -> [(Character, TrieNode)] {
      var node : TrieNode? = root
      var nodes:[(Character,TrieNode)] = []

      var tup = (Character("@"),node!)  // just a random character
                                    // it's not going to be used
      nodes.append(tup)

      for character in characters {
        node = node?.children[character]
        if node == nil {
          return nodes
        }
        var tup = (character, node!)
        nodes.append(tup)
      }

      return nodes
    }

The idea is to iterate backwards the sequence returned by the previous function. At each step we check for the node to have isFinalset to true or to have at least one child. If none of these requirements is met, the node reference is deleted from the Trie.

Because of deleting the word “round”, we have our first TrieNode pointing to the “d” letter. None of the previous mentioned requirements are met, so the reference to this node is going to be deleted from the TrieNode.childrencorresponding to the letter “n”.

Here’s a step by step explanation for the new remove function:

TrieDelete2Round

Also for the the word “roll”:

TrieDelete2Roll

with the corresponding implementation:

    func remove2(characters: [Character]) {
      var nodes:[(Character,TrieNode)] = getNodeSequence(characters: characters)

      if nodes.count != characters.count + 1 {
        return
      }

      nodes[nodes.count - 1].1.isFinal = false
      for i in (1..<nodes.count).reversed() { 
        if nodes[i].1.isFinal {
          break
        }
        if nodes[i].1.children.count >= 1{
          break
        }
        nodes[i - 1].1.children.removeValue(forKey: nodes[i].0) 
        // we delete the reference from the parent node
        // and because that was the only reference to that Trieode in our program
        // the TrieNode will be erased from memoery
      }

    }

That was it for now! In Part 3 we are going to take a look at a few of the tricks we can do with a trie.

Thank you for reading! If you enjoyed this article please take a moment and share it with your friends.

1 Star2 Stars3 Stars4 Stars5 Stars (6 votes, average: 4.33 out of 5)
Loading...

  2 comments for “Working with the Trie data structure: Part 2

  1. Alex
    May 18, 2017 at 2:53 am

    In the TrieNode class in the insert(characters: [Character]) method when you assign var to the root ( var node = root), what is root? We only have 2 variables in this class which are `children` and `isFinal`.

    • Robert Cazoni
      May 19, 2017 at 7:58 pm

      in the previous part of this article we can find the declaration of the Trie class:
      class Trie {
      var root = TrieNode()


      }

Leave a Reply

Your email address will not be published. Required fields are marked *

Subscribe
We send about one email per week with our latest tutorials and updates
Never display this again :)