# Random Names using Markov Chains

### Introduction

In this tutorial we’ll look at generating plausible random names from a list of pre-existing names. We’ll also build a fun little app that lets you generate your Elf / Ninja name.

Just generating random characters won’t give us reasonable results. We’ll use a list of preexisting names in order to generate new ones. We’ll take advantage of the distribution of letters in our preexisting names.

For each name we look at each letter from that name.
For each letter we keep track of every letter that directly follows it.
Now we have a list of mappings from letters to all possible letters that directly follow them.

To generate a new name we use the following approach:
1. We start with some letter `l1`. Our new name consists solely of `l1`.
2. Starting from the letter (`l1`) we randomly select a letter that follows it `(l2`).
3. We add the letter `l2` to our name
4. We repeat the the above algorithm starting at `Step 2` with the letter `l2`.

Once we reach a predetermined maximum number of letters or we reach a letter that has no successor we stop.

Lets say we have a list of names: “Mark, Michael and Mathew”. For each letter let’s look at the letters that follow it:

`m : [a, i, a]`
`a : [r, e, t]`
`r : [k]`
`k : []`
`i : [c]`
`c : [h]`
`h : [a, e]`
`e : [l,w]`
`l : []`
`t : [h]`
`w : []`

Now starting from the letter `m` lets generate a new name.

1. for `m` we pick the successor `a`
2. for `a` we pick the successor `e`
3. for `e` we pick the successor `l`
4. l has no possible successor so we stop there

We end up generating the name `Mael`, some other possibilities starting from `M` would be: `Maew`, `Michew`, `Michark`

Some possible outputs when starting from `h` would be `Hel`, `Hark`, `Hael`, `Hathathel` and others.

Note: in practice the list of words that we use to generate new words (`the seed`) would be much larger than 3, typically at least 100 entries.

This process is knowed as a Markov Chain i.e. a process that has some states between which it transitions. Transitions occur randomly and a transition only depends on the current state, it doesn’t need any info on past states.

### A Swift Implementation

NOTE: The implementation I’m going to show below is sub-optimal. i.e. It will be slow and take up a lot of unnecessary memory for long lists of names. I’m trying to emphasize clarity and simplicity instead of speed.

We’ll implement all our logic in a class `MarkovGenerator`. The most important data that we’ll keep is a list of `words` ( the words we use to generate new words) and `transitionTable` a dictionary that maps from letters to successors of letters. For each letter that is a key in the dictionary the corresponding value is an array of strings that contains the letters successors, the array may contain duplicates.

```class MarkovGenerator {
private let words: [String]
private var transitionTable: [String : [String]] = [:]
}```

We’ll create 2 initializers. One that takes a list of words and a convenience initializer that reads a list of words from a file. Our `init` methods will call the `buildTransitionTable` function that will take care of establishing the successors for each letter in each word.

```class MarkovGenerator {
...
init(words: [String]) {
self.words = words

buildTransitionTable()
}

convenience init(fileName: String) {
let path = NSBundle.mainBundle().pathForResource(fileName, ofType: "txt")!

let names = String(contentsOfFile: path, encoding: NSUTF8StringEncoding, error: nil)!.componentsSeparatedByString("\n").map { \$0.lowercaseString}

self.init(words: names)
}
}```

In the `buildTransitionTable` function we go through all our words and call the `addWordToTransitionTable` on each word.

In the `addWordToTransitionTable` function we go through all the characters in our word. For each character we consider it’s successor (`nextChar`), in case we reach the last letter we set it’s successor to a special character(`"\$"`).

We use the current character as a key into the transition table and add it’s successor to the array stored in the dictionary, updating or creating the array as needed.

```class MarkovGenerator {
...
private func buildTransitionTable() {
for word in words {
}

}

let wordArray = Array(word)

for index in 0..<wordArray.count {

let char = String(wordArray[index])
var nextChar = "\$"
if index < wordArray.count - 1 {
nextChar = String(wordArray[index + 1])
}

var transitionsArray = transitionTable[char]
if (transitionsArray == nil) {
transitionsArray = []
}

transitionsArray?.append(nextChar)

transitionTable[char] = transitionsArray

}
}
...
}```

### Generating Words

First we’ll need to add some helper functions that we’ll use to generate words.

The `generateNextCharacter` function takes a character and generates a new character by randomly selecting an entry from the array in the transition table corresponding to the input character.

```class MarkovGenerator {
...
private func generateNextCharacter(character: String) -> String {
let transitionArray = transitionTable[character]!

let p = Int(arc4random()) % transitionArray.count

return transitionArray[p]
}
...
}```

We’ll also need a function that generates a random character that appears in some of our input words. This is equivalent to selecting a random key of the transition table.

```class MarkovGenerator {
...
private func randomCharacter() -> String {
return transitionTable.keys.array[Int(arc4random()) % transitionTable.keys.array.count]
}
...
}```

We’ll want to somehow constrain the length of the generated words, to do this we’ll add 2 more properites`minWordLength` and `maxWordLength`.

```class MarkovGenerator {
...
var minWordLength = 4
var maxWordLength = 10
...
}```

Now finally we’re ready to add a method that generates a word. We’ll add a method `generateWord(startString: String = "") -> String`, the optional parameter `startString` indicates some string that should be used to start generating the new string.

We’ll keep track of the generated word in a variable `result`. We’ll also use a variable `currentChar` that keeps track of the current character we’re processing. If the provided `startingString` is empty we initialize `currentChar` to a random character, otherwise we initialize it to the last character in the `startingString`.

Next we repeat the following process a maximum of `maxWordLength` times:
We transition to the next character using the `generateNextCharacter` function.
If the generated character denotes the end of a word(`"\$"`) and we have generated at least `minWordLength` characters we return the generated string.
If the generated character denotes the end of a word(`"\$"`) and we have not generated enough character we randomly generate a new character until it’s not the end of a word.
We append the generated character to our result.

Whenever the result is returned we’ll want to capitalize it. Names should start with an uppercase letter after all!

```class MarkovGenerator {
...
func generateWord(startString: String = "") -> String {

var startStringArray = Array(startString)

var currentChar = randomCharacter()
var result = currentChar

if startStringArray.count > 0 {
result = startString
currentChar = String(startStringArray.last!)
}

for i in 1...maxWordLength {
currentChar = generateNextCharacter(currentChar)

if currentChar == "\$" {
if i > minWordLength {
return result.capitalizedString
} else {
currentChar = randomCharacter()
}
}

result += currentChar
}

return result.capitalizedString
}
...
}```

Example Usage of the class:

```let markovGenerator = MarkovGenerator(words: ["Mark","Michael","Mathew"])

markovGenerator.generateWord() // Maew```

Now let’s use the generator to build an iOS app!

### Setting up the iOS app

You can download a starter project that includes the `MarkovGenerator` class and 2 files with lists of names (“ElfNames.txt” and “NinjaNames.txt”).

The interface for the starter project is empty. We’ll add the following components that will make up the app:
A `label` with text “Generate your elf / ninja name” A `segmented control` with 2 segments “Elf” and “Ninja” A `label`with text “Name:” A `textfield` to the right of the name label A `button` with title “Generate” An empty `label` where the result will be displayed. Make sure to set the `label`‘s `Alignment` to `Center` and `Lines` to `2`

The view should look like this after you have everything set up.

### Connecting Outlets

We’ll have to connect 2 outlets:
The textfield as `nameTextField`
The bottom label as `resultLabel`

Also we’ll connect 2 actions:
The `Value Changed` action for our segmented control as `segmentedControlChanged`.
The `Touch Up Inside` action for our button as `didTapGenerateButton`.

Your ViewController should look like this after you’ve set everything up:

```class ViewController: UIViewController {

@IBOutlet weak var resultLabel: UILabel!
@IBOutlet weak var nameTextField: UITextField!

@IBAction func segmentedControlChanged(sender: UISegmentedControl) {
}

@IBAction func didTapGenerateButton(sender: AnyObject) {
}

...
}```

### Implementing the Logic

We’ll keep 2 separate instances of `MarkovGenerator`, one for Elf names and one for Ninja names. We’ll also keep a 3rd property that will keep the currently active `MarkovGenerator`.

```class ViewController {
...
var markovGenerator: MarkovGenerator!
var elfGenerator: MarkovGenerator!
var ninjaGenerator: MarkovGenerator!
}```

In the `viewDidLoad` method we’ll intialize our `MarkovGenerators`.

```class ViewController {
...

self.elfGenerator = MarkovGenerator(file: "ElfNames")

self.ninjaGenerator = MarkovGenerator(file: "NinjaNames")

self.markovGenerator = self.elfGenerator
}
...
}```

When the segmented control changes value we’ll want to update the active markov generator that we use:

```class ViewController {
...
@IBAction func segmentedControlChanged(sender: UISegmentedControl) {
if sender.selectedSegmentIndex == 0 {
self.markovGenerator = self.elfGenerator
} else {
self.markovGenerator = self.ninjaGenerator
}
}
...
}```

When the button is tapped we’ll want to generate a new text and place it in our `resultLabel`. We’ll also want to add some flavor text based on the current active generator. (“Your Elf name is:” or “Your Ninja name is”).

Now, we don’t want to pass the full name the user inputted to the markov generator, let’s just pass the first 2 letters so that the generated word starts like the inputted name. We’ll use an `String` extension to get the first 2 characters of the name.

```extension String {
subscript (r: Range<Int>) -> String {
get {
let endIndex = advance(startIndex, r.endIndex - r.startIndex)

return self[Range(start: startIndex, end: endIndex)]
}
}
}```
```class ViewController {
...
@IBAction func didTapGenerateButton(sender: AnyObject) {

let generatedType = self.markovGenerator === self.elfGenerator ? "Elf" : "Ninja"

var name = ""

let nameLength = count(self.nameTextField.text)
if nameLength > 0 {
name = self.nameTextField.text[0..<min(2,nameLength)].lowercaseString
}

let generatedName = self.markovGenerator.generateWord(startString: name)

self.resultLabel.text = "Your \(generatedType) name is:\n\(generatedName)"
}
...
}```

And… we’re done

### Extensions

1. We have only considered individual letters for our markov chains, a useful extension is to consider groups of `n`letters (pairs, triplets, …). So for some group of `n` letters we know all the letters that follow after that group. This tends to give results that are closer to the original text.
2. We can use a very simillar approach to generate random text, given some source text. Instead of considering individual letters, we consider words and all the words that can succeed them.
(4 votes, average: 3.00 out of 5)