Introduction
In this tutorial we’ll look at generating plausible random names from a list of pre-existing names. We’ll also build a fun little app that lets you generate your Elf / Ninja name.
Just generating random characters won’t give us reasonable results. We’ll use a list of preexisting names in order to generate new ones. We’ll take advantage of the distribution of letters in our preexisting names.
For each name we look at each letter from that name.
For each letter we keep track of every letter that directly follows it.
Now we have a list of mappings from letters to all possible letters that directly follow them.
To generate a new name we use the following approach:
1. We start with some letter l1
. Our new name consists solely of l1
.
2. Starting from the letter (l1
) we randomly select a letter that follows it (l2
).
3. We add the letter l2
to our name
4. We repeat the the above algorithm starting at Step 2
with the letter l2
.
Once we reach a predetermined maximum number of letters or we reach a letter that has no successor we stop.
Lets say we have a list of names: “Mark, Michael and Mathew”. For each letter let’s look at the letters that follow it:
m : [a, i, a]
a : [r, e, t]
r : [k]
k : []
i : [c]
c : [h]
h : [a, e]
e : [l,w]
l : []
t : [h]
w : []
Now starting from the letter m
lets generate a new name.
- for
m
we pick the successora
- for
a
we pick the successore
- for
e
we pick the successorl
- l has no possible successor so we stop there
We end up generating the name Mael
, some other possibilities starting from M
would be: Maew
, Michew
, Michark
…
Some possible outputs when starting from h
would be Hel
, Hark
, Hael
, Hathathel
and others.
Note: in practice the list of words that we use to generate new words (the seed
) would be much larger than 3, typically at least 100 entries.
This process is knowed as a Markov Chain i.e. a process that has some states between which it transitions. Transitions occur randomly and a transition only depends on the current state, it doesn’t need any info on past states.
A Swift Implementation
NOTE: The implementation I’m going to show below is sub-optimal. i.e. It will be slow and take up a lot of unnecessary memory for long lists of names. I’m trying to emphasize clarity and simplicity instead of speed.
We’ll implement all our logic in a class MarkovGenerator
. The most important data that we’ll keep is a list of words
( the words we use to generate new words) and transitionTable
a dictionary that maps from letters to successors of letters. For each letter that is a key in the dictionary the corresponding value is an array of strings that contains the letters successors, the array may contain duplicates.
class MarkovGenerator {
private let words: [String]
private var transitionTable: [String : [String]] = [:]
}
We’ll create 2 initializers. One that takes a list of words and a convenience initializer that reads a list of words from a file. Our init
methods will call the buildTransitionTable
function that will take care of establishing the successors for each letter in each word.
class MarkovGenerator {
...
init(words: [String]) {
self.words = words
buildTransitionTable()
}
convenience init(fileName: String) {
let path = NSBundle.mainBundle().pathForResource(fileName, ofType: "txt")!
let names = String(contentsOfFile: path, encoding: NSUTF8StringEncoding, error: nil)!.componentsSeparatedByString("\n").map { $0.lowercaseString}
self.init(words: names)
}
}
In the buildTransitionTable
function we go through all our words and call the addWordToTransitionTable
on each word.
In the addWordToTransitionTable
function we go through all the characters in our word. For each character we consider it’s successor (nextChar
), in case we reach the last letter we set it’s successor to a special character("$"
).
We use the current character as a key into the transition table and add it’s successor to the array stored in the dictionary, updating or creating the array as needed.
class MarkovGenerator {
...
private func buildTransitionTable() {
for word in words {
addWordToTransitionTable(word)
}
}
private func addWordToTransitionTable(word: String) {
let wordArray = Array(word)
for index in 0..<wordArray.count {
let char = String(wordArray[index])
var nextChar = "$"
if index < wordArray.count - 1 {
nextChar = String(wordArray[index + 1])
}
var transitionsArray = transitionTable[char]
if (transitionsArray == nil) {
transitionsArray = []
}
transitionsArray?.append(nextChar)
transitionTable[char] = transitionsArray
}
}
...
}
Generating Words
First we’ll need to add some helper functions that we’ll use to generate words.
The generateNextCharacter
function takes a character and generates a new character by randomly selecting an entry from the array in the transition table corresponding to the input character.
class MarkovGenerator {
...
private func generateNextCharacter(character: String) -> String {
let transitionArray = transitionTable[character]!
let p = Int(arc4random()) % transitionArray.count
return transitionArray[p]
}
...
}
We’ll also need a function that generates a random character that appears in some of our input words. This is equivalent to selecting a random key of the transition table.
class MarkovGenerator {
...
private func randomCharacter() -> String {
return transitionTable.keys.array[Int(arc4random()) % transitionTable.keys.array.count]
}
...
}
We’ll want to somehow constrain the length of the generated words, to do this we’ll add 2 more properitesminWordLength
and maxWordLength
.
class MarkovGenerator {
...
var minWordLength = 4
var maxWordLength = 10
...
}
Now finally we’re ready to add a method that generates a word. We’ll add a method generateWord(startString: String = "") -> String
, the optional parameter startString
indicates some string that should be used to start generating the new string.
We’ll keep track of the generated word in a variable result
. We’ll also use a variable currentChar
that keeps track of the current character we’re processing. If the provided startingString
is empty we initialize currentChar
to a random character, otherwise we initialize it to the last character in the startingString
.
Next we repeat the following process a maximum of maxWordLength
times:
We transition to the next character using the generateNextCharacter
function.
If the generated character denotes the end of a word("$"
) and we have generated at least minWordLength
characters we return the generated string.
If the generated character denotes the end of a word("$"
) and we have not generated enough character we randomly generate a new character until it’s not the end of a word.
We append the generated character to our result.
Whenever the result is returned we’ll want to capitalize it. Names should start with an uppercase letter after all!
class MarkovGenerator {
...
func generateWord(startString: String = "") -> String {
var startStringArray = Array(startString)
var currentChar = randomCharacter()
var result = currentChar
if startStringArray.count > 0 {
result = startString
currentChar = String(startStringArray.last!)
}
for i in 1...maxWordLength {
currentChar = generateNextCharacter(currentChar)
if currentChar == "$" {
if i > minWordLength {
return result.capitalizedString
} else {
currentChar = randomCharacter()
}
}
result += currentChar
}
return result.capitalizedString
}
...
}
Example Usage of the class:
let markovGenerator = MarkovGenerator(words: ["Mark","Michael","Mathew"])
markovGenerator.generateWord() // Maew
Now let’s use the generator to build an iOS app!
Setting up the iOS app
You can download a starter project that includes the MarkovGenerator
class and 2 files with lists of names (“ElfNames.txt” and “NinjaNames.txt”).
The interface for the starter project is empty. We’ll add the following components that will make up the app:
A label
with text “Generate your elf / ninja name” A segmented control
with 2 segments “Elf” and “Ninja” A label
with text “Name:” A textfield
to the right of the name label A button
with title “Generate” An empty label
where the result will be displayed. Make sure to set the label
‘s Alignment
to Center
and Lines
to 2
The view should look like this after you have everything set up.
Connecting Outlets
We’ll have to connect 2 outlets:
The textfield as nameTextField
The bottom label as resultLabel
Also we’ll connect 2 actions:
The Value Changed
action for our segmented control as segmentedControlChanged
.
The Touch Up Inside
action for our button as didTapGenerateButton
.
Your ViewController should look like this after you’ve set everything up:
class ViewController: UIViewController {
@IBOutlet weak var resultLabel: UILabel!
@IBOutlet weak var nameTextField: UITextField!
@IBAction func segmentedControlChanged(sender: UISegmentedControl) {
}
@IBAction func didTapGenerateButton(sender: AnyObject) {
}
...
}
Implementing the Logic
We’ll keep 2 separate instances of MarkovGenerator
, one for Elf names and one for Ninja names. We’ll also keep a 3rd property that will keep the currently active MarkovGenerator
.
class ViewController {
...
var markovGenerator: MarkovGenerator!
var elfGenerator: MarkovGenerator!
var ninjaGenerator: MarkovGenerator!
}
In the viewDidLoad
method we’ll intialize our MarkovGenerators
.
class ViewController {
...
override func viewDidLoad() {
super.viewDidLoad()
self.elfGenerator = MarkovGenerator(file: "ElfNames")
self.ninjaGenerator = MarkovGenerator(file: "NinjaNames")
self.markovGenerator = self.elfGenerator
}
...
}
When the segmented control changes value we’ll want to update the active markov generator that we use:
class ViewController {
...
@IBAction func segmentedControlChanged(sender: UISegmentedControl) {
if sender.selectedSegmentIndex == 0 {
self.markovGenerator = self.elfGenerator
} else {
self.markovGenerator = self.ninjaGenerator
}
}
...
}
When the button is tapped we’ll want to generate a new text and place it in our resultLabel
. We’ll also want to add some flavor text based on the current active generator. (“Your Elf name is:” or “Your Ninja name is”).
Now, we don’t want to pass the full name the user inputted to the markov generator, let’s just pass the first 2 letters so that the generated word starts like the inputted name. We’ll use an String
extension to get the first 2 characters of the name.
extension String {
subscript (r: Range<Int>) -> String {
get {
let startIndex = advance(self.startIndex, r.startIndex)
let endIndex = advance(startIndex, r.endIndex - r.startIndex)
return self[Range(start: startIndex, end: endIndex)]
}
}
}
class ViewController {
...
@IBAction func didTapGenerateButton(sender: AnyObject) {
let generatedType = self.markovGenerator === self.elfGenerator ? "Elf" : "Ninja"
var name = ""
let nameLength = count(self.nameTextField.text)
if nameLength > 0 {
name = self.nameTextField.text[0..<min(2,nameLength)].lowercaseString
}
let generatedName = self.markovGenerator.generateWord(startString: name)
self.resultLabel.text = "Your \(generatedType) name is:\n\(generatedName)"
}
...
}
And… we’re done
Extensions
- We have only considered individual letters for our markov chains, a useful extension is to consider groups of
n
letters (pairs, triplets, …). So for some group ofn
letters we know all the letters that follow after that group. This tends to give results that are closer to the original text. - We can use a very simillar approach to generate random text, given some source text. Instead of considering individual letters, we consider words and all the words that can succeed them.