Try clicking your favorite names as they are generated below, they will be added to this list and folded back into the generator, so you can influence the generation of future names. Seize your destiny!
To the left you see a continuously updated list of random demon names. In this panel we have a literate program version of the Javascript code that does the work. This code is based on a Racket implementation of the same algorithm I published a while back. In accord with the not-particularly-functional nature of Javascript, this version foregoes functional purity.
Our goal is to generate names which are plausible Demon names. Our approach is to take a corpus of names (lifted from Wikipedia)
var testCorpus =
["Abraxas", "Abbadon",... "Overlord", "Zuul"];
and to extract therefrom information about what letters start a demon name, what letters follow what other letters at each position in a name, and what letters tend to end a demon name. With that collected information, we will then write a simulator which generates new names.
We will organize our name generator as a class, so we can abstract away certain features and parameterize the generation by a corpus:
function NameGenerator(corpus){
this.transitions = [[]];
this.corpus = [];
this.initialize(corpus);
return this;
}
We pass in the corpus we want to use to initialize the model, but we don't initialize the corpus member variable. Our model will support updating the corpus "online", and so we will let the `initialize` method populate the member corpus. This lets us do some sanitation on spurious names in the corpus too.
The method `initialize` is the most logical thing to implement next:
NameGenerator
.prototype
.initialize = function(corpus){
corpus
.forEach((function(string){
this.integrateString(string);
}).bind(this));
return this;
}
Meant to be called after the constructor sets up the member variables, this method simply adds each string in the corpus to the model by calling the method `integrateString`:
NameGenerator
.prototype
.integrateString = function(string){
var strLen = string.length,
initial = this.transitions[0],
lastLetter = string[0],
currentLetter, currentTable, i;
if(strLen>0){
initial.push(string[0]);
forEachCharacter(string.substring(1),(function(currentLetter,index){
addTableElement(this.getNthTable(index+1),
lastLetter,
currentLetter);
lastLetter = currentLetter;
}).bind(this));
addTableElement(this.getNthTable(strLen),lastLetter,"end");
this.corpus.push(string);
}
return this;
}
`integrateString` does the hard work. We want to track, for each position in a name, the probability that we transition to another letter. The first character of a string is a special case, because it is not conditioned on any previous character.
The member variable `transitions` maintains a list of
tables holding transition information. The first element of
that array is just an array of characters. When it comes
time to generate a new name, we will select one member of
that list of characters to be the initial character of our
name. If we add the first character of every string to this
array, then that selection will reflect the underlying
frequency distribution of initial characters. This is the
reason we simply initial.push(string[0]);
before we iterate over the subsequent characters of the
string.
Then, for each character after the first character, we grab the appropriately indexed table (offset by one, here, because we chopped off the first character before iterating) and add an element to that table indicating that the previous letter leads to the current letter. Each table beyond the first is a Javascript object with entries for all letters which occur at the previous position in the name. Each entry holds an array of characters seen after that character. During name generation, we will look up the previous character in the appropriately indexed table, and choose randomly the next character from the associated array. This technique is fast, but wastes memory.
Finally, after we finish iterating over the string, we indicate that the last character transitioned to a special item denoted by "end", which signals that the name is finished. During generation, if we choose an "end" from a transition table, we terminate the name generation.
Here are the utility methods and functions upon which the previous code depends. They are mostly self explanatory, ensuring that entries made into tables which don't yet exist first create and then populate the appropriate table.
NameGenerator
.prototype
.getNthTable = function(n){
var table = this.transitions[n] || {};
this.transitions[n] = table;
return table;
}
function forEachCharacter(s,f){
Array.prototype.forEach.call(s,f);
}
function addTableElement(table, lastLetter, currentLetter){
var array = table[lastLetter] || [];
array.push(currentLetter);
table[lastLetter] = array;
}
In many ways, generation of a novel name is the opposite of adding a name to the corpus. We choose an initial letter from the first transition table, and then we choose each subsequent letter by consulting the appropriate transition table based on the previously chosen letter. If we encounter "end" we terminate the string and return the name.
NameGenerator
.prototype
.generateOne = function(){
var name = [selectOne(this.transitions[0])],
lastLetter = name[0],
done = false,
i = 1,
currentLetter;
while(!done){
currentLetter = selectOne(this.getNthTable(i)[lastLetter]);
if(currentLetter==="end"){
done=true;
} else {
name.push(currentLetter);
lastLetter = currentLetter;
i = i + 1;
}
}
return name.join("");
}
Because each of our tables stores the characters we've encountered directly, we can simply use `selectOne` to get a random next character. Incidentally, this is the implementation of `selectOne`
function selectOne(a){
return a[Math.floor(Math.random()*a.length)];
}
We can implement a method to generate N names trivially:
NameGenerator
.prototype
.generateN = function(n){
var results = [];
for(var i = 0; i < n; i = i + 1){
results.push(this.generateOne());
}
return results;
}
See the actual source code here. Try it out with your own datasets! I have to admit that this implementation is significantly faster than the one I wrote in Racket. The difference may have something to do with the excellent v8 engine that I am testing on, but it probably has more to do with the fact that the Racket implementation used a memory-saving representation of the transition tables which required iteration through a list to select a new letter. I expect that even a purely functional implementation that relied on more efficient data structures would be significantly faster in Racket.