On Inform 7, Natural Language Programming and the Principle of Least Surprise

I’ve been pecking away at Inform 7 lately on account of its recently acquired Gnome front end. For those not in the know, Inform (and Inform 7) is a text adventure authoring language. I’ve always been interested in game programming but never had the time (or more likely the persistence of mind) to develop one of any sophistication myself. Usually in these cases one lowers the bar, and as far as interactive media goes, you can’t get much lower, complexity wise, than text adventures.

Writing a game in Inform amounts to describing the world and it’s rules in terms of a programming language provided by Inform. The system then collects the rules and descriptions and creates a game out of them. Time was, programming in Inform used to look like:

Constant Story "Hello World";
Constant Headline "^An Interactive Example^";
Include "Parser";
Include "VerbLib";
[ Initialise;
  location = Living_Room;
  "Hello World"; ];
Object Kitchen "Kitchen";
Object Front_Door "Front Door";
Object Living_Room "Living Room"
  with
      description "A comfortably furnished living room.",
      n_to Kitchen,
      s_to Front_Door,
  has light;

Which is recognizably a programming language, if a bit strange and domain specific. These days, writing Inform looks like this: (from my little project):

"Frustrate" by "Vincent Toups"
Ticks is a number which varies.
Ticks is zero.
When play begins:  
Now ticks is 1.

The Observation Room is a room. "The observation room cold and
surreal. Stars dot the floor underneath thick, leaded glass, cutting
across it with a barely perceptible tilt. This room seems to have been
adapted for storage, and is filled with all sorts of sub-stellar
detritus, sharp in the chill and out of place against the slowly
rotating sky. Even in the cold, the place smells of dust, old wood
finish, and mildew. [If ticks is less than two] As the sky cuts its
way across the milky way, the whole room seems to tilt.  You feel
dizzy.[else if ticks is less than four]The plane of the galaxy is
sinking out of range and the portal is filling with the void of
space. It feels like drowning.[else if ticks is greater than 7]The
galactic plane is filling the floor with a powdering of
stars.[else]The observation floor looks out across the void of space.
You avert your eyes from the floor.[end if]"

Every turn: Now ticks is ticks plus one.
Every turn: if ticks is 10:
decrease ticks by 10.

As you can see, the new Inform adopts a “natural language” approach to programming. As the Inform 7 website puts it

[The] Source language [is] modelled closely on a subset of English, and usually readable as such.

Also reproduced in the Inform 7 manual is the following quote from luminary Donald Knuth:

Programming is best regarded as the process of creating works of literature, which are meant to be read… so we ought to address them to people, not to machines. (Donald Knuth, “Literate Programming”, 1981)

Which better than anything else illustrates the desired goal of the new system: Humans are not machines! Machines should accommodate our modes of expression rather than forcing us to accommodate theirs! If it wasn’t for the unnaturalness of programming languages, the logic goes, many more people would program. The creation of interactive fiction means to be inclusive, so why not teach the machine to understand natural language?

This is a laudable goal. I really think the future is going to have a lot more programmers in it, and a primary task of language architects is to design programming languages which “regular” people find intuitive and useful. For successes in that arena see Python, or Smalltalk or even Basic. Perhaps these languages are not the pinnacle of intuitive programming environments but whatever that ultimate language is, I doubt seriously it will look much like Inform 7.

This is unfortunate, because reading Inform 7 is very pleasant, and the language is even charming from time to time. Unfortunately, it’s very difficult to program in1, and I say that as something of a programming language aficionado. It’s true that creating the basic skeleton of a text adventure is very easy, but even slightly non-trivial extensions to the language are difficult to intuitively get right. For instance, the game I am working on takes place on a gigantic, hollowed out natural satellite, spinning to provide artificial gravity. The game begins in a sort of observation bubble, where the floor is transparent and the stars are visible outside. Sometimes this observation window should be pointing into the plane of the Milky Way, but other times it should be pointing towards the void of space because the station’s axis of rotation is parallel to the plane of the galaxy. The description of the room should reflect these different possibilities.

Inform 7 operates on a turn based basis, so it seems like it should be simple enough to create this sort of time dependent behavior by keeping track of time but it was frustrating to figure out how to “tell” the Inform compiler what I wanted.

First I tried joint conditionals:

  When the player is in the Observation Room and
the turn is even, say: "The stars fill the floor."

But this resulted in an error message. Maybe the system doesn’t know about “evenness” so I tried:

  When the player is in the Observation Room and
the turn is greater than 3, say "The stars fill the floor."

(Figuring I could add more complex logic later).

Eventually I figured out the right syntax, which involved creating a variable and having a rule set its value each turn and a separate rule reset the value with the periodicity of the rotation of the ship, but the process was very frustrating. In Python the whole game might look, with the proper abstractions, like:


while not game.over():
    game.describe_location(player.position);
    if (player.position == 'The Observation Room' and
         game.turn() % 10):
        print "The stars fill the floor."

Which is not perhaps as “englishy” as the final working Inform code (posted near the beginning of this article) but is much more concise and obvious.

But that isn’t the reason the Python version is less frustrating to write. The reason is the Principle of Least Surprise, which states, roughly, that once you know the system, the least surprising way of doing things will work. The problem with Inform 7 is that “the system” appears to the observer to be “written English (perhaps more carefully constructed that usual)”. This produces in the coder a whole slew of assumptions about what sorts of statements will do what kind of things and as a consequence, you try a lot of things which, according to your mental model, inexplicably don’t work.

It took me an hour to figure out how to make what amounts to a special kind of clock and I had the benefit of knowing that underneath all that “natural English” was a (more or less) regular old (prolog flavored) programming environment. I can’t imagine the frustration a non-programmer would feel when they first decided to do something not directly supported or explained in the standard library or documentation.

That isn’t the only problem, either. Natural english is a domain specific language for communicating between intelligent things. It assumes that the recepient of the stream of tokens can easily resolve ambiguities, invert accidental negatives (pay attention, people do this all the time in speech) and tell the difference between important information and information it’s acceptable to leave ambiguous. Not only are computers presently incapable of this level of deduction/induction, but generally speaking we don’t want that behavior anyway: we are programming to get a computer to perform a very narrowly defined set of behaviors. The implication that Inform 7 will “understand you” in this context is doubly frustrating. And you don’t want it to “understand,” you want it to do exactly.

A lot of this could be ameliorated by a good piece of reference documentation, spelling out in exact detail the programmatic environment’s behavior. Unfortunately, the bundled documentation is a big tutorial which does a poor job of delineated between constructs in the language and elements of it. It all seems somewhat magical in the tutorial, in other words, and the intrepid reader, wishing to generalize on the rules of the system, is often confounded.

Nevertheless, I will probably keep using it. The environment is clean and pleasant, and the language, when you begin to feel out the classical language under the hood, is ok. And you can’t beat the built in features for text based games. I doubt that Inform 7, though, will seriously take off. Too many undeliverable promises.

1 This may make it the only “Read Only” programming language I can think of.

A Critique of The Programming Language J

I’ve spent around a year now fiddling with and eventually doing real
data analytic work in the The Programming Language J. J is one of
those languages which produces a special enthusiasm from its users and
in this way it is similar to other unusual programming languages like
Forth or Lisp. My peculiar interest in the language was due to no
longer having access to a Matlab license, wanting an array oriented
language to do analysis in, and an attraction to brevity and the point
free programming style, two aspects of programming which J emphasizes.

Sorry, Ken.

Sorry, Ken.

I’ve been moderately happy with it, but after about a year of light
work in the language and then a month of work-in-earnest (writing
interfaces to gnuplot and hive and doing Bayesian inference and
spectral clustering) I now feel I am in a good position to offer a
friendly critique of the language.

First, The Good

J is terse to nearly the point of obscurity. While terseness is not a
particularly valuable property in a general purpose programming
language (that is, one meant for Software Engineering), there is a
case to be made for it in a data analytical language. Much of my work
involves interactive exploration of the structure of data and for that sort
of workflow, being able to quickly try a few different ways of
chopping, slicing or reducing some big pile of data is pretty
handy. That you can also just copy and paste these snippets into some
analysis pipeline in a file somewhere is also nice. In other words,
terseness allows an agile sort of development style.

Much of this terseness is enabled by built in support for tacit
programming. What this means is that certain expressions in J are
interpreted at function level. That is, they denote, given a set of
verbs in a particular arrangement, a new verb, without ever explicitly
mentioning values.

For example, we might want a function which adds up all the maximum
values selected from the rows of an array. In J:

+/@:(>./"1)

J takes considerable experience to read, particularly in Tacit
style. The above denotes, from RIGHT to LEFT: for each row ("1)
reduce (/) that row using the maximum operation >. and then (@:)
reduce (/) the result using addition (+). In english, this means:
find the max of each row and sum the results.

Note that the meaning of this expression is itself a verb, that is
something which operates on data. We may capture that meaning:

sumMax =: +/@:(>./"1)

Or use it directly:

+/@:(>./"1) ? (10 10 $ 10)

Tacit programming is enabled by a few syntactic rules (the so-called
hooks and forks) and by a bunch of function level operators called
adverbs and conjuctions. (For instance, @: is a conjunction rougly
denoting function composition while the expression +/ % # is a fork,
denoting the average operation. The forkness is that it is three
expressions denoting verbs separated by spaces.

The details obscure the value: its nice to program at function level
and it is nice to have a terse denotation of common operations.

J has one other really nice trick up its sleeve called verb
rank
. Rank itself is not an unusual idea in data analytic languages:
it just refers to the length of the shape of the matrix; that is, its
dimensionality.

We might want to say a bit about J’s basic evaluation strategy before
explaining rank, since it makes the origin of the idea more clear. All
verbs in J take one or two arguments on the left and the right. Single
argument verbs are called monads, two argument verbs are called dyads.
Verbs can be either monadic or dyadic in which case we call the
invocation itself monadic or dyadic. Most of J’s built-in operators
are both monadic and dyadic, and often the two meanings are unrelated.

NB. monadic and dyadic invocations of <
4 < 3 NB. evaluates to 0
<3 NB. evalutes to 3, but in a box.

Give that the arguments (usually called x and y respectively) are
often matrices it is natural to think of a verb as some sort of matrix
operator, in which case it has, like any matrix operation, an expected
dimensionality on its two sides. This is sort of what verb rank is
like in J: the verb itself carries along some information about how
its logic operates on its operands. For instance, the built-in verb
-: (called match) compares two things structurally. Naturally, it
applies to its operands as a whole. But we might want to compare two
lists of objects via match, resulting in a list of results. We can
do that by modifying the rank of -:

x -:”(1 1) y

The expression -:”(1 1) denotes a version of match which applies to
the elements of x and y, each treated as a list. Rank in J is roughly
analogous the the use of repmat, permute and reshape in Matlab: we can
use rank annotations to quickly describe how verbs operate on their
operands in hopes of pushing looping down into the C engine, where
it can be executed quickly.

To recap: array orientation, terseness, tacit programming and rank are
the really nice parts of the language.

The Bad and the Ugly

As a programming environment J can be productive and efficient, but it
is not without flaws. Most of these have to do with irregularities in
the syntax and semantics which make the language confusing without
offering additional power. These unusual design choices are
particularly apparent when J is compared to more modern programming
languages.

Fixed Verb Arities

As indicated above, J verbs, the nearest cousin to functions or
procedures from other programming languages, have arity 1 or
arity 2. A single symbol may denote expressions of both arity, in
which case context determines which function body is executed.

There are two issues here, at least. The first is that we often want
functions of more than two arguments. In J the approach is to pass
boxed arrays to the verb. There is some syntactic sugar to support
this strategy:

multiArgVerb =: monad define
‘arg1 arg2 arg3’ =. y
NB. do stuff
)

If a string appears as the left operand of the =. operator, then
simple destructuring occurs. Boxed items are unboxed by this
operation, so we typically see invocations like:

multiArgVerb('a string';10;'another string')

But note that the expression on the right (starting with the open
parentheses) just denotes a boxed array.

This solution is fine, but it does short-circuit J’s notion of verb
rank: we may specify the the rank with which the function operates on
its left or right operand as a whole, but not on the individual
“arguments” of a boxed array. But nothing about the concept of rank
demands that it be restricted to one or two argument functions: rank
entirely relates to how arguments are extracted from array valued
primitive arguments and dealt to the verb body. This idea can be
generalized to functions of arbitrary argument count.

Apart from this, there is the minor gripe that denoting such single
use boxed arrays with ; feels clumsy. Call that the Lisper’s bias:
the best separator is the space character.1

A second, related problem is that you can’t have a
zero argument function either. This isn’t the only language where
this happens (Standard ML and OCaml also have this tradition, though I
think it is weird there too). The problem in J is that it would feel
natural to have such functions and to be able to mention them.

Consider the following definitions:

o1 =: 1&-
o2 =: -&1

(o1 (0 1 2 3 4)); (o2 (0 1 2 3 4))
┌────────────┬──────────┐
│1 0 _1 _2 _3│_1 0 1 2 3│
└────────────┴──────────┘

So far so good. Apparently using the & conjunction (called “bond”)
we can partially apply a two-argument verb on either the left or the
right. It is natural to ask what would happen if we bonded twice.

(o1&1)
o1&1

Ok, so it produces a verb.

 3 3 $ ''
  ;'o1'
  ;'o2'
  ;'right'
  ;((o1&1 (0 1 2 3 4))
  ; (o2&1 (0 1 2 3 4))
  ;'left'
  ; (1&o1 (0 1 2 3 4))
  ; (1&o2 (0 1 2 3 4)))

┌─────┬────────────┬────────────┐
│     │o1          │o2          │
├─────┼────────────┼────────────┤
│right│1 0 1 0 1   │1 0 _1 _2 _3│
├─────┼────────────┼────────────┤
│left │1 0 _1 _2 _3│_1 0 1 2 3  │
└─────┴────────────┴────────────┘

I would describe these results as goofy, if not entirely impossible to
understand (though I challenge the reader to do so). However, none of
them really seem right, in my opinion.

I would argue that one of two possibilities would make some sense.

  1. (1&-)&1 -> 0 (eg, 1-1)
  2. (1&-)&1 -> 0″_ (that is, the constant function returning 0)

That many of these combinations evaluate to o1 or o2 is doubly
confusing because it ignores a value AND because we can denote
constant functions (via the rank conjunction), as in the expression
0"_.

Generalizations

What this is all about is that J doesn’t handle the idea of a
function very well. Instead of having a single, unified abstraction
representing operations on things, it has a variety of different ideas
that are function-like (verbs, conjuctions, adverbs, hooks, forks,
gerunds) which in a way puts it ahead of a lot of old-timey languages
like Java 7 without first order functions, but ultimately this
handful of disparate techniques fails to acheive the conceptual unity
of first order functions with lexical scope.

Furthermore, I suggest that nothing whatsoever would be lost (except
J‘s interesting historical development) by collapsing these ideas
into the more typical idea of closure capturing functions.

Other Warts

Weird Block Syntax

Getting top-level2 semantics right is hard in any
language. Scheme is famously ambiguous on the subject, but at
least for most practical purposes it is comprehensible. Top-level has
the same syntax and semantics as any other body of code in scheme
(with some restrictions about where define can be evaluated) but in
J neither is the same.

We may write block strings in J like so:

blockString =: 0 : 0 
Everything in here is a block string.       
)

When the evaluator reads 0:0 it switches to sucking up characters
into a string until it encounters a line with a ) as its first
character. The verb 0:3 does the same except the resulting string is
turned into a verb.

plus =: 3 : 0
    x+y
)

However, we can’t nest this syntax, so we can’t define non-tacit
functions inside non-tacit functions. That is, this is illegal:

plus =: 3 : 0
  plusHelper =. 3 : 0
    x+y
  )
  x plusHelper y
)

This forces to the programmer to do a lot of lambda lifting
manually, which also forces them to bump into the restrictions on
function arity and their poor interaction with rank behavior, for if
we wish to capture parts of the private environment, we are forced to
pass those parts of the environment in as an argument, forcing us to
give up rank behavior or forcing us to jump up a level to verb
modifiers.

Scope

Of course, you can define local functions if you do it tacitly:

plus =: 3 : 0
    plusHelper =. +
    x plusHelper y   
)

But, even if you are defining a conjunction or an adverb, whence you
are able to “return” a verb, you can’t capture any local functions –
they disappear as soon as execution leaves the conjunction or adverb
scope.

That is because J is dynamically scoped, so any capture has to be
handled manually, using things like adverbs, conjunctions, or the good
old fashioned fix f., which inserts values from the current scope
directly into the representation of a function. Essentially all modern
languages use lexical scope, which is basically a rule which says: the
value of a variable is exactly what it looks like from reading the
program. Dynamic scope says: the valuable of the variable is whatever
its most recent binding is.

Recapitulation!

The straight dope, so to speak, is that J is great for a lot of
reasons (terseness, rank) but also a lot of irregular language
features (adverbs, conjunctions, hooks, forks, etc) which could be
folded all down into regular old functions without harming the
benefits of the language, and simplifying it enormously.

If you don’t believe that regular old first order functions with
lexical scope can get us where we need to go, check out my
tacit-programming libraries in R and Javascript. I
even wrote a complete, if ridiculously slow implementation of J‘s
rank feature, literate-style, here.


Footnotes

1 It bears noting that ; in an expression like (a;b;c)
is not a syntactic element, but a semantic one. That is, it is the
verb called “link” which has the effect of linking its arguments into
a boxed list. It is evaluated like this:

(a;(b;c))

(a;b;c) is nice looking but a little strange: In an expression
(x;y) the effect depends on y is boxed already or not: x is always boxed regardless, but y is boxed only if it wasn’t boxed before.

2 Top level? Top-level is the context where everything
“happens,” if anything happens at all. Tricky things about top-level
are like: can functions refer to functions which are not yet defined,
if you read a program from top to bottom? What about values? Can you
redefine functions, and if so, how do the semantics work? Do functions
which call the redefined function change their behavior, or do they
continue to refer to the old version? What if the calling interface
changes? Can you check types if you imagine that functions might be
redefined at any time? If your language has classes, what about
instances created before a change in the class definition. Believe or
not, Common Lisp tries to let you do this – and its confusing!

On the opposite end of the spectrum are really static languages like
Haskell, wherein type enforcement and purity ensure that the top-level
is only meaningful as a monolith, for the most part.


Prime Industrial Space

In their enthusiasm, they built roads
(huge, wide things, six lanes or more, sidewalks)
for which they had no buildings, through forest,
or through meadow. But to drive them, empty
and spacious, is a kind of luxury.
You have passed from, at fifty miles per hour,
reality; prime industrial space.

And into what? At an empty crossroad,
kids have knocked down the “road closed” barriers,
the black circles of their tires crossing,
crisscrossing, looping across that square of
white concrete. With the barriers down
you can drive right through to where the road ends,
leave your car, and walk through dirt, to the woods.

Quick, Probabily Naive Thoughts about Turing Machines and Random Numbers

Here is a fact which is still blowing my mind, albeit quietly, from the horizon.

Turing Machines, the formalism which we use to describe computation, do not, strictly speaking, cover computational processes which have access to random values. When we wish to reason about such machines people typically imagine a Turing Machine with two tapes, one which takes on the typical role and another which contains an infinite string of random numbers which the machine can peel off one at a time.

Screen Shot 2016-05-30 at 9.42.43 AM

I know what you are all thinking: can’t I just write a random number generator and put it someplace on my turing machine’s tape, and use that? Sure, but those numbers aren’t really random, particularly in the sense that a dedicated attacker, having access to the output of your turing machine can in principle detect the difference between your machine and one with bona fide random numbers if it has access to your outputs. And, in fact, the question of whether there exists a random number generator which uses only polynomial time and space such that a polynomial time and space algorithm is unable to detect whether the numbers derive from a real random process or an algorithm is still open.

All that is really an aside. What is truly, profoundly surprising to me is this: a machine which has access to random numbers seems to be more powerful than one without random numbers. In what sense? There are algorithms which are not practical on a normal turing machine which become immanently practical on a turing machine with a random tape as long as we are able to accept a vanishingly small probability that the result is wrong. Algorithms about which we can even do delta/epsilon style reasoning: that is, we can make the probability of error as small as we like by the expedient of repeating the computation with new random numbers and (for instance) using the results as votes to determine the “correct answer.” This expedient does not really modify the big O complexity of algorithms.

Buridan’s Ass is a paradox in which a hungry donkey sits between two identical bales of hay and dies of hunger, unable to choose which to eat on account of their equal size. There is a strange sort of analogy here: if the Ass has a source of random numbers he can pick one randomly and survive. It is almost as if deterministic, finitist mathematics, in its crystalline precision, encounters and wastes energy on lots of tiny Ass’ Dilemmas which put certain sorts of results practically out of reach, but if we fuzz it up with random numbers, suddenly it is liberated to find much more truth than it was before. At least that is my paltry intuitive understanding.

In defense of “The Thin Line Aesthetic”

I was lucky to be one of the guest artists at the Code+Art Student Visualization contest at NCSU library recently, where parts of my generative art work Clocks were displayed. In preparation for the show, I wrote specific pieces for the space, which uses a large screen made out of Christie Micro-tiles. These are modular screens which can be used to construct large and even irregular displays. While the Micro-tiles gave me the largest amount of screen real-estate I’d ever worked with, they posed their own challenges. In particular, the luminance tends to vary between tiles in such a way that art works with predominantly white backgrounds can be distracting, since each tile making up the whole screen will appear at a different brightness. The effect is less noticeable when darker colors are supplied.

Thin Lines!

Thin Lines!

During these discussions, it was suggested that using dark backgrounds might help get us away from “the thin line aesthetic” so predominant in generative art. I agree, thin lines, typically dark on white, are extremely common in generative artworks. But here I will say a few words in their defense.

Clock 10 is a thin-line clock. I’ve had occasion to think very carefully about why this clock works as a generative art piece, and it is typically the clock I talk about when talking about the project as a whole because it has a relatively simple, but non-trivial, account. Briefly, Clock 10’s charged particles want to distribute themselves evenly over the face of the clock (since they are all positively charged, and hence repel one another). The clock hands persistently frustrate this tendency by moving particles from the second hand to the hour or minute hand. As such, the particles are constantly seeking, but never attaining, their low energy equilibrium state. Critically, there is not just one such equilibrium state: there exists a family of states related by symmetry transforms (continuous and discrete rotations).

What this boils down to is that Clock 10 traces out the symmetries of the ground states. This is why, if you let the clock run for a half hour, you see concentric rings appear: these rings are the places that particles would like to be modulo rotations, were they allowed to find those states without interference.

cant-number-3

A few aesthetic choices support the relationship of Clock 10 to this interpretation. The color of the trace left by each particle is adjusted in such a way that it only darkens as the particle settles down, so that paths near equilibrium are dark while those far from it are white. And, of course, we use thin lines, which allow lots of information about those trajectories to appear on the face of the clock.

I am a barbarian, so far as artistic pedigree is concerned, but if Clock 10 lives anywhere in the landscape of the practice of generative art, it is in the school of Complexism. Complexism suggests one role of generative art is to explore complex systems. In the sense the Clock 10 is an aesthetically pleasing visualization of the ground states of a certain physical system, it meets this criterion. And thin lines help it operate in this way because they allow us to see a lot of different trajectories in a small amount of space. They give the clock non-trivial texture: tendencies of the motion can be apprehended at a large scale while details of the motion are still discernable.

I tried a variety of other ways of visualizing the trajectories, but none were particularly satisfying because they obscured the fine-scale variations in a way which significantly reduced the information content of the visualization. Part of the impact of generative art is that it imitates nature, to an extent, in that it can compound over and over again many fine motions. The accumulation of so many effects is part of the immediate perception of a work, and undermining it undermines one of the fundamental advantages of using computers, systems capable simultaneously of great precision and great, repetitive patience.

So use thin lines! Or, if you are seeking alternative aesthetic choices, try to find ones which capture the same benefits, packing lots of precise detail into the image in such a way that larger trends are also made visible.

cant-number-1.5


Notes on `Quantum Computing Since Democritus, Chapter 1`

For a long time, I’ve been interested in the sorts of questions exemplified by the following example:

Suppose we are Isaac Newton or  Gottfried Leibniz. We have at our disposal two sources of inspiration: data, collected by intrepid philatelists like Tycho Brahe and something like theory, in the form of artifacts like Kepler’s Laws, Galileo’s pre-Newtonian laws of motion (for it was he who first suggested that objects in motion retain that motion unless acted upon), and a smattering of Aristotelian and post-Aristotelian intuitions about motion (for instance, John Philoponus’ notion that, in addition to the rules of motion described by Aristotle, one object could impart on another a transient impetus). You also have tables and towers and balls you can roll on them or drop from them. You can perform your own experiments.

The question, then, is how do you synthesize something like Newton’s Laws. Jokes about Newton’s extra-scientific interests aside, this is alchemy indeed, and an alchemy to which most training physicists receive (or at least I received) does not address itself.

Newton’s Laws are generally dropped on the first year physics student (perhaps after working with statics for awhile) fully formed:

First law: When viewed in an inertial reference frame, an object either remains at rest or continues to move at a constant velocity, unless acted upon by an external force.[2][3]
Second law: The vector sum of the external forces F on an object is equal to the mass m of that object multiplied by the acceleration vector aof the object: F = ma.
Third law: When one body exerts a force on a second body, the second body simultaneously exerts a force equal in magnitude and opposite in direction on the first body.

(this formulation borrowed from Wikipedia)

The laws are stated here in terms of a lot of subsidiary ideas: inertial reference frames, forces, mass. Neglecting the reference to mathematical structures (vector sums), this is a lot to digest: and it is hard to imagine Newton just pulling these laws from thin air.  It took the species about 2000 years to figure it out (if you measure from Zeno to Newton, since Newton’s work is in some sense a practical rejoinder to the paradoxes of that pre-Socratic philosopher), so it cannot be, as some of my colleagues have suggested, so easy to figure out.

A doctorate in physics takes (including the typical four year undergraduate degree in math, physics or engineering) about ten years. Most of what is learned in such a program is pragmatic theory: how to take a problem statement or something even more vague, identify the correct theoretical approach from a dictionary of possibilities, and then to “turn the crank.” It is unusual (or it was unusual for me) for a teacher to spend time posing more philosophical questions. Why, for instance, does a specific expression called the “Action,” when minimized over all possible paths of a particle, find a physical path? I’ve had a lot of physicist friends dismiss my curiosity about this subject, but I’m not the only one interested (eg, the introductory chapter of Lanczos’ “The Variation Principles of Mechanics”).

What I am getting to here, believe it or not, is that I think physicists are over-prepared to work problems and under-prepared to do the synthetic work of building new theoretical approaches to existing unsolved problems. I enjoy the freedom of having fallen from the Ivory Tower, and I aim to enjoy that freedom in 2016 by revisiting my education from a perspective which allows me to stop and ask “why” more frequently and with more intensity.

Enter Scott Aaronson’s “Quantum Computing Since Democritus,” a book whose title immediately piqued my interest, combining, as it does, the name of a pre-Socratic philosopher (the questions of which form the basis, in my opinion, for so much modern physics) with the most modern and pragmatic of contemporary subjects in physics. Aaronson’s project seems to accomplish exactly what I want as an armchair physicist: stopping to think about what our theories really mean.

To keep myself honest, I’ll be periodically writing about the chapters of this book – I’m a bit rusty mathematically and so writing about the work will encourage me to get concrete where needed.

Atoms and the Void

Atoms and the Void is a short chapter which basically asks us to think a bit about what quantum mechanics means. Aaronson describes Quantum Mechanics in the following way:

Here’s the thing: for any isolated region of the universe that you want to consider, quantum mechanics describes the evolution in time of the state of that region, which we represent as a linear combination – a superposition – of all the possible configurations of elementary particles in that region. So, this is a bizarre picture of reality, where a given particle is not here, not there, but in a sort of weighted sum over all the places it could be. But it works. As we all know, it does pretty well at describing the “atoms and the void” that Democritus talked about.

The needs of an introductory chapter, I guess, prevent him from describing how peculiar this description is: for one thing, there is never an isolated region of the universe (or at least, not one we are interested in, I hope obviously). But he goes on to meditate on this anyway by asking us to think about how we interpret measurement where quantum mechanics is concerned. He dichotimizes interpretations of quantum mechanics by where they fall on the question of putting oneself in coherent superposition.

Happily, he doesn’t try to claim that any particular set of experiments can definitely disambiguate different interpretations of quantum mechanics. Instead he suggests that by thinking specifically of Quantum Computing, which he implies gets most directly at some of the issues raised by debates over interpretation, we might learn something interesting.

This tantalizes us to move to chapter 2.

Aping J’s Verb Rank in Puff

This blog post will sketch out some thoughts relating to Puff, a function level programming language I am embedding in Javascript and J‘s notion of operator rank.

Rank as it pertains to nouns in J is fairly easy to understand: it is just the number of dimensions of an array. Scalars, like 10, have rank 0, the empty value (denotable as 0 $ 0) has rank 0, a simple vector (0 0 0) has rank 1, a 2d matrix rank 2, etc.

But J also has rank for verbs. Consider the verb +.

(1 2 3) + (4 5 6)
-> 5 7 9

(For J tyros: + is called a verb in J and furthermore we use it in its dyadic sense, which is to say we pass it arguments on the left and the right.)

Informally we understand from this that in J + operates element-wise on both its left and right operands. This means its left and right rank are both zero and it operates, then, on the rank zero elements of its arguments: the individual scalar values.

But there is more to the story. As a side note, We can denote multi-dimensional arrays in J like so:

]example =. 2 3 $ 1 2 3 4 5 6 
1 2 3
4 5 6

(For the curious, that is “change the shape ($) for array 1 2 3 4 5 6 so that it is 2 3)

J has a box operator which is handy for demonstrating rank issues. It is denoted < and wraps any value into a box, which is a special scalar value type which holds something else.

<1
┌─┐
│1│
└─┘

Operators in J have rank and the rank of < is infinite. This means that it always operates on its argument as a whole.

<(1 2 3 4 5 6)
┌───────────┐
│1 2 3 4 5 6│
└───────────┘

But the smart thing about J is that you can modify verbs with adverbs one of which returns a new verb with a different rank. See if you can guess what all this means:

<"0(1 2 3 4 5 6)
┌─┬─┬─┬─┬─┬─┐
│1│2│3│4│5│6│
└─┴─┴─┴─┴─┴─┘

The array denotation 1 2 3 4 5 6 is the same as before, but now we have written <"0 instead of <. " is an adverb which modified its right hand arguments’ rank such that it is the left hand value. The result of<"0 then is a verb with the same meaning as < except that it has rank 0. Verbs with rank 0 operate on the smallest possible cells of the array, so that

<"0(3 2 $ 1 2 3 4 5 6)
┌─┬─┐
│1│2│
├─┼─┤
│3│4│
├─┼─┤
│5│6│
└─┴─┘

each element of the input is boxed regardless of the incoming arrays shape or rank.

If we use a different rank:

<"1(3 2 $ 1 2 3 4 5 6)
┌───┬───┬───┐
│1 2│3 4│5 6│
└───┴───┴───┘

We get a different result. One-ranked verbs operate 1-cells (that is, the elements of rank 1) of the incoming array, in this case the arrays 1 2, 3 4, and 5 6.

The rules work for dyadic values too – each argument of the verb has a rank (a right rank and a left rank) which determines how the underlying logic represented by the verb is used to join elements from the right and left arguments.

By modifying verb rank you can custom tailor your iteration through argument arrays and avoid most explicit looping.

Puff

Puff is mostly aping the function level semantics of J but we can analogize verb rank too. Consider the Puff function map, which has a single argument meaning:

var plus1 = _p(plus,1);
map(plus1)([1,2,3]) -> [2,3,4]

plus1 above would have, in J an infinite rank: it always applies to its whole argument. When we say map(plus1) we have a new function which applies to the N-cells of its argument (in this case integers). In other words, map creates a new function which peels off one layer of its input and applies the original function, collecting the outputs.

What, then, is

var mm_plus1 = map(map(plus1))

?

(NB, we can denote this in Puff via rep(map,2,plus1))

Here is a hint:

mm_plus1([[1,2,3],[4,5,6]]) -> [[2,3,4],[5,6,7]]

Now we have a function operating on the N-2 cells of the input. Rank in J typically operates bottom up: we start at rank 0 operating on the 0 cells, and increasing rank operates on larger and larger chunks of the input array. In contrast, iterative application of map in Puff moves us from the larger chunks to smaller and smaller chunks, until a number of applications equal to the array rank has us operating on individual items.

J being J we can mimic this behavior using negative rank.

<"_2(3 2 $ 1 2 3 4 5 6)
┌─┬─┐
│1│2│
├─┼─┤
│3│4│
├─┼─┤
│5│6│
└─┴─┘

(_2 denotes the number -2 in J for possibly obscure reasons to do with making the parser simpler.)

Given that 3 2 $ 1 2 3 4 5 6 has rank 2, the verb <"_2 must operate on the 2-2=0 cells.

The J approach of, by default, thinking about rank from 0-cells up works well for that language because matrices in J are regular and they keep track of their rank. If we represent matrices as nested arrays in Javascript (this is not the only option, but it is the most idiomatic) then the real rank of a matrix cannot be known without a full traversal, which is prohibitive.

I might, one day, integrate a multidimensional matrix implementation into Puff and then enable rank modifying functions to work on that representation, but for now I want to focus on the successive use ofmap to simulate ranking down a function from infinite rank.

Consider Rank

Consider the following definition:

function rankedCall(f,n,a){
    if(n<0){
        return rep(map, n, f)(a);
    } else {
        throw new Error("Positive ranks not yet supported.");
    }
}

var rank = _c(rankedCall);

Such that:

rank(plusOne,1)([1,2,3]) -> [2,3,4]

Cute. This gets us somewhere. But what really makes rank useful is that each argument carries its own rank and the system resolves the looping for you. In J operators have at most two arguments (to which rank applies, simulating more arguments with lists of boxes bypasses ranked dispatch).

Dealing with multiple argument functions is tricky. Let’s start with two.

Consider:

// Puff provides a plus function
plus(1,3) -> 4
// but it doesn't work on arrays
plus([1,2,3],[4,5,6]) -> '1,2,34,5,6'

That last line is because Javascript idiotically interprets [1,2,3]+[4,5,6] to mean [1,2,3].toString()+[4,5,6].toString().

For these one dimensional arrays, we can get what we want with map which applies a function f of arity n to the items of n arrays.

map(plus,[1,2,3],[4,5,6]) -> [5,7,9]

(NB. In Puff we can also have said map(plus)([1,2,3],[4,5,6]))

What if we have [1,2,3] and [[4],[5],[6]], that is, the second argument is rank two?

Put aside questions of efficiency for a moment and consider the function:

function nextCellIndex(a, indexes){
    var indexes = indexes.map(id); // copy the array
    var delta = indexes.length-1;
    var subIndex = indexes.slice(0,delta);
    var indexedArray = index.apply(null, [a].concat(subIndex));
    var done = indexes[delta]+1 < indexedArray.length;
    while(!done){
      delta = delta -1;
      if(delta<0){
          return null;
      } else {
          indexedArray = index.apply(null, [a].concat(indexes.slice(0,delta)));
          done = indexes[delta]+1 < indexedArray.length;

      }
    }
    indexes[delta] = indexes[delta]+1;
    for(var i = delta+1; i<indexes.length; i = i + 1){
      indexes[i] = 0;
    }
    return indexes;
}

This function takes an array and an array of indexes and finds the next valid index into that array by incrementing the innermost index, checking whether that is in bounds, stopping if it is, or incrementing the next innermost and so on. If there is no valid next index, then null is returned.

If we want what J would call the -2 cells of an array a, we iteratively apply this function to a two element index vector.

var a = [[1],[2,3],[4]]
var indexes = repeatAccumulate(_p(nextCellIndex,a),3,[0,0])

Evaluating to:

indexes
[ [ 0, 0 ], [ 1, 0 ], [ 1, 1 ], [ 2, 0 ] ]

that is, the indexes of the -2 cells. We can get these by, for instance,

index.apply(null, a, indexes[0])

Note that a is not a regular matrix (the second item of a has a different length than the first and third – it has no obvious rank, but we can talk about its n-cells if we talk about them from outside in. We can write a function to give us these cells:

function cells(n, a){
    if(n<0){
      var nn = -n;
      var out = [];
      var indexes = initArray(nn,0);
      while(indexes){
          out.push(index.apply(null, [a].concat(indexes)));
          indexes = nextCellIndex(a,indexes)
      }
      return out;
    } else if (n===0){
      return a;
    } else {
      throw new Error("Positive cells not yet supported.");
    }
}

We can then just fall back onto map with the appropriate applications of cells:

map(plus,[1,2,3],cells(-2,[[1,2],[3]]))
-> [ 2, 4, 6 ]

Conceptually we’ve done well for ourselves: we’ve reproduced J‘s ability to change the way that functions join elements of arrays of arbitrary dimension. On top of that, by virtue of the arity of map, which can apply a function of any arity to any number of arrays, we have extended this idea to operators of any number of arguments (J is limited to monadic and dyadic verbs.)

In addition, Puff allows us to write the above function in a point free fashion:

var ex = f(2,au(map, al(plus), n0, r(n1,_p(cells, 2))));
ex([1,2,3],[[1,2],[3]])
-> [2, 4, 6]

(NB. al returns a function which always returns the argument to al, short for always, n0 returns the first element of a thing, n1 the second, etc. f (short for lambda) returns a function which first collects its arguments into an array and then passes them through its subsequent arguments as if via r (rCompose). Finally, au (short for augment) takes a set of functions and returns a new function which transforms its inputs via functions 1..n and applies function 0 to the that list.)

Positive Ranks

Using negative ranks is much more in-line with idiomatic javascript, since there are no native multidimensional arrays. We can produce a simple implementation of positive ranks if we make a few simple assumptions about usage. First consider:

function guessRank(a){
    var rank = 0;
    var done = false;
    while(!done){
        if(typeof a['length'] === 'number'){
          rank = rank + 1;
          a = a[0];
        } else {
          done = true;
        }
    }
    return rank;
}

Working like:

:Puff:> guessRank(10)
0
:Puff:> guessRank([1,2,3])
1
:Puff:> guessRank([[1,2],[2,3],[3,4]])
2

The assumption we are making here is that the rank of sub-elements is homogeneous (and hence, the first element is a sufficient indicator). Now that we can guess the rank of an array, we can fill in the positive rank branch of our cells function:

function cells(n, a){
    if(n<0){
      var nn = -n;
      var out = [];
      var indexes = initArray(nn,0);
      while(indexes){
          out.push(index.apply(null, [a].concat(indexes)));
          indexes = nextCellIndex(a,indexes)
      }
      return out;
    } else {
      var rank = n-guessRank(a);
      return cells(rank, a);
    }
}

Now we can finally write our implementation of J‘s conjunction ". Our version of " will be called rank and will take a function and a list of ranks and return a new function with the appropriate rank behavior.

function rank(f){
    var ranks = Array.prototype.slice.call(arguments,1,arguments.length);
    return function(){
    return map.apply(null,[f].concat(map(cells, 
                         Array.prototype.slice.call(arguments,0,arguments.length),
                         ranks)));
    }
}

We can now say:

rank(plus,0,0)([1,2,3],[[4],[5],[6]])

And get [5,7,9]. Just like J. Of course, as we’ve written the code here we won’t be anywhere near the efficiency of J – in particular we iterate over each argument array separately, where we could combine all those loops into just one. But performance isn’t everything and we can always optimize the Puff implementation as needed. Rewriting the approprite sequence functions (map,mapcat,crossMap) to handle lazy versions of the sequences and introducing a lazy cells operator would be the most elegant solution. I’m sure I’ll get there eventually.

In the meantime, I hope I’ve at least helped the reader understand J‘s rank concept in greater depth and also showed off some of the nice ways Puff can simulate J style while staying entirely in Javascript.


Compose is Better than Dot (or: Function Level Programming In Javascript)

That great minds think alike is a reflection of the fact that certain ideas have an appeal that is, if not innate, then compelling in context. Hence the use of dot notation in the creation of domain specific languages in Javascript: dot is a handy way of succinctly representing a computation with a context. This is closely related to monads, and many javascript libraries relying heavily on dot for syntactic sugar are very closely related to one monad or another (eg _ to the sequence monad, promises to a sort of continuation monad, etc).

(NB. Much of the code here is inspired by a pointfree/function level programming library I am building for Javascript called puff, available here);

What I aim to prove today is that function composition subsumes the functionality of dot in this regard and that we can embed in Javascript a powerful, succinct function-level programming language in the vein of APL, J, or K/Q based primarily on function composition.

First of all, the traditional definition of function composition:

/** compose functions
 *  return a new function which applies the last argument to compose
 *  to the values passed in and then applies each previous function
 *  to the previous result.  The final result is returned.
 */
function compose(){
    var fs = Array.prototype.slice.call(arguments, 0, arguments.length);
    return function(){
    var inArgs = Array.prototype.slice.call(arguments, 0, arguments.length);
    var i = fs.length-1;
    var res = fs[i].apply(null, inArgs);
    i = i - 1;
    for(; i>=0; i = i - 1){
        res = fs[i](res);
    }
    return res;
    }
}

This version of compose allows the programmer to pass in an unlimited number of functions and returns a new function which takes any number of arguments and then threads the result through all the previous functions, transforming it each time, starting with the last function and ending with the first.

This order of composition is inspired by the fact that the function in a traditional application expression precedes the argument (on the left), transforming:

f(h(g(o)))

“naturally” to

compose(f,h,g)(o)

or, if succinctness is of interest:

var c = compose;
c(f,h,g)(o)

In this way we drop a few parentheses and hopefully express more clearly our intent.

Of course we might not wish to apply our composition immediately: we can produce useful functions via composition, after all.

var plusOne(x){ return x + 1 };
var timesTen(x){ return x*10 };

var plusOneTimesTen = c(timesTen, plusOne);

We can now apply plusOneTimesTen to as many values as we wish. Handy. However, now our intuition about naming and the order of the arguments to c are at odds. Hence, we introduce:

function rCompose(){
   return compose.apply(null,
     Array.prototype.slice.call(arguments, 0, arguments.length).reverse());
}
var r = rCompose;

So that the above looks a bit nicer:

var plusOne(x){ return x + 1 };
var timesTen(x){ return x*10 };

var plusOneTimesTen = r(plusOne, timesTen);

This reverse composition operator is similar in many respects to dot in javascript except we have abstracted away this, to which each method invocation is implicitly addressed in a dot chain. In addition, instead of simple method names, each element in our r list can be any Javascript expression which evaluates to a function. This means that we can denote any sequence of operations this way without worrying whether or not they have been associated with any particular Javascript object.

With the right set of primitives and combinators, r forms the basis of a powerful, succinct function level programming language in which we can build, among other things, expressive domain specific languages. Or with which we can rapidly denote complex operations in a very small number of characters.

Well, first of all we want the bevy of basic functions:

plus, minus, div, times, split, join, toString, index, call,
apply, etc, array

These behave as you might expect, more or less (eg, some plausible implementations to give you the spirit of the idea):

function plus(){
   var out = arguments[0];
   Array.prototype.slice.call(arguments, 1, arguments.length)
    .forEach(function(x){
        out = out + x;
     });
   return out;
}

function index(o, i){
  return o[i];
}

function call(f){
  return f.apply(null, Array.prototype.slice.call(arguments, 0, arguments.length));
}

You get the idea.

Astute readers may realize that our composition function seems to only work with functions of a single argument. Remedying this will be a matter of some interest. The simplest approach is to provide for partial application:

/** partially fix arguments to f (on the right)
 *
 */
function partialRight(f /*... fixedArgs */){
    var fixedArgs = Array.prototype.slice.call(arguments, 1, arguments.length);
    var out = function(/*... unfixedArgs */){
    var unfixedArgs = Array.prototype.slice.call(arguments, 0, arguments.length);
    return f.apply(null,unfixedArgs.concat(fixedArgs));
    }
    out.toString = function(){
    return "partialRight("+f.toString()+","+fixedArgs.join(",")+")";
    }
    return out;
}

/** partially fix arguments to f (on the left)
 *
 */
function partialLeft(f /*... fixedArgs */){
    var fixedArgs = Array.prototype.slice.call(arguments, 1, arguments.length);
    var out = function(/*... unfixedArgs */){
    var unfixedArgs = Array.prototype.slice.call(arguments, 0, arguments.length);
    return f.apply(null,fixedArgs.concat(unfixedArgs));
    }
    out.toString = function(){
    return "partialLeft("+f.toString()+","+fixedArgs.join(",")+")";
    }
    return out;
}

These functions (they might be adverbs in J) take a function and a set of values and return a new function, “fixing” the arguments to the left or right of the argument list, depending on whether we’ve usedpartialLeft or partialRight.

Its handy to introduce the following bindings:

var p_ = partialRight;
var _p = partialLeft;

I hope these are relatively mnemonic (Javascript unfortunately isn’t an ideal environment for very short, expressive names).

We can get a surprising amount of mileage out of these ingredients already. For instance, a function to remove break tags from a string and replace them with newlines (sort of contrived):

var remBreaks = r(p_(split,'<br>'),p_(join,'\n'));

compared to

function remBreaks(s){
   return s.split('<br>').join('\n');
}

(NB. If split and join are written as curried functions, as they are in puff, the above is a little shorter:

var remBreaks = r(split('<br>'),join('\n'));

Providing a meaningful default currying (which args should be applied first) is a little tricky, though.)

Going Further

The above example demonstrates that we can do some handy work with r as long as it involves simply transforming a single value through a set of functions. What may not be obvious here is that a determined programmer can denote any computation whatsoever this way, even if multiple values might need to be kept around and transformed.

Consider the function which I will call augment or au:

/** given a function f and a additional functions gs
 *  return a new function h which applies each g
 *  to its single argument and then applies f to the 
 *  resulting list of values 
 */
function augment(f /*... gs*/){
    var gs = Array.prototype.slice.call(arguments, 1, arguments.length);
    var out = function(a){
    return f.apply(null, gs.map(function(g){
        return g(a);
    }));
    }
    out.toString = function(){
    return "augment("+f.toString()+","+gs.map(toString).join(", ")+")";
    }
    return out;
}

And a nice default currying of index:

function index(o, ix){
   if(typeof ix === "undefined"){
     var realIx = o;
     return function(o){
       return o[realIx];
     }
   } else {
     return o[ix];
   }
}
var ix = index;

Now:

var fullName = augment(join(' '), ix('first'), ix('last'));

Such that:

fullName({first:'Ronald',last:'Reagan'}) -> "Ronald Reagan"

What have we just accomplished? We’ve demonstrated that we can take an arbitrary function of any arity and automatically transform it into a function which reads its input arguments from a single object. The sort of single object which we might be passing from function to function via r.

Putting it Together

To go further we need to add just one more utility function: cleave

function cleave(v /*... fs*/){
    var fs = Array.prototype.slice.call(arguments, 1, arguments.length);
    return fs.map(function(f){
    return f(v);
    });
}

and the shortcut:

function cl_(){
  return function(f){
    return cleave.apply(null, [f].concat(Array.prototype.slice.call(arguments,0,arguments.length)));
  }
}

(This is just cleave curried on the right, eg in puff: c_(cleave).)

// replaceMiddleName name newMiddleName -> transformedName
var replaceMiddleName = r(args(2),
                          cl_(r(first, split(' ')), second),
                          cl_(r(first,first),second,r(first, third)),
                          au(join(' '), first, second, third));

Let’s go nuts with some of the extra utility functions in puff:

var replaceMiddleName = f(2,
                          cl_(r(n0,split(' ')),n1),
                          cl_(n00, n1, n02),
                          join(' '));

Puff lets you do all of this insanity quite easily. There are browser and nodejs builds. You use it by saying, in node:

require('puff').pollute(global);

Which pollutes the global namespace with the puff functions. Since terseness is the rule I can’t imagine why you’d do otherwise, but you can also say:

var puff = require('puff');
puff.r(...)

In the browser, add:

<script src="./build/puff-browser.js"></script>

To your head. This will define a global puff object, which you can then use directly or say:

puff.pollute(window);

Quick Thoughts about Interactive Fiction

I’ve recently started a podcast called Text Adventure Purgatory wherein myself and several friends play and talk about Text Adventures/Interactive Fiction. Doing so has crystalized, in my mind, a few thoughts have been in mere fluid suspension in the back of my head about games and fun in general.

“A Theory of Fun for Game Design,” by Raph Koster asserts the following basic premise: fun is learning. This predicts that if a game offers to you a system which you can learn, then you will have fun playing it up until you have exhausted either the system or your capacity to continue learning about it. It’s silly to suggest that this theory covers everything that is fun or everything we might want to assert is a game (this kind of idealism is counterproductive in any context, if you ask me), but it is, I would argue, a useful one.

What is learning, anyway? I think neuroscience and contemporary machine learning techniques (which are inspired by and inspire neuroscience) can provide us with a useful model of the process: learning is an optimization problem which attempts to map inputs onto “desired” outputs or outcomes. Eg: the pixels (and their history) on a screen are mapped by our brains into a series of button presses which result in Mario reaching the end of the screen, where he touches the flag pole. Better than just describing the process, we now have a reasonable idea of how it happens too, and how to imitate the process in software.

There are lots of techniques for the latter, but they basically boil down to optimizing an objective function (the mapping from input to output) by exploring the input space, finding, and following trends in the output space. That is, start with a naive model, take some characteristic input data, apply the model to it, measure the outcome, make small changes to the model to improve the outcome (lots of strategies for this step), repeat until the model behaves well enough for your purposes. In the brain this happens by adjusting synaptic weights (and other physiological properties) of the neurons in question. In computerized learning systems this occurs by modifying the numerical parameters of the model.

Now we are ready for the point of this reflection: text adventures and interactive fiction provide too sparse a set of inputs and outputs to meaningfully train a system for playing. They (generally, I’m sure exceptions exist or attempt to exist) don’t provide a rich enough state space for learning, and hence they aren’t fun in the way that “A Theory of Fun” proposes we interpret that word.

What do I mean by “too sparse?” I mean, for one thing, that for any state in the game I can specify some non-perverse measure of similarity and value for that measure which has the following property: there will be no neighboring states included within that boundary. This is in contrast to games which involve simulated motion in space, which is, for the purposes of our discussion, continuous (that computers actually only simulate discrete spaces is not really material to the discussion: they are discrete spaces of sufficient granularity that our brains perceive them to be continuous).

For instance, there is a state in Deadline, the infocom game we played for several episodes in TAP, wherein the player character has discovered several pieces of broken china in the rose garden near the balcony of the library in which a murder has taken place. We arrive at this state only and exactly when a particular sequence of events (amounting, in isolation, to a few turns in the right order) has occured. There is nothing to refine about the process of reaching this state: either you perform the sequence of actions that produce this outcome or you do not so perform them.

A bit of reflection reveals how much in contrast this is with more typical videogames: in Super Mario Brothers, for instance, there are effectively an infinite number of ways to touch (to specify a single instance) the final flagpole in each level. As we vary the exact moment we press the jump botton, where we jump from, how long we hold it, how long we have run before, we refine the final state of interest and can find a solution which maximizes our height on the pole. There is a continuum of input states and output states (and a clear way of measuring our success) which allows those learning circuits (to use a drastically oversimplifying colloquialism) to grab onto something.

When playing a text adventure, in contrast, we essentially have nothing to do but explore, often by brute force, the state space the game gives us branch by branch until we find the final state. This is not usually fun, and using the context clues embedded in the text rarely helps: they can be either obtuse, in which case we are in the first strategy, or obvious, in which case there isn’t much to do but follow their instructions and traverse the graph. This problem is exacerbated by the fact that text adventures present themselves to us as text, creating the illusion of a rich, detailed world where, computationally, the exact opposite is true: everything reduces to a set of nodes connected by edges. Labeling more than one of those edges as “ending” the game helps a little: we can repeat the experience and land at different ending nodes by virtue of knowledge obtained on previous playthroughs, but we are still jumping for discrete state to discrete state, connected by discrete edges of low cardinality.

This isn’t a dig at interactive fiction: it is a way of explaining why it doesn’t “play like” other kinds of videogames, despite sharing a medium (computers). Novels, for instance, are even more restricted than interactive fiction: they proceed only and exactly in one way and come to life only and exactly as we read them.

Maybe these reflections tell us what we already know: that interactive fiction is more literature than game and that we should look elsewhere than traditional videogame experiences for an interpretive strategy which will allow us to discuss interactive fiction meaningfully.