Tag Archives: programming

Compose is Better than Dot (or: Function Level Programming In Javascript)

That great minds think alike is a reflection of the fact that certain ideas have an appeal that is, if not innate, then compelling in context. Hence the use of dot notation in the creation of domain specific languages in Javascript: dot is a handy way of succinctly representing a computation with a context. This is closely related to monads, and many javascript libraries relying heavily on dot for syntactic sugar are very closely related to one monad or another (eg _ to the sequence monad, promises to a sort of continuation monad, etc).

(NB. Much of the code here is inspired by a pointfree/function level programming library I am building for Javascript called puff, available here);

What I aim to prove today is that function composition subsumes the functionality of dot in this regard and that we can embed in Javascript a powerful, succinct function-level programming language in the vein of APL, J, or K/Q based primarily on function composition.

First of all, the traditional definition of function composition:

/** compose functions
 *  return a new function which applies the last argument to compose
 *  to the values passed in and then applies each previous function
 *  to the previous result.  The final result is returned.
 */
function compose(){
    var fs = Array.prototype.slice.call(arguments, 0, arguments.length);
    return function(){
    var inArgs = Array.prototype.slice.call(arguments, 0, arguments.length);
    var i = fs.length-1;
    var res = fs[i].apply(null, inArgs);
    i = i - 1;
    for(; i>=0; i = i - 1){
        res = fs[i](res);
    }
    return res;
    }
}

This version of compose allows the programmer to pass in an unlimited number of functions and returns a new function which takes any number of arguments and then threads the result through all the previous functions, transforming it each time, starting with the last function and ending with the first.

This order of composition is inspired by the fact that the function in a traditional application expression precedes the argument (on the left), transforming:

f(h(g(o)))

“naturally” to

compose(f,h,g)(o)

or, if succinctness is of interest:

var c = compose;
c(f,h,g)(o)

In this way we drop a few parentheses and hopefully express more clearly our intent.

Of course we might not wish to apply our composition immediately: we can produce useful functions via composition, after all.

var plusOne(x){ return x + 1 };
var timesTen(x){ return x*10 };

var plusOneTimesTen = c(timesTen, plusOne);

We can now apply plusOneTimesTen to as many values as we wish. Handy. However, now our intuition about naming and the order of the arguments to c are at odds. Hence, we introduce:

function rCompose(){
   return compose.apply(null,
     Array.prototype.slice.call(arguments, 0, arguments.length).reverse());
}
var r = rCompose;

So that the above looks a bit nicer:

var plusOne(x){ return x + 1 };
var timesTen(x){ return x*10 };

var plusOneTimesTen = r(plusOne, timesTen);

This reverse composition operator is similar in many respects to dot in javascript except we have abstracted away this, to which each method invocation is implicitly addressed in a dot chain. In addition, instead of simple method names, each element in our r list can be any Javascript expression which evaluates to a function. This means that we can denote any sequence of operations this way without worrying whether or not they have been associated with any particular Javascript object.

With the right set of primitives and combinators, r forms the basis of a powerful, succinct function level programming language in which we can build, among other things, expressive domain specific languages. Or with which we can rapidly denote complex operations in a very small number of characters.

Well, first of all we want the bevy of basic functions:

plus, minus, div, times, split, join, toString, index, call,
apply, etc, array

These behave as you might expect, more or less (eg, some plausible implementations to give you the spirit of the idea):

function plus(){
   var out = arguments[0];
   Array.prototype.slice.call(arguments, 1, arguments.length)
    .forEach(function(x){
        out = out + x;
     });
   return out;
}

function index(o, i){
  return o[i];
}

function call(f){
  return f.apply(null, Array.prototype.slice.call(arguments, 0, arguments.length));
}

You get the idea.

Astute readers may realize that our composition function seems to only work with functions of a single argument. Remedying this will be a matter of some interest. The simplest approach is to provide for partial application:

/** partially fix arguments to f (on the right)
 *
 */
function partialRight(f /*... fixedArgs */){
    var fixedArgs = Array.prototype.slice.call(arguments, 1, arguments.length);
    var out = function(/*... unfixedArgs */){
    var unfixedArgs = Array.prototype.slice.call(arguments, 0, arguments.length);
    return f.apply(null,unfixedArgs.concat(fixedArgs));
    }
    out.toString = function(){
    return "partialRight("+f.toString()+","+fixedArgs.join(",")+")";
    }
    return out;
}

/** partially fix arguments to f (on the left)
 *
 */
function partialLeft(f /*... fixedArgs */){
    var fixedArgs = Array.prototype.slice.call(arguments, 1, arguments.length);
    var out = function(/*... unfixedArgs */){
    var unfixedArgs = Array.prototype.slice.call(arguments, 0, arguments.length);
    return f.apply(null,fixedArgs.concat(unfixedArgs));
    }
    out.toString = function(){
    return "partialLeft("+f.toString()+","+fixedArgs.join(",")+")";
    }
    return out;
}

These functions (they might be adverbs in J) take a function and a set of values and return a new function, “fixing” the arguments to the left or right of the argument list, depending on whether we’ve usedpartialLeft or partialRight.

Its handy to introduce the following bindings:

var p_ = partialRight;
var _p = partialLeft;

I hope these are relatively mnemonic (Javascript unfortunately isn’t an ideal environment for very short, expressive names).

We can get a surprising amount of mileage out of these ingredients already. For instance, a function to remove break tags from a string and replace them with newlines (sort of contrived):

var remBreaks = r(p_(split,'<br>'),p_(join,'\n'));

compared to

function remBreaks(s){
   return s.split('<br>').join('\n');
}

(NB. If split and join are written as curried functions, as they are in puff, the above is a little shorter:

var remBreaks = r(split('<br>'),join('\n'));

Providing a meaningful default currying (which args should be applied first) is a little tricky, though.)

Going Further

The above example demonstrates that we can do some handy work with r as long as it involves simply transforming a single value through a set of functions. What may not be obvious here is that a determined programmer can denote any computation whatsoever this way, even if multiple values might need to be kept around and transformed.

Consider the function which I will call augment or au:

/** given a function f and a additional functions gs
 *  return a new function h which applies each g
 *  to its single argument and then applies f to the 
 *  resulting list of values 
 */
function augment(f /*... gs*/){
    var gs = Array.prototype.slice.call(arguments, 1, arguments.length);
    var out = function(a){
    return f.apply(null, gs.map(function(g){
        return g(a);
    }));
    }
    out.toString = function(){
    return "augment("+f.toString()+","+gs.map(toString).join(", ")+")";
    }
    return out;
}

And a nice default currying of index:

function index(o, ix){
   if(typeof ix === "undefined"){
     var realIx = o;
     return function(o){
       return o[realIx];
     }
   } else {
     return o[ix];
   }
}
var ix = index;

Now:

var fullName = augment(join(' '), ix('first'), ix('last'));

Such that:

fullName({first:'Ronald',last:'Reagan'}) -> "Ronald Reagan"

What have we just accomplished? We’ve demonstrated that we can take an arbitrary function of any arity and automatically transform it into a function which reads its input arguments from a single object. The sort of single object which we might be passing from function to function via r.

Putting it Together

To go further we need to add just one more utility function: cleave

function cleave(v /*... fs*/){
    var fs = Array.prototype.slice.call(arguments, 1, arguments.length);
    return fs.map(function(f){
    return f(v);
    });
}

and the shortcut:

function cl_(){
  return function(f){
    return cleave.apply(null, [f].concat(Array.prototype.slice.call(arguments,0,arguments.length)));
  }
}

(This is just cleave curried on the right, eg in puff: c_(cleave).)

// replaceMiddleName name newMiddleName -> transformedName
var replaceMiddleName = r(args(2),
                          cl_(r(first, split(' ')), second),
                          cl_(r(first,first),second,r(first, third)),
                          au(join(' '), first, second, third));

Let’s go nuts with some of the extra utility functions in puff:

var replaceMiddleName = f(2,
                          cl_(r(n0,split(' ')),n1),
                          cl_(n00, n1, n02),
                          join(' '));

Puff lets you do all of this insanity quite easily. There are browser and nodejs builds. You use it by saying, in node:

require('puff').pollute(global);

Which pollutes the global namespace with the puff functions. Since terseness is the rule I can’t imagine why you’d do otherwise, but you can also say:

var puff = require('puff');
puff.r(...)

In the browser, add:

<script src="./build/puff-browser.js"></script>

To your head. This will define a global puff object, which you can then use directly or say:

puff.pollute(window);

Optional Keyword Arguments in J

J is great! It is a wonderful little language for data analysis tasks. However, to programmers used to working in modern dynamic languages, it sometimes feels a little restrictive. In particular, a ubiquitous feature in the post-Python (let’s call it) era is hash-maps. Even in older languages, like Common Lisp, the association list – allowing abritrary mappings between keys and values, is a very common idiom.

J exposes no native functionality that exactly meets this use case, one common application of which is named, optional arguments to functions.

In developing an interactive gnuplot interface for J, I wanted to pass optional keyword arguments to functions so that plots can be easily customized. So I developed a simple representation of key/val pairs which is efficient enough for small collections of named values.

Consider:

nil =: (0$0)

rankOne =: 1: -: (#@:$)
rankTwo =: 2: -: (#@:$)
talliesEven =: 0: -: (2: | #)
twoColumns =: 2: -: 1&{@:$

opts =: monad define
  keysandvals =. y
  assert. rankOne keysandvals
  assert. talliesEven keysandvals
  ii =. 2 | (i. #y)
  keys =. (-. ii) # y
  vals =. ii # y
  keys (>@:[ ; ])"(0 0) vals
)

Opts is a monadic verb which takes a flat boxed array of even length and returns a rank two boxed array whose first column is keys and whose second column is values:

opts 'key1';10;'key2';11
+----+--+
|key1|10|
+----+--+
|key2|11|
+----+--+

Look up is simple: find the key’s index in the first column and index the second column with it. Return nil if the key isn’t found.

getopt =: dyad define
  options =. x
  key =. y
  assert. rankTwo options
  assert. twoColumns options
  if. 0 -: #options do.
   nil
  else. 
    ii =. ((i.&1)@:(((key&-:@:>)"0)@:(((0&{)@:|:)))) options
    if. ii < #options do.
      (>@:(ii&{)@:(1&{)@:|:) options
    else.
      nil
    end.
  end.
)

Eg:

(opts 'key1';10;'key2';11) getopt 'key1'
-> 10

We can now define a handy conjunction to allow the specification of a default value:

dft =: conjunction define 
:
  r =. x u y
  if. r -: nil do.
   n
  else.
   r
  end.
)

Which we use this way:

(opts 'key1';10;'key2';11) getopt dft '___' 'key1'
-> 10
(opts 'key1';10;'key2';11) getopt dft '___' 'key3'
'---'

This allows us to pass optional arguments to verbs and specify default values relatively easily, as in this example from my gnuplot library:

histogram =: verb define 
  (ensureGnuPlot'') histogram y
:
  data =. y
  if. boxedP x do.
   options =. x
   gph =. options getopt dft (ensureGnuPlot'') 'gnuplot'
  else.
   gph =. x
   options =. (opt '')
  end.
  mn =. <./ data
  mx =. >./ data 
  bw =. options getopt dft (0.1&* mx-mn) 'binwidth'
  pttl =. options getopt dft '' 'plot-title'
  'binwidth=%f\n' gph gpfmt <bw
  gph send 'set boxwidth binwidth'
  gph send 'bin(x,width)=width*floor(x/width) + binwidth/2.0' 
  s =. 'plot "%s" using (bin($1,binwidth)):(1.0) smooth freq title "%s" with boxes'
  s gph gpfmt ((asFile (asVector data));pttl)
)

Where, in the dyadic case, we detect whether x is boxed, and if so, treat it as a list of options. We extract each option by name, providing a reasonable default.

This seems like a common problem, so I am wondering if anyone else in the J community has solved it before in a different way?

Elaborations on “You Aren’t Gonna Need It”

The Cunningham & Cunningham Wiki is a wonderful place to get lost in, and it is so (chaotically) packed with useful programming lore that you are bound to come out of a dive a bit more enlightened about what it is programmers actually do.

One of my favorite pages is You Aren’t Gonna Need It from which I pull the following quotation:

Always implement things when you actually need them, never when you just foresee that you need them.

The justification for this is pretty straightforward: adding things you don’t need takes time and energy, plus generates more code, which means more potential bugs and cognitive load for future development. Since your job is to deliver software that actually does something which you presumably have at least a provisional understanding of, speculative development is an obvious waste of time.

To this basic justification I add only the following small elaboration: If you don’t need it now, you probably don’t understand it anyway. Anything you implement speculatively is very likely to be wrong as well as useless.

Why Software Is Hard

There are a lot of reasons software engineering is hard. Probably the primary reason it is hard is that we do not yet have a complete understanding of why it is so hard in the first place. Richard P Gabriel, a software philosopher and progenitor of the Worse is Better meme observes, probably correctly, that one reason for this fundamental ignorance is that software engineering is a comparative tyro among engineering disciplines: it is both extremely young (beginning only in 1945, approximately) and subject to radical change. A developer in 1945 would find the contemporary development environment utterly alien and vice versa, a state of affairs not afflicting, for instance, stone masons, who have, arguably, thousands of years of more or less dependable tradition to inform their work.

With characteristic insight, Dr Gabriel also observes that software engineering is also difficult because it isn’t physically constrained1, which means that humans, the product of 3.5 billion years of evolution in a physical environment, but not a computational one, have very little experience upon which to rely as they build objects in the space of computable functions.

Suffer me a brief, digressive analogy: Hyper Rogue III is a game which takes place in a two-dimensional hyperbolic plane. One implication of such a space is that it is very unlikely that a wanderer will ever return to a particular position unless she almost exactly follows her own trail of bread crumbs. Exploring the space of computable functions is similarly dangerous to wanderers, except more so: we are not well equipped to even identify the hills, valleys, walls and contours of this space.

Hence my elaboration: the wanderer in the landscape of programming is very likely to lose his way. He has neither physical constraints nor well developed intuition to guide him. And despite the existence of tools like git, which promise us the ability to backtrack and refactor with some confidence, software inevitably ossifies as it grows.

Hence my elaboration: you don’t build things you don’t need because, if you don’t need them, you probably don’t really understand what they should be. If you don’t understand what they should be, you’ll probably wander off base as you build them, and you probably won’t even notice that you have wandered off until something concrete impinges upon that code. When that happens, you might find that refactoring to solve the problem you have now is prohibitively difficult, because that feature you thought you would need has subtly impinged on other parts of your code.

Explicit, concrete requirements are the closest thing you have to the physical constraints which make stone masonry a more reliable discipline than software engineering, and nothing but rote experience will ever give you the physical intuition that stone masons can rely on.

So don’t wander off: You Aren’t Gonna Need It anyway. At least until you do.


1: Well, the relationship between the physical constraints on software and the software is, at any rate, not entirely trivial or transparent.


Hey, what is up with you and Lisp?

Yeah. What is up with me and Lisp? Maybe you are a Lisp programmer and you’re interested in why I choose to spend so much time in parentheses. Or maybe you’ve never programmed in Lisp and you’re interested in why someone might choose to.

Well, these days, the question of why a person might choose Lisp is somewhat difficult to definitively answer. Time was, Lisp could make the claim, along with cousins, like Smalltalk, to be among the most forward thinking of programming languages, but contemporary life is littered with languages just as clever, dynamic, ergonomic, or mathematically sound. So while the question is, in many ways, more difficult to answer than ever, it is also more salient than it has ever been1.

(Maybe it’s not the) Metaprogramming

It bears emphasis that Lisp isn’t a single language, but a family of dialects with widely varying semantics, and significantly varying syntaxes. This misperceived closeness among dialects is a somewhat unusual aspect of the Lisp family or the culture around the culture around Lisp1. For instance, Lua and Javascript are much more closely related than, say, Clojure and Common Lisp, yet we do not instantly group the former pair into a language family, say of, `table based dynamic programming languages`.

However, the one thing Lisp dialects tend to have in common is the ease with which one can metaprogram, and I think this is a major component of the Lisp experience and this at least seems to be related to their distinctive, fully parenthesized, s-expression based syntaxes. And I would like to advance that it is in the area of syntactic simplicity that Lisp’s true appeal lies, at least to this author.

So what, right? The observation that metaprogramming is the reason to use and/or like Lisp is hardly new. Sure. But what is true is that metaprogramming is now hardly the exclusive domain of Lisp. Contemporary programming languages all support aspects of metaprogramming in one form or another. In Ruby, for instance, blocks give you 70% of what you want out of metaprogramming for 10% of the price you pay in Lisp2, Python uses decorators to expose a limited set of metaprogrammatic techniques, C++ allows psychopaths3 to template metaprogram, language-level support for monads in Haskell constitutes a kind of metaprogramming in that it exposes an interface for the nature of binding in certain contexts4. And if these language features don’t strike you as rich enough to compete with Lisp’s `just mess with the code before the compiler gets it` approach, most languages have these sorts of facilities these days.

So what is it that you can’t get from other languages that you can get in Lisp?

What you get

At the outset, I want to say that I’ve had the luck to work in Common Lisp for the past two years and that, after an initial period of adjustment, I’ve come to see the language as extraordinarily well designed, if lacking, somewhat, in modern sensibilities. So if you choose to work in Common Lisp, you get a very pragmatic, extensively well thought out, extremely powerful base language to work with. As a matter of comparison, Common Lisp is more cleverly and extensibly object oriented than Ruby, with semantics carefully thought out so that most implementations can acheive significantly better performance. So if you have the expertise and the temperment, you can rely on a good Common Lisp implementation to get you where you need to to go.

But that isn’t really the point of this little essay. I’m not primarily interested in the best language for getting a project to completion. If that is your interest, lots of factors will influence your decision that don’t have anything to do with the fact that Lisp might feel like a really well broken in pair of jeans.

Ultimately, I think Lisp’s real strength is the unpretentiousness with which it exposes the underlying nature of the programming system. While the Lisp community might be accused of pretention from time to time, I think the language is decidedly straightforward: in most languages you go through the, in my opinion, silly exercise of denoting a program tree using some large set of syntactic conventions, and then the language runtime converts that into a syntax tree, which is then operated upon to produce executable instructions of some kind. In Lisp, you are able to pretend, at least5, that you are denoting that intermediate tree directly.

This is not profound. On the contrary, it is essentially pragmatic6, and it has the appeal of the essentially pragmatic.

One source of feelings of alienation, in my opinion, is to be expected or forced to act in ways which one knows are meaningless. It might be argued that your average algol-syntaxed language is asking you to do that which is alienating: denote what you know is simply a tree as something which requires a complex transformation to arrive there.

The benefits of this simplicity extend beyond vague feelings of empowerment. They enable, for instance, one to engage in code transformations without the conceptual overhead of source to source transformation. Of course, the runtime is performing a source-to-tree transformation, but it feels trivial, and your own code which operates on the parse tree can, most of the time, be written as though it is operating on the thing you typed, rather than the thing which is produced by the runtime. Once you encompass backquote, unquote, and unquote splice, denoting program transformations requires very little conceptual overhead7.

But the other benefit is the fact that your toolchain benefits in kind from the simplicity of the language. Tools like paredit or the excellent Dr Racket8 further close the gap between typing individual characters and denoting and manipulating, directly, the code tree.

The Anticlimax

So this isn’t exactly a resounding endorsement of the language, but that is exactly my point. I don’t think it is credible, in the contemporary programming language ecosystem, to claim that Lisp is significantly better as an existential matter, than a lot of languages out there. In many respects, languages like Haskell are much better considered intellectually, and a language like Javascript is much better positioned to actually deliver a product to someone.

But I do think that Lisp’s simplicity in the arena of “source to parse tree to executable code” gives it a distinctly pleasant, distinctly empowering “feel,” and that probably accounts for a significant number of its deep adherents.

And that is the deal with me and Lisp.

Footnotes

1: That is to say, the similarity is not itself a property of the languages, nor a property of the immediate Lisp community, who somewhat close to the languages in question, tend not to confuse them, but a property of the community of programmers `at large`.

2: Of course Smalltalk did blocks earlier and better.

3: No shortage of these, somehow.

4: A given monad can be thought of as specifying alternative semantics for the act of binding a value to a variable and handling the result of an expression evaluated in the context of that binding. A surprisingly large space of `mini-languages` can be reached by controlling exactly what these two operations constitute.

5: I say pretend because it turns out that your average Lisp reader is really doing a lot of work, and that the data structure you denote and its denotation are, as a consequence, not as close to one another as you might imagine. The key is that they are close enough that you can avoid thinking too hard about it most of the time.

6: Literally. The original intent was to denote Lisp programs using m-expressions, but it was an extra complication that never really caught on, since most programmers found it simpler to just use parenthesized notation.

7: Or so it seems. One shortcoming of Lisp, it might be argued, is that it makes thinking about program transformation seem easier than it actually is. Correct program transformations, in which the resulting code does what you expect it to do in all scenarios is a pretty hard problem. A hard problem which is not even generally conceived of in its entirety by those writing macros, as evidenced by the frequency with which one hears the false claim that `gensym` is sufficient to avoid unexpected behavior.

8: To be fair, Racket goes way beyond Lisp in general in this area.