Monthly Archives: November 2015

Aping J’s Verb Rank in Puff

This blog post will sketch out some thoughts relating to Puff, a function level programming language I am embedding in Javascript and J‘s notion of operator rank.

Rank as it pertains to nouns in J is fairly easy to understand: it is just the number of dimensions of an array. Scalars, like 10, have rank 0, the empty value (denotable as 0 $ 0) has rank 0, a simple vector (0 0 0) has rank 1, a 2d matrix rank 2, etc.

But J also has rank for verbs. Consider the verb +.

(1 2 3) + (4 5 6)
-> 5 7 9

(For J tyros: + is called a verb in J and furthermore we use it in its dyadic sense, which is to say we pass it arguments on the left and the right.)

Informally we understand from this that in J + operates element-wise on both its left and right operands. This means its left and right rank are both zero and it operates, then, on the rank zero elements of its arguments: the individual scalar values.

But there is more to the story. As a side note, We can denote multi-dimensional arrays in J like so:

]example =. 2 3 $ 1 2 3 4 5 6 
1 2 3
4 5 6

(For the curious, that is “change the shape ($) for array 1 2 3 4 5 6 so that it is 2 3)

J has a box operator which is handy for demonstrating rank issues. It is denoted < and wraps any value into a box, which is a special scalar value type which holds something else.

<1
┌─┐
│1│
└─┘

Operators in J have rank and the rank of < is infinite. This means that it always operates on its argument as a whole.

<(1 2 3 4 5 6)
┌───────────┐
│1 2 3 4 5 6│
└───────────┘

But the smart thing about J is that you can modify verbs with adverbs one of which returns a new verb with a different rank. See if you can guess what all this means:

<"0(1 2 3 4 5 6)
┌─┬─┬─┬─┬─┬─┐
│1│2│3│4│5│6│
└─┴─┴─┴─┴─┴─┘

The array denotation 1 2 3 4 5 6 is the same as before, but now we have written <"0 instead of <. " is an adverb which modified its right hand arguments’ rank such that it is the left hand value. The result of<"0 then is a verb with the same meaning as < except that it has rank 0. Verbs with rank 0 operate on the smallest possible cells of the array, so that

<"0(3 2 $ 1 2 3 4 5 6)
┌─┬─┐
│1│2│
├─┼─┤
│3│4│
├─┼─┤
│5│6│
└─┴─┘

each element of the input is boxed regardless of the incoming arrays shape or rank.

If we use a different rank:

<"1(3 2 $ 1 2 3 4 5 6)
┌───┬───┬───┐
│1 2│3 4│5 6│
└───┴───┴───┘

We get a different result. One-ranked verbs operate 1-cells (that is, the elements of rank 1) of the incoming array, in this case the arrays 1 2, 3 4, and 5 6.

The rules work for dyadic values too – each argument of the verb has a rank (a right rank and a left rank) which determines how the underlying logic represented by the verb is used to join elements from the right and left arguments.

By modifying verb rank you can custom tailor your iteration through argument arrays and avoid most explicit looping.

Puff

Puff is mostly aping the function level semantics of J but we can analogize verb rank too. Consider the Puff function map, which has a single argument meaning:

var plus1 = _p(plus,1);
map(plus1)([1,2,3]) -> [2,3,4]

plus1 above would have, in J an infinite rank: it always applies to its whole argument. When we say map(plus1) we have a new function which applies to the N-cells of its argument (in this case integers). In other words, map creates a new function which peels off one layer of its input and applies the original function, collecting the outputs.

What, then, is

var mm_plus1 = map(map(plus1))

?

(NB, we can denote this in Puff via rep(map,2,plus1))

Here is a hint:

mm_plus1([[1,2,3],[4,5,6]]) -> [[2,3,4],[5,6,7]]

Now we have a function operating on the N-2 cells of the input. Rank in J typically operates bottom up: we start at rank 0 operating on the 0 cells, and increasing rank operates on larger and larger chunks of the input array. In contrast, iterative application of map in Puff moves us from the larger chunks to smaller and smaller chunks, until a number of applications equal to the array rank has us operating on individual items.

J being J we can mimic this behavior using negative rank.

<"_2(3 2 $ 1 2 3 4 5 6)
┌─┬─┐
│1│2│
├─┼─┤
│3│4│
├─┼─┤
│5│6│
└─┴─┘

(_2 denotes the number -2 in J for possibly obscure reasons to do with making the parser simpler.)

Given that 3 2 $ 1 2 3 4 5 6 has rank 2, the verb <"_2 must operate on the 2-2=0 cells.

The J approach of, by default, thinking about rank from 0-cells up works well for that language because matrices in J are regular and they keep track of their rank. If we represent matrices as nested arrays in Javascript (this is not the only option, but it is the most idiomatic) then the real rank of a matrix cannot be known without a full traversal, which is prohibitive.

I might, one day, integrate a multidimensional matrix implementation into Puff and then enable rank modifying functions to work on that representation, but for now I want to focus on the successive use ofmap to simulate ranking down a function from infinite rank.

Consider Rank

Consider the following definition:

function rankedCall(f,n,a){
    if(n<0){
        return rep(map, n, f)(a);
    } else {
        throw new Error("Positive ranks not yet supported.");
    }
}

var rank = _c(rankedCall);

Such that:

rank(plusOne,1)([1,2,3]) -> [2,3,4]

Cute. This gets us somewhere. But what really makes rank useful is that each argument carries its own rank and the system resolves the looping for you. In J operators have at most two arguments (to which rank applies, simulating more arguments with lists of boxes bypasses ranked dispatch).

Dealing with multiple argument functions is tricky. Let’s start with two.

Consider:

// Puff provides a plus function
plus(1,3) -> 4
// but it doesn't work on arrays
plus([1,2,3],[4,5,6]) -> '1,2,34,5,6'

That last line is because Javascript idiotically interprets [1,2,3]+[4,5,6] to mean [1,2,3].toString()+[4,5,6].toString().

For these one dimensional arrays, we can get what we want with map which applies a function f of arity n to the items of n arrays.

map(plus,[1,2,3],[4,5,6]) -> [5,7,9]

(NB. In Puff we can also have said map(plus)([1,2,3],[4,5,6]))

What if we have [1,2,3] and [[4],[5],[6]], that is, the second argument is rank two?

Put aside questions of efficiency for a moment and consider the function:

function nextCellIndex(a, indexes){
    var indexes = indexes.map(id); // copy the array
    var delta = indexes.length-1;
    var subIndex = indexes.slice(0,delta);
    var indexedArray = index.apply(null, [a].concat(subIndex));
    var done = indexes[delta]+1 < indexedArray.length;
    while(!done){
      delta = delta -1;
      if(delta<0){
          return null;
      } else {
          indexedArray = index.apply(null, [a].concat(indexes.slice(0,delta)));
          done = indexes[delta]+1 < indexedArray.length;

      }
    }
    indexes[delta] = indexes[delta]+1;
    for(var i = delta+1; i<indexes.length; i = i + 1){
      indexes[i] = 0;
    }
    return indexes;
}

This function takes an array and an array of indexes and finds the next valid index into that array by incrementing the innermost index, checking whether that is in bounds, stopping if it is, or incrementing the next innermost and so on. If there is no valid next index, then null is returned.

If we want what J would call the -2 cells of an array a, we iteratively apply this function to a two element index vector.

var a = [[1],[2,3],[4]]
var indexes = repeatAccumulate(_p(nextCellIndex,a),3,[0,0])

Evaluating to:

indexes
[ [ 0, 0 ], [ 1, 0 ], [ 1, 1 ], [ 2, 0 ] ]

that is, the indexes of the -2 cells. We can get these by, for instance,

index.apply(null, a, indexes[0])

Note that a is not a regular matrix (the second item of a has a different length than the first and third – it has no obvious rank, but we can talk about its n-cells if we talk about them from outside in. We can write a function to give us these cells:

function cells(n, a){
    if(n<0){
      var nn = -n;
      var out = [];
      var indexes = initArray(nn,0);
      while(indexes){
          out.push(index.apply(null, [a].concat(indexes)));
          indexes = nextCellIndex(a,indexes)
      }
      return out;
    } else if (n===0){
      return a;
    } else {
      throw new Error("Positive cells not yet supported.");
    }
}

We can then just fall back onto map with the appropriate applications of cells:

map(plus,[1,2,3],cells(-2,[[1,2],[3]]))
-> [ 2, 4, 6 ]

Conceptually we’ve done well for ourselves: we’ve reproduced J‘s ability to change the way that functions join elements of arrays of arbitrary dimension. On top of that, by virtue of the arity of map, which can apply a function of any arity to any number of arrays, we have extended this idea to operators of any number of arguments (J is limited to monadic and dyadic verbs.)

In addition, Puff allows us to write the above function in a point free fashion:

var ex = f(2,au(map, al(plus), n0, r(n1,_p(cells, 2))));
ex([1,2,3],[[1,2],[3]])
-> [2, 4, 6]

(NB. al returns a function which always returns the argument to al, short for always, n0 returns the first element of a thing, n1 the second, etc. f (short for lambda) returns a function which first collects its arguments into an array and then passes them through its subsequent arguments as if via r (rCompose). Finally, au (short for augment) takes a set of functions and returns a new function which transforms its inputs via functions 1..n and applies function 0 to the that list.)

Positive Ranks

Using negative ranks is much more in-line with idiomatic javascript, since there are no native multidimensional arrays. We can produce a simple implementation of positive ranks if we make a few simple assumptions about usage. First consider:

function guessRank(a){
    var rank = 0;
    var done = false;
    while(!done){
        if(typeof a['length'] === 'number'){
          rank = rank + 1;
          a = a[0];
        } else {
          done = true;
        }
    }
    return rank;
}

Working like:

:Puff:> guessRank(10)
0
:Puff:> guessRank([1,2,3])
1
:Puff:> guessRank([[1,2],[2,3],[3,4]])
2

The assumption we are making here is that the rank of sub-elements is homogeneous (and hence, the first element is a sufficient indicator). Now that we can guess the rank of an array, we can fill in the positive rank branch of our cells function:

function cells(n, a){
    if(n<0){
      var nn = -n;
      var out = [];
      var indexes = initArray(nn,0);
      while(indexes){
          out.push(index.apply(null, [a].concat(indexes)));
          indexes = nextCellIndex(a,indexes)
      }
      return out;
    } else {
      var rank = n-guessRank(a);
      return cells(rank, a);
    }
}

Now we can finally write our implementation of J‘s conjunction ". Our version of " will be called rank and will take a function and a list of ranks and return a new function with the appropriate rank behavior.

function rank(f){
    var ranks = Array.prototype.slice.call(arguments,1,arguments.length);
    return function(){
    return map.apply(null,[f].concat(map(cells, 
                         Array.prototype.slice.call(arguments,0,arguments.length),
                         ranks)));
    }
}

We can now say:

rank(plus,0,0)([1,2,3],[[4],[5],[6]])

And get [5,7,9]. Just like J. Of course, as we’ve written the code here we won’t be anywhere near the efficiency of J – in particular we iterate over each argument array separately, where we could combine all those loops into just one. But performance isn’t everything and we can always optimize the Puff implementation as needed. Rewriting the approprite sequence functions (map,mapcat,crossMap) to handle lazy versions of the sequences and introducing a lazy cells operator would be the most elegant solution. I’m sure I’ll get there eventually.

In the meantime, I hope I’ve at least helped the reader understand J‘s rank concept in greater depth and also showed off some of the nice ways Puff can simulate J style while staying entirely in Javascript.


Compose is Better than Dot (or: Function Level Programming In Javascript)

That great minds think alike is a reflection of the fact that certain ideas have an appeal that is, if not innate, then compelling in context. Hence the use of dot notation in the creation of domain specific languages in Javascript: dot is a handy way of succinctly representing a computation with a context. This is closely related to monads, and many javascript libraries relying heavily on dot for syntactic sugar are very closely related to one monad or another (eg _ to the sequence monad, promises to a sort of continuation monad, etc).

(NB. Much of the code here is inspired by a pointfree/function level programming library I am building for Javascript called puff, available here);

What I aim to prove today is that function composition subsumes the functionality of dot in this regard and that we can embed in Javascript a powerful, succinct function-level programming language in the vein of APL, J, or K/Q based primarily on function composition.

First of all, the traditional definition of function composition:

/** compose functions
 *  return a new function which applies the last argument to compose
 *  to the values passed in and then applies each previous function
 *  to the previous result.  The final result is returned.
 */
function compose(){
    var fs = Array.prototype.slice.call(arguments, 0, arguments.length);
    return function(){
    var inArgs = Array.prototype.slice.call(arguments, 0, arguments.length);
    var i = fs.length-1;
    var res = fs[i].apply(null, inArgs);
    i = i - 1;
    for(; i>=0; i = i - 1){
        res = fs[i](res);
    }
    return res;
    }
}

This version of compose allows the programmer to pass in an unlimited number of functions and returns a new function which takes any number of arguments and then threads the result through all the previous functions, transforming it each time, starting with the last function and ending with the first.

This order of composition is inspired by the fact that the function in a traditional application expression precedes the argument (on the left), transforming:

f(h(g(o)))

“naturally” to

compose(f,h,g)(o)

or, if succinctness is of interest:

var c = compose;
c(f,h,g)(o)

In this way we drop a few parentheses and hopefully express more clearly our intent.

Of course we might not wish to apply our composition immediately: we can produce useful functions via composition, after all.

var plusOne(x){ return x + 1 };
var timesTen(x){ return x*10 };

var plusOneTimesTen = c(timesTen, plusOne);

We can now apply plusOneTimesTen to as many values as we wish. Handy. However, now our intuition about naming and the order of the arguments to c are at odds. Hence, we introduce:

function rCompose(){
   return compose.apply(null,
     Array.prototype.slice.call(arguments, 0, arguments.length).reverse());
}
var r = rCompose;

So that the above looks a bit nicer:

var plusOne(x){ return x + 1 };
var timesTen(x){ return x*10 };

var plusOneTimesTen = r(plusOne, timesTen);

This reverse composition operator is similar in many respects to dot in javascript except we have abstracted away this, to which each method invocation is implicitly addressed in a dot chain. In addition, instead of simple method names, each element in our r list can be any Javascript expression which evaluates to a function. This means that we can denote any sequence of operations this way without worrying whether or not they have been associated with any particular Javascript object.

With the right set of primitives and combinators, r forms the basis of a powerful, succinct function level programming language in which we can build, among other things, expressive domain specific languages. Or with which we can rapidly denote complex operations in a very small number of characters.

Well, first of all we want the bevy of basic functions:

plus, minus, div, times, split, join, toString, index, call,
apply, etc, array

These behave as you might expect, more or less (eg, some plausible implementations to give you the spirit of the idea):

function plus(){
   var out = arguments[0];
   Array.prototype.slice.call(arguments, 1, arguments.length)
    .forEach(function(x){
        out = out + x;
     });
   return out;
}

function index(o, i){
  return o[i];
}

function call(f){
  return f.apply(null, Array.prototype.slice.call(arguments, 0, arguments.length));
}

You get the idea.

Astute readers may realize that our composition function seems to only work with functions of a single argument. Remedying this will be a matter of some interest. The simplest approach is to provide for partial application:

/** partially fix arguments to f (on the right)
 *
 */
function partialRight(f /*... fixedArgs */){
    var fixedArgs = Array.prototype.slice.call(arguments, 1, arguments.length);
    var out = function(/*... unfixedArgs */){
    var unfixedArgs = Array.prototype.slice.call(arguments, 0, arguments.length);
    return f.apply(null,unfixedArgs.concat(fixedArgs));
    }
    out.toString = function(){
    return "partialRight("+f.toString()+","+fixedArgs.join(",")+")";
    }
    return out;
}

/** partially fix arguments to f (on the left)
 *
 */
function partialLeft(f /*... fixedArgs */){
    var fixedArgs = Array.prototype.slice.call(arguments, 1, arguments.length);
    var out = function(/*... unfixedArgs */){
    var unfixedArgs = Array.prototype.slice.call(arguments, 0, arguments.length);
    return f.apply(null,fixedArgs.concat(unfixedArgs));
    }
    out.toString = function(){
    return "partialLeft("+f.toString()+","+fixedArgs.join(",")+")";
    }
    return out;
}

These functions (they might be adverbs in J) take a function and a set of values and return a new function, “fixing” the arguments to the left or right of the argument list, depending on whether we’ve usedpartialLeft or partialRight.

Its handy to introduce the following bindings:

var p_ = partialRight;
var _p = partialLeft;

I hope these are relatively mnemonic (Javascript unfortunately isn’t an ideal environment for very short, expressive names).

We can get a surprising amount of mileage out of these ingredients already. For instance, a function to remove break tags from a string and replace them with newlines (sort of contrived):

var remBreaks = r(p_(split,'<br>'),p_(join,'\n'));

compared to

function remBreaks(s){
   return s.split('<br>').join('\n');
}

(NB. If split and join are written as curried functions, as they are in puff, the above is a little shorter:

var remBreaks = r(split('<br>'),join('\n'));

Providing a meaningful default currying (which args should be applied first) is a little tricky, though.)

Going Further

The above example demonstrates that we can do some handy work with r as long as it involves simply transforming a single value through a set of functions. What may not be obvious here is that a determined programmer can denote any computation whatsoever this way, even if multiple values might need to be kept around and transformed.

Consider the function which I will call augment or au:

/** given a function f and a additional functions gs
 *  return a new function h which applies each g
 *  to its single argument and then applies f to the 
 *  resulting list of values 
 */
function augment(f /*... gs*/){
    var gs = Array.prototype.slice.call(arguments, 1, arguments.length);
    var out = function(a){
    return f.apply(null, gs.map(function(g){
        return g(a);
    }));
    }
    out.toString = function(){
    return "augment("+f.toString()+","+gs.map(toString).join(", ")+")";
    }
    return out;
}

And a nice default currying of index:

function index(o, ix){
   if(typeof ix === "undefined"){
     var realIx = o;
     return function(o){
       return o[realIx];
     }
   } else {
     return o[ix];
   }
}
var ix = index;

Now:

var fullName = augment(join(' '), ix('first'), ix('last'));

Such that:

fullName({first:'Ronald',last:'Reagan'}) -> "Ronald Reagan"

What have we just accomplished? We’ve demonstrated that we can take an arbitrary function of any arity and automatically transform it into a function which reads its input arguments from a single object. The sort of single object which we might be passing from function to function via r.

Putting it Together

To go further we need to add just one more utility function: cleave

function cleave(v /*... fs*/){
    var fs = Array.prototype.slice.call(arguments, 1, arguments.length);
    return fs.map(function(f){
    return f(v);
    });
}

and the shortcut:

function cl_(){
  return function(f){
    return cleave.apply(null, [f].concat(Array.prototype.slice.call(arguments,0,arguments.length)));
  }
}

(This is just cleave curried on the right, eg in puff: c_(cleave).)

// replaceMiddleName name newMiddleName -> transformedName
var replaceMiddleName = r(args(2),
                          cl_(r(first, split(' ')), second),
                          cl_(r(first,first),second,r(first, third)),
                          au(join(' '), first, second, third));

Let’s go nuts with some of the extra utility functions in puff:

var replaceMiddleName = f(2,
                          cl_(r(n0,split(' ')),n1),
                          cl_(n00, n1, n02),
                          join(' '));

Puff lets you do all of this insanity quite easily. There are browser and nodejs builds. You use it by saying, in node:

require('puff').pollute(global);

Which pollutes the global namespace with the puff functions. Since terseness is the rule I can’t imagine why you’d do otherwise, but you can also say:

var puff = require('puff');
puff.r(...)

In the browser, add:

<script src="./build/puff-browser.js"></script>

To your head. This will define a global puff object, which you can then use directly or say:

puff.pollute(window);