Aping J’s Verb Rank in Puff

This blog post will sketch out some thoughts relating to Puff, a function level programming language I am embedding in Javascript and J‘s notion of operator rank.

Rank as it pertains to nouns in J is fairly easy to understand: it is just the number of dimensions of an array. Scalars, like 10, have rank 0, the empty value (denotable as 0 $ 0) has rank 0, a simple vector (0 0 0) has rank 1, a 2d matrix rank 2, etc.

But J also has rank for verbs. Consider the verb +.

(1 2 3) + (4 5 6)
-> 5 7 9

(For J tyros: + is called a verb in J and furthermore we use it in its dyadic sense, which is to say we pass it arguments on the left and the right.)

Informally we understand from this that in J + operates element-wise on both its left and right operands. This means its left and right rank are both zero and it operates, then, on the rank zero elements of its arguments: the individual scalar values.

But there is more to the story. As a side note, We can denote multi-dimensional arrays in J like so:

]example =. 2 3 $ 1 2 3 4 5 6 
1 2 3
4 5 6

(For the curious, that is “change the shape ($) for array 1 2 3 4 5 6 so that it is 2 3)

J has a box operator which is handy for demonstrating rank issues. It is denoted < and wraps any value into a box, which is a special scalar value type which holds something else.

<1
┌─┐
│1│
└─┘

Operators in J have rank and the rank of < is infinite. This means that it always operates on its argument as a whole.

<(1 2 3 4 5 6)
┌───────────┐
│1 2 3 4 5 6│
└───────────┘

But the smart thing about J is that you can modify verbs with adverbs one of which returns a new verb with a different rank. See if you can guess what all this means:

<"0(1 2 3 4 5 6)
┌─┬─┬─┬─┬─┬─┐
│1│2│3│4│5│6│
└─┴─┴─┴─┴─┴─┘

The array denotation 1 2 3 4 5 6 is the same as before, but now we have written <"0 instead of <. " is an adverb which modified its right hand arguments’ rank such that it is the left hand value. The result of<"0 then is a verb with the same meaning as < except that it has rank 0. Verbs with rank 0 operate on the smallest possible cells of the array, so that

<"0(3 2 $ 1 2 3 4 5 6)
┌─┬─┐
│1│2│
├─┼─┤
│3│4│
├─┼─┤
│5│6│
└─┴─┘

each element of the input is boxed regardless of the incoming arrays shape or rank.

If we use a different rank:

<"1(3 2 $ 1 2 3 4 5 6)
┌───┬───┬───┐
│1 2│3 4│5 6│
└───┴───┴───┘

We get a different result. One-ranked verbs operate 1-cells (that is, the elements of rank 1) of the incoming array, in this case the arrays 1 2, 3 4, and 5 6.

The rules work for dyadic values too – each argument of the verb has a rank (a right rank and a left rank) which determines how the underlying logic represented by the verb is used to join elements from the right and left arguments.

By modifying verb rank you can custom tailor your iteration through argument arrays and avoid most explicit looping.

Puff

Puff is mostly aping the function level semantics of J but we can analogize verb rank too. Consider the Puff function map, which has a single argument meaning:

var plus1 = _p(plus,1);
map(plus1)([1,2,3]) -> [2,3,4]

plus1 above would have, in J an infinite rank: it always applies to its whole argument. When we say map(plus1) we have a new function which applies to the N-cells of its argument (in this case integers). In other words, map creates a new function which peels off one layer of its input and applies the original function, collecting the outputs.

What, then, is

var mm_plus1 = map(map(plus1))

?

(NB, we can denote this in Puff via rep(map,2,plus1))

Here is a hint:

mm_plus1([[1,2,3],[4,5,6]]) -> [[2,3,4],[5,6,7]]

Now we have a function operating on the N-2 cells of the input. Rank in J typically operates bottom up: we start at rank 0 operating on the 0 cells, and increasing rank operates on larger and larger chunks of the input array. In contrast, iterative application of map in Puff moves us from the larger chunks to smaller and smaller chunks, until a number of applications equal to the array rank has us operating on individual items.

J being J we can mimic this behavior using negative rank.

<"_2(3 2 $ 1 2 3 4 5 6)
┌─┬─┐
│1│2│
├─┼─┤
│3│4│
├─┼─┤
│5│6│
└─┴─┘

(_2 denotes the number -2 in J for possibly obscure reasons to do with making the parser simpler.)

Given that 3 2 $ 1 2 3 4 5 6 has rank 2, the verb <"_2 must operate on the 2-2=0 cells.

The J approach of, by default, thinking about rank from 0-cells up works well for that language because matrices in J are regular and they keep track of their rank. If we represent matrices as nested arrays in Javascript (this is not the only option, but it is the most idiomatic) then the real rank of a matrix cannot be known without a full traversal, which is prohibitive.

I might, one day, integrate a multidimensional matrix implementation into Puff and then enable rank modifying functions to work on that representation, but for now I want to focus on the successive use ofmap to simulate ranking down a function from infinite rank.

Consider Rank

Consider the following definition:

function rankedCall(f,n,a){
    if(n<0){
        return rep(map, n, f)(a);
    } else {
        throw new Error("Positive ranks not yet supported.");
    }
}

var rank = _c(rankedCall);

Such that:

rank(plusOne,1)([1,2,3]) -> [2,3,4]

Cute. This gets us somewhere. But what really makes rank useful is that each argument carries its own rank and the system resolves the looping for you. In J operators have at most two arguments (to which rank applies, simulating more arguments with lists of boxes bypasses ranked dispatch).

Dealing with multiple argument functions is tricky. Let’s start with two.

Consider:

// Puff provides a plus function
plus(1,3) -> 4
// but it doesn't work on arrays
plus([1,2,3],[4,5,6]) -> '1,2,34,5,6'

That last line is because Javascript idiotically interprets [1,2,3]+[4,5,6] to mean [1,2,3].toString()+[4,5,6].toString().

For these one dimensional arrays, we can get what we want with map which applies a function f of arity n to the items of n arrays.

map(plus,[1,2,3],[4,5,6]) -> [5,7,9]

(NB. In Puff we can also have said map(plus)([1,2,3],[4,5,6]))

What if we have [1,2,3] and [[4],[5],[6]], that is, the second argument is rank two?

Put aside questions of efficiency for a moment and consider the function:

function nextCellIndex(a, indexes){
    var indexes = indexes.map(id); // copy the array
    var delta = indexes.length-1;
    var subIndex = indexes.slice(0,delta);
    var indexedArray = index.apply(null, [a].concat(subIndex));
    var done = indexes[delta]+1 < indexedArray.length;
    while(!done){
      delta = delta -1;
      if(delta<0){
          return null;
      } else {
          indexedArray = index.apply(null, [a].concat(indexes.slice(0,delta)));
          done = indexes[delta]+1 < indexedArray.length;

      }
    }
    indexes[delta] = indexes[delta]+1;
    for(var i = delta+1; i<indexes.length; i = i + 1){
      indexes[i] = 0;
    }
    return indexes;
}

This function takes an array and an array of indexes and finds the next valid index into that array by incrementing the innermost index, checking whether that is in bounds, stopping if it is, or incrementing the next innermost and so on. If there is no valid next index, then null is returned.

If we want what J would call the -2 cells of an array a, we iteratively apply this function to a two element index vector.

var a = [[1],[2,3],[4]]
var indexes = repeatAccumulate(_p(nextCellIndex,a),3,[0,0])

Evaluating to:

indexes
[ [ 0, 0 ], [ 1, 0 ], [ 1, 1 ], [ 2, 0 ] ]

that is, the indexes of the -2 cells. We can get these by, for instance,

index.apply(null, a, indexes[0])

Note that a is not a regular matrix (the second item of a has a different length than the first and third – it has no obvious rank, but we can talk about its n-cells if we talk about them from outside in. We can write a function to give us these cells:

function cells(n, a){
    if(n<0){
      var nn = -n;
      var out = [];
      var indexes = initArray(nn,0);
      while(indexes){
          out.push(index.apply(null, [a].concat(indexes)));
          indexes = nextCellIndex(a,indexes)
      }
      return out;
    } else if (n===0){
      return a;
    } else {
      throw new Error("Positive cells not yet supported.");
    }
}

We can then just fall back onto map with the appropriate applications of cells:

map(plus,[1,2,3],cells(-2,[[1,2],[3]]))
-> [ 2, 4, 6 ]

Conceptually we’ve done well for ourselves: we’ve reproduced J‘s ability to change the way that functions join elements of arrays of arbitrary dimension. On top of that, by virtue of the arity of map, which can apply a function of any arity to any number of arrays, we have extended this idea to operators of any number of arguments (J is limited to monadic and dyadic verbs.)

In addition, Puff allows us to write the above function in a point free fashion:

var ex = f(2,au(map, al(plus), n0, r(n1,_p(cells, 2))));
ex([1,2,3],[[1,2],[3]])
-> [2, 4, 6]

(NB. al returns a function which always returns the argument to al, short for always, n0 returns the first element of a thing, n1 the second, etc. f (short for lambda) returns a function which first collects its arguments into an array and then passes them through its subsequent arguments as if via r (rCompose). Finally, au (short for augment) takes a set of functions and returns a new function which transforms its inputs via functions 1..n and applies function 0 to the that list.)

Positive Ranks

Using negative ranks is much more in-line with idiomatic javascript, since there are no native multidimensional arrays. We can produce a simple implementation of positive ranks if we make a few simple assumptions about usage. First consider:

function guessRank(a){
    var rank = 0;
    var done = false;
    while(!done){
        if(typeof a['length'] === 'number'){
          rank = rank + 1;
          a = a[0];
        } else {
          done = true;
        }
    }
    return rank;
}

Working like:

:Puff:> guessRank(10)
0
:Puff:> guessRank([1,2,3])
1
:Puff:> guessRank([[1,2],[2,3],[3,4]])
2

The assumption we are making here is that the rank of sub-elements is homogeneous (and hence, the first element is a sufficient indicator). Now that we can guess the rank of an array, we can fill in the positive rank branch of our cells function:

function cells(n, a){
    if(n<0){
      var nn = -n;
      var out = [];
      var indexes = initArray(nn,0);
      while(indexes){
          out.push(index.apply(null, [a].concat(indexes)));
          indexes = nextCellIndex(a,indexes)
      }
      return out;
    } else {
      var rank = n-guessRank(a);
      return cells(rank, a);
    }
}

Now we can finally write our implementation of J‘s conjunction ". Our version of " will be called rank and will take a function and a list of ranks and return a new function with the appropriate rank behavior.

function rank(f){
    var ranks = Array.prototype.slice.call(arguments,1,arguments.length);
    return function(){
    return map.apply(null,[f].concat(map(cells, 
                         Array.prototype.slice.call(arguments,0,arguments.length),
                         ranks)));
    }
}

We can now say:

rank(plus,0,0)([1,2,3],[[4],[5],[6]])

And get [5,7,9]. Just like J. Of course, as we’ve written the code here we won’t be anywhere near the efficiency of J – in particular we iterate over each argument array separately, where we could combine all those loops into just one. But performance isn’t everything and we can always optimize the Puff implementation as needed. Rewriting the approprite sequence functions (map,mapcat,crossMap) to handle lazy versions of the sequences and introducing a lazy cells operator would be the most elegant solution. I’m sure I’ll get there eventually.

In the meantime, I hope I’ve at least helped the reader understand J‘s rank concept in greater depth and also showed off some of the nice ways Puff can simulate J style while staying entirely in Javascript.


One thought on “Aping J’s Verb Rank in Puff

Leave a Reply

Your email address will not be published. Required fields are marked *