Monthly Archives: February 2014

Commentary on Mike Bostock’s “Towards Reusable Charts”

A colleague recently asked me to comment on Mike Bostock’s “Towards Reusable Charts” since I have the distinction, at the company, perhaps not entirely deserved, of knowing Javascript and being something of a programming language theory dilettante. The upshot of the article is that we can represent configurable charts using that old PLT observation that there is a close relationship between closures and objects. The ordinary approach, outlined, for instance, in SICP, observes that you can encapsulate your object state and methods inside a closure, and provide an interface to that closure via the function which gives you a handle on it. In scheme:


(define (Point x y)
  (lambda arguments 
	(match arguments ; only barbarians suffer lack of pattern matching
		   ((list 'x) x)
		   ((list 'y) y)
		   ((list 'set 'x value)
			(set! x value) ; or purely (Point value y)
)
		   ((list 'set 'y value)
			(set! y value) ; or purely (Point x value)
))))

(define p1 (Point 0 0)) ; -> #
(p1 'x) ; -> 0
(p1 'set 'x 100)
(p1 'x) ; -> 100
; and so forth

Who cannot look upon this and give a little chuckle? It is a neat little trick about which we can observe a variety of things. First of all, this is admirably encapsulating. There really is no way for a user to get their dirty little fingers into the inner workings of our instance. True to its name and to its mission in life as the fundamental unit of abstraction, our closure is sealed up quite tightly. The second observation is that we are basically implementing message dispatch by hand – a feature usually handled by the implementation language which is a bit smelly, but the sort of thing people do in Scheme all the time. And anyway, the encapsulation might be a bit too heavy handed: how do we debug or interactively play with such objects? This isn’t exactly unsolvable, but it is a downside to the approach.

Bostock presents a slightly modified version of this trick anyway. In Javascript, functions are objects, and as such they may have methods attached to them. Rather than implementing dispatch himself by inspecting the arguments to the returned closure, he attaches methods to the function object. Since they capture the same closure as the returned function, they too can interact with the state of the simulated object. Eg:


function Point(x,y){
  var closure = function(){
    console.log("x:"+x+", y:"+y);
  };
  closure.setX = function(newVal){
     x = newVal;
  }
  closure.setY = function(newVal){
     y = newVal;
  }
}

var p1 = Point(1,2);
p1() // -> prints x:1, y:2
p1.setX(10);
p1() // -> prints x:10, y:2 

So this version of the trick is a bit less encapsulating. There are definite private member variables here, but the methods are accessible to the outside world.

This approach, while superficially clever, leaves some things to be desired, mostly related to integrating into Javascript's admittedly confusing and arguably poorly designed object oriented programming "ecosystem." First and foremost, Javascript instanceof operator will not give you any meaningful information ( Point(1,2) instanceof Function and Point(1,2) instanceof Object will be true, but Point(1,2) instanceof Point will be false). The even less useful, but frequently used, typeof operator will also give you the unhelpful answer "function". Javascript pretty clearly exposes functionality to the user pertaining to object orientation. I would argue it is just bad style to ignore it. On top of that, this technique doesn't have any obvious mechanism for implementing inheritance. While I agree that inheritance hierarchies should be shallow, this pretty much prevents any inheritance at all.

Even if we swallow these peculiarities in favor of the encapsulating that this technique furnishes (its only major benefit besides returning a callable object), we still must observe that there are efficiency concerns operating here: each time we construct a new object we construct a new set of methods because they must access the enclosing object's scope. We can ameliorate the implementation somewhat, but not without making it more peculiar and sacrificing certain things, like callability. I leave how that might work to the reader, in favor of describing the contemporary standard method of implementing object oriented programs in Javascript.

Javascript and Families have a problem in common: resolving inheritance

In Javascript, we generally create objects via an unholy collusion of the new operator and a regular function definition. Eg:

function Point(x,y){
  this.x = x;
  this.y = y;
}

Point
.prototype
.toString = function(){
  return "new Point("+this.x+","+this.y+")";
}

var p = new Point(1,2)
console.log(p.toString()) ; -> new Point(1,2) 

The new operator creates a fresh object for us, set's its `prototype` in such a way that it points to an object at Point.prototype, invokes the function Point in such a way that this is bound to the newly created object, and then finally returns the object, regardless of the return value of Point, though I like to return this from constructors just for the sake of consistency.

One does not access an object's prototype directly, but when a lookup on an object is attempted, the prototype will be checked if the object doesn't have the item, and then the prototype's prototype, etc. This is why we set Point.prototype.toString instead of putting the method directly on the object. This is good organization AND it means the method is only created once. This setting up of the prototype also ensures that when we use instanceof we can find that our objects are indeed instances of Point.

If you know Javascript, all this probably seems uncannily clean. You might be wondering why we'd be tempted to cook up our own mechanisms of object oriented programming, given that this seems to be the way the language intends for us to do it. The rub is what happens when we want to do inheritance. Here is a naive attempt to extend Point, supposing we wish to embed the point in a mathematical Field:

function FieldPoint(x,y){
  Point.call(this,x,y);
  return this;
}
FieldPoint
 .prototype
 .multiply = function(byPt){
  this.x = this.x*byPt.x - this.y*byPt.y;
  this.y = this.x*byPt.y + this.y*byPy.x;
}
FieldPoint
 .prototype
 .add = function(withPt){
   this.x = this.x+withPt.x;
   this.y = this.y+withPt.y;
 }

Astute readers will recognize that these points are like complex numbers, and indeed are a field. So far so good. However, things start going wrong when we experiment with instanceof. For instance

var p = new Point(1,2);
var fp = new FieldPoint(1,2);

console.log("p is instanceof Point: ", p instanceof Point)
console.log("fp is instanceof FieldPoint: ", fp instanceof FieldPoint)
console.log("fp is instance of Point", fp instanceof Point)

gives us:

p is instanceof Point:  true
fp is instanceof FieldPoint:  true
fp is instance of Point false

Once again we are breaking what I take to be a good rule of software engineering, which is to avoid violating expectations implied by some system. In this case, we would expect our subclasses to be instances of their superclass, which they are not. Here is the standard way that people do inheritance in Javascript (put your thinking caps on):

function extend(subClass, superClass){
	var inheritance = (function(){});
	inheritance.prototype = superClass.prototype;
	subClass.prototype = new inheritance();
	subClass.prototype.constructor = subClass;
	subClass.prototype.superConstructor = superClass;
	subClass.superClass = superClass.prototype;
}

function Point(x,y){
	this.x = x;
	this.y = y;
}

Point
	.prototype
	.toString = function(){
		return "new Point("+this.x+","+this.y+")";
	}

function FieldPoint(x,y){
	Point.call(this,x,y);
	return this;
}
FieldPoint
	.prototype
	.multiply = function(byPt){
		this.x = this.x*byPt.x - this.y*byPt.y;
		this.y = this.x*byPt.y + this.y*byPy.x;
	}
FieldPoint
	.prototype
	.add = function(withPt){
		this.x = this.x+withPt.x;
		this.y = this.y+withPt.y;
	}

// FieldPoint extends Point 
extend(FieldPoint, Point);

var pt = new Point(1,2);
var fp = new FieldPoint(1,2);

console.log("pt instanceof Point", pt instanceof Point);
console.log("fp instanceof Point", fp instanceof Point);
console.log("fp instanceof FieldPoint", fp instanceof FieldPoint);

This should result in the following being printed

pt instanceof Point true
fp instanceof Point true
fp instanceof FieldPoint true

As an aside, people usually put extend on the Object object so that you can simply say something like: var newclass = someClass.extend(newConstructor);.

Recapitulation

So, to wrap up, the Bostock model is a variation on the old "use closures as objects" model, with the advantages that the object is callable, and the internal state is completely private. The downsides are that each object so constructed recreates its methods at creation time, which may be burdensome, depending on the use case, that the data is entirely private, that there are no mechanisms for inheritance, and the objects do not behave meaningfully against expectations about how Javascript object oriented programming works.

The standard behavior, more or less outlined above, but present in variations throughout the Javascript world, is idiomatic, behaves meaningfully with respect to `instaceof` and allows inheritance. There is less privacy of internal state, but this is generally not considered to be deeply problematic. In general, without a compelling reason to adopt the technique of punning a closure into an object, I'd favor the more standard approach. It's downside is that, like many things in Javascript, its a bit weird. But at least it is weird with, rather than against the grain.