7 min read

Latest attempt at D3

Lately, there’s been a particular kind of chart I want to make. I guess you can consider it a beeswarm plot with variable radius, or maybe a packed circle timeline. Whatever you call it, Keith Collins and Kevin Roose published a story with a spectacular one, tracking the proliferation of a right wing meme.

I encountered something similar at Tapestry Conference in November – where I gave a talk by the way. Mollie Pettit presented some great work she’s doing with the ACLU of Illinois, that examines racial bias in traffic incidents. Mollie’s presentation was a standout among two days of wonderful talks. And the standout charts from Mollie’s talk used beeswarms in the same manner the NYT journalists did (although her analysis doesn’t use time on the x-axis). Afterward, I asked her how they were made. She told me something more specific than “D3”, but that’s all I remembered.

I’ve still never made an actual, real data visualization in D3. But I’ve spent some time trying to understand it, and a lot more time understanding Javascript and web development (two things I didn’t know the first time tried to play with D3). So with some downtime around the holidays, I worked towards building this data vis. Not knowing what the hell I’m doing in D3, here was my strategy:

  1. Browse the D3 docs for something that might help, while simultanesouly googling for an open-sourced example that felt close enough.
  2. Paste the example into RStudio, get it to work using the D3 support.
  3. Poke at the code until I understood it well enough to change it in the ways I needed.

I ended up using Nate Vack’s block “Beeswarm plot with gravity and collisions”, The gravity and collisions aspect was important to prevent the circles from overlapping.

Getting that into RStudio’s way of working with D3 stumped me for a bit. The trick was to replace the section of the block where svg was defined, because the RStudio package rd2d3 handles that by exposing a predefined svg variable for you to reference in your Javascript code.

The code defines nodes, and each node is randomly assigned a y-value based on the norm() function defined at the top. Then the nodes are passed into D3.layout.force, and the layout gets simulated to

Here is what Nate Vack’s chart looks like:

It seemed like the right place to start, but the chart I have in my head is different from that in a few ways:

  1. Rather than being all circles being the same size, the circle size should encode a continuous variable.
  2. Rather than a single color, circle color should encode a categorical variable.
  3. I wanted the distribution to be horizontal, rather than vertical.

Step 1 was the easiest. The original block had a line that said var radius = 4, in the loop that randomly created the circles. I updated it to assign each circle a radius randomly based on a uniform distribution between zero and seven var radius = 7 * Math.random(),.

To encode color, I first added color as a property of each node, randomly assigning it one of two hex codes. To choose colors, I literally just typed random digits for the hex codes, having no idea what colors would come out. Then in the section of the code that starts with svg.selectAll, I replaced the fixed value of the fill style with a function that mapped to the color property I defined in the node code: .style("fill", d => d.color).

Then to swap the axis, I basically looked at everything defined in terms of x and y in the code and tried to flip them. I needed to switch x and y. I figured I needed to change true_x and true_y. After some poking, I got it. In addition to swapping coordinates in the nodes, I also needed to swap the coefficients on xerr and yerr in the while loop.

Next Steps I might not take

I’m excited by the progress I made, although this is still a ways from being usable to visualize data. There are a lot of places to go from here. Here is a list of good of where I should go with it.

  1. Move the random data generator into R and use R2D3 to pass that in, possibly in a Shiny app.
  2. Find some actual data to visualize.
  3. Just keep poking at params of the existing script, with the aim of understanding d3.layout.force and d3.geom.quadtree a little better.
  4. Add in mouseover or click interactions with the circles.
  5. Find another D3 example for packing circles that looks a little cleaner and start over.
  6. Rather than copy and paste from a block, just try to write some fresh D3 code.
  7. Package the existing script into an htmlwidget.
  8. Find an R package for collision detection.
  9. Just give up on another small project before it’s a real thing.
  10. Think a bit harder about how this visualization distorts the data, especially before I try to use it in my job.

Here’s the code:

function norm() {
  var res, i;
  res = 0;
  for (i = 0; i < 10; i += 1) {
    res += Math.random()*2-1
  }
  return res;
}

var w = 500,
    h = 500;

var nodes = d3.range(400).map(function() { 
  var true_y = (norm()*50)+250;
  return {
    radius: 7 * Math.random(), 
    x: true_y,
    true_x: true_y,
    true_y: 200,
    color: ['#123456', '#4bab56'][Math.floor(Math.random() + .5)] }
  });

var force = d3.layout.force()
    .gravity(0)
    .charge(0)
    .friction(0.9)
    .nodes(nodes)
    .size([w, h]);

var root = nodes[0];
root.radius = 0;
root.fixed = true;

force.start();

svg.selectAll("circle")
    .data(nodes)
  .enter().append("svg:circle")
    .attr("r", function(d) { return d.radius; })
    .style("fill", d => d.color)
    .style("stroke", "black")

force.on("tick", function(e) {
  var q,
    node,
    i = 0,
    n = nodes.length;
    
  var q = d3.geom.quadtree(nodes);

  while (++i < n) {
    node = nodes[i];
    q.visit(collide(node));
    xerr = node.x - node.true_x;
    yerr = node.y - node.true_y;
    node.x -= xerr*0.5;
    node.y -= yerr*0.008;
  }

  svg.selectAll("circle")
      .attr("cx", function(d) { return d.x; })
      .attr("cy", function(d) { return d.y; });
});

function collide(node) {
  var r = node.radius,
    nx1,
    nx2,
    ny1,
    ny2,
    xerr,
    yerr;
    
  nx1 = node.x - r;
  nx2 = node.x + r;
  ny1 = node.y - r;
  ny2 = node.y + r;
      
  return function(quad, x1, y1, x2, y2) {
    if (quad.point && (quad.point !== node)) {
      var x = node.x - quad.point.x,
          y = node.y - quad.point.y,
          l = Math.sqrt(x * x + y * y),
          r = node.radius + quad.point.radius;
      if (l < r) {
        // we're colliding.
        var xnudge, ynudge, nudge_factor;
        nudge_factor = (l - r) / l * .4;
        xnudge = x*nudge_factor;
        ynudge = y*nudge_factor;
        node.x -= xnudge;
        node.y -= ynudge;
        quad.point.x += xnudge;
        quad.point.y += ynudge;
      }
    }
    return x1 > nx2
        || x2 < nx1
        || y1 > ny2
        || y2 < ny1;
  };
}