Fetching the data
The easy way to do this would be to export some data from our analytics systems every once in a while. It felt a little too static, in my mind – I wanted something more dynamic, preferably with real-time data.
We’re using Varnish Cache at VG, so we can’t just parse the webserver logs. However, using a tool called vstatd/VCS (Varnish Custom Statistics), we are able to log hits into different keys, which we can filter on and later sort these keys by the number of hits. This gives us realtime statistics, and also allows us to go a couple of minutes back in time, depending on bucket size and the number of buckets we have set up in our configuration.
Every time someone reads one of our articles, they will hit Varnish, which will assign a hash to the request containing the article ID – lets say ARTICLE-<ID>
. We can then filter on keys starting with ARTICLE-
and sort the results descending by the number of hits. For every article in the top-list, we fetch some basic article information, such as the title of the article, category and the “lead asset” (main article image, usually). I’ve written a simple node.js application that does these steps and polls for new information every 5 seconds.
Presenting the data
After all the data has been retrieved and is available in a simple JSON array, presenting it using D3 is fairly simple. We group the results by category, then use the enter/exit pattern of D3 to easily add, remove and update nodes. The treemap algorithm automatically calculates x, y, width and height for our nodes, based on the defined size of our treemap:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 | [code language="js"] // Initialize the treemap var treemap = d3.layout.treemap() .size([width, height]) .sticky(false) .value(function(d) { return d.size; }); var leaves = treemap(jsonData); // Node-positioning function var position = function() { this.style('left', function(d) { return d.x + 'px'; }) .style('top', function(d) { return d.y + 'px'; }) .style('width', function(d) { return d.dx + 'px'; }) .style('height', function(d) { return d.dy + 'px'; }); }; // Set background-image based on data var getBackgroundStyle = function(d) { return 'url(' + d.img + ')'; }; // Select all nodes, join data on id var nodes = domRoot .selectAll('.node') .data(leaves, function(d) { return d.id; }); // On new nodes... nodes.enter().append('a') .attr('href', function(d) { return d.url; }) .style('background', getBackgroundStyle) .text(function(d) { return d.name; }) .call(position); // Remove old nodes nodes.exit().remove(); // Update existing nodes nodes .style('background', getBackgroundStyle) .transition() .duration(750) .text(function(d) { return d.name; }) .call(position); [/code] |
Conclusion
Having proven to myself that the visualization worked as I had hoped it would, I wanted to wrap it into a dashboard-like prototype that we could put up on a monitor in the office. Basically, there is now a node.js application doing four different tasks:
- Fetches new lists of the most read articles every few seconds
- Fetches article information for the top articles
- Provides a simple data endpoint to retrieve the data we need
- Serves a static webpage which will serve as our dashboard
The solution is now available for anyone who wants to give it a try.
Taking it further
I wanted to make it a little more interactive, so I added some options for toggling images and article titles on and off, setting the number of articles to show, frequency of updates and the data timeframe to fetch.
What I found was that with a small timeframe (say, 10 seconds), the data was very dynamic. However, it might not have enough data to really represent the full picture. With a timeframe of 30 seconds, we can get a clearer picture of what is going on. If you set it to update every 10 seconds, you still get a moving window which is fairly dynamic yet still more statistically correct.
Take a look at the current prototype at mestlest.vg.no. It was a fun project to make, and I will definitely be using D3 more in the future! Hope you like the prototype 🙂