Digital data has no physical shape. While this allows us to manipulate easily great amounts of data, it poses a problem when it comes to understanding this data and assessing its state. The lack of physical shape renders useless our built-in skill of perceiving the world around us through visual stimuli.
Visualization aims to solve this problem by offering a visual skin to data. “A picture tells a thousand words” goes the old adage. And so it does, but only if the picture is the right one.
What makes a picture appropriate? Well, it has to focus on one or more relevant questions, and it has to take the particularities of data into account.
There are many tools out there providing nice and useful visualizations for interesting questions. However, many of them offer only limited customization possibilities, and this makes them less useful in particular circumstances.
To address this issue, in 2006, Michael Meyer and me (more recently joined by Alex who took over the main maintenance tasks) built Mondrian, a scripting engine based on Smalltalk that allows us to craft custom visualizations for custom data. Mondrian is an important part of the Moose platform and can be tried by downloading the latest Moose release.
The name Mondrian comes from the famous Dutch painter that used to see the world in rectangles and lines. In a similar manner, our Mondrian views data through a graph lens with nodes connected with each other via edges.
But enough talk, let’s draw a simple painting. For the purpose of this exercise, let’s take as case study of exposing the dependencies between the namespaces (or packages) of a software system. One idea to do that is to visualize the situation as a graph in which namespaces are nodes and the dependencies between them are edges.
We start with a fresh canvas:
| view |
view := MOViewRenderer new.
view open.
A fresh canvas is just too simple even for a beginner. So, let’s fill it with a node for each namespace from our model (in this case the model is provided by Moose):
| view namespaces |
namespaces := MooseModel root allModels first allModelNamespaces.
view := MOViewRenderer new.
view nodes: namespaces.
view open.
It starts to get interesting. The next thing we need are the edges. We know that from each namespace we can obtain the providerNamespaces
. Thus, we can create an edge for each such relationship:
| view namespaces |
namespaces := MooseModel root allModels first allModelNamespaces.
view := MOViewRenderer new.
view nodes: namespaces.
view edgesToAll: #providerNamespaces.
view open.
Now, to make the drawing more relevant, let us layout the graph so that the namespaces that are not used by any other are at the top and those that use no other at the bottom. We simply use a dominanceTreeLayout
:
| view namespaces |
namespaces := MooseModel root allModels first allModelNamespaces.
view := MOViewRenderer new.
view nodes: namespaces.
view edgesToAll: #providerNamespaces.
view dominanceTreeLayout.
view open.
To provide more context to the picture, we might also want to add more information on the nodes by mapping the number of classes, methods and lines of code on the width, height and color respectively:
| view namespaces |
namespaces := MooseModel root allModels first allModelNamespaces.
view := MOViewRenderer new.
view shape rectangle
width: #numberOfClasses;
height: #numberOfMethods;
linearFillColor: #numberOfLinesOfCode within: namespaces.
view nodes: namespaces.
view edgesToAll: #providerNamespaces.
view dominanceTreeLayout.
view open.
Executing this code in a Smalltalk workspace reveals a visualization as shown below (provided you have a Moose model already loaded).
There are more visualization and interaction options offered by Mondrian, but we wanted to start with a small painting. Ok, maybe this painting was not that small, but it was rather simple. Once data was around and we knew what we wanted, creating the view was a few keystrokes away. In fact, these keystrokes are so few that it’s hard to imagine any excuse to continue to let graph data shapeless.