Statsbook

Hierarchical Edge Bundling Plot

A hierarchical edge bundling plot can be used to show correlations, similar to a correlation plot. Typically, it is used for larger data sets. To create the plot, install the edgebundleR package1

Base R contains a data frame ‘mtcars’ with information about different cars. To show the first six observations (head) of the data frame and obtain further information about the variables in the data frame:

head(mtcars)
                   mpg cyl disp  hp drat    wt  qsec vs am gear carb
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
str(mtcars)
'data.frame':	32 obs. of  11 variables:
 $ mpg : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
 $ cyl : num  6 6 4 6 8 6 8 4 4 6 ...
 $ disp: num  160 160 108 258 360 ...
 $ hp  : num  110 110 93 110 175 105 245 62 95 123 ...
 $ drat: num  3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
 $ wt  : num  2.62 2.88 2.32 3.21 3.44 ...
 $ qsec: num  16.5 17 18.6 19.4 17 ...
 $ vs  : num  0 0 1 1 0 1 0 1 1 1 ...
 $ am  : num  1 1 1 0 0 0 0 0 0 0 ...
 $ gear: num  4 4 4 3 3 3 3 4 4 4 ...
 $ carb: num  4 4 1 1 2 1 4 2 2 4 ...

First, create a correlation matrix that contains the correlation coefficients between the different variables:

corr_matrix <- cor(mtcars)

Convert the matrix to an igraph object needed for edgebundleR:

library(igraph)
corr_matrix <- graph_from_adjacency_matrix(corr_matrix, mode = "undirected", weighted = TRUE, diag = FALSE)

Next create the plot using the edgebundleR package2:

library(edgebundleR)
edgebundle(corr_matrix)

The plot should now appear in your web browser. The folder that contains the files to create the plot is shown in the address bar of your browser. This folder will be deleted when you close R, so leave R open. Go to the folder shown in the address bar of your browser and copy the folder with all its subfolders and paste it to your web server. Set up a link and the plot should be available on your server:

Default Plot

By hovering over the nodes, the associations become visible.

By default, the ‘tension’ of the blue lines between the nodes is 0.5, but can be set to any value between 0 and 1. With the tension set to 1:

edgebundle(corr_matrix, tension = 1)

Plot with Tension = 1

It is also possible to specify the ‘cutoff’. To show all associations were the correlation coefficient is larger than 0.7 (70%):

edgebundle(corr_matrix, cutoff = 0.7)

Plot with Cutoff 70%