{"id":1805,"date":"2016-09-06T22:06:31","date_gmt":"2016-09-06T21:06:31","guid":{"rendered":"http:\/\/pcool.dyndns.org:8080\/statsbook\/?page_id=1805"},"modified":"2025-07-01T18:11:09","modified_gmt":"2025-07-01T17:11:09","slug":"mosaic-plot","status":"publish","type":"page","link":"https:\/\/pcool.dyndns.org\/index.php\/mosaic-plot\/","title":{"rendered":"Mosaic Plot"},"content":{"rendered":"\n<p>A mosaic plot is used to evaluate several categorical variables into one plot and are a way of displaying contingency tables graphically. The width and heights of individual cells represent their proportions of total. In each individual column, the width on the box is the same and is equal to the total count of that column. The height of each cell represents the proportion of patients in that column. In fact, each column in a mosaic plot represents a bar plot with the bins stacked on top of each other. Each cell in the mosaic plot represents the proportion of that combination of categories to the total and is just a graphical display of a contingency table.<\/p>\n\n\n\n<p>To create a mosaic plot, the vcd (<strong>v<\/strong>isualise <strong>c<\/strong>ategorical <strong>d<\/strong>ata) package<sup class='sup-ref-note' id='note-zotero-ref-p1805-r1-o1'><a class='sup-ref-note' href='#zotero-ref-p1805-r1'>1<\/a><\/sup> could be used. The example below shows how to create a mosaic plot using the Titanic data set included in R.<\/p>\n\n\n\n<pre class=\"wp-block-code has-small-font-size\"><code><em><mark style=\"background-color:rgba(0, 0, 0, 0);color:#f5070f\" class=\"has-inline-color\">library(vcd)<\/mark><mark style=\"background-color:rgba(0, 0, 0, 0);color:#2608f5\" class=\"has-inline-color\">\nLoading required package: grid\n<\/mark><mark style=\"background-color:rgba(0, 0, 0, 0);color:#f50717\" class=\"has-inline-color\">Titanic<\/mark><mark style=\"background-color:rgba(0, 0, 0, 0);color:#2608f5\" class=\"has-inline-color\">\n, , Age = Child, Survived = No\n\n      Sex\nClass  Male Female\n  1st     0      0\n  2nd     0      0\n  3rd    35     17\n  Crew    0      0\n\n, , Age = Adult, Survived = No\n\n      Sex\nClass  Male Female\n  1st   118      4\n  2nd   154     13\n  3rd   387     89\n  Crew  670      3\n\n, , Age = Child, Survived = Yes\n\n      Sex\nClass  Male Female<\/mark><\/em>\n<em><mark style=\"background-color:rgba(0, 0, 0, 0);color:#2608f5\" class=\"has-inline-color\">  1st     5      1\n  2nd    11     13\n  3rd    13     14\n  Crew    0      0\n\n, , Age = Adult, Survived = Yes\n\n      Sex\nClass  Male Female\n  1st    57    140\n  2nd    14     80\n  3rd    75     76\n  Crew  192     20<\/mark><\/em><\/code><\/pre>\n\n\n\n<p>Please note the data is in a contingency table format (not a data frame) required for the mosaic function of the vcd package. <\/p>\n\n\n\n<p class=\"is-style-text-annotation is-style-text-annotation--1\">To convert data to a contingency table, use the table function in R base package.<\/p>\n\n\n\n<pre class=\"wp-block-code has-small-font-size\"><code><em><span style=\"color: #ff0000;\">mosaic(Titanic)<\/span><\/em><\/code><\/pre>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"606\" src=\"https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/TitanicMosaic1-1024x606.png\" alt=\"\" class=\"wp-image-3818\" srcset=\"https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/TitanicMosaic1-1024x606.png 1024w, https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/TitanicMosaic1-300x177.png 300w, https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/TitanicMosaic1-768x454.png 768w, https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/TitanicMosaic1-1536x908.png 1536w, https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/TitanicMosaic1-2048x1211.png 2048w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>The real strength of a mosaic plot is that it is possible to display the residuals of a chi-square test graphically by applying shading to the different categories according the the value of their residual (Pearson). In fact it is a graphical display of a chi-square test. To obtain such a plot, just set the shading argument to TRUE:<\/p>\n\n\n\n<pre class=\"wp-block-code has-small-font-size\"><code><em><span style=\"color: #ff0000;\">mosaic(Titanic, shade = TRUE)<\/span><\/em><\/code><\/pre>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"673\" src=\"https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/TitanicMosaic2-1024x673.png\" alt=\"\" class=\"wp-image-3823\" srcset=\"https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/TitanicMosaic2-1024x673.png 1024w, https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/TitanicMosaic2-300x197.png 300w, https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/TitanicMosaic2-768x505.png 768w, https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/TitanicMosaic2-1536x1010.png 1536w, https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/TitanicMosaic2-2048x1347.png 2048w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>It is now very easy to see which categories are under-represented (red) and over-represented (blue).<\/p>\n\n\n\n<p>In addition, it is possible to create custom mosaic plots with ggplot2. An example function for two categorical variables can be downloaded <a href=\"https:\/\/pcool.dyndns.org:\/wp-content\/R_functions\/mosaicGG.txt\" target=\"_blank\" rel=\"noreferrer noopener\">here<\/a>. Please note that the function requires three arguments: data frame, variable1 and variable2.<\/p>\n\n\n\n<p>For example, copy and paste the function into R and create a mosaic plot with the mtcars dataset:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code><span style=\"color: #ff0000;\"><em>mosaicGG(mtcars,'cyl','am')<\/em><\/span>\n<em><mark style=\"background-color:rgba(0, 0, 0, 0);color:#1608f3\" class=\"has-inline-color\">   \n     4  6  8\n  0  3  4 12\n  1  8  3  2\n\n\tPearson's Chi-squared test\n\ndata:  table(data&#091;&#091;FILL]], data&#091;&#091;X]])\nX-squared = 8.7407, df = 2, p-value = 0.01265\n\n  FILL X    residual\n1    0 4 -1.38175267\n2    1 4  1.67045752\n3    0 6 -0.07664242\n4    1 6  0.09265616\n5    0 8  1.27898721\n6    1 8 -1.54622013\nWarning message:\nIn chisq.test(table(data&#091;&#091;FILL]], data&#091;&#091;X]])) :\n  Chi-squared approximation may be incorrect<\/mark><\/em><\/code><\/pre>\n\n\n\n<p class=\"is-style-text-annotation is-style-text-annotation--2\"><span style=\"color: #000000;\">The expected frequencies are less than 5, hence the warning message.<\/span><\/p>\n\n\n\n<p>The resulting plot:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"755\" src=\"https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/mosaic2-1024x755.png\" alt=\"\" class=\"wp-image-3280\" srcset=\"https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/mosaic2-1024x755.png 1024w, https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/mosaic2-300x221.png 300w, https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/mosaic2-768x566.png 768w, https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/mosaic2.png 1092w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>Or as PDF:<\/p>\n\n\n\n<p><a href=\"https:\/\/pcool.dyndns.org:\/wp-content\/uploads\/old\/mosaic2.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">mosaic2<\/a><\/p>\n\n\n\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>A mosaic plot is used to evaluate several categorical variables into one plot and are a way of displaying contingency tables graphically. The width and heights of individual cells represent their proportions of total. In each individual column, the width on the box is the same and is equal to the total count of that [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"inline_featured_image":false,"footnotes":""},"class_list":["post-1805","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/pcool.dyndns.org\/index.php\/wp-json\/wp\/v2\/pages\/1805","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/pcool.dyndns.org\/index.php\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/pcool.dyndns.org\/index.php\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/pcool.dyndns.org\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/pcool.dyndns.org\/index.php\/wp-json\/wp\/v2\/comments?post=1805"}],"version-history":[{"count":5,"href":"https:\/\/pcool.dyndns.org\/index.php\/wp-json\/wp\/v2\/pages\/1805\/revisions"}],"predecessor-version":[{"id":4716,"href":"https:\/\/pcool.dyndns.org\/index.php\/wp-json\/wp\/v2\/pages\/1805\/revisions\/4716"}],"wp:attachment":[{"href":"https:\/\/pcool.dyndns.org\/index.php\/wp-json\/wp\/v2\/media?parent=1805"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}