
The size of the intersections is shown as a bar chart placed on top of the matrix so that each column lines up with exactly one bar. A vertical black line connects the topmost black circle with the bottommost black circle in each column to emphasize the column-based relationships. If a set is not part of the intersection, a light gray circle is shown. For each set that is part of a given intersection, a black filled circle is placed in the corresponding matrix cell. S1 and S2 for comparisons of Venn and Euler diagrams with UpSetR plots). UpSetR visualizes intersections of sets as a matrix in which the rows represent the sets and the columns represent their intersections ( Fig. We also provide a Shiny app that allows researchers to create publication-quality UpSet plots directly in a web browser. UpSetR differs from the original UpSet technique as it is optimized for static plots and for integration into typical bioinformatics workflows. UpSetR provides support for the visualization of attributes associated with the elements contained in the sets, enabling researchers to explore and characterize the intersections. UpSetR support three input formats: (i) a table in which the rows represent elements and columns include set assignments and additional attributes (ii) sets of elements names and (iii) an expression describing the size of the set intersections as introduced by the venneuler package ( Wilkinson, 2012). It is implemented using ggplot2 ( Wickham, 2009) and allows data analysts to easily generate generate UpSet plots for their own data. Here we present an R package named ‘UpSetR’ based on the ‘UpSet’ technique ( Lex et al., 2014 Lex and Gehlenborg, 2014) that employs a matrix-based layout to show intersections of sets and their sizes. The visual representation of intersection size by irregularly shaped and unaligned areas makes it hard to answer essential questions such as ‘What is the biggest intersection?’ or ‘Is intersection X larger than intersection Y?’ ( Cleveland and McGill, 1984). These closely related techniques have well known shortcomings, as they are hard to generate for more than a small number of sets. Such diagrams can be generated with R packages such as venneuler ( Wilkinson, 2012) and VennDiagram ( Chen and Boutros, 2011).

Although many alternative set visualization techniques exist ( Alsallakh et al., 2016), such data are typically visualized using Venn and Euler diagrams. S1) or show orthologs of genes in newly sequenced species across genomes of related species ( D’Hont et al., 2012, Supplementary Fig. For example, a researcher might need to compare multiple algorithms that identify single nucleotide polymorphisms ( Xu et al., 2012, Supplementary Fig. The visualization of sets and their intersections is a common challenge for researchers who are dealing with biological and biomedical data.
