Use the latest version of Circos and read Circos best practices—these list recent important changes and identify sources of common problems.
If you are having trouble, post your issue to the Circos Google Group and include all files and detailed error logs. Please do not email me directly unless it is urgent—you are much more likely to receive a timely reply from the group.
Don't know what question to ask? Read Points of View: Visualizing Biological Data by Bang Wong, myself and invited authors from the Points of View series.
Nature's special issue dedicated to the Encode Project uses the Circos motif on its cover as well as the interactive Encode Explorer, which is available as an app at iTunes.
In this tutorial, I'll show you how to automatically generate the image that appears on the cover. The original was created for Nature by Carl DeTorres.
The figure contains 23 segments — these are human chromosomes 1–22 and chromosome X. The proportions in the figure aren't exactly the same as the lengths of the assembled chromosomes in the hg19 human assembly. Here, we'll use the assembled lengths.
The color scheme used is is a pleasant range of pastel hues that cycle through orange, green, blue and purple. We'll redefine the default chromosome colors to make use of this color scheme.
The data in the figure is shown in six concentric tracks, whose spacing decreases slightly towards the inside of the circle. Each track appears to highlight fixed-width regions, colored after the chromosome color scheme. I don't know how the data tracks were populated, whether the data are randomly placed, designed for visual effect, or drawn from an actual Encode data set. In this example, we'll create the tracks using a random scheme.
I sampled the RGB colors from the cover image and obtained the following values, which I've placed in a <colors> block. The *
suffix overrides existing color definitions.
# circos.conf <<include etc/colors_fonts_patterns.conf>> <colors> chr1* = 163,132,130 chr2* = 188,162,118 chr3* = 216,196,96 chr4* = 233,212,56 chr5* = 229,229,50 chr6* = 212,222,56 chr7* = 195,215,57 chr8* = 177,209,58 chr9* = 160,204,61 chr10* = 139,198,61 chr11* = 128,193,95 chr12* = 115,186,126 chr13* = 102,183,152 chr14* = 91,178,176 chr15* = 61,174,199 chr16* = 36,170,224 chr17* = 75,129,194 chr18* = 85,111,180 chr19* = 92,92,168 chr20* = 98,70,156 chr21* = 101,45,145 chr22* = 121,74,141 chrx* = 140,104,137 </colors>
The background of the image is set in the <image> block. The *
suffix is used to overwrite values of parameters set elsewhere in the block (e.g., here defined in the included file etc/image.conf
).
<image> <<include etc/image.conf>> background* = black </image>
Each track has the same data source but has a different appearance because of dynamic rules that alter the data randomly.
The definition of each track is actually the same, but includes dynamically changing fields called counters that change the position of each track. The figure with 7 tracks is created like this
# variables used in each plot.conf block plot_width = 80 plot_padding = 25 num_plots = 6 <plots> type = highlight file = bins.txt stroke_thickness = 0 <<include plot.conf>> <<include plot.conf>> <<include plot.conf>> <<include plot.conf>> <<include plot.conf>> <<include plot.conf>> <<include plot.conf>> </plots>
where the plot.conf
file is
<plot> r1 = dims(ideogram,radius_inner) - conf(plot_padding)*eval(remap(counter(plot),0,conf(num_plots),1,0.9)) - eval((conf(plot_width)+conf(plot_padding))*counter(plot)*eval(remap(counter(plot),0,conf(num_plots),1,0.9))) r0 = conf(.,r1) - conf(plot_width)*eval(remap(counter(plot),0,conf(num_plots),1,0.9)) post_increment_counter = plot:1 <<include rules.conf>> </plot>
The inner and outer radii of the the track (r0
and r1
) are computed using the value of the plot_padding
and plot_width
parameters. Each time a plot is drawn, the value of the variable counter(plot
) is incremented by 1.
The dims(ideogram,radius_inner
) variable refers to the position of the inner radius of the ideograms. The function remap(VAR,MIN,MAX,TARGETMIN,TARGETMAX
) is used to remap the variable VAR
from the range [MIN,MAX]
to [TARGETMIN,TARGETMAX]
. The expressions are designed so that the spacing between the tracks decreases slightly towards the center of the circle, to match the appearance of the original image in Nature.
Each track uses the same data file, bins.txt
. How is it that the tracks look different, then?
The data file defines 7.5 Mb bins across the genome
hs1 0 7499999 hs1 7500000 14999999 hs1 15000000 22499999 hs1 22500000 29999999 ...
and the color associated with each bin is dynamically altered by rules that are included in each <plot> block, from the file rules.conf
.
# rules.conf <rules> <rule> # Rules with multiple condition # The first condition tests that bins are further than 5 Mb from the # start and end of each ideogram. This ensures that the color # for the first/last bin will be the same as the ideogram. condition = var(start) >= 5e6 && var(end) < chrlen(var(chr))-5e6 # The probability that the second condition is true is proportional to # the track counter. Bins in inner tracks are more likely to trigger # this rule. Here, rand() is a uniformly distributed random number in # the range [0,1). condition = rand() < remap(counter(plot),0,conf(num_plots)-1,1/conf(num_plots),1) # If this rule is true, the color of the bin is changed to that of a # random ideogram. fill_color = eval("chr" . (sort {rand() <=> rand()} (1..22,"x"))[0]) </rule> # If the above rule is not true, the color of the bin is assigned to # that of its ideogram. <rule> condition = 1 fill_color = eval("chr" . lc substr(var(chr),2)) </rule> </rules>
This is an advanced technique—one meant for designing illustrations more than for the display of data. However, next time you're in the need of random circular heatmaps, give it a try.