GBrowse之频率直方图

GBrowse之频率直方图,有称为频率分布图,Generating Feature Frequency Histograms,用以展示这些统计信息,可以表意以下信息:

  • 不同区段内基因组Gene或者SNP等Feature区间数量分布的差异;
  • 基因表达丰度;
  • 序列的保守性;

新版的GBrowse更加强了该部分的功能,具体的版本是搞不清楚了,整个过程是GFF2时代使用脚本制备数据,然后再倒入数据库,然后是建立数据库的时候增加–summary参数,增加频率数据的功能,现在时默认就有,在Bio::DB::SeqFeature中,表interval_stats就是专门为支撑统计(summary)的。但这里还是要从GFF2说起,这样有利于弄清楚统计数据的GFF表示,以及如何以普通的feature方式配置进行显示,比如序列的保守性,表达丰度等,需要特殊处理,自己进行统计的生成的数据,就需要用到这个方法。

GFF2时代频率直方图

1、使用 bp_generate_histogram.pl 制备数据,脚本的用法:

  1. Usage: /usr/bin/bp_generate_histogram.pl [options] feature_type1 feature_type2...
  2. Dump out a GFF-formatted histogram of the density of the indicated set
  3. of feature types.
  4. Options:
  5. --dsn        <dsn>       Data source (default dbi:mysql:test)
  6. --adaptor    <adaptor>   Schema adaptor (default dbi::mysqlopt)
  7. --user       <user>      Username for mysql authentication
  8. --pass       <password>  Password for mysql authentication
  9. --bin        <bp>        Bin size in base pairs.
  10. --aggregator <list>      Comma-separated list of aggregators
  11. --sort                   Sort the resulting list by type and bin
  12. --merge                  Merge features with same method but different sources
  13. 例如: bp_generate_histogram.pl -merge -d <> -u <> -p <> -bin 10000 SNP >snp_density.gff

2、注意生成文件的格式:

  1.  Chr1  SNP bin 1     10000 49 + . bin Chr1:SNP
  2.  Chr1  SNP bin 10001 20000 29 + . bin Chr1:SNP
  3. 注意,频率数据保存在在第六列

3、数据入库

  1. bp_seqfeature_load.pl -a DBI::mysql -d gb2 snp_density.gff

4、配置GBrowse

  1. [SNP:overview]
  2. feature       = bin:SNP
  3. glyph         = xyplot
  4. graph_type    = boxes
  5. scale         = right
  6. bgcolor       = red
  7. fgcolor       = red
  8. height        = 20
  9. key           = SNP Density

5、OK

GBrowse之频率直方图

参考:http://gmod.org/wiki/GBrowse_Configuration/Feature_frequency_histograms

 新版本频率直方图的设置(Summary Mode)

bioperl、GBrowse最新的版本,使用Bio::DB::SeqFeature存储数据。只要在track设置是添加show summary参数。

  1. [TRACK DEFAULTS]
  2. ...
  3. show summary = 1000000

频率图xyplot的参数说明

  1. The following options are standard among all Glyphs. See Bio::Graphics::Glyph for a full explanation.
  2.  
  3.   Option      Description                      Default
  4.   ------      -----------                      -------
  5.  
  6.   -fgcolor      Foreground color               black
  7.  
  8.   -outlinecolor Synonym for -fgcolor
  9.  
  10.   -bgcolor      Background color               turquoise
  11.  
  12.   -fillcolor    Synonym for -bgcolor
  13.  
  14.   -linewidth    Line width                     1
  15.  
  16.   -height       Height of glyph                10
  17.  
  18.   -font         Glyph font                     gdSmallFont
  19.  
  20.   -label        Whether to draw a label        0 (false)
  21.  
  22.   -description  Whether to draw a description  0 (false)
  23.  
  24.   -hilite       Highlight color                undef (no color)
  25.  
  26. In addition, the xyplot glyph recognizes the following glyph-specific options:
  27.  
  28.   Option         Description                  Default
  29.   ------         -----------                  -------
  30.  
  31.   -max_score   Maximum value of the           Calculated
  32.                feature's "score" attribute
  33.  
  34.   -min_score   Minimum value of the           Calculated
  35.                feature's "score" attributes
  36.  
  37.   -graph_type  Type of graph to generate.     Histogram
  38.                Options are: "histogram",
  39.                "boxes", "line", "points",
  40.                or "linepoints".
  41.  
  42.   -point_symbol Symbol to use. Options are    none
  43.                 "triangle", "square", "disc",
  44.                 "filled_triangle",
  45.                 "filled_square",
  46.                 "filled_disc","point",
  47.                 and "none".
  48.  
  49.   -point_radius Radius of the symbol, in      4
  50.                 pixels (does not apply
  51.                 to "point")
  52.  
  53.   -scale        Position where the Y axis     none
  54.                 scale is drawn if any.
  55.                 It should be one of
  56.                 "left", "right", "both" or "none"
  57.  
  58.   -graph_height Specify height of the graph   Same as the
  59.                                               "height" option.
  60.  
  61.   -part_color  For boxes & points only,       none
  62.                bgcolor of each part (should
  63.                be a callback). Supersedes
  64.                -neg_color.
  65.  
  66.   -scale_color Color of the scale             Same as fgcolor
  67.  
  68.   -clip        If min_score and/or max_score  false
  69.                are manually specified, then
  70.                setting this to true will
  71.                cause values outside the
  72.                range to be clipped.
  73.  
  74.   -bicolor_pivot                              0
  75.                Where to pivot the two colors
  76.                when drawing bicolor plots.
  77.                Scores greater than this value will
  78.                be drawn using -pos_color.
  79.                Scores lower than this value will
  80.                be drawn using -neg_color.
  81.  
  82.   -pos_color   When drawing bicolor plots,    same as bgcolor
  83.                the fill color to use for
  84.                values that are above
  85.                the pivot point.
  86.  
  87.   -neg_color   When drawing bicolor plots,    same as bgcolor
  88.                the fill color to use for values
  89.                that are below the pivot point.

参考:

本文来自:http://boyun.sh.cn/bio/?p=1817

发表评论

匿名网友

拖动滑块以完成验证
加载失败