给定一条序列,我们也许只关心前10个碱基或者我们想得到序列中的一段。你也许想打印一条子序列至输出流,例如STDOUT。如何做到这些?
Biojava使用生物学坐标系统识别碱基。第一个碱基索引为1,最后一个碱基索引为序列长度。注意这里和计算机中字串的索引不同(以零开始)。如果你的读取超过了1到序列长度的范围,会抛出异常。
获取子序列
[code lang="java"]
SymbolList symL = null;
//code here to generate a SymbolList
//get the first Symbol
Symbol sym = symL.symbolAt(1);
//get the first three bases
SymbolList symL2 = symL.subList(1,3);
//get the last three bases
SymbolList symL3 = symL.subList(symL.length() - 3, symL.length());
Printing a Sub - Sequence
//print the last three bases of a SymbolList or Sequence
String s = symL.subStr(symL.length() - 3, symL.length());
System.out.println(s);
Complete Listing
import org.biojava.bio.seq.*;
import org.biojava.bio.symbol.*;
public class SubSequencing {
public static void main(String[] args) {
SymbolList symL = null;
//generate an RNA SymbolList
try {
symL = RNATools.createRNA("auggcaccguccagauu");
}
catch (IllegalSymbolException ex) {
ex.printStackTrace();
}
//get the first Symbol
Symbol sym = symL.symbolAt(1);
//get the first three bases
SymbolList symL2 = symL.subList(1,3);
//get the last three bases
SymbolList symL3 = symL.subList(symL.length() - 3, symL.length());
//print the last three bases
String s = symL.subStr(symL.length() - 3, symL.length());
System.out.println(s);
}
}
[/code]