The format for GenBank Accession numbers are:
GenBank Accession numbers命名的规则是:
Nucleotide: | 1 letter + 5 numerals OR 2 letters + 6 numerals 1个字母+5个数字 或 2个字母+6位数字 |
Protein: | 3 letters + 5 numerals 3个字母+5位数字 |
WGS: | 4 letters + 2 numerals for WGS assembly version + 6-8 numerals 4个字母+2位数字+WGS的版本+6-8位数字 |
MGA: | 5 letters + 7 numerals 5个字母+7位数字 |
Accession号前缀在各个数据库的分布:
Nucleotide Accession Prefixes (核酸序列的前缀)
Prefix | Database | Type |
---|---|---|
BA,DF,DG | DDBJ | CON division |
AN | EMBL | CON division |
CH,CM,DS,EM, EN,EP,EQ,FA, GG,GL | NCBI | CON division |
C,AT,AU,AV,BB, BJ,BP,BW,BY,CI, CJ,DA,DB,DC, DK,FS | DDBJ | EST |
F | EMBL | EST |
H,N,T,R,W,AA,AI, AW,BE,BF,BG, BI,BM,BQ,BU, CA,CB,CD,CF, CK,CN,CO,CV, CX,DN,DR,DT, DV,DY,EB,EC, EE,EG,EH,EL, ES,EV,EW,EX, EY,FC,FD,FE, FF,FG,FK,FL, GD,GE,GH,GO | GenBank | EST |
D,AB | DDBJ | Direct submissions |
V,X,Y,Z,AJ,AM, FM | EMBL | Direct submissions |
U,AF,AY,DQ,EF, EU,FJ,GQ | GenBank | Direct submissions |
AP | DDBJ | Genome project data |
BS | DDBJ | Chimpanzee genome data |
AL,BX,CR,CT, CU | EMBL | Genome project data |
AE,CP,CY | GenBank | Genome project data |
AG,DE,DH,FT | DDBJ | GSS |
B,AQ,AZ,BH,BZ, CC,CE,CG,CL, CW,CZ,DU,DX, ED,EI,EJ,EK, ER,ET,FH,FI | GenBank | GSS |
AK | DDBJ | cDNA projects |
AC,DP | GenBank | HTGS |
E,BD,DD,DI,DJ, DL,DM,FU | DDBJ | Patents |
A,AX,CQ,CS,FB, GM,GN | EMBL | Patents (nucleotide only) |
I,AR,DZ,EA,GC, GP | GenBank | Patents (nucleotide) |
G,BV,GF | GenBank | STS |
BR | DDBJ | TPA |
BN | EMBL | TPA |
EZ | GenBank | TSA |
S | GenBank | From journal scanning |
AD | GenBank | From GSDB |
AH | GenBank | Segmented set header |
AS | GenBank | Other - not currently being used |
BC | GenBank | MGC project |
BK | GenBank | TPA |
BL,GJ,GK | GenBank | TPA CON division |
BT | GenBank | FLI-cDNA projects |
J,K,L,M | GenBank | from GSDB direct submissions |
N | GenBank and DDBJ | N0-N2 were used intially by both groups but have been removed from circulation, N2-N9 are ESTs |
AAAA-AZZZ | GenBank | WGS |
BAAA-BZZZ | DDBJ | WGS |
CAAA-CZZZ | EMBL | WGS |
DAAA-DZZZ | GenBank | WGS TPA |
AAAAA-AZZZZ | DDBJ | MGA |
Protein Accession Prefixes (蛋白序列的前缀)
Prefix | Database | Type |
---|---|---|
BAA-BZZ | DDBJ | Protein ID |
CAA-CZZ | EMBL | Protein ID |
AAA-AZZ | GenBank | Protein ID |
AAE | GenBank | Protein ID for Patents (note that there are also some patent proteins with AAA and AAC |
FAA_FZZ | DDBJ | TPA Protein ID |
DAA-DZZ | GenBank | TPA Protein ID |
GAA-GZZ | DDBJ | WGS Protein ID |
EAA-EZZ | GenBank | WGS Protein ID |
HAA-HZZ | GenBank | TPA WGS Protein ID |
O | Swiss-Prot | Protein |
P | Swiss-Prot (Geneva) | Protein |
Q | Swiss-Prot (Hinxton) | Protein |
NCBI RefSeq命名格式的详细说明:https://www.plob.org/2012/02/24/3711.html