问题描述 有时候我们想知道与某一个GO注释分类相关的基因有哪些,那么我们就需要一种方法将注释到这个GO term所有的基因提取出来
 
解决方案 在搜索一轮后,发现可以通过以下代码解决:
1 2 3 library(tidyverse) library(org.Hs.eg.db) GOgeneID <- get(GOID, org.Hs.egGO2ALLEGS) %>% mget(org.Hs.egSYMBOL) %>% unlist()  
 
下面用DNA 复制(GO:0006260)这一生物学过程为例子,使用人源的GO注释进行展开
1 2 3 4 5 6 7 8 9 library(tidyverse) library(org.Hs.eg.db) # GO ID --> gene entrez ID DNA_geneID <- get('GO:0006260', org.Hs.egGO2ALLEGS)  > head(DNA_geneID)   TAS   IEA   TAS   IMP   TAS   ISS   "94" "466" "472" "545" "545" "546"  > length(DNA_geneID) [1] 421 
 
org.Hs.egGO2ALLEGS 包含GO ID与 Entrez ID之间的对应关系,输出的结果中还标注了该基因的注释证据程度,包括以下分类 :
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 IMP: inferred from mutant phenotype IGI: inferred from genetic interaction IPI: inferred from physical interaction ISS: inferred from sequence similarity IDA: inferred from direct assay IEP: inferred from expression pattern IEA: inferred from electronic annotation TAS: traceable author statement NAS: non-traceable author statement ND: no biological data available IC: inferred by curator 
 
详细分类结果可以到以下网址查询:http://geneontology.org/docs/guide-go-evidence-codes/ 
进一步我们还可以将Entrez ID转换为Symbol
1 2 3 4 DNA_geneSYMBOL <- mget(DNA_geneID, org.Hs.egSYMBOL) %>% unlist()  > head(DNA_geneSYMBOL)       94      466      472      545      545      546  "ACVRL1"   "ATF1"    "ATM"    "ATR"    "ATR"   "ATRX"  
 
完。
Ref:https://davetang.org/muse/2011/05/20/extract-gene-names-according-to-go-terms/ https://www.ebi.ac.uk/QuickGO/term/GO:0006260 http://geneontology.org/docs/guide-go-evidence-codes/