0%

R-SCENIC

文章接受,终于有空再发点笔记了😃

SCENIC_Tutorial

http://htmlpreview.github.io/?https://github.com/aertslab/SCENIC/blob/master/inst/doc/SCENIC_Running.html

image-20221106111655645

Setup

1
2
3
4
5
6
7
8
9
10
11
12
13
# http://htmlpreview.github.io/?https://github.com/aertslab/SCENIC/blob/master/inst/doc/SCENIC_Setup.html
## Required
BiocManager::install(c("RcisTarget"))
BiocManager::install(c("GENIE3")) # Optional. Can be replaced by GRNBoost
## Optional (but highly recommended):
# To score the network on cells (i.e. run AUCell):
BiocManager::install(c("zoo", "mixtools", "rbokeh"), update=F)
# For various visualizations and perform t-SNEs:
BiocManager::install(c("DT", "NMF", "ComplexHeatmap", "R2HTML", "Rtsne"), update=F)
BiocManager::install("doRNG")
devtools::install_github("aertslab/SCENIC")
devtools::install_github("aertslab/SCopeLoomR")
devtools::install_github("aertslab/AUCell")
1
2
3
#if (!dir.exists("SCENIC_MouseBrain/")) dir.create("SCENIC_MouseBrain/")
if (!dir.exists("int/")) dir.create("int/")
if (!dir.exists("cisTarget_databases/")) dir.create("cisTarget_databases/")

下载相应转录因子数据库,本文用的数据来自小鼠,所以下载小鼠的数据库

由于直接从R这里下载的话会下载不完整,所以这里使用 zsync_curl 下载数据库。

ubuntu下zsync_curl安装可以参考: https://github.com/AppImage/zsync-curl
相应数据库的下载方式:https://resources.aertslab.org/cistarget/help.html

以下为shell命令行

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#!/usr/bin/bash
# Specify database name:
# for human
feather_database_url_hg=("https://resources.aertslab.org/cistarget/databases/homo_sapiens/hg19/refseq_r45/mc9nr/gene_based/hg19-500bp-upstream-7species.mc9nr.feather"
"https://resources.aertslab.org/cistarget/databases/homo_sapiens/hg19/refseq_r45/mc9nr/gene_based/hg19-tss-centered-10kb-7species.mc9nr.feather")
# for mouse
feather_database_url_mm=('https://resources.aertslab.org/cistarget/databases/mus_musculus/mm9/refseq_r45/mc9nr/gene_based/mm9-500bp-upstream-7species.mc9nr.feather' 'https://resources.aertslab.org/cistarget/databases/mus_musculus/mm9/refseq_r45/mc9nr/gene_based/mm9-tss-centered-10kb-7species.mc9nr.feather')
# for fly
feather_database_url_dm='https://resources.aertslab.org/cistarget/databases/drosophila_melanogaster/dm6/flybase_r6.02/mc8nr/gene_based/dm6-5kb-upstream-full-tx-11species.mc8nr.feather'
# download
for url in ${feather_database_url_mm[*]}; do
feather_database="${url##*/}"
echo "$feather_database"
zsync_curl "${url}.zsync"
done

下载后,将数据库置于当前工作目录的 cisTarget_databases/

其他的db和相关注释文件可从该网址下载https://resources.aertslab.org/cistarget/

Load Packages

1
2
3
4
5
library(SCopeLoomR)
library(SCENIC)
library(tidyverse)
library(AUCell)
library(foreach)

Input

Expression matrix

这里使用官方提供的小鼠大脑的单细胞表达矩阵(862 genes * 200 cells)进行演示.

1
2
3
4
5
6
7
loomPath <- system.file(package="SCENIC", "examples/mouseBrain_toy.loom")
loom <- open_loom(loomPath)
exprMat <- get_dgem(loom)
cellInfo <- get_cell_annotation(loom)
close_loom(loom)

dim(exprMat)
## [1] 862 200

Cell info/phenodata

1
head(cellInfo)
##                        CellType nGene nUMI
## 1772066100_D04     interneurons   170  509
## 1772063062_G01 oligodendrocytes   152  443
## 1772060224_F07        microglia   218  737
## 1772071035_G09    pyramidal CA1   265 1068
## 1772067066_E12 oligodendrocytes    81  273
## 1772066100_B01    pyramidal CA1   108  191

Numbers of different celltypes

1
table(cellInfo$CellType)
## 
## astrocytes_ependymal         interneurons            microglia 
##                   29                   43                   14 
##     oligodendrocytes        pyramidal CA1 
##                   55                   59

save cell information

1
saveRDS(cellInfo, file="int/cellInfo.Rds")

设置每个celltype的颜色

1
2
3
4
5
6
7
8
9
10
11
# Color to assign to the variables (same format as for NMF::aheatmap)
colVars <- list(CellType=c("microglia"="forestgreen",
"endothelial-mural"="darkorange",
"astrocytes_ependymal"="magenta4",
"oligodendrocytes"="hotpink",
"interneurons"="red3",
"pyramidal CA1"="skyblue",
"pyramidal SS"="darkblue"))
colVars$CellType <- colVars$CellType[intersect(names(colVars$CellType), cellInfo$CellType)]
saveRDS(colVars, file="int/colVars.Rds")
plot.new(); legend(0,1, fill=colVars$CellType, legend=names(colVars$CellType))

image-20221106111730508

Initialize SCENIC settings

在开始分析前,我们先初始化SCENIC分析的一些参数

1
2
3
4
5
6
7
8
9
10
11
12
13
org <- "mgi" # or hgnc, or dmel
dbDir <- "cisTarget_databases" # RcisTarget databases location
myDatasetTitle <- "SCENIC example on Mouse brain" # choose a name for your analysis
data(defaultDbNames)
dbs <- defaultDbNames[[org]]
scenicOptions <- initializeScenic(org=org,
dbDir=dbDir,
dbs=dbs['10kb'],
datasetTitle=myDatasetTitle,
nCores=6)

# Save to use at a later time...
saveRDS(scenicOptions, file="int/scenicOptions.Rds")

为了演示,我们只分析10kb的数据库

Co-expression network

SCENIC流程的第一步是基于表达数据推断潜在的转录因子靶标。这可以通过GENIE3或GRNBoost完成

这里我们需要输入的是:

  • 过滤后的单细胞表达矩阵
  • TF的列表(潜在的调控因子)

这一步完成后将会输出一个相关性矩阵,以生成基因共表达模块

Gene filter/selection

运行GENIE3/GRNBoost之前需要先对表达矩阵进行过滤,过滤的条件包括:

  1. 按每个基因的所有read counts过滤,以排除极低reads的测序噪声;
  2. 按检测到基因的细胞数过滤;
  3. 只保留在RcisTarget数据库中的基因进行分析。
1
2
3
4
# (Adjust minimum values according to your dataset)
genesKept <- geneFiltering(exprMat, scenicOptions=scenicOptions,
minCountsPerGene=3*.01*ncol(exprMat),
minSamples=ncol(exprMat)*.01)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     0.0    17.0    48.0   146.8   137.0  5868.0 
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    0.00   10.00   25.00   39.01   60.00  180.00

需要注意的是过滤的条件需要根据数据集进行相应调整。

应用以上条件过滤,最终过滤后的表达矩阵为: 770 genes * 200 cells

1
2
exprMat_filtered <- exprMat[genesKept, ]
dim(exprMat_filtered)
## [1] 770 200

Correlation

计算TF和其靶标的Spearman correlation。这一步会在 int/ 文件夹下生成 corrMat.Rds

1
2
runCorrelation(exprMat_filtered, scenicOptions)
list.files("int/", pattern="corrMat")
## [1] "1.2_corrMat.Rds"

GENIE3

GENIE3是一种基于随机森林的算法,每次运行的结果都有些许差别,可以通过 set.seed 设置固定的种子以获得可重复的结果。

GENIE3使用转录因子的表达量作为变量预测每个靶基因的表达量,最后把训练输出模型的权重作为转录因子-靶基因的调控强度。

1
2
3
4
5
6
# Optional: add log (if it is not logged/normalized already)
exprMat_filtered <- log2(exprMat_filtered+1)

set.seed(1001)
# Run GENIE3
runGenie3(exprMat_filtered, scenicOptions)

Build and score the GRN

GENIE3/GRNBoost 完成后,就可以使用SCENIC推断基因调控网路(Gene Regulatory Network, GRN).

SCENIC的流程包括:

  1. 获取基因共表达模块
  2. 获取调控子 (with RcisTarget)
  3. 对每个细胞的GRN进行打分 (with AUCell)
  4. 根据GRN的活性对细胞进行聚类
1
2
3
4
5
6
loom <- open_loom(loomPath)
exprMat <- get_dgem(loom)
close_loom(loom)
# Optional: log expression (for TF expression plot, it does not affect any other calculation)
exprMat_log <- log2(exprMat+1)
dim(exprMat)
## [1] 862 200

运行SCENIC

  • runSCENIC_1_coexNetwork2modules: 将GENIE3/GRNBoost的结果转换为共表达模块
  • runSCENIC_2_createRegulons: 调用RcisTarget,保留通过TF-motif富集分析后的共表达模块
  • runSCENIC_3_scoreCells: 调用AUCell在每个细胞中对regulon进行打分,分数作为regulon在该细胞的活性
1
scenicOptions <- runSCENIC_1_coexNetwork2modules(scenicOptions)
##            [,1]
## nTFs          8
## nTargets    770
## nGeneSets    47
## nLinks    17308
1
scenicOptions <- runSCENIC_2_createRegulons(scenicOptions, coexMethod=c("top5perTarget")) #** Only for toy run!!
##               [,1]
## top5perTarget    8
## [1] 22058
1
2
scenicOptions@settings$nCores <- 1 # for runSCENIC_3_scoreCells
scenicOptions <- runSCENIC_3_scoreCells(scenicOptions, exprMat_log)
##    min     1%     5%    10%    50%   100% 
##  46.00  58.99  77.00  84.80 154.00 342.00
1
saveRDS(scenicOptions, file="int/scenicOptions.Rds") # To save status

上述每一步都会在int/中生成若干文件。

runSCENIC_1_coexNetwork2modules 计算了TF与每个基因的相关性权重,为了避免in-direct TF-target pairs的干扰,作者使用了几种过滤方法的组合来形成共表达基因模块(Modules)


方法 含义


w0.001 以TF为单位,保留weight > 0.01的基因形成modules

w0.005 以TF为单位,保留weight > 0.05的基因形成modules

top50 保留每个TF中weight top 50的基因

top5perTarget 以基因为单位,保留weight top 5的TF得到简化的TF-target pairs list,然后把基因分配给TF构建共表达模块

top10perTarget 以基因为单位,保留weight top 10的TF得到简化的TF-target pairs list,然后把基因分配给TF构建共表达模块

top50perTarget 以基因为单位,保留weight top 50的TF得到简化的TF-target pairs list,然后把基因分配给TF构建共表达模块

top1sd 保留每个TF中,weight大于 mean(weight) + sd(weight)的targets

top3sd 保留每个TF中,weight大于 mean(weight) + 3*sd(weight)的targets

runSCENIC_3_scoreCells使用AUCell计算每个regulon的活性

AUCell calculates the enrichment of the regulon as an area under the recovery curve (AUC) across the ranking of all genes in a particular cell, whereby genes are ranked by their expression value.

In brief, the scoring method is based on a recovery analysis where the x-axis (Supplementary Fig. 1c) is the ranking of all genes based on expression level (genes with the same expression value, e.g., ‘0’, are randomly sorted); and the y-axis is the number of genes recovered from the input set.

AUCell then uses the AUC to calculate whether a critical subset of the input gene set is enriched at the top of the ranking for each cell. In this way, the AUC represents the proportion of expressed genes in the signature and their relative expression values compared to the other genes within the cell.

get regulon activity matrix

runSCENIC_3_scoreCells将regulon打分的结果保存到’int/3.4_regulonAUC.Rds’中,我们可以读入这个数据进行分析。

1
2
3
4
5
regulonAUC <- readRDS('int/3.4_regulonAUC.Rds')
regulonAUC <- regulonAUC[onlyNonDuplicatedExtended(rownames(regulonAUC)),]

regAct <- getAUC(regulonAUC)
dim(regAct);regAct[,1:3]
## [1]   7 200

##              cells
## gene sets     1772066100_D04 1772063062_G01 1772060224_F07
##   Tef (405g)     0.595558153     0.46756283    0.275862069
##   Dlx5 (35g)     0.097560976     0.00000000    0.009059233
##   Irf1 (105g)    0.000000000     0.05610754    0.382817066
##   Stat6 (27g)    0.000000000     0.01399177    0.146502058
##   Sox9 (150g)    0.210987726     0.22676797    0.049678551
##   Sox10 (88g)    0.003506721     0.31092928    0.023962595
##   Sox8 (98g)     0.042080655     0.26241964    0.026300409

这里我们获得了7个regulon在200个细胞中的打分矩阵,其中每一行是一个regulon,每一列是一个细胞。以”Tef (405g)”为例,该regulon id表示的意思是转录因子Tef这个模块当前调控405个潜在的靶基因。

Optional steps

在获取调控模块(regulons)X 细胞(cells)的AUC矩阵后,SCENIC的主要分析已经完成了,后续可以根据个人需求进一步分析。例如以下的这些分析。

二值化网络活性(regulon on/off)

这一步可以手动查看每一个regulon的binarized threshold是否合理,AUCell自动判断的阈值有时候会过于严格,以致某一regulon在所有细胞都被判断为off的状态。

1
2
aucellApp <- plotTsne_AUCellApp(scenicOptions, exprMat_log)
savedSelections <- shiny::runApp(aucellApp)

image-20221106111821002

这一步会创建一个shinyapp,以探索定义基因模块激活的阈值。

1
2
3
4
# Save the modified thresholds:
newThresholds <- savedSelections$thresholds
saveRDS(newThresholds, file=getIntName(scenicOptions, "aucell_thresholds"))
saveRDS(scenicOptions, file="int/scenicOptions.Rds")

二值化AUC

1
scenicOptions <- runSCENIC_4_aucell_binarize(scenicOptions)

基于调控活性的聚类与降维

我们可以根据调控模块的活性对细胞进行聚类。

1
2
3
4
5
6
nPcs <- c(5)
# Run t-SNE with different settings:
fileNames <- tsneAUC(scenicOptions, aucType="AUC", nPcs=nPcs, perpl=c(5,15,50))
fileNames <- tsneAUC(scenicOptions, aucType="AUC", nPcs=nPcs, perpl=c(5,15,50), onlyHighConf=TRUE, filePrefix="int/tSNE_oHC")
# Plot as pdf (individual files in int/):
fileNames <- paste0("int/",grep(".Rds", grep("tSNE_", list.files("int"), value=T), value=T))
1
2
3
par(mfrow=c(length(nPcs), 3))
fileNames <- paste0("int/",grep(".Rds", grep("tSNE_AUC", list.files("int"), value=T, perl = T), value=T))
plotTsne_compareSettings(fileNames, scenicOptions, showLegend=FALSE, varName="CellType", cex=.5)

image-20221106111844875

image-20221106111853549

保存tSNE的参数

1
2
3
4
scenicOptions@settings$defaultTsne$aucType <- "AUC"
scenicOptions@settings$defaultTsne$dims <- 5
scenicOptions@settings$defaultTsne$perpl <- 15
saveRDS(scenicOptions, file="int/scenicOptions.Rds")

Exploring/interpreting the results

SCENIC 分析的结果在当前工作目录下的 output/ 中。

将AUC和TF的表达投影到tSNE

1
2
3
4
5
tSNE_scenic <- readRDS(tsneFileName(scenicOptions))
aucell_regulonAUC <- loadInt(scenicOptions, "aucell_regulonAUC")
# Show TF expression:
par(mfrow=c(2,3))
AUCell::AUCell_plotTSNE(tSNE_scenic$Y, exprMat, aucell_regulonAUC[onlyNonDuplicatedExtended(rownames(aucell_regulonAUC))[c("Dlx5", "Sox10", "Sox9","Irf1", "Stat6")],], plots="Expression")

image-20221106111901760

1
2
3
4
5
# Save AUC as PDF:
Cairo::CairoPDF("output/Step4_BinaryRegulonActivity_tSNE_colByAUC.pdf", width=20, height=15)
par(mfrow=c(4,6))
AUCell::AUCell_plotTSNE(tSNE_scenic$Y, cellsAUC=aucell_regulonAUC, plots="AUC")
dev.off()
## png 
##   2

以密度图的方式展示tSNE

1
2
3
4
5
library(KernSmooth)
library(RColorBrewer)
dens2d <- bkde2D(tSNE_scenic$Y, 1)$fhat
image(dens2d, col=brewer.pal(9, "YlOrBr"), axes=FALSE)
contour(dens2d, add=TRUE, nlevels=5, drawlabels=FALSE)

image-20221106111909826

展示多个调控子

这里注意原教程写于2020,用的是 SCENIC_1.2.0 。到了2022年,SCENIC已经更新到1.3.1.,而 plotTsne_rgb 已经被替换为 plotEmb_rgb

1
2
3
4
5
6
7
8
9
10
#par(bg = "black")
par(mfrow=c(1,2))

regulonNames <- c( "Dlx5","Sox10")
cellCol <- SCENIC::plotEmb_rgb(scenicOptions, regulonNames, aucType="AUC", aucMaxContrast=0.6)

regulonNames <- list(red=c("Sox10", "Sox8"),
green=c("Irf1"),
blue=c( "Tef"))
cellCol <- SCENIC::plotEmb_rgb(scenicOptions, regulonNames, aucType="Binary")

image-20221106111917525

GRN: Regulon targets and motifs

列出当前基因调控网路中某个调控模块下的基因:

1
2
regulons <- loadInt(scenicOptions, "regulons")
regulons[c("Dlx5", "Irf1")]
## $Dlx5
##  [1] "2610203C20Rik" "Adamts17"      "AI854703"      "Arhgef10l"    
##  [5] "Bahcc1"        "Cirbp"         "Cplx1"         "Dlx1"         
##  [9] "Gad1"          "Gadd45gip1"    "Hexim2"        "Igf1"         
## [13] "Iglon5"        "Ltbp3"         "Myt1"          "Npas1"        
## [17] "Nxph1"         "Peli2"         "Plekha6"       "Prkab1"       
## [21] "Ptchd2"        "Racgap1"       "Rgs8"          "Robo1"        
## [25] "Rpl34"         "Sema3c"        "Shisa9"        "Slc12a5"      
## [29] "Slc39a6"       "Spns2"         "Stox2"         "Syt6"         
## [33] "Unc5d"         "Wnt5a"        
## 
## $Irf1
##   [1] "4930523C07Rik" "Acsl5"         "Adrb1"         "Ahdc1"        
##   [5] "Ahnak"         "Akap1"         "Anxa2"         "Arhgap31"     
##   [9] "Arhgef10l"     "Arhgef19"      "Atg14"         "B2m"          
##  [13] "Bank1"         "Bcl2a1b"       "Cald1"         "Capsl"        
##  [17] "Ccl2"          "Ccl3"          "Ccl7"          "Ccnd1"        
##  [21] "Ccnd3"         "Cd163"         "Cd48"          "Cited2"       
##  [25] "Cmtm6"         "Ctgf"          "Ctnnd1"        "Ctss"         
##  [29] "Ddr2"          "E130114P18Rik" "Egfr"          "Ehd1"         
##  [33] "Esam"          "Fcer1g"        "Fcgr1"         "Fgf14"        
##  [37] "Fstl4"         "Fyb"           "Gabrb1"        "Gadd45g"      
##  [41] "Gja1"          "Gpr34"         "Gramd3"        "H2-K1"        
##  [45] "Hfe"           "Il6ra"         "Irf1"          "Itgb5"        
##  [49] "Kcne4"         "Kcnj16"        "Kcnj2"         "Laptm5"       
##  [53] "Lhfp"          "Lrp4"          "Map3k5"        "Mdk"          
##  [57] "Med13"         "Midn"          "Mr1"           "Ms4a7"        
##  [61] "Msr1"          "Myh9"          "Nin"           "Nptx1"        
##  [65] "Osbpl6"        "P2ry13"        "Parp12"        "Peli2"        
##  [69] "Plcl2"         "Plekha6"       "Prkab1"        "Rab3il1"      
##  [73] "Rgs5"          "Rhobtb2"       "Rnase4"        "Sepp1"        
##  [77] "Sertad2"       "Sgk3"          "Slamf9"        "Slc4a4"       
##  [81] "Slc7a7"        "Slco5a1"       "Slitrk2"       "Soat1"        
##  [85] "Srgn"          "St3gal6"       "St5"           "St8sia4"      
##  [89] "Stard8"        "Stat6"         "Stk17b"        "Tapbp"        
##  [93] "Tbxas1"        "Tgfa"          "Tgfb3"         "Tmem100"      
##  [97] "Tnfaip3"       "Tnfaip8"       "Trib1"         "Trpm7"        
## [101] "Txnip"         "Usp2"          "Vgll4"         "Vps37b"       
## [105] "Zfp36l2"

一个regulon可以对应多个靶标,而AUC则是在靶标大于10个的regulons中进行运算。

1
2
regulons <- loadInt(scenicOptions, "aucell_regulons")
head(cbind(onlyNonDuplicatedExtended(names(regulons))))
##       [,1]         
## Tef   "Tef (405g)" 
## Dlx5  "Dlx5 (35g)" 
## Sox9  "Sox9 (150g)"
## Sox8  "Sox8 (98g)" 
## Sox10 "Sox10 (88g)"
## Irf1  "Irf1 (105g)"

TF-target详细的对应关系在 output/Step2_regulonTargetsInfo.tsv

1
2
3
regulonTargetsInfo <- loadInt(scenicOptions, "regulonTargetsInfo")
tableSubset <- regulonTargetsInfo[TF=="Stat6" & highConfAnnot==TRUE]
head(tableSubset)
##       TF          gene highConfAnnot nMotifs         bestMotif  NES motifDb
## 1: Stat6 4930523C07Rik          TRUE      14 tfdimers__MD00051 3.39    10kb
## 2: Stat6       Bcl2a1b          TRUE      11 tfdimers__MD00051 3.39    10kb
## 3: Stat6          Ccl3          TRUE       8 tfdimers__MD00051 3.39    10kb
## 4: Stat6          Cd14          TRUE      14 tfdimers__MD00051 3.39    10kb
## 5: Stat6       Clec4a2          TRUE       1 tfdimers__MD00051 3.39    10kb
## 6: Stat6       Col27a1          TRUE       9 tfdimers__MD00051 3.39    10kb
##       coexModule   spearCor CoexWeight
## 1: top5perTarget 0.17259814 0.06908293
## 2: top5perTarget 0.08241394 0.15894490
## 3: top5perTarget 0.13977515 0.07461879
## 4: top5perTarget 0.07996444 0.15006861
## 5: top5perTarget 0.10529877 0.24615271
## 6: top5perTarget 0.17040821 0.18443827

Regulators for known cell types or clusters

  • 每个cluster的平均Regulon活性
1
2
3
4
5
6
7
regulonAUC <- loadInt(scenicOptions, "aucell_regulonAUC")
regulonAUC <- regulonAUC[onlyNonDuplicatedExtended(rownames(regulonAUC)),]
regulonActivity_byCellType <- sapply(split(rownames(cellInfo), cellInfo$CellType),
function(cells) rowMeans(getAUC(regulonAUC)[,cells]))
regulonActivity_byCellType_Scaled <- t(scale(t(regulonActivity_byCellType), center = T, scale=T))

ComplexHeatmap::Heatmap(regulonActivity_byCellType_Scaled, name="Regulon activity")

image-20221106111929359

  • 二值化版本(具有调节子活性的细胞类型/簇的细胞百分比)
1
2
3
4
5
6
7
minPerc <- .7
binaryRegulonActivity <- loadInt(scenicOptions, "aucell_binary_nonDupl")
cellInfo_binarizedCells <- cellInfo[which(rownames(cellInfo)%in% colnames(binaryRegulonActivity)),, drop=FALSE]
regulonActivity_byCellType_Binarized <- sapply(split(rownames(cellInfo_binarizedCells), cellInfo_binarizedCells$CellType),
function(cells) rowMeans(binaryRegulonActivity[,cells, drop=FALSE]))
binaryActPerc_subset <- regulonActivity_byCellType_Binarized[which(rowSums(regulonActivity_byCellType_Binarized>minPerc)>0),]
ComplexHeatmap::Heatmap(binaryActPerc_subset, name="Regulon activity (%)", col = c("white","pink","red"))

image-20221106111937144

  • 细胞特异性调控因子

我们还可以计算每个regulon在细胞中的特异性系数(Regulon Specificity Score, RSS)来衡量regulon在不同细胞类型间的特异程度。

RSS = 1-sqrt(JSD)

JSD: Jensen-Shannon Divergence

In probability theory and statistics, the JensenShannon divergence is a method of measuring the similarity between two probability distributions.

https://en.wikipedia.org/wiki/Jensen%E2%80%93Shannon_divergence

1
2
3
4
# regulonAUC <- loadInt(scenicOptions, "aucell_regulonAUC")
rss <- calcRSS(AUC=getAUC(regulonAUC), cellAnnotation=cellInfo[colnames(regulonAUC), "CellType"])
rssPlot <- plotRSS(rss)
print(rssPlot$plot)

image-20221106111946554

以上就是SECENIC的分析流程,实际上计算AUC矩阵后就结束了,后续的分析都是根据课题需求进行的各种探索与可视化。

ref:

https://scenic.aertslab.org/tutorials/

http://htmlpreview.github.io/?https://github.com/aertslab/SCENIC/blob/master/inst/doc/SCENIC_Running.html

https://github.com/aertslab/SCENIC/blob/master/R/runSCENIC_1_coexNetwork2modules.R

https://cloud.tencent.com/developer/article/1692240

https://www.jianshu.com/p/093a835ab056