Liger单细胞多组学分析:整合单细胞RNA-seq数据集

介绍完liger的数学原理,我们便可以来跑一跑liger的流程,并对liger的数学原理和参数选择有更深刻的理解。我们可以在GitHub上找到liger的码源以及相关的教程,本文也主要翻译自GitHub上的教程,并加入了一些自己的函数和理解。

载入包和数据

本文的教程希望整合控制组以及干扰素刺激组的PBMC单细胞RNA-seq数据,数据来源于Kang et al, 2017,liger的作者则提供了以RData格式存储的经过down sample的表达矩阵,我们可以从这里下载并直接使用。当然,如果你使用10x的数据的话,则也可以使用read10X函数将数据读入。

  1. ############################################
  2. ## Project: Liger-learning
  3. ## Script Purpose: run pipeline of liger integration and further analysis
  4. ## Data: 2020.10.24
  5. ## Author: Yiming Sun
  6. ############################################
  7.  
  8. # general setting
  9. setwd('/data/User/sunym/project/liger_learning/')
  10. # library
  11. library(liger)
  12. library(Seurat)
  13. library(dplyr)
  14. library(tidyverse)
  15. library(viridis)
  16. library(parallel)
  17. #my function
  18. MySuggestLambda <- function(object,lambda,k,knn_k = 20,thresh = 1e-04,max.iters = 100,k2 = 500,ref_dataset = NULL,resolution = 1,parl_num){
  19. res <- matrix(ncol = 2,nrow = length(lambda))
  20. colnames(res) <- c('lambda','alignment')
  21. res[,'lambda'] <- lambda
  22. cl <- makeCluster(parl_num)
  23. clusterExport(cl,c('object','k','knn_k','thresh','max.iters','k2','ref_dataset','resolution'),envir = environment())
  24. clusterExport(cl,c('optimizeALS','quantileAlignSNF','calcAlignment'),envir = .GlobalEnv)
  25. alignment_score <- parLapply(cl = cl,lambda,function(x){
  26. object <- optimizeALS(object,k = k,lambda = x,thresh = thresh,max.iters = max.iters)
  27. object <- quantileAlignSNF(object,knn_k = knn_k,k2 = k2,ref_dataset = ref_dataset,resolution = resolution)
  28. return(calcAlignment(object))
  29. })
  30. stopCluster(cl)
  31. res[,'alignment'] <- (alignment_score %>% unlist())
  32. res <- as.data.frame(res)
  33. res[,1] <- as.numeric(as.character(res[,1]))
  34. res[,2] <- as.numeric(as.character(res[,2]))
  35. p <- ggplot(res,aes(x = lambda,y = alignment)) geom_point() geom_line()
  36. return(list(res,p))
  37. }
  38. # 1.Integrating Multiple Single-Cell RNA-seq Datasets
  39. #load data
  40. ctrl_dge <- readRDS("./data/PBMC_control.RDS")
  41. stim_dge <- readRDS("./data/PBMC_interferon-stimulated.RDS")

这里我写了一个函数MySuggestLambda用来优化λ的选择,其实liger包自带有suggestK和suggestLambda函数帮助我们优化K和λ的选择,但是suggestLambda函数似乎有什么bug,在一台服务器上跑完优化后会报一个莫名其妙的错误,而在另一台服务器上则完全正常,我还没有发现bug在哪里所以就自己写了个类似的函数帮助我优化λ的选择,如果你使用liger默认的函数也会报错的话不妨试试我的函数,虽然他内存运用效率特别低,建议配置好50G左右的swap区,因为当R跑多线程时,如果内存不够rsession会被直接挤掉,配置好swap能拯救一下它(血泪教训)。

标准化

在降维和聚类(整合)之前,我们通常都要对表达矩阵进行标准化、挑选variable gene最后再归一化。需要注意的是,liger使用的是非负矩阵分解的方法,因此我们在做归一化的时候注意不要做中心化,因为这会导致负数的出现。

  1. #initialize a liger object
  2. ifnb_liger <- createLiger(list(ctrl = ctrl_dge, stim = stim_dge))
  3. #explore liger object
  4. dim([email protected]$ctrl)
  5. head(colnames([email protected]$ctrl))
  6. head(rownames([email protected]$ctrl))
  7. dim([email protected]$stim)
  8. #normalize data
  9. ifnb_liger <- normalize(ifnb_liger)
  10. #select variable gene
  11. ifnb_liger <- selectGenes(ifnb_liger)
  12. #scale data but not center
  13. ifnb_liger <- scaleNotCenter(ifnb_liger)

可以简单关注一下liger的数据结构,原始的表达矩阵存储在raw.data中,聚类的信息存储在clusters中,降维的信息存储在tsne.coords中。这里就不得不提一下seurat,seurat有一个meta.data的表格来存储各个细胞的各种信息,因此在我们做个性化的比较分析的时候特别有用。liger虽然也有cell.data,但我们似乎不太能够利用liger自带的函数指定cell.data中的变量来进行比较,我想到的解决方法是把liger导出为seurat,在seurat中做一些个性化的比较,而liger则单纯用来做整合、降维以及聚类。

Liger单细胞多组学分析:整合单细胞RNA-seq数据集-图片1

非负矩阵分解

Liger利用非负矩阵分解的方法进行数据集之间的整合,我们需要提供的参数为K和λ。通常,K反映了数据集中真实cluster的个数,而λ则反映了数据集之间的异质性,如果你对你的数据完全没有把握,那么默认参数K=20,λ=5往往也能有不错的表现。当然我们也可以根据KL散度和Alignment score来优化K和λ的选择。

  1. #find the optimized k
  2. suggestK(ifnb_liger,k.test = seq(10,30,1),lambda = 5,plot.log2 = FALSE,num.cores = 10)

设置λ=5,调用10个进程,测试K从10到30时所计算得到的KL散度,最终选择20作为最优的K。

Liger单细胞多组学分析:整合单细胞RNA-seq数据集-图片2

使用liger自带的函数优化λ的选择,取K=20,测试lambda的范围从1到10,调用10个进程,最终选择5作为最优的λ。

  1. #find the optimized lambda
  2. #use liger function
  3. suggestLambda(object = ifnb_liger,k = 20,lambda.test = seq(1,10,1),num.cores = 10)

Liger单细胞多组学分析:整合单细胞RNA-seq数据集-图片3

使用我自己写的函数来优化λ的选择,

  1. #use my function
  2. suggest_lambda <- MySuggestLambda(ifnb_liger,lambda = seq(1,10,1),k = 20,parl_num = 10)
  3. suggest_lambda[[2]]

Liger单细胞多组学分析:整合单细胞RNA-seq数据集-图片4

官方的函数应该是在Alignment score上做了平均而使得曲线更加平滑了,总之建议使用官方的函数,如果出现报错的话,可以试试我的函数。最后我们选择K=20,λ=5进行非负矩阵分解。

  1. #use lambda 5 k 20
  2. #integrate NMF
  3. ifnb_liger <- optimizeALS(ifnb_liger,k = 20,lambda = 5,max.iters = 30,thresh = 1e-06)

聚类

做完非负矩阵分解后就可以进行聚类了,liger聚类的方法也比较有趣,可以参考liger的数学原理。quantile_norm函数会构建SFN图(shared factor neighborhood graph)对细胞进行聚类,在聚类完成后将同一cluster中不同数据集来源的细胞在坐标轴(W V矩阵)上进行分位数归一化。

  1. #Quantile Normalization and Joint Clustering
  2. ifnb_liger <- quantile_norm(ifnb_liger,knn_k = 20,quantiles = 50,min_cells = 20,do.center = FALSE,
  3. max_sample = 1000,refine.knn = TRUE,eps = 0.9)

这里需要讲两个参数,knn_k是在构建SFN图中选择邻居细胞的个数,do.center是指是否做中心化,一般默认为FALSE,但当我们处理一些比较dense的数据时,如DNA甲基化数据,metagene的loading值通常会很高,此时可以做一下中心化来进行矫正。当然,我们也可以用一些图聚类算法来进一步优化我们的聚类,这一步可做可不做。

  1. # you can use louvain cluster to detect and assign cluster
  2. ifnb_liger <- louvainCluster(ifnb_liger, resolution = 0.25)

可视化与下游分析

  • UMAP降维图
  1. #Visualization and Downstream Analysis
  2. ifnb_liger <- runUMAP(ifnb_liger, distance = 'cosine', n_neighbors = 30, min_dist = 0.3)
  3. all.plots <- plotByDatasetAndCluster(ifnb_liger, axis.labels = c('UMAP 1', 'UMAP 2'), return.plots = T)
  4. all.plots[[1]] all.plots[[2]]

Liger单细胞多组学分析:整合单细胞RNA-seq数据集-图片5

  • Gene loading plot

我们可以找到在各个cluster中高表达的metagene,以及这些metagene在不同的数据集中(W和V矩阵)对其贡献最高的基因。

  1. #find the maximun loading factor on each cluster and the maximun loading gene on the factor
  2. gene_loadings <- plotGeneLoadings(ifnb_liger, do.spec.plot = FALSE, return.plots = TRUE)
  3. gene_loadings[[4]]

Liger单细胞多组学分析:整合单细胞RNA-seq数据集-图片6

metagene_4在cluster_3中特异性高表达,那么metagene_4在W矩阵中对其贡献最大的基因(Shared),则很有可能是cluster_3的marker基因。类似的,metagene_4在V矩阵中对其贡献最大的基因,则反映了同一细胞类型数据集之间(或处理之间)的差异。我们也可以很容易得到图片背后的数据。

  1. #the metadata of the factor/gene loading plot
  2. #UMAP
  3. head(gene_loadings[[4]][[1]]$data)
  4. #gene loading value on the specific factor
  5. head(gene_loadings[[4]][[2]][[1]]$data[order(gene_loadings[[4]][[2]][[1]]$data$loadings,decreasing = TRUE),])
  6. head(gene_loadings[[4]][[2]][[2]]$data[order(gene_loadings[[4]][[2]][[2]]$data$loadings,decreasing = TRUE),])
  7. head(gene_loadings[[4]][[2]][[3]]$data[order(gene_loadings[[4]][[2]][[3]]$data$loadings,decreasing = TRUE),])
  8. # conclusion: gene_loadings object is a list of all the factor, specific the factor you want
  9. # conclusion: the factor object listed by gene_loadings has two lists, the first is the UMAP with the factor loading value, the second is the dataset-sepcific matrix W
  10. # the dataset specific matrix W is a list contains the all the dataset and shared gene loading value
  11. # what a mess!

总之就是list套list,很麻烦,可以对照着图和输出理解一下。

  1. > head(gene_loadings[[4]][[1]]$data)
  2. Factor4 tSNE1 tSNE2
  3. ctrlTCAGCGCTGGTCAT-1 0.0018384202 -9.5633178 -1.5025842
  4. ctrlTTATGGCTTCATTC-1 NA -7.4026990 -0.5618219
  5. ctrlACCCACTGCTTAGG-1 0.0007611497 3.7179575 4.9839707
  6. ctrlATGGGTACCCCGTT-1 0.0001820607 -11.0730367 -4.6024990
  7. ctrlTGACTGGACAGTCA-1 NA 0.5273923 -8.4171533
  8. ctrlGTGTAGTGGTTGTG-1 0.0013735353 8.8018236 5.2826267
  9. > #gene loading value on the specific factor
  10. > head(gene_loadings[[4]][[2]][[1]]$data[order(gene_loadings[[4]][[2]][[1]]$data$loadings,decreasing = TRUE),])
  11. loadings xpos top_k
  12. FAM43A 2.522310 1.0000000 TRUE
  13. FGFBP2 2.411532 0.9995883 TRUE
  14. SPON2 2.264957 0.9991766 FALSE
  15. ZNF638 1.925689 0.9987649 FALSE
  16. KARS 1.916834 0.9983532 FALSE
  17. PRDX5 1.889175 0.9979415 TRUE
  18. > head(gene_loadings[[4]][[2]][[2]]$data[order(gene_loadings[[4]][[2]][[2]]$data$loadings,decreasing = TRUE),])
  19. loadings xpos top_k
  20. GNLY 136.98236 1.0000000 TRUE
  21. NKG7 119.07990 0.9995883 TRUE
  22. CLIC3 117.71097 0.9991766 TRUE
  23. GZMB 108.58467 0.9987649 TRUE
  24. PRF1 108.14754 0.9983532 TRUE
  25. GZMA 93.22887 0.9979415 TRUE
  26. > head(gene_loadings[[4]][[2]][[3]]$data[order(gene_loadings[[4]][[2]][[3]]$data$loadings,decreasing = TRUE),])
  27. loadings xpos top_k
  28. PRF1 3.273654 1.0000000 TRUE
  29. SDF2L1 2.913147 0.9995883 FALSE
  30. CD38 2.204181 0.9991766 TRUE
  31. IFNG 2.111837 0.9987649 FALSE
  32. HSPA5 2.081298 0.9983532 FALSE
  33. CIAPIN1 2.002635 0.9979415 FALSE
  • 差异表达分析

转录组分析怎么离得开差异表达分析嘛,我们可以用runWilcoxon比较不同的cluster之间的差异表达基因,也可以比较同一个cluster的不同数据集来源之间的差异表达基因。

  1. #differential gene express
  2. # for different cluster
  3. cluster.results <- runWilcoxon(ifnb_liger, compare.method = "clusters")
  4. head(cluster.results)
  5. datasets.results <- runWilcoxon(ifnb_liger, compare.method = "datasets")
  6. head(datasets.results)

只需要指定compare.method就可以进行不同的比较,不过这个函数其实也只能做这两种比较,如果我想做之前提到的更加个性化的比较,就只能导入到seurat中再进行比较。

  1. > cluster.results <- runWilcoxon(ifnb_liger, compare.method = "clusters")
  2. [1] "Performing Wilcoxon test on ALL datasets: ctrl, stim"
  3. > head(cluster.results)
  4. feature group avgExpr logFC statistic
  5. 1 RP11-206L10.2 0 -23.01898 -0.005557170 4259604
  6. 2 RP11-206L10.9 0 -23.01914 -0.005979592 4259602
  7. 3 LINC00115 0 -23.00609 -0.031260202 4252898
  8. 4 NOC2L 0 -21.83167 0.292228474 4340248
  9. 5 KLHL17 0 -23.00556 0.003658342 4262141
  10. 6 PLEKHN1 0 -23.01194 -0.039663835 4249915
  11. auc pval padj pct_in pct_out
  12. 1 0.4998057 0.570575909 0.69419204 100 100
  13. 2 0.4998055 0.570110484 0.69419204 100 100
  14. 3 0.4990189 0.138680541 0.24813295 100 100
  15. 4 0.5092682 0.004585054 0.01435755 100 100
  16. 5 0.5001034 0.819517383 0.88363326 100 100
  17. 6 0.4986689 0.044537037 0.10099905 100 100
  18. > datasets.results <- runWilcoxon(ifnb_liger, compare.method = "datasets")
  19. [1] "Performing Wilcoxon test on ALL datasets: ctrl, stim"
  20. > head(datasets.results)
  21. feature group avgExpr logFC statistic
  22. 1 RP11-206L10.2 0-ctrl -23.02585 -0.01408078 665437.5
  23. 2 RP11-206L10.9 0-ctrl -23.02585 -0.01376717 665437.5
  24. 3 LINC00115 0-ctrl -22.99983 0.01283157 666564.5
  25. 4 NOC2L 0-ctrl -21.87312 -0.08498766 662383.5
  26. 5 KLHL17 0-ctrl -22.98625 0.03960509 667718.0
  27. 6 PLEKHN1 0-ctrl -23.02585 -0.02853147 664846.0
  28. auc pval padj pct_in pct_out
  29. 1 0.4995560 0.30577296 0.6990712 100 100
  30. 2 0.4995560 0.30577296 0.6990712 100 100
  31. 3 0.5004020 0.59231247 0.8964793 100 100
  32. 4 0.4972633 0.61893894 0.9110340 100 100
  33. 5 0.5012680 0.09102028 0.4580475 100 100
  34. 6 0.4991119 0.14726281 0.5773302 100 100

对差异表达进行筛选。

  1. > # filter the data to find the meaningful DEG for your own experiment
  2. > cluster.results <- cluster.results[cluster.results$padj < 0.05,]
  3. > cluster.results <- cluster.results[cluster.results$logFC > 3,]
  4. > wilcoxon.cluster_3 <- cluster.results[cluster.results$group == 3, ]
  5. > wilcoxon.cluster_3 <- wilcoxon.cluster_3[order(wilcoxon.cluster_3$padj), ]
  6. > markers <- wilcoxon.cluster_3[1:20, ]
  7. > head(markers)
  8. feature group avgExpr logFC statistic auc
  9. 41861 GNLY 3 -5.282466 16.147323 2379942 0.9647642
  10. 46904 CLIC3 3 -12.962070 9.485599 1946458 0.7890417
  11. 47130 PRF1 3 -12.260573 9.830082 1970124 0.7986352
  12. 49239 GZMB 3 -7.840488 13.697178 2231966 0.9047789
  13. 52832 NKG7 3 -6.594620 14.433185 2324832 0.9424243
  14. 48310 KLRC1 3 -18.000059 4.916239 1606731 0.6513252
  15. pval padj pct_in pct_out
  16. 41861 0.000000e 00 0.000000e 00 100 100
  17. 46904 0.000000e 00 0.000000e 00 100 100
  18. 47130 0.000000e 00 0.000000e 00 100 100
  19. 49239 0.000000e 00 0.000000e 00 100 100
  20. 52832 0.000000e 00 0.000000e 00 100 100
  21. 48310 1.839405e-289 4.099114e-286 100 100

可以看一下PRF1在不同的cluster之间的表达情况。

  1. # feature plot on UMAP
  2. PRF1 <- plotGene(ifnb_liger, "PRF1", axis.labels = c('UMAP 1', 'UMAP 2'), return.plots = T)
  3. PRF1[[1]] PRF1[[2]]

Liger单细胞多组学分析:整合单细胞RNA-seq数据集-图片7

可以看到PRF1在两个数据集的cluster_3中均显著高表达。也可以用小提琴图来展示。

  1. # vlnplot
  2. PRF1 <- plotGeneViolin(ifnb_liger,gene = 'PRF1',return.plots = TRUE)
  3. PRF1[[1]] PRF1[[2]]

Liger单细胞多组学分析:整合单细胞RNA-seq数据集-图片8

完整代码

  1. ############################################
  2. ## Project: Liger-learning
  3. ## Script Purpose: run pipeline of liger integration and further analysis
  4. ## Data: 2020.10.24
  5. ## Author: Yiming Sun
  6. ############################################
  7.  
  8. # general setting
  9. setwd('/data/User/sunym/project/liger_learning/')
  10. # library
  11. library(liger)
  12. library(Seurat)
  13. library(dplyr)
  14. library(tidyverse)
  15. library(viridis)
  16. library(parallel)
  17. #my function
  18. MySuggestLambda <- function(object,lambda,k,knn_k = 20,thresh = 1e-04,max.iters = 100,k2 = 500,ref_dataset = NULL,resolution = 1,parl_num){
  19. res <- matrix(ncol = 2,nrow = length(lambda))
  20. colnames(res) <- c('lambda','alignment')
  21. res[,'lambda'] <- lambda
  22. cl <- makeCluster(parl_num)
  23. clusterExport(cl,c('object','k','knn_k','thresh','max.iters','k2','ref_dataset','resolution'),envir = environment())
  24. clusterExport(cl,c('optimizeALS','quantileAlignSNF','calcAlignment'),envir = .GlobalEnv)
  25. alignment_score <- parLapply(cl = cl,lambda,function(x){
  26. object <- optimizeALS(object,k = k,lambda = x,thresh = thresh,max.iters = max.iters)
  27. object <- quantileAlignSNF(object,knn_k = knn_k,k2 = k2,ref_dataset = ref_dataset,resolution = resolution)
  28. return(calcAlignment(object))
  29. })
  30. stopCluster(cl)
  31. res[,'alignment'] <- (alignment_score %>% unlist())
  32. res <- as.data.frame(res)
  33. res[,1] <- as.numeric(as.character(res[,1]))
  34. res[,2] <- as.numeric(as.character(res[,2]))
  35. p <- ggplot(res,aes(x = lambda,y = alignment)) geom_point() geom_line()
  36. return(list(res,p))
  37. }
  38. # 1.Integrating Multiple Single-Cell RNA-seq Datasets
  39. #load data
  40. ctrl_dge <- readRDS("./data/PBMC_control.RDS")
  41. stim_dge <- readRDS("./data/PBMC_interferon-stimulated.RDS")
  42. #initialize a liger object
  43. ifnb_liger <- createLiger(list(ctrl = ctrl_dge, stim = stim_dge))
  44. #explore liger object
  45. dim([email protected]$ctrl)
  46. head(colnames([email protected]$ctrl))
  47. head(rownames([email protected]$ctrl))
  48. dim([email protected]$stim)
  49. #normalize data
  50. ifnb_liger <- normalize(ifnb_liger)
  51. #select variable gene
  52. ifnb_liger <- selectGenes(ifnb_liger)
  53. #scale data but not center
  54. ifnb_liger <- scaleNotCenter(ifnb_liger)
  55. #find the optimized k
  56. suggestK(ifnb_liger,k.test = seq(10,30,1),lambda = 5,plot.log2 = FALSE,num.cores = 10)
  57. #find the optimized lambda
  58. #use liger function
  59. suggestLambda(object = ifnb_liger,k = 20,lambda.test = seq(1,10,1),num.cores = 10)
  60. #use my function
  61. suggest_lambda <- MySuggestLambda(ifnb_liger,lambda = seq(1,10,1),k = 20,parl_num = 10)
  62. suggest_lambda[[2]]
  63. #use lambda 5 k 20
  64. #integrate NMF
  65. ifnb_liger <- optimizeALS(ifnb_liger,k = 20,lambda = 5,max.iters = 30,thresh = 1e-06)
  66. #Quantile Normalization and Joint Clustering
  67. ifnb_liger <- quantile_norm(ifnb_liger,knn_k = 20,quantiles = 50,min_cells = 20,do.center = FALSE,
  68. max_sample = 1000,refine.knn = TRUE,eps = 0.9)
  69. # you can use louvain cluster to detect and assign cluster
  70. ifnb_liger <- louvainCluster(ifnb_liger, resolution = 0.25)
  71. head([email protected])
  72. #Visualization and Downstream Analysis
  73. ifnb_liger <- runUMAP(ifnb_liger, distance = 'cosine', n_neighbors = 30, min_dist = 0.3)
  74. all.plots <- plotByDatasetAndCluster(ifnb_liger, axis.labels = c('UMAP 1', 'UMAP 2'), return.plots = T)
  75. #the place of each cell and the cluster it belongs
  76. head(all.plots[[1]]$data)
  77. pdf(file = './res/fig_201024/plot_by_dataset_and_cluster.pdf',width = 8,height = 4)
  78. all.plots[[1]] all.plots[[2]]
  79. dev.off()
  80. #find the maximun loading factor on each cluster and the maximun loading gene on the factor
  81. gene_loadings <- plotGeneLoadings(ifnb_liger, do.spec.plot = FALSE, return.plots = TRUE)
  82. pdf(file = './res/fig_201024/gene_loading_plot.pdf',width = 6.5,height = 6)
  83. gene_loadings[[4]]
  84. dev.off()
  85. #the metadata of the factor/gene loading plot
  86. #UMAP
  87. head(gene_loadings[[4]][[1]]$data)
  88. #gene loading value on the specific factor
  89. head(gene_loadings[[4]][[2]][[1]]$data[order(gene_loadings[[4]][[2]][[1]]$data$loadings,decreasing = TRUE),])
  90. head(gene_loadings[[4]][[2]][[2]]$data[order(gene_loadings[[4]][[2]][[2]]$data$loadings,decreasing = TRUE),])
  91. head(gene_loadings[[4]][[2]][[3]]$data[order(gene_loadings[[4]][[2]][[3]]$data$loadings,decreasing = TRUE),])
  92. # conclusion: gene_loadings object is a list of all the factor, specific the factor you want
  93. # conclusion: the factor object listed by gene_loadings has two lists, the first is the UMAP with the factor loading value, the second is the dataset-sepcific matrix W
  94. # the dataset specific matrix W is a list contains the all the dataset and shared gene loading value
  95. # what a mess!
  96. #differential gene express
  97. # for different cluster
  98. cluster.results <- runWilcoxon(ifnb_liger, compare.method = "clusters")
  99. head(cluster.results)
  100. datasets.results <- runWilcoxon(ifnb_liger, compare.method = "datasets")
  101. head(datasets.results)
  102. # filter the data to find the meaningful DEG for your own experiment
  103. cluster.results <- cluster.results[cluster.results$padj < 0.05,]
  104. cluster.results <- cluster.results[cluster.results$logFC > 3,]
  105. wilcoxon.cluster_3 <- cluster.results[cluster.results$group == 3, ]
  106. wilcoxon.cluster_3 <- wilcoxon.cluster_3[order(wilcoxon.cluster_3$padj), ]
  107. markers <- wilcoxon.cluster_3[1:20, ]
  108. head(markers)
  109. # feature plot on UMAP
  110. PRF1 <- plotGene(ifnb_liger, "PRF1", axis.labels = c('UMAP 1', 'UMAP 2'), return.plots = T)
  111. pdf(file = './res/fig_201024/feature_plot_PRF1.pdf',width = 8,height = 4)
  112. PRF1[[1]] PRF1[[2]]
  113. dev.off()
  114. # vlnplot
  115. PRF1 <- plotGeneViolin(ifnb_liger,gene = 'PRF1',return.plots = TRUE)
  116. pdf(file = './res/fig_201024/vlnplot_PRF1.pdf',width = 6,height = 3)
  117. PRF1[[1]] PRF1[[2]]
  118. dev.off()
  119. # compare cluster marker express winthin and across dataset
  120. IFIT3 <- plotGene(ifnb_liger, "IFIT3", axis.labels = c('UMAP 1', 'UMAP 2'), return.plots = TRUE)
  121. IFITM3 <- plotGene(ifnb_liger, "IFITM3", axis.labels = c('UMAP 1', 'UMAP 2'), return.plots = TRUE)
  122. pdf(file = './res/fig_201024/cluster_marker_within_between_dataset.pdf',width = 8,height = 8)
  123. plot_grid(IFIT3[[1]],IFIT3[[2]],IFITM3[[1]],IFITM3[[2]], ncol=2)
  124. dev.off()

发表评论

匿名网友

拖动滑块以完成验证
加载失败