Liger单细胞多组学分析:seurat与liger

使用liger整合单细胞RNA-seq的文章中,我提到liger的数据结构和函数调用不及seurat那么方便和个性化,因此将两者的优势结合起来能够大大便利我们的单细胞数据分析。本文主要介绍以下两种方法:

  1. 使用SeuratWrappers在seurat中直接调用liger进行降维聚类
  2. 使用liger内置的函数在liger对象和seurat对象之间进行转换

SeuratWrappers

SeuratWrappers相当于seurat的社区工具,能够使得我们很方便地调用别的包的一些代码和方法来处理seurat对象。SeuratWrappers目前支持Monocle3、glmpca、LIGER、Harmony以及Velocity等13个常用的单细胞转录组相关的包,具体的细节可以参考GitHub。这里不再对代码做任何讲解,因为使用方法和参数与liger基本完全一致,如果有困惑可以参考使用liger整合单细胞RNA-seq

  1. ############################################
  2. ## Project: Liger-learning
  3. ## Script Purpose: Integrating Seurat objects using LIGER
  4. ## Data: 2020.11.01
  5. ## Author: Yiming Sun
  6. ############################################
  7.  
  8. # general setting
  9. setwd('~/sunym/project/liger_learning/')
  10. # library
  11. library(liger)
  12. library(Seurat)
  13. library(dplyr)
  14. library(tidyverse)
  15. library(viridis)
  16. library(SeuratData)
  17. library(SeuratWrappers)
  18. # 1.Systematic comparative analysis of human PBMC
  19. data("pbmcsca")
  20. #pbmcsca is a seurat object
  21. pbmcsca <- NormalizeData(pbmcsca)
  22. pbmcsca <- FindVariableFeatures(pbmcsca,selection.method = 'vst',nfeatures = 2000)
  23. #scale by dfferent methods --> intagrate different methods
  24. pbmcsca <- ScaleData(pbmcsca,split.by = 'Method',do.center = FALSE)
  25. pbmcsca <- RunOptimizeALS(pbmcsca,k = 20,lambda = 5,split.by = 'Method',max.iters = 30,thresh = 1e-06)
  26. pbmcsca <- RunQuantileNorm(pbmcsca,split.by = 'Method',knn_k = 20,quantiles = 50,min_cells = 20,do.center = FALSE,
  27. max_sample = 1000,refine.knn = TRUE,eps = 0.9)
  28. #can further cluster the data and find neighbours
  29. pbmcsca <- FindNeighbors(pbmcsca,reduction = 'iNMF',dims = 1:20)
  30. pbmcsca <- FindClusters(pbmcsca,resolution = 0.3)
  31. #dimension reduction and plotting
  32. pbmcsca <- RunUMAP(pbmcsca,dims = 1:20,reduction = 'iNMF')
  33. pdf(file = './res/fig_201101/pbmc_split_by_methods.pdf',width = 18,height = 5)
  34. DimPlot(pbmcsca,group.by = c('Method','RNA_snn_res.0.3','CellType'),ncol = 3)
  35. dev.off()

Liger单细胞多组学分析:seurat与liger-图片1

Liger内置函数

Liger包中内置了两个函数ligerToSeurat和seuratToLiger,通常我们用的比较多的是将降维聚类过后的liger对象转换成seurat对象用于做后续的差异表达分析。我们可以简单的来看一下这两个函数的效果。首先先创建一个liger对象。

  1. ############################################
  2. ## Project: Liger-learning
  3. ## Script Purpose: liger and seurat
  4. ## Data: 2020.11.14
  5. ## Author: Yiming Sun
  6. ############################################
  7.  
  8. # general setting
  9. setwd('/data/User/sunym/project/liger_learning/')
  10.  
  11. #libarry
  12. library(liger)
  13. library(Seurat)
  14. library(dplyr)
  15. library(tidyverse)
  16. library(viridis)
  17.  
  18.  
  19. #######################################
  20. #liger to seurat
  21. #######################################
  22.  
  23. #load data
  24. ctrl_dge <- readRDS("./data/PBMC_control.RDS")
  25. stim_dge <- readRDS("./data/PBMC_interferon-stimulated.RDS")
  26. #initialize a liger object
  27. ifnb_liger <- createLiger(list(ctrl = ctrl_dge, stim = stim_dge))
  28. #explore liger object
  29. dim([email protected]$ctrl)
  30. head(colnames([email protected]$ctrl))
  31. head(rownames([email protected]$ctrl))
  32. dim([email protected]$stim)
  33. #normalize data
  34. ifnb_liger <- normalize(ifnb_liger)
  35. #select variable gene
  36. ifnb_liger <- selectGenes(ifnb_liger)
  37. #scale data but not center
  38. ifnb_liger <- scaleNotCenter(ifnb_liger)
  39. #integrate NMF
  40. ifnb_liger <- optimizeALS(ifnb_liger,k = 20,lambda = 5,max.iters = 30,thresh = 1e-06)
  41. #Quantile Normalization and Joint Clustering
  42. ifnb_liger <- quantile_norm(ifnb_liger,knn_k = 20,quantiles = 50,min_cells = 20,do.center = FALSE,
  43. max_sample = 1000,refine.knn = TRUE,eps = 0.9)
  44. # you can use louvain cluster to detect and assign cluster
  45. ifnb_liger <- louvainCluster(ifnb_liger, resolution = 0.25)
  46. #Visualization and Downstream Analysis
  47. ifnb_liger <- runUMAP(ifnb_liger, distance = 'cosine', n_neighbors = 30, min_dist = 0.3)
  48. all.plots <- plotByDatasetAndCluster(ifnb_liger, axis.labels = c('UMAP 1', 'UMAP 2'), return.plots = T)
  49. pdf(file = './res/fig_201114/plot_by_dataset_and_cluster.pdf',width = 8,height = 4)
  50. all.plots[[1]] all.plots[[2]]
  51. dev.off()

Liger单细胞多组学分析:seurat与liger-图片2

  1. #liger to seurat
  2. #use nms
  3. ifnb_seurat <- ligerToSeurat(ifnb_liger,use.liger.genes = TRUE,by.dataset = FALSE,renormalize = TRUE)
  4. table([email protected])
  5. head([email protected])
  6. head(colnames(ifnb_seurat))

可以直接用ligerToSeurat函数进行转换,use.liger.genes参数表示是否保留variable gene的信息,by.dataset参数表示是否在cluster的名字之前加入dataset的名字以作区分,另外默认nms参数为names([email protected]),这个参数会在细胞的barcode之前加入dataset的名称并在orig.ident中标注出数据集的来源。可以看下输出简单理解下。

  1. > #liger to seurat
  2. > #use nms
  3. > ifnb_seurat <- ligerToSeurat(ifnb_liger,use.liger.genes = TRUE,by.dataset = FALSE,renormalize = TRUE)
  4. Warning: No assay specified, setting assay as RNA by default.
  5. Warning: No columnames present in cell embeddings, setting to 'iNMF_1:20'
  6. Warning: No assay specified, setting assay as RNA by default.
  7. Warning: No columnames present in cell embeddings, setting to 'tSNE_1:2'
  8. Warning: Feature names cannot have underscores ('_'), replacing with dashes ('-')
  9. Performing log-normalization
  10. 0% 10 20 30 40 50 60 70 80 90 100%
  11. [----|----|----|----|----|----|----|----|----|----|
  12. **************************************************|
  13. > table([email protected])
  14.  
  15. 1 0 2 9 4 3 6 7 5 8 10
  16. 1500 2309 609 37 384 444 238 140 251 56 32
  17. > head([email protected])
  18. orig.ident nCount_RNA nFeature_RNA
  19. ctrl_ctrlTCAGCGCTGGTCAT-1 ctrl 2232 815
  20. ctrl_ctrlTTATGGCTTCATTC-1 ctrl 2466 760
  21. ctrl_ctrlACCCACTGCTTAGG-1 ctrl 1085 452
  22. ctrl_ctrlATGGGTACCCCGTT-1 ctrl 3242 925
  23. ctrl_ctrlTGACTGGACAGTCA-1 ctrl 635 333
  24. ctrl_ctrlGTGTAGTGGTTGTG-1 ctrl 1462 549
  25. > head(colnames(ifnb_seurat))
  26. [1] "ctrl_ctrlTCAGCGCTGGTCAT-1" "ctrl_ctrlTTATGGCTTCATTC-1"
  27. [3] "ctrl_ctrlACCCACTGCTTAGG-1" "ctrl_ctrlATGGGTACCCCGTT-1"
  28. [5] "ctrl_ctrlTGACTGGACAGTCA-1" "ctrl_ctrlGTGTAGTGGTTGTG-1"

如果令nms = NULL。

  1. > #not use nms
  2. > ifnb_seurat <- ligerToSeurat(ifnb_liger,nms = NULL,use.liger.genes = TRUE,by.dataset = FALSE,renormalize = TRUE)
  3. Warning: No assay specified, setting assay as RNA by default.
  4. Warning: No columnames present in cell embeddings, setting to 'iNMF_1:20'
  5. Warning: No assay specified, setting assay as RNA by default.
  6. Warning: No columnames present in cell embeddings, setting to 'tSNE_1:2'
  7. Warning: Feature names cannot have underscores ('_'), replacing with dashes ('-')
  8. Performing log-normalization
  9. 0% 10 20 30 40 50 60 70 80 90 100%
  10. [----|----|----|----|----|----|----|----|----|----|
  11. **************************************************|
  12. > table([email protected])
  13.  
  14. 1 0 2 9 4 3 6 7 5 8 10
  15. 1500 2309 609 37 384 444 238 140 251 56 32
  16. > head([email protected])
  17. orig.ident nCount_RNA nFeature_RNA
  18. ctrlTCAGCGCTGGTCAT-1 SeuratProject 2232 815
  19. ctrlTTATGGCTTCATTC-1 SeuratProject 2466 760
  20. ctrlACCCACTGCTTAGG-1 SeuratProject 1085 452
  21. ctrlATGGGTACCCCGTT-1 SeuratProject 3242 925
  22. ctrlTGACTGGACAGTCA-1 SeuratProject 635 333
  23. ctrlGTGTAGTGGTTGTG-1 SeuratProject 1462 549
  24. > head(colnames(ifnb_seurat))
  25. [1] "ctrlTCAGCGCTGGTCAT-1" "ctrlTTATGGCTTCATTC-1"
  26. [3] "ctrlACCCACTGCTTAGG-1" "ctrlATGGGTACCCCGTT-1"
  27. [5] "ctrlTGACTGGACAGTCA-1" "ctrlGTGTAGTGGTTGTG-1"

使用seurat的函数做个性化的差异表达分析,参考文章Seurat进行单细胞RNA-seq聚类分析

  1. #use liger cluster as cell type and do the DE analysis
  2. new.cluster.ids <- c("a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k")
  3. names(new.cluster.ids) <- levels([email protected])
  4. ifnb_seurat <- RenameIdents(ifnb_seurat, new.cluster.ids)
  5. ifnb_seurat$cell_type <- [email protected]
  6. Idents(ifnb_seurat) <- 'cell_type'
  7. all.marker <- FindAllMarkers(ifnb_seurat,only.pos = TRUE,min.pct = 0.25,logfc.threshold = 0.25)
  8. cluster_a_vs_b_marker <- FindMarkers(ifnb_seurat,group.by = 'cell_type',ident.1 = 'a',ident.2 = 'b',only.pos = TRUE)

手动导出liger中的降维图并导入seurat。

  1. #get the tsne manually
  2. tsne.obj <- CreateDimReducObject(embeddings = [email protected],key = 'testUMAP_',global = TRUE)
  3. ifnb_seurat[['tsne']] <- tsne.obj
  4. pdf(file = './res/fig_201114/dimplot_tsetUMAP.pdf',width = 9,height = 5)
  5. DimPlot(ifnb_seurat,group.by = 'cell_type')
  6. dev.off()

注意tsne.obj的barcode要与seurat中的barcode相对应,因此可以将nms设为NULL或者为tsne.obj手动加上dataset的标签。

Liger单细胞多组学分析:seurat与liger-图片3

可以看到key参数中的内容被成功导入进去了。最后我们也可以将seurat对象转换为liger对象。

  1. > ###################################################
  2. > #seurat to liger
  3. > ###################################################
  4. > ifnb_liger <- seuratToLiger(ifnb_seurat,combined.seurat = TRUE,meta.var = 'orig.ident',renormalize = TRUE)
  5. > head([email protected])
  6. ctrlTCAGCGCTGGTCAT-1 ctrlTTATGGCTTCATTC-1 ctrlACCCACTGCTTAGG-1
  7. a a b
  8. ctrlATGGGTACCCCGTT-1 ctrlTGACTGGACAGTCA-1 ctrlGTGTAGTGGTTGTG-1
  9. a c b
  10. Levels: a b c d e f g h i j k
  11. > head([email protected])
  12. tSNE_1 tSNE_2
  13. ctrlTCAGCGCTGGTCAT-1 -9.5633178 -1.5025842
  14. ctrlTTATGGCTTCATTC-1 -7.4026990 -0.5618219
  15. ctrlACCCACTGCTTAGG-1 3.7179575 4.9839707
  16. ctrlATGGGTACCCCGTT-1 -11.0730367 -4.6024990
  17. ctrlTGACTGGACAGTCA-1 0.5273923 -8.4171533
  18. ctrlGTGTAGTGGTTGTG-1 8.8018236 5.2826267

写在最后

这部分的内容比较枯燥,主要是我自己探索了一下seurat和liger的数据结构以及他们之间如何进行相互转换,想在这里记录一下以免自己忘了。

参考链接

Integrating Seurat objects using LIGER

发表评论

匿名网友

拖动滑块以完成验证
加载失败