问答文章1 问答文章501 问答文章1001 问答文章1501 问答文章2001 问答文章2501 问答文章3001 问答文章3501 问答文章4001 问答文章4501 问答文章5001 问答文章5501 问答文章6001 问答文章6501 问答文章7001 问答文章7501 问答文章8001 问答文章8501 问答文章9001 问答文章9501

单细胞实战(3):STAR分析单细胞数据

发布网友 发布时间:2022-12-03 02:37

我来回答

1个回答

热心网友 时间:2023-09-16 15:44

在利用cellranger比对单细胞reads时,可以发现有STAR的进程夹杂在里面,那么STAR可以用来比对单细胞数据吗?在STAR的2.7版本中(2.7.6a)出现了STARsolo,可以进行单细胞数据的比对,由此可见STAR的强大

在使用STAR之前,先看一下cellranger的输出结果

为方便查看,cellranger提供了一个网页端的结果,我们主要观察细胞和基因数目的评估即可,后续的聚类工作由seurat完成

在结果目录,可以看到如下两个目录

raw_feature_bc_matrix

filtered_gene_bc_matrices

raw目录下是所有的barcode信息,包含了细胞相关的barcoed和背景barcode,而filter目录下只包含细胞相关的barcode信息,内容如下

后缀为mtx的文件记录的就是基因的表达量信息,可以导入R或者python中查看,barcodes相当于一个细胞,features代表不同的基因,barcodes文件在STARsolo中会用到,这就是为什么我要先说明一下cellranger的输出结果

STARsolo被设计为替代10X CellRanger基因定量比对软件。而且STARsolo的速度是cellranger的十倍(具体怎么样我也不知道)

建立索引:

其实利用cellranger构建的索引原则上也能用,在GRCh38/star/下,但是由于STAR版本问题可能会导致无法识别,因为cellranger用的STAR构建的索引和我们自己用的STAR版本是不一致的。

STARsolo与普通的转录组比对区别在于你需要在比对时加上whitelist, whitelist 文件格式在10X官网有写,我们可以利用cellranger的barcodes.tsv.gz文件获得

需要注意ReadFilesIn先读入测序数据,再读入barcode+UMI文件,即先读入R2再读入R1

结果默认保存在Solo.out文件中,8线程只用了10min左右,确实要快一点

看一下Summary里是啥

可以看到比起cellranger,STAR捕获到的细胞数少一点,而且每个细胞的reads要低一点,其他差不多,后续将使用Seurat包对两组数据进行比较

--soloType
default: None
string(s): type of single-cell RNA-seq
CB_UMI_Simple
(a.k.a. Droplet) one UMI and one Cell Barcode of xed length in
read2, e.g. Drop-seq and 10X Chromium.
CB_UMI_Complex
one UMI of xed length, but multiple Cell Barcodes of varying length,
as well as adapters sequences are allowed in read2 only, e.g. inDrop.
CB_samTagOut
output Cell Barcode as CR and/or CB SAm tag. No UMI counting.
{readFilesIn cDNA read1 [cDNA read2 if paired-end]
CellBarcode read . Requires {outSAMtype BAM Unsorted [and/or
SortedByCoordinate]
SmartSeq
Smart-seq: each cell in a separate FASTQ (paired- or single-end),
barcodes are corresponding read-groups, no UMI sequences,
alignments deplicated according to alignment start and end (after
extending soft-clipped bases)

--soloCBwhitelist
default: -
string(s): le(s) with whitelist(s) of cell barcodes. Only {soloType
CB UMI Complex allows more than one whitelist le.
None
no whitelist: all cell barcodes are allowed

--soloCBstart
default: 1
int>0: cell barcode start base

--soloCBlen
default: 16
int>0: cell barcode length
--soloUMIstart
default: 17
int>0: UMI start base

--soloUMIlen
default: 10
int>0: UMI length

--soloBarcodeReadLength
default: 1
int: length of the barcode read
1
equal to sum of soloCBlen+soloUMIlen
0
not dened, do not check
--soloCBposition
default: -
strings(s) position of Cell Barcode(s) on the barcode read.
Presently only works with {soloType CB UMI Complex, and barcodes are
assumed to be on Read2.
Format for each barcode: startAnchor startPosition endAnchor endPosition
start(end)Anchor denes the Anchor Base for the CB: 0: read start; 1: read
end; 2: adapter start; 3: adapter end
start(end)Position is the 0-based position with of the CB start(end) with
respect to the Anchor Base
String for di�erent barcodes are separated by space.
Example: inDrop (Zilionis et al, Nat. Protocols, 2017):
{soloCBposition 0 0 2 -1 3 1 3 8
--soloUMIposition
default: -
string position of the UMI on the barcode read, same as soloCBposition
--soloAdapterSequence
default: -
string: adapter sequence to anchor barcodes.
--soloAdapterMismatchesNmax

default: 1
int>0: maximum number of mismatches allowed in adapter sequence.
--soloCBmatchWLtype
default: 1MM multi
string: matching the Cell Barcodes to the WhiteList
Exact
only exact matches allowed
1MM
only one match in whitelist with 1 mismatched base allowed. Allowed
CBs have to have at least one read with exact match.
1MM_multi
multiple matches in whitelist with 1 mismatched base allowed,
posterior probability calculation is used choose one of the matches.
Allowed CBs have to have at least one read with exact match. Similar to
CellRanger 2.2.0
1MM_multi_pseudocounts
same as 1MM Multi, but pseudocounts of 1 are added to all whitelist
barcodes.
Similar to CellRanger 3.x.x
--soloStrand
default: Forward
string: strandedness of the solo libraries:
Unstranded
no strand information
Forward
read strand same as the original RNA molecule
Reverse
read strand opposite to the original RNA molecule
--soloFeatures
default: Gene
string(s): genomic features for which the UMI counts per Cell Barcode are
collected

Gene
genes: reads match the gene transcript
SJ
splice junctions: reported in SJ.out.tab
GeneFull
full genes: count all reads overlapping genes' exons and introns
--soloUMIdep
default: 1MM_All
string(s): type of UMI deplication (collapsing) algorithm
1MM_All
all UMIs with 1 mismatch distance to each other are collapsed (i.e.
counted once)
1MM_Directional
follows the "directional" method from the UMI-tools by Smith, Heger
and Sudbery (Genome Research 2017).
Exact
only exactly matching UMIs are collapsed
NoDep
no deplication of UMIs, count all reads. Allowed for --soloType
SmartSeq

--soloUMIfiltering
default: -
string(s) type of UMI ltering
-

basic ltering: remove UMIs with N and homopolymers (similar to
CellRanger 2.2.0)
MultiGeneUMI
remove lower-count UMIs that map to more than one gene
(introced in CellRanger 3.x.x)

--soloOutFileNames
default: Solo.out/ features.tsv barcodes.tsv matrix.mtx
string(s) le names for STARsolo output:
le name prex gene names barcode sequences cell feature count matrix

--soloCellFilter
default: CellRanger2.2 3000 0.99 10

string(s): cell ltering type and parameters
CellRanger2.2
simple ltering of CellRanger 2.2, followed by three numbers: number
of expected cells, robust maximum percentile for UMI count,
maximum to minimum ratio for UMI count
TopCells
only report top cells by UMI count, followed by the exact number of
cells
None
do not output ltered cells

--soloOutFormatFeaturesGeneField3
default: "Gene Expression"
string(s): eld 3 in the Gene features.tsv le. If "-", then no 3rd eld is output.

声明声明:本网页内容为用户发布,旨在传播知识,不代表本网认同其观点,若有侵权等问题请及时与本网联系,我们将在第一时间删除处理。E-MAIL:11247931@qq.com
求这图片原图或者主角 出自哪一边电影或者电视剧 哪位大神知道这张图原版在吗?求原版。。。 在word中加下划线时第一次出现一条下划线第二次怎么就出现两条... 诛仙2资料站里的宠物展示能在npc买到吗 求一些诛仙2中的可以变为人形的宠物名字 诛仙2 神域的神农殿里,这是什么东西 诛仙2宠物元神问题 诛仙2 商城 中 宠物元神 怎么比 宠物 还贵啊?宠物元神是干什么的啊... 电热水器怎么安固定花洒 猛鬼夜惊魂剧情介绍 抖音用粉丝一千万拍抖音能涨粉吗? 农业银行几点开门? 小米cc9e怎么打开悬浮球 商务礼仪中女士头发发型要求礼仪 清蒸螃蟹怎么作 清蒸贵妃蟹 清蒸贵妃蟹的做法步骤 手机root手机变砖 贵妇膏是不是骗局 款去右边的部首还能组成一什么新字 我想问问账单日当天消费什么时候出账单 账单日一般几点出账单? 中洲大桥简介 中洲大桥简单介绍 请问哪位大佬有经典美式指弹系列高清视频的百度云资源分享一下链接呗 eix恢复完成后没重装 葡萄多少钱一斤啊 葡萄酒净含量75cl是多少斤? 电热水器复位键按不下去怎么办? 毕业论文答辩ppt格式 空调2000cal 什么意思? 淘宝运费险为什么不可以买了? Oppo a59s如何设置移动网络?4g开关在哪?出国办流量用 附近离婚的女人,附近离婚的女人微信群 单身离异女人的,50岁离婚的女人那有 海王和波塞冬什么关系 两人相比谁更加厉害一些 海王是波塞冬吗 DNF打不了字怎么弄,重启也试过了 王者改名卡可以赠送吗 王者改名卡赠送的步骤 太平福禄娃和平安爱满分有什么区别 求耽美重生复仇主受文 凯迪拉克有哪些suv车型? 凯迪拉克suv3.0三十万公理后油耗 如何将华为手机mate+30微信上的金山文档发送到网易邮箱里? 手机冲不了电想换其他型号 新苹果手机充电玩手机充电断断续续能不能换新机 哪些上行文 有签发人 梦见自己拉出大使变糠是好是坏 甲辰年的简介 全国有哪些比较有名的红木家具厂家或者经销商? 十大寿险公司排名是怎么样的呢? 830canon型号的墨盒怎么灌墨水啊