UTR区域的分布plotprofile——R包Guitar—科研工具箱

请关注公众号【叨客学习资料】 在使用网站的过程中有疑问,请来公众号进行反馈哦

1 Quick Start with Guitar

if (!requireNamespace(\"BiocManager\", quietly = TRUE))
    install.packages(\"BiocManager\")

BiocManager::install(\"Guitar\")

 

This is a manual for Guitar package. The Guitar package is aimed for RNA landmark-guided transcriptomic analysis of RNA-reated genomic features. The Guitar package enables the comparison of multiple genomic features, which need to be stored in a name list. Please see the following example, which reads 1000 RNA m6A methylation sites into R for detection. Of course, in actual data analysis, features may come from multiple sets of resources.

library(Guitar)

# genomic features imported into named list

stBedFiles <- list(system.file(\"extdata\", \"m6A_mm10_exomePeak_1000peaks_bed12.bed\",

package=\"Guitar\"))

With the following script, we may generate the transcriptomic distribution of genomic features to be tested, and the result will be automatically saved into a PDF file under the working directory with prefix “example”. With the GuitarPlot function, the gene annotation can be downloaded from internet automatically with a genome assembly number provided; however, this feature requires working internet and might take a longer time. The toy Guitar coordinates generated internally should never be re-used in other real data analysis.

count <- GuitarPlot(txGenomeVer = \"mm10\",

stBedFiles = stBedFiles,

miscOutFilePrefix = NA)

In a more efficent protocol, in order to re-use the gene annotation and Guitar coordinates, you will have to build Guitar Coordiantes from a txdb object in a separate step. The transcriptDb contains the gene annotation information and can be obtained in a number of ways, .e.g, download the complete gene annotation of species from UCSC automatically, which might takes a few minutes. In the following analysis, we load the Txdb object from a toy dataset provided with the Guitar package. Please note that this is only a very small part of the complete hg19 transcriptome, and the Txdb object provided with Guitar package should not be used in real data analysis. With a TxDb object that contains gene annotation information, we in the next build Guitar coordiantes, which is essentially a bridge connects the transcriptomic landmarks and genomic coordinates.

txdb_file <- system.file(\"extdata\", \"mm10_toy.sqlite\",

package=\"Guitar\")

txdb <- loadDb(txdb_file)

guitarTxdb <- makeGuitarTxdb(txdb = txdb, txPrimaryOnly = FALSE)

You may now generate the Guitar plot from the named list of genome-based features.

GuitarPlot(txTxdb = txdb,

stBedFiles = stBedFiles,

miscOutFilePrefix = \"example\")

Alternatively, you may also optionally include the promoter DNA region and tail DNA region on the 5’ and 3’ side of a transcript in the plot with parameter headOrtail =TRUE.

GuitarPlot(txTxdb = txdb,

stBedFiles = stBedFiles,

headOrtail = TRUE)

图片[1]-UTR区域的分布plotprofile——R包Guitar—科研工具箱-叨客学习资料网

Alternatively, you may also optionally include the Confidence Interval for guitar plot with parameter enableCI = FALSE.

GuitarPlot(txTxdb = txdb,

stBedFiles = stBedFiles,

headOrtail = TRUE,

enableCI = FALSE)

 

图片[2]-UTR区域的分布plotprofile——R包Guitar—科研工具箱-叨客学习资料网

2 Supported Data Format

Besides BED file, Guitar package also supports GRangesList and GRanges data structures. Please see the following examples.

# import different data formats into a named list object.

# These genomic features are using mm10 genome assembly

stBedFiles <- list(system.file(\"extdata\", \"m6A_mm10_exomePeak_1000peaks_bed12.bed\",

package=\"Guitar\"),

system.file(\"extdata\", \"m6A_mm10_exomePeak_1000peaks_bed6.bed\",

package=\"Guitar\"))

# Build Guitar Coordinates

txdb_file <- system.file(\"extdata\", \"mm10_toy.sqlite\",

package=\"Guitar\")

txdb <- loadDb(txdb_file)

# Guitar Plot

GuitarPlot(txTxdb = txdb,

stBedFiles = stBedFiles,

headOrtail = TRUE,

enableCI = FALSE,

mapFilterTranscript = TRUE,

pltTxType = c(\"mrna\"),

stGroupName = c(\"BED12\",\"BED6\"))

 

图片[3]-UTR区域的分布plotprofile——R包Guitar—科研工具箱-叨客学习资料网

3 Processing of sampling sites information

We can select parameters for site sampling.

stGRangeLists = vector(\"list\", length(stBedFiles))

sitesPoints <- list()

for (i in seq_len(length(stBedFiles))) {

stGRangeLists[[i]] <- blocks(import(stBedFiles[[i]]))

}

for (i in seq_len(length(stGRangeLists))) {

sitesPoints[[i]] <- samplePoints(stGRangeLists[i],

stSampleNum = 10,

stAmblguity = 5,

pltTxType = c(\"mrna\"),

stSampleModle = \"Equidistance\",

mapFilterTranscript = FALSE,

guitarTxdb = guitarTxdb)

}

 

4 Guitar Coordinates – Transcriptomic Landmarks Projected on Genome

The guitarTxdb object contains the genome-projected transcriptome coordinates, which can be valuable for evaluating transcriptomic information related applications, such as checking the quality of MeRIP-Seq data. The Guitar coordinates are essentially the genomic projection of standardized transcript-based coordiantes, making a viable bridge beween the landmarks on transcript and genome-based coordinates. It is based on the txdb object input, extracts the transcript information in txdb, selects the transcripts that match the parameters according to the component parameters set by the user, and saves according to the transcript type (tx, mrna, ncrna).

guitarTxdb <- makeGuitarTxdb(txdb = txdb,

txAmblguity = 5,

txMrnaComponentProp = c(0.1,0.15,0.6,0.05,0.1),

txLncrnaComponentProp = c(0.2,0.6,0.2),

pltTxType = c(\"tx\",\"mrna\",\"ncrna\"),

txPrimaryOnly = FALSE)

5 Check the Overlapping between Different Components

We can also check the distribution of the Guitar coordinates built.

gcl <- list(guitarTxdb$tx$tx)

GuitarPlot(txTxdb = txdb,

stGRangeLists = gcl,

stSampleNum = 200,

enableCI = TRUE,

pltTxType = c(\"tx\"),

txPrimaryOnly = FALSE

)

 

图片[4]-UTR区域的分布plotprofile——R包Guitar—科研工具箱-叨客学习资料网

Alternatively, we can extract the RNA components, check the distribution of tx components in the transcriptome.

GuitarCoords <- guitarTxdb$tx$txComponentGRange

type <- paste(mcols(GuitarCoords)$componentType,mcols(GuitarCoords)$txType)

key <- unique(type)

landmark <- list(1,2,3,4,5,6,7,8,9,10,11)

names(landmark) <- key

for (i in 1:length(key)) {

landmark[[i]] <- GuitarCoords[type==key[i]]

}

GuitarPlot(txTxdb = txdb ,

stGRangeLists = landmark[1:3],

pltTxType = c(\"tx\"),

enableCI = FALSE

)

 

图片[5]-UTR区域的分布plotprofile——R包Guitar—科研工具箱-叨客学习资料网

Check the distribution of mRNA components in the transcriptome

GuitarPlot(txTxdb = txdb ,

stGRangeLists = landmark[4:8],

pltTxType = c(\"mrna\"),

enableCI = FALSE

)

 

图片[6]-UTR区域的分布plotprofile——R包Guitar—科研工具箱-叨客学习资料网

Check the distribution of lncRNA components in the transcriptome

GuitarPlot(txTxdb = txdb ,

stGRangeLists = landmark[9:11],

pltTxType = c(\"ncrna\"),

enableCI = FALSE

)

 

图片[7]-UTR区域的分布plotprofile——R包Guitar—科研工具箱-叨客学习资料网

6 Session Information

sessionInfo()

© 版权声明
THE END
喜欢就支持一下吧
点赞0 分享
评论 抢沙发
头像
请输入有效评论哦,肆意灌水或者乱打评论是不会通过的,会影响您评论后获得资源哦~~
提交
头像

昵称

取消
昵称表情

    暂无评论内容