Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Optimization of de novo short read assembly of seabuckthorn (Hippophae rhamnoides L.) transcriptome

Ghangal, Rajesh, Chaudhary, Saurabh, Jain, Mukesh, Purty, Ram Singh and Chand Sharma, Prakash 2013. Optimization of de novo short read assembly of seabuckthorn (Hippophae rhamnoides L.) transcriptome. PLoS ONE 8 (8) , e72516. 10.1371/journal.pone.0072516

[thumbnail of Ghangal et al journal.pone.0072516.PDF]
Preview
PDF - Published Version
Available under License Creative Commons Attribution.

Download (534kB) | Preview

Abstract

Seabuckthorn (Hippophae rhamnoides L.) is known for its medicinal, nutritional and environmental importance since ancient times. However, very limited efforts have been made to characterize the genome and transcriptome of this wonder plant. Here, we report the use of next generation massive parallel sequencing technology (Illumina platform) and de novo assembly to gain a comprehensive view of the seabuckthorn transcriptome. We assembled 86,253,874 high quality short reads using six assembly tools. At our hand, assembly of non-redundant short reads following a two-step procedure was found to be the best considering various assembly quality parameters. Initially, ABySS tool was used following an additive k-mer approach. The assembled transcripts were subsequently subjected to TGICL suite. Finally, de novo short read assembly yielded 88,297 transcripts (> 100 bp), representing about 53 Mb of seabuckthorn transcriptome. The average length of transcripts was 610 bp, N50 length 1198 BP and 91% of the short reads uniquely mapped back to seabuckthorn transcriptome. A total of 41,340 (46.8%) transcripts showed significant similarity with sequences present in nr protein databases of NCBI (E-value < 1E-06). We also screened the assembled transcripts for the presence of transcription factors and simple sequence repeats. Our strategy involving the use of short read assembler (ABySS) followed by TGICL will be useful for the researchers working with a non-model organism’s transcriptome in terms of saving time and reducing complexity in data management. The seabuckthorn transcriptome data generated here provide a valuable resource for gene discovery and development of functional molecular markers.

Item Type: Article
Date Type: Publication
Status: Published
Schools: Biosciences
Additional Information: This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Publisher: Public Library of Science
ISSN: 1932-6203
Date of First Compliant Deposit: 23 November 2020
Date of Acceptance: 9 July 2013
Last Modified: 07 May 2023 02:55
URI: https://orca.cardiff.ac.uk/id/eprint/136586

Citation Data

Cited 23 times in Scopus. View in Scopus. Powered By Scopus® Data

Actions (repository staff only)

Edit Item Edit Item

Downloads

Downloads per month over past year

View more statistics