Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Lessons learnt on the analysis of large sequence data in animal genomics

Biscarini, Filippo, Cozzi, P. and Orozco Ter Wengel, Pablo 2018. Lessons learnt on the analysis of large sequence data in animal genomics. Animal Blood Groups and Biochemical Genetics 49 (3) , pp. 147-158. 10.1111/age.12655

[img]
Preview
PDF - Accepted Post-Print Version
Download (414kB) | Preview

Abstract

The ’omics revolution has made a large amount of sequence data available to researchers and the industry. This has had a profound impact in the field of bioinformatics, stimulating unprecedented advancements in this discipline. Mostly, this is usually looked at from the perspective of human ’omics, in particular human genomics. Plant and animal genomics, however, have also been deeply influenced by next‐generation sequencing technologies, with several genomics applications now popular among researchers and the breeding industry. Genomics tends to generate huge amounts of data, and genomic sequence data account for an increasing proportion of big data in biological sciences, due largely to decreasing sequencing and genotyping costs and to large‐scale sequencing and resequencing projects. The analysis of big data poses a challenge to scientists, as data gathering currently takes place at a faster pace than does data processing and analysis, and the associated computational burden is increasingly taxing, making even simple manipulation, visualization and transferring of data a cumbersome operation. The time consumed by the processing and analysing of huge data sets may be at the expense of data quality assessment and critical interpretation. Additionally, when analysing lots of data, something is likely to go awry—the software may crash or stop—and it can be very frustrating to track the error. We herein review the most relevant issues related to tackling these challenges and problems, from the perspective of animal genomics, and provide researchers that lack extensive computing experience with guidelines that will help when processing large genomic data sets.

Item Type: Article
Date Type: Publication
Status: Published
Schools: Medicine
Biosciences
Publisher: Wiley: No OnlineOpen
ISSN: 0268-9146
Date of First Compliant Deposit: 10 April 2018
Date of Acceptance: 11 February 2018
Last Modified: 29 Jun 2019 15:23
URI: http://orca.cf.ac.uk/id/eprint/110601

Citation Data

Cited 2 times in Scopus. View in Scopus. Powered By Scopus® Data

Actions (repository staff only)

Edit Item Edit Item

Downloads

Downloads per month over past year

View more statistics