P-arch - Digital Repository

Experiences with workflows for automating data-intensive bioinformatics

Mostra i principali dati

dc.contributor.author Spjuth, Ola
dc.contributor.author Bongcam-Rudlof, Erik
dc.contributor.author Carrasco Hernández, Guillermo
dc.contributor.author Forer, Lucas
dc.contributor.author Giovacchini, Mario
dc.contributor.author Valls Guimera, Roman
dc.contributor.author Kallio, Aleksi
dc.contributor.author Korpelainen, Eija
dc.contributor.author Kanduła, Maciej M
dc.contributor.author Krachunov, Milko
dc.contributor.author Kreil, David P.
dc.contributor.author Kulev, Ognyan
dc.contributor.author Łabaj, Pavel P.
dc.contributor.author Lampa, Samuel
dc.contributor.author Pireddu, Luca
dc.contributor.author Schönherr, Sebastian
dc.contributor.author Siretskiy, Alexey
dc.contributor.author Vassilev, Dimitar
dc.date.accessioned 2015-09-17T10:26:44Z
dc.date.available 2015-09-17T10:26:44Z
dc.date.issued 2015-08-19
dc.identifier.issn 1745-6150
dc.identifier.uri http://hdl.handle.net/11050/1148
dc.description.abstract High-throughput technologies, such as next-generation sequencing, have turned molecular biology into a data-intensive discipline, requiring bioinformaticians to use high-performance computing resources and carry out data management and analysis tasks on large scale. Workflow systems can be useful to simplify construction of analysis pipelines that automate tasks, support reproducibility and provide measures for fault-tolerance. However, workflow systems can incur significant development and administration overhead so bioinformatics pipelines are often still built without them. We present the experiences with workflows and workflow systems within the bioinformatics community participating in a series of hackathons and workshops of the EU COST action SeqAhead. The organizations are working on similar problems, but we have addressed them with different strategies and solutions. This fragmentation of efforts is inefficient and leads to redundant and incompatible solutions. Based on our experiences we define a set of recommendations for future systems to enable efficient yet simple bioinformatics workflow construction and execution. IT
dc.language.iso en IT
dc.publisher BioMed Central IT
dc.relation.ispartof Biology Direct IT
dc.relation.ispartofseries 10;34
dc.rights Attribuzione - Non commerciale - Condividi allo stesso modo 3.0 Italia *
dc.rights.uri http://creativecommons.org/licenses/by-nc-sa/3.0/it/ *
dc.subject workflow IT
dc.subject automation IT
dc.subject big data IT
dc.subject reproducibility IT
dc.subject high-performance computing IT
dc.subject data-intensive IT
dc.title Experiences with workflows for automating data-intensive bioinformatics IT
dc.type Articolo IT
dc.description.status Pubblicato IT
dc.identifier.doi 10.1186/s13062-015-0071-8 IT
dc.subject.een-cordis EEN CORDIS::SCIENZE BIOLOGICHE ::Biologia / biotecnologia ::Progettazione Molecolare IT
dc.subject.een-cordis EEN CORDIS::SCIENZE BIOLOGICHE ::Ricerca sul genoma ::Bioinformatica IT
dc.subject.program Program::Data Fusion::Visual Computing (VIC) IT


File allegati

I seguenti file di Licenza sono associati a questo inserimento:

Questo inserimento fa parte delle seguenti collezioni

Mostra i principali dati

Attribuzione - Non commerciale - Condividi allo stesso modo 3.0 Italia Attribuzione - Non commerciale - Condividi allo stesso modo 3.0 Italia