PIPAx: next-generation sequencing data management and analytics

Data management

Management and bioinformatics analysis of next-generation sequencing data are an on-going challenge. State-of-the-art computational methods that address various analysis steps such as quality control, alignment of sequences to the reference genome and identification of differentially expressed genes are currently available either as separate components or as parts of scripting suites which often lack simple, graphical user interfaces. The principal challenge in the field is to design dedicated tools that are both user-friendly and accessible to biologists, and implement potentially complex data analysis pipelines while hiding unnecessary computational details from the user. To address these needs, we have constructed PIPAx, an integrated environment for RNA sequence data management and analysis.

Data analytics

PIPAx is composed of a computational server with a web-client that features a simple and interactive graphical user interface. It implements both local and remote data uploads, experiment annotation, resolution of multiplexed data, alignment of reads to a reference genome and computation of transcript abundance. The framework integrates various state-of-the-art data processing and analysis tools, such as bowtie, cufflinks, ht-seq and DEseq. PIPAx’s results can be visualized in a genome browser, downloaded as tab-delimited files, or accessed programmatically through an HTTP application program interface. Pipa also implements an elaborate but simple-to-use permission hierarchy together with a user-customizable annotation of experiments. The entire pipeline is lightweight; it focuses on transcription analytics and implements a simple and interactive user interface.

Software technology

PIPAx server side is written in Python, with the web part attached to the Apache server with Wsgi. The client side, developed in Adobe Flash Builder 4.6, communicates with the server using JSON objects. This simple and light-weight framework gives developers the opportunity to write their own client side if needed. The graphically rich flash client provides easy access to all data and analytics.

Connectivity

Explore analysis results using the in-build dictyExpress analytics toolbox. For advanced statistical and data mining analytics, connect gene expression data to Orange.