Skip to main content
Logo CinecaLogo SCAI
SCAI - SuperComputing Applications and Innovation

Main menu

HPC for Next Generation Sequencing

High-Performance Bioinformatics Services for Next Generation Sequencing data analysis in Public Health and Research

Next-generation DNA sequencing (NGS) has incredibly accelerated biological and biomedical research, by allowing the comprehensive analysis of genomes, transcriptomes and interactomes. Managing the huge amount of data from new sequencing platforms requires non trivial skills, strong computational power and storage capacity which are generally not available in most research labs. Our consortium has been recognized as big data center and HPC analysis for the Italian epigenomic flag project Epigen.
The CINECA centralized bioinformatics core facility provides shared resources for the computational and IT requirements.

Whole-Exome Sequencing

Whole Exome Sequencing (WES) analysis is now available for several research purposes. A frequently updated pipeline, WEP, is used to call variants, both SNPs and indels. Variants are then filtered with many public databases including dbSNP, the 1000 Genomes project, HapMap exomes and more. Variant prioritization is obtained by comparing disease and healthy controls and performing their functional annotation (e.g. the functional relevance of a protein variant is assessed by SITF software). Moreover, for family-based samples, the advanced analysis of haplotype phasing and complex heterozygous or homologous mutations detection is available as well.


A new sequencing platform, the MiSeq Illumina sequencer, allows to identify known causative mutations by producing a Ultra-Deep coverage on a selected list of Targeted genomic regions Sequencing (UDT-Seq). UDT-Seq is becoming particularly suitable for clinical diagnostic applications since it implies full coverage of sequenced regions and guarantees that no other mutation was lost by the analysis. COVACS is a new automated high-performance bioinformatics web platform, developed for targeting genes at high coverage through deep sequencing with the maximum usability, and focused on rational diagnosis of targeted therapies. It identifies Single Nucleotide Variants (SNVs) and Deletion/Insertion Variants (DIVs) classified by different useful scores (e.g. depth coverage).


RNA-Seq (Transcriptome) analysis is now avaliable for transcriptome structural analysis and quantification. The transcriptome analysis allows the identification of known or novel expressed transcript variants, and their quantification.
RNA-Seq, unlike microarrays, does not require prior knowledge of the genome and therefore offers several advantages. Our facility, RAP, can study the transcriptome profiling of each sample, performs differential gene expression analysis, cassette exons, chimeric transcripts and polyA sites detection.

RNA editing (from RNA-seq data)

RNA editing is a post-transcriptional mechanism challenging the central dogma of molecular biology. Nowadays, the term RNA editing is also used to indicate post-transcriptional changes due to specific base substitutions. Such alterations may affect coding as well as non-coding RNAs located in different cellular compartments and occur in a variety of organisms. ExpEdit is a web application for assessing RNA editing in human at known or user-specified sites supported by transcript data obtained by RNA-Seq experiments. Mapping data or directly sequence reads can be provided as input to carry out a comparative analysis against a large collection of known editing sites collected in DARNED database as well as other user-provided potentially edited positions. Results are shown as dynamic tables containing University of California, Santa Cruz (UCSC) links for a quick examination of the genomic context.


ChIP-Seq is widely used to analyze DNA-protein interactions. It combines chromatin immunoprecipitation (ChIP) with massively parallel DNA sequencing to identify binding sites of DNA-associated proteins, and can be used to precisely map global binding sites for any protein of interest. Our bioinformatic service, CAST, provides Genome-wide distribution of ChIP sequencing reads, peak identification and differential analysis across different samples.