Frequently asked questions
We warmly invite you to read our Help
section too. If you still have some doubts about our tool, please write us from our feedback page
- What input files are supported?
- How many samples can be uploaded and analyzed? How many analysis can be run? Are there any time limits?
- Can I upload multiple files? And multiple samples?
- How long does a typical analysis take?
- What do you recommend before running an analysis?
- How can i use best the standard pipeline and MACS software?
- How can my parameters influence analysis results?
- Why is CAST forcing me to apply a control file for MACS peak calling?
- Do i need an input DNA as control?
- What is a study? What's the difference with a project?
- How can I quickly learn how to browse results?
- Is it possibile to download the results of my analysis?
- What makes so difficult to find significant peaks?
- What is the FDR column? What is a reasonable threshold?
- Why am i getting high FDR for every peak?
Analysis1. What input files are supported?
2. How many samples can be uploaded and analyzed? How many analysis can be run? Are there any time limits?
The CAST tool supports:
- short-read data-sets produced by Illumina sequencing platforms (FASTQ)
- several standard file formats (SRA, BAM)
- compressed archives (zip1, tar, gzip, bz or bz2 compression are admitted)
CAST automatically detects the type of uploaded file and chooses the necessary program to decompress it.
 Warning: compressed archives obtained with Mac OS X require the windows-compatibility flag
3. Can I upload multiple files? And multiple samples?
An account with User
- create up to 2 studies
- upload up to 12 files
- build up to 2 analyses
- run 1 analysis at a time
Each sample has to be uploaded as a single file: you can upload multiple files if you have multiple samples.4. How long does a typical analysis take?
The general rule is: 1 sample = 1 file, no merge operation will be executed by our system.
5. What do you recommend before running an analysis?
The amount of time required by an analysis execution is influenced by different factors, such as:
- The amount of files uploaded
- The sequencing region (genome-wide or targeted)
- The number of jobs waiting for execution on our servers
However, you can find an estimated execution time in your analysis monitoring page.
We recommend that you understand and tweak the peak finder parameters for your data set.6. How can i use best the standard pipeline and MACS software?
Once the reads have been aligned to the reference genome you can run different analyses on the same BAM file and compare the results.
7. How can my parameters influence analysis results?
We recommend that you read Zhang et al. (2008)
for a detailed explanation of the MACS peak finding algorithm.
We strongly recommend that you always look at the MACS logfile to see how well MACS did on your data set:
failures to complete the MACS analysis are often related to the experimental data and/or the chosen analysis parameters.
Setting the right bandwidth and mfold parameters for your data set is important.8. Why is CAST forcing me to apply a control file for MACS peak calling?
If these parameters are set too stringently, MACS is unable to find enough high-quality peaks and will exit with an error.
Moreover getting the closest bandwith to DNA sonication size helps to tune MACS in order to call the right peaks.
Because control is used for calculating enrichment significance, to provide more rigorous filtering of false positives9. Do i need an input DNA as control?
and accurate methods for ranking high confidence peak calls.
We recommend that you use a real control in your experimental setup.
The experiment and control samples should have a comparable (high) number of reads.
MACS simply linearly scales (normalizes) the number of reads and therefore noise will be scaled in the same way as signal.
No. MACS can also be applied to identify differential peaks between two conditions by treating one of the samples as the control.
However, calculated FDR value should be ignored, as peaks from either sample are likely to be biologically meaningful in this case.
Archive1. What is a study? What's the difference with a project?
According to the EBI/ENA data format standard, a study contains information about the a single sequencing project (more analysis can be run in a single study). So, in practice, a study contains all the information about a project.
Results1. How can I quickly learn how to browse results?
2. Is it possibile to download the results of my analysis?
Click on Results example
(on the top navigation menu) to follow a guided tour of the results pages.
3. What makes so difficult to find significant peaks?
Yes, you have two options:
Data browsing and downloading has been optimized with caching/sessioning.
- Downloading the results directly from the analysis monitoring page.
- From the Results page, after applying any filter, with the
While this ensures better performance in page loading,
as a drawback opening different tabs on different samples may lead to data misconfiguration.
Please ensure to filter and download one sample at the time.
Not all genomic regions are equal.4. What is the FDR column? What is a reasonable threshold?
Things like sequencing biases, mapping biases, chromatin and copy number variations, and repeat structures create regional differences in ChIP-Seq data.
However MACS addresses some of these issues by looking at the background noise in a control sample and the direct surroundings of a potential peak.
In an ideal experiment, with control and samples well balanced, peaks with FDR < 1% are not likely to be false positives.
5. Why am i getting high FDR for every peak?
MACS computes FDR based on the theoretical Poisson distribution about ChIP and control libraries.
In reality, ChIP usually has a much smaller coverage than the control.
Therefore, there will be many regions that show only enrichment in the control library.
And sometimes, these enrichments can be highly significant, depending on
the experimental design, anti-body efficiency, sequencing machine, etc.
Therefore, we can see many significant positive peaks being tagged with "100%" FDR.
Back to top