NMC Main Page FUGE Norwegian Microarray Consortium Logo The Norwegian Microarray Consortium
A National Platform for MicroArray Technology and High-Throughput Genomics
Software 
Data analysis tools 
BASE 
Genetools/Annotation database/eGOn 
J-Express 
MCGH 
PubGene 
MAGMA 
NMC Quality Control Tool 

 



 
The Norwegian Microarray Consortiums Quality Control Tool.


Purpose
In an easy and quick way assess the quality of your microarray hybridizations. Three scores are calculated per array, and sent to a central database for comparison to other arrays.


Download
Standalone program with example files here.. You need R installed on your computer.
For the plugin to BASE contact vegard@microarray.no for files and instructions.
If you have your data inside a BASE installation run by NMC the tool should be there already.

Help
We (NMC) are very keen to make this work for you. Do not hesitate to ask for help, especially if you have a commercial platform with another type of spikes. Contact us and we will see if we can incorporate it into the qc tool. Contact primarily vegard@microarray.no, or some other NMC bioinformatics help.


About the NMC Quality Control Tool.
The microarray is evaluated for precision and accuracy using spikes. Also other approaches can be used for quality measurements, and we aim to develop the control procedure further using observation of repeatedly printed spots on the arrays. The spikes used for the present quality control are 'cDNA' sequences unrelated to the genes on the array that correspond to synthetic cDNA or RNA added to the sample at known ratios. How well these ratios are reproduced indicates the quality of the micro array slide itself and the processing stages after the spiking material was added.

ABSOLUTE QUALITY SCORES.
Three numbers are calculated for each array. Two spike based quality criteria are used, bias and scatter. Bias corresponds to systematic deviations from the true ratios, often caused by dye bias or errors in the normalisation. Scatter is the deviation between individual spike spots and the median of all spikes spots. This measures the random errors in individual spots independent of normalisation. High scatter may indicate a large amount of random noise, malformed spots or bad signal to noise ratio. Weight is calculated from all the spots. To assess array quality independent of the external control we use spot a quality metric derived form the uncertainty of the spot. This spot quality is the inverse of the uncertainty of the log2. This is computed using a taylor expansion of the ratio expression.

RELATIVE QUALITY
In order to better assess what the absolute quality for each array means, the values are exported to the NMC quality control database and compared to scores from other hybridizations that share these common traits: Design, i.e. array design. Batch (print series). Might differ slightly from design. Owner shows other scores already in the database from the same person identified as the owner of this hybridization, regardless of design or batch. Run will compare towards the arrays that are in this same run of the QC report. Not retrieved or stored in the NMC QC database.

This means that your three quality scores will be stored in the database and used in other people's quality comparison (in an anonymous way).

Example of report for one hybridization.







Array info
serial: 240805-10
design: 21k human oligo v2.3
batch: 240805
owner: exampleuser
source id: - Standalone_somewhere
Comparative means
Array Design Batch Owner Run
bias 0.24 0.23 0.26 0.11 0.18
scatter 0.16 0.21 0.18 0.18 0.17
weight 1.11 1.01 1.07 1.08 1.13
count 1 70 9 5 2

Mean of bias is from the absolute values.






Prerequisites
R with libraries limma, MASS,boot. Download R at http://cran.r-project.org/.
See R documentation how to install packages.
GhostScript, http://www.cs.wisc.edu/~ghost/.
Internet.
Two channel microarrays.
Result files in GenePix (.gpr) format.
Spikes with expected values.


Limitations and known problems
Works for gpr. files only.Do not use unsafe characters in filenames! The program may not work with files named something with the '#' character. Try to avoid strange characters as #$@{}* in filenames. To be safe you should also try to avoid זרו %/&? and space . This is not just for this program but for computer use in general. Most programs handle 'וזר' in filenames now days, but some do not! And you never know upfront what program you need. in order to make the rest of your(and mine) digital life slighly easier you should stick to English characters only, numbers and .-_ (a-z A-Z 0-9 _ . -).

Spikes must be used. The scores are generated from spike data compared to the expected values. Thus your array need spikes that are used and the description of these.

Today the program just work on two channel arrays, nut we are willing to expand to one channel if desired.

Use only arrays with the same array design in the a run, ion order to make sure that the same reporter_ID for each row is in all the gpr-files.

The gostschript program used to make plots sometimes had some strange errors. If some plots are missing try a re-run.


License
This software is free. Three scores per array are uploaded to a central database. By using this tool you agree that your scores can be used as a comparison for other arrays. None of your experimental data is uploaded.