||When and when not to
Sample to sample variations across a plate are inevitable. "Hot" and "cold" samples arise for a variety of reasons, e.g. nonuniform temperature in the plate, sample collection methodology, amount of starting material, RNA extraction efficiency, manual pipetting variations, enzymatic efficiency and other factors.
The figure below is an example of how two samples can show a similar pattern of probe expression, but one of the samples has an overall higher level of signal expression than the other well, due to technical variation introduced by the factors above. In order to find biologically meaningful variation in probe expression between samples, it is necessary to first compensate for the technical variation.
|Well B02 - "hot" well||Well B03 - "cold" well|
The effect of normalization can be seen by looking at an overlay of the two wells, with and without normalization. In the overlay images, the bars from both samples appear on top of each other. Before normalization, it can be seen that the bars are of quite different height, and it is hard to tell which changes are coming from overall sample expression, as opposed to a differential expression.
In the normalized picture, the bar heights are now very comparable in almost every bar, and the remaining differences are now potentially significant.
|Wells B02 & B03 overlaid, no normalization
||Wells B02 & B03 overlaid, with normalization
A set of samples is normalized if the value of some quantity is the same in all samples.
Typically the quantity is the level of a particular probe, but it could also be total RNA content, sample optical density, or some other measurement.
The definition is motivated by the idea that if one knew that a certain probe "should" be the same in every sample, then one would adjust each sample up or down in overall expression level in order to make the the measured value of that probe the same in each sample.
In practice, it is often not known what probe should be the same in every sample (or even whether there is any such probe). Instead, one chooses some set of normalization probes, numerically synthesizes a composite probe from those probes, and adjusts the sample levels so that the synthetic probe is the same across all samples.
A variety of approaches to choosing normalization probes are provided, discussed below.
The primary advantage of normalization is compensating for technical variation between samples. In addition, normalization usually reduces inter-sample standard deviation. For the same reason, differential expression between sample groups is more likely to find significant probes, because the sample-to-sample "noise" is reduced.
There are some situations when normalization needs to be treated with caution. If, for instance, a particular drug treatment causes higher expression of every probe, or most of the probes, in a treated sample, then normalization will tend to wash out the difference between the treated and untreated samples.
In a dilution series, normalization will by definition make all the dilutions look the same.
Overall, if there is a good reason to believe that the variation in overall probe expression level between samples is meaningful to the experiment and not just an artifact, that there should be "hot" and "cold" wells, normalization is not recommended.
There are several approaches to choosing normalization probes. We focus here on endogenous controls. Exogenous controls (spike-in probes) only compensate for some particular sources of variation, and cannot account for, for instance, sample purification efficiency. They may even vary inversely with the expression level of endogenous controls if they compete with the endogenous probes for enzymes and reagents, making them wholly unrepresentative of the actual sample data.
Any choice of normalizer requires certain assumptions. Manual selection requires the assumption that the chosen probes are actually the same in all samples, and that the measured differences are entirely due to technical variation. Probe averaging requires the assumption that the overall expression level should be the same in every sample, and that the measured differences are technical. Algorithmic approaches assume that the set of probes used in the experiment includes some subset which is the same in all samples, except for technical variation.
With that in mind, a judgement call is always required. In the absence of any other information, a probe averaging approach is often a safe place to start, especially if only a small number of probes are expected to change between one condition and another. If there is reason to believe a large fraction of the probes in the set may change in response to experimental variables such as treatment type, or time after dosing, then a norm-finding algorithm may be preferred, in order to single out the relatively few probes which are invariant.
In FirePlex Analysis Workbench, all three methods above are supported. In the probe
table, each probe has a checkbox to indicate whether or not it is being used as a normalization probe.
To choose probes manually, check off the desired probes. The data set is recalculated on the fly each time the probe is checked or un-checked; there is no "apply" button.
To choose probes automatically, use the normalization button. The workbench will check off the probes it selects, as if you had checked them manually. You can inspect the algorithm's choices, and override them with the checkboxes.
To choose the algorithm used, use the top menu item Analyze -> Choose Normalization Scheme.
There are situations where it is not appropriate to expect some quantity to be the same across all the samples in an experiment. For instance, in a dilution series, the replicates within a dilution cohort would be expected to have comparable signal levels for all targets, but between cohorts, the levels should be distinctly different. Or, in samples originating from serum vs samples originating from plasma, it is not a priori obvious that the mix of targets in the replicates of one should have the same average as the mix of targets in the replicates of the other.
In such situations where there is a mix of apples and a mix of oranges, so to speak, rather than trying to normalize so that all the apples and oranges have the same value of some quantity, it may be preferable to do a more limited normalization where each apple is normalized using only data from apples, the oranges normalized using only data from oranges, and no attempt is made to force the apples look like the oranges.
Logically, the approach is the same as splitting the experiment into an apple experiment and an orange experiment, normalizing each experiment individually, and then re-assembling the data into a single experiment.
The workbench automates this type of normalization if (a) a variable called Replicate is defined for the samples and (b) normalization by replicates is chosen as the normalization scheme. Any of the usual normalization schemes (geNorm, average) can be combined with replicates; it simply means that the usual scheme is only applied within the sub-experiment consisting of one replicate group.
The overall benefits of normalization will still apply within the groups; the difference between "hot" and "cold" wells will be taken out, and only the true sample-to-sample variation in each probe remains.
 Vandesompele et al, "Accurate normalization of real-time quantitative RT-PCR data by geometric averaging of multiple internal control genes", Genome Biology 2002, 3(7)
 Andersen C.L., Ledet-Jensen J., Ørntoft T.: "Normalization of real-time quantitative RT-PCR data: a model based variance estimation approach to identify genes suited for normalization - applied to bladder- and colon-cancer data-sets". Cancer Research. 2004 (64): 5245-5250