General Considerations

The methods used for analysis of high-throughput screening data are as important as the screening protocols. There is no one correct method, and different possibilities should be evaluated for individual screens during the screen development process. Some general considerations are discussed below.

Many assays involve a time-dependent readout and therefore have background and intensity levels that vary over time and by plate. Any screen with appreciable plate-to-plate changes in signal intensity and background should first be scaled using fold induction by dividing each observed well value by the plate median or the plate control-well medians (depending on the experimental design). In general, using the plate median is more reliable than the plate mean for re-scaling or normalization as it is less affected by outlier values.

Screens without appreciable time- or plate-based signal intensity variance should forego the fold-induction calculation and simply be normalized on a plate-by-plate basis by calculating the z-score, or number of standard deviations from the mean for each readout value. These z-scores can be used to indicate the probability that a screening positive is genuine and not due to background noise.

As mentioned earlier, duplicate data points are very important for determining which compounds are genuine positives meriting follow-up. As an example, consider one pair of duplicate data points with z-scores of 0.5 and 2.0. The data point with a z-score of 0.5 represents an event that is about 61.7% probable to have occurred randomly, whereas the data point with a z-score of 2.0 represents an event with a 4.5% chance of having occurred randomly. Averaging these probabilities gives a 33.1% chance of random occurrence. In comparison, a data point with a z-score of 1.5 in both duplicates has a 13.4% chance of random occurrence in both cases, making it a better screening positive to follow up.