A T-score is a standardized score that allows you to compare test scores for tests with different scales and for different classes. A T-score assumes that the test mean is 50 and the standard deviation is 10. The T-score provides an index of the distance a particular score lies from the average. In cases where the scores are normally distributed, approximately 68% of the students would have T-scores between 40 and 60. This is similar to the Z-score, where we assume the mean is 0 and the standard deviation is 1.
The relation between a Z-score and a T-score is as follows: If μ is the mean of the tests and σ is the standard deviation, then the Z score z of an individual test x is calculated as
![]()
and then the T score t is calculated as
.
Point-biserial correlation coefficient is another measure of the relationship between the score on the item and the score on the test. The formula to calculate the correlation is
![]()
where:
The value of this statistic ranges from -1.00 to 1.00. A high positive value indicates that those who answered the item correctly also received higher scores on the test than those who answered the item incorrectly. A high negative value indicates that those who answered the item correctly received low scores on the test and those who answered the item incorrectly did well on the test. A near zero value indicates that there is little relationship between the score on the item and the score on the test. It is desirable to retain items with a high positive correlation coefficient and to eliminate those with near zero or negative values. As a rough guide, it is suggested that items with large negative or near-zero correlations be eliminated or substantially revised and those with low positive correlations be studied to determine how improvement might be accomplished.
The validity, or discrimination index, for an individual item (i.e. question) compares two groups of students: those who scored in the top 27% of the class and those who scored in the bottom 27%. The value is calculated by subtracting the fraction of the bottom 27% who answered correctly on the item from the fraction of upper 27% who answered correctly.
The value of this statistics ranges from –1.0 to 1.0. If an item carries a high validity, it means that overall, high scoring individuals (i.e., those with high scores on the total test) answered the item correctly while low scoring individuals tended to miss the item. Therefore, an item with high validity has a high correlation with the total test score. If one considers the total test score to be a better indicator of a student’s knowledge, then the higher the relationship between the item and the total test, the more valid the item.
The difficulty of an item is the proportion of the entire class who answered the item correctly.
The percentile rank for a particular grade is calculated as follows:
![]()
where
Although Frederic Kuder and M. Richardson developed a number of formulas for estimating reliability, formula 20 is the one used by Opscan. The formula is as follows:

where
Reliability is an attempt to determine how much variability in test scores is a result of variability in students taking the test and not some other random error.
For Opscan, the standard error of measurement is calculated as follows:
![]()
where
The standard error of measurements is a measure of test consistency that is not affected by score variability. When the reliability is 1, the standard error of measurement is 0. When reliability is 0, then the standard error is the same as the standard deviation.