Introduction to Item Response Theory
Item Response Theory (IRT) has many attractive features and advantages over Classical Test Theory, which has contributed to its popularity in many measurement applications. Although IRT relies upon some strong assumptions, it is useful and practical in many situations such as educational and psychological testing. The IRT models describe a probabilistic relationship between the response an examinee provides on a test item, or items, and some latent trait, such as math or reading ability or some personality trait. This article will define the popular unidimensional models and provide an interpretation for the parameters of these models.
The most general of the common dichotomous models is the three-parameter logistic model (3PLM), which is given below:
where is the probability of a correct response to item i , given an ability level of q. The item parameters are ai, bi, and ci and refer to characteristics of the items themselves.
The b-parameter is often referred to as the item difficulty, and it is the point on the curve where the examinee has a probability of (1+ci)/2 of answering the item correctly. In the case where ci is zero, that corresponds to the point where the examinee has a 50% chance of getting the item correct. The b-parameter is on the same metric as the student ability parameter: q. A student with an ability equivalent to the item difficulty will have a probability of (1+ci)/2 of answering the item correctly. A student with a higher ability will have a higher probability of a correct response, and a student with a lower ability will have a lower probability of a correct response. Though the scale for ability and difficulty parameters is arbitrary, most IRT software scales the student parameters so that 0 is the average, and that the standard deviation of the abilities is generally set to 1. This means that an item with a b-parameter of 0 is usually considered to be of average difficulty, and that b-values in general fall in the range of -3 to +3.
The a-parameter is commonly referred to as the item discrimination parameter, and is proportional to the slope of the tangent line at the point on the q scale equal to the b-parameter. The higher the value, the more an item contributes to a student's ability estimate. Typically, it is desirable to have items with a-values of 1 or higher, but content constraints and the difficulty of creating items with high discriminations at differing ability levels generally means that items with a-values lower than 1 are often used.
The c-parameter is the pseudo guessing parameter, or often, the guessing parameter, and is the height of the lower asymptote of the curve. This point provides the probability of a person of very low ability getting a correct response to the item. The curve generated by these item parameters is referred to as the Item Characteristic Curve (ICC) or the Item Characteristic Function (ICF). A graphical representation of an ICC, with the corresponding parameters is presented in Figure 1 below:
Figure 1: Graphical Representation of Item Characteristic Curve
Other popular IRT models are special cases of the more general three-parameter logistic model. The two-parameter model is the case where the c-parameter is set equal to zero. The one-parameter model, or the Rasch model, is obtained when the c-parameters is zero and the a-parameter is set equal to 1 for all items.
Use of these less general models is often called for in specific situations. In the context of an open response item, where the student does not have a high probability of guessing the right answer, a 2-parameter model is a more appropriate choice. Under conditions where there are too few examinees to get high quality estimates of the a- and c-parameters, the one parameter model is often chosen. Though the one-parameter model provides less information about the item's true nature, the quality of the parameter estimate is often much higher under this condition and using less information of a higher quality is often preferable to using more information of lesser quality.

