Score tests for identifying locally dependent item pairs have been proposed for binary item response models. In this article, both the bifactor and the threshold shift score tests are generalized to the graded response model. For the bifactor test, the generalization is straightforward; it adds one secondary dimension associated only with one pair of items. For the threshold shift test, however, multiple generalizations are possible: in particular, conditional, uniform, and linear shift tests are discussed in this article. Simulation studies show that all of the score tests have accurate Type I error rates given large enough samples, although their small-sample behaviour is not as good as that of Pearson's ξ2 and M2 as proposed in other studies for the purpose of local dependence (LD) detection. All score tests have the highest power to detect the LD which is consistent with their parametric form, and in this case they are uniformly more powerful than ξ2 and M2; even wrongly specified score tests are more powerful than ξ2 and M2 in most conditions. An example using empirical data is provided for illustration.