The current study examined the reliability of the Autism Diagnostic Observation Schedule (ADOS) across the continuum of severity of autism spectrum disorder (ASD) core deficits. Modules 3 and 4 of the ADOS assess the deficits of ASD in 2 core domains (Social Affect and Restrictive and Repetitive Behaviors [RRB]) among verbally fluent children, adolescents, and adults, and ADOS diagnostic classification of ASD is based on a total score that combines the 2 domains. Currently, the total and domain scores are calculated using only a subset of the administered items. This study used an item response theory (IRT) approach to examine whether scores from the ADOS Modules 3 and 4 item sets under the revised scoring algorithm provide adequate reliability around the diagnostic threshold of the total score, as well as across the hypothesized continuum of the Social Affect and RRB domains. Furthermore, the present study examined whether the reliability of the ASD domains measured by the ADOS is improved by incorporating items that are collected but not included in the current diagnostic algorithm. Measurement precision was estimated using IRT models, which allow for an examination of reliability across a continuum of ASD domain severity. Results suggest that although the ADOS Modules 3 and 4 are reliable at the diagnostic threshold using only the scoring algorithm items, adding additional items can improve the reliability of scores at moderately low and moderately high levels of ASD severity. However, even with additional items, the ADOS Modules 3 and 4 do not allow for adequately reliable measurement of restrictive and repetitive behaviors.