To develop prediction models for Chlamydia trachomatis (Ct) infection with different levels of detail in information, that is, from readily available data in registries and from additional questionnaires.Methods
All inhabitants of Rotterdam and Amsterdam aged 16–29 were invited yearly from 2008 until 2011 for home-based testing. Their registry data included gender, age, ethnicity and neighbourhood-level socioeconomic status (SES). Participants were asked to fill in a questionnaire on education, sexually transmitted infection history, symptoms, partner information and sexual behaviour. We developed prediction models for Ct infection using first-time participant data—including registry variables only and with additional questionnaire variables—by multilevel logistic regression analysis to account for clustering within neighbourhoods. We assessed the discriminative ability by the area under the receiver operating characteristic curve (AUC).Results
Four per cent (3540/80 385) of the participants was infected. The strongest registry predictors for Ct infection were young age (especially for women) and Surinamese, Antillean or sub-Saharan African ethnicity. Neighbourhood-level SES was of minor importance. Strong questionnaire predictors were low to intermediate education level, ethnicity of the partner (non-Dutch) and having sex with casual partners. When using a prediction model including questionnaire risk factors (AUC 0.74, 95% CI 0.736 to 0.752) for selective screening, 48% of the participating population needed to be screened to find 80% (95% CI 78.4% to 81.0%) of Ct infections. The model with registry risk factors only (AUC 0.67, 95% CI 0.656 to 0.675) required 60% to be screened to find 78% (95% CI 76.6% to 79.4%) of Ct infections.Conclusions
A registry-based prediction model can facilitate selective Ct screening at population level, with further refinement at the individual level by including questionnaire risk factors.