Applying deep neural networks to unstructured text notes in electronic medical records for phenotyping youth depression

    loading  Checking for direct PDF access through Ovid



We report a study of machine learning applied to the phenotyping of psychiatric diagnosis for research recruitment in youth depression, conducted with 861 labelled electronic medical records (EMRs) documents. A model was built that could accurately identify individuals who were suitable candidates for a study on youth depression.


Our objective was a model to identify individuals who meet inclusion criteria as well as unsuitable patients who would require exclusion.


Our methods included applying a system that coded the EMR documents by removing personally identifying information, using two psychiatrists who labelled a set of EMR documents (from which the 861 came), using a brute force search and training a deep neural network for this task.


According to a cross-validation evaluation, we describe a model that had a specificity of 97% and a sensitivity of 45% and a second model with a specificity of 53% and a sensitivity of 89%. We combined these two models into a third one (sensitivity 93.5%; specificity 68%; positive predictive value (precision) 77%) to generate a list of most suitable candidates in support of research recruitment.


Our efforts are meant to demonstrate the potential for this type of approach for patient recruitment purposes but it should be noted that a larger sample size is required to build a truly reliable recommendation system.

Clinical implications

Future efforts will employ alternate neural network algorithms available and other machine learning methods.

Related Topics

    loading  Loading Related Articles