Parallel antibody germline gene and haplotype analyses support the validity of immunoglobulin germline gene inference and discovery
Analysis of antibody repertoire development and specific antibody responses important for e.g. autoimmune conditions, allergy, and protection against disease is supported by high throughput sequencing and associated bioinformatics pipelines that describe the diversity of the encoded antibody variable domains. Proper assignment of sequences to germline genes are important for many such processes, for instance in the analysis of somatic hypermutation. Germline gene inference from antibody-encoding transcriptomes, by using tools such as TIgGER or IgDiscover, has a potential to enhance the quality of such analyses. These tools may also be used to identify germline genes not previously known. In this study, we exploited such software for germline gene inference and define aspects of analysis settings and pre-existing knowledge of germline genes that affect the outcome of gene inference. Furthermore, we demonstrate the capacity of IGHJ and IGHD haplotype inference, whenever subjects are heterozygous with respect to such genes, to lend support to IGHV gene inference in general, and to the identification of novel alleles presently not recognized by germline gene reference directories. We propose that such haplotype analysis shall, whenever possible, be used in future best practice to support the outcome of germline gene inference. IGHJ-directed haplotype inference was also used to identify haplotypes not expressing some IGHV germline genes. In particular, we identified a haplotype that did not express several major germline genes such as IGHV1-8, IGHV3-9, IGHV3-15, IGHV1-18, IGHV3-21, and IGHV3-23. We envisage that haplotype analysis will provide an efficient approach to identify subjects for further studies of the link between the available immunoglobulin repertoire and outcomes of immune responses.