A large collection of natural HIV-1 integrase (IN) sequences has not previously been described. We reasoned that analysis of such sequences would address whether natural variation of HIV-1 IN contributes to the pathogenesis of AIDS and might also identify amino acid residues important for IN function. Sequences encoding HIV-1 IN were amplified from cryopreserved lymphocytes or plasma obtained at different times from 10 hemophilia patients who had been observed for up to 17 years. The region of the HIV-1 genome that encodes the 288-amino acid IN protein was sequenced from a total of 102 clones; information was obtained for 99.97% of 29,478 amino acid positions. Phylogenetic analysis indicated that patient samples were unique. Interpatient nucleic acid distances ranged from 0.8% to 4.9%, highlighting the tight conservation of this genomic region. No major differences were found between DNA and RNA or between early and late time points from the same patient. Significantly, no amino acid changes that might account for the variable rate of disease progression between patients were evident. Only one amino acid substitution involved a highly conserved residue known to be important for enzymatic activity. However, several interesting amino acid substitutions were noted, including residues within the C-terminal region of the protein for which sequence comparisons between animal retroviruses have not been very informative. These results should encourage the pursuit of anti-integrase therapies, especially inasmuch as the apparent biologic constraints on the IN sequence may deter the development of drug resistance.