There are an increasing number of clinical, administrative and trial databases that can be used for research. These are particularly valuable if there are opportunities for linkage to other databases. This paper describes examples of the use of large diabetes databases for research. It reviews the advantages and disadvantages of using large diabetes databases for research and suggests solutions for some challenges. Large, high-quality databases offer potential sources of information for research at relatively low cost. Fundamental issues for using databases for research are the completeness of capture of cases within the population and time period of interest and accuracy of the diagnosis of diabetes and outcomes of interest. The extent to which people included in the database are representative should be considered if the database is not population based and there is the intention to extrapolate findings to the wider diabetes population. Information on key variables such as date of diagnosis or duration of diabetes may not be available at all, may be inaccurate or may contain a large amount of missing data. Information on key confounding factors is rarely available for the nondiabetic or general population limiting comparisons with the population of people with diabetes. However comparisons that allow for differences in distribution of important demographic factors may be feasible using data for the whole population or a matched cohort study design. In summary, diabetes databases can be used to address important research questions. Understanding the strengths and limitations of this approach is crucial to interpret the findings appropriately.