Social Media Crowdsourcing in Health Care Research
In this issue of Medical Care, Mortensen et al9 report an extremely informative and useful analysis of self-rated health status, health-related behaviors, and other characteristics of a population surveyed with an innovative data collection platform, the Amazon Mechanical Turk, that is commercially available at low cost. There are contrasts and similarities among the conventionally collected reference datasets in comparison with the novel technology. To begin with, the populations differ substantially. The crowd-sourced population is younger, more female, whiter, more educated, less likely to be married, and they are lower in socioeconomic status than those in the reference weighted Medical Expenditure Panel Survey (MEPS) and Behavioral Risk Factor Surveillance System (BRFSS). They rate their health status lower, and they smoke less and drink more. One is tempted to caricature the Mechanical Turk population as hypochondriac, underemployed hipsters, self-identified as women, sipping craft beer, nibbling at kale salads while sharing avocado toasts in cafes of gentrifying neighborhoods, perhaps in Brooklyn, Austin or Portland, and dismiss the potential for improvement in the efficiency and efficacy of population surveys in health care research. To do so would be to fall prey to the representation heuristic. Still more importantly, we would fail to recognize the potential of this emerging disruptive technology.
The Amazon Mechanical Turk (MTurk) platform, one of many crowd-sourced tools,10 is named after an 18th century hoax, which purported to be an automaton capable of playing chess at the master level, when in fact a human chess player was concealed within the complex, deliberately opaque and confusing illusion with space inside for the actual operator.11 By contrast, the Amazon Mechanical Turk website is transparent with respect to the internal mechanisms, to a point.12
Businesses or other customers, such as health care researchers, vendors, manufacturers, or providers, pay people in cyberspace to perform “Human Intelligence Tasks,” requiring judgment or other cognitive skills that are not yet performed by artificial intelligence. These are services commonly carried out by knowledge workers in other business contexts which might use the categorization, sentiment, tagging, and/or feedback services. Businesses might provide additional compensation for specific demographic categories such as age, employment by industry and job category, wealth proxies, income level, sex, marital status, parenthood, and other variables. In our medical research domain, the human is not the vehicle but the domain of the survey, and the variables might be adapted to construct representative samples of target populations prospectively or, with modeling and attendant underlying assumptions, back out a pseudorepresentative population from the nonprobability convenience sample acquired without prospectively defined population characteristics.
So how do crowd-sourced data compare with more conventional methodologies? Pretty well actually, considering that there is internal weighting within the gold standard surveys themselves. However, some caution is required before leaping into this brave new world. First, we must learn to validate these new survey tools externally, and we must determine the risks involved in specific use cases for surveys conducted with crowdsourcing.