MTurk justification

In Uncategorized

Recruitment and sampling

“A sample is representative of the population from which it is selected if the aggregate characteristics of the sample closely approximate those same aggregate characteristics in the population” (Babbie, 2012). The decision to conduct the study using MTurk was based on a review of academic papers concerning the validity of online research. Over time, MTurk has been deemed to offer randomly assigned samples that are generally more diverse than the samples recruited for research reported in top academic journals with respect to gender, socioeconomic status, geographic region, and age. In addition, the system allows for the collection of large samples, which would be cost-prohibitive using traditional recruiting methods. A summary of the tradeoffs among recruiting methods is available in the Appendix (Paolacci et al., 2010).

[fix this] A pilot test will be conducted to allow for refinement of the study design if indicated. A pilot test of the recruitment protocol has already been performed to see how a randomly assigned EPSEM sample is constructed in MTurk. The process proved uncomplicated.

Once any refinements are implemented as a result of pilot testing, a sample of 500 adults will be recruited from MTurk in return for a small cash payment. Using the service’s built-in screening criteria, only U.S. users will be eligible to participate.  Exclusion criteria will include the failure to answer correctly a comprehension check asking the subjects to identify a default option; should any exclusions occur, it will be determined whether including those participants in the analysis would change the results substantively. The sampling frame will be a list of unique ID numbers issued by MTurk and associated with the responses to demographic questions and the aforementioned measures of cognitive reflection and information-seeking behavior. The final sample will be analyzed for relative representativeness of gender, age, education level, household income, and self-reported level of online search expertise.

The ethical principle of autonomy contained in the Belmont Report (1979) requires informed consent so participants can judge for themselves the relevance and weight of risks. The consent form to be used in this study is derived from the UCLA IRB template and available in the Appendix. The consent form will appear on respondents’ screens prior to the start of the survey, and requires an acceptance action and submission of the form to begin the survey.

Apart from the ability to recruit large samples, perhaps the greatest benefit of using MTurk for this study is that procedures are already in place for protecting anonymity and confidentiality. Shadish, Cook & Campbell admonish researchers to use “research procedures that can ensure confidentiality, such as using randomized response methods or determining not to gather any data about potential identifiers, and, just as important, they should ensure that such procedures are followed” (Shadish et al., 2002). The only identifying information available to the researcher will be the unique MTurk Worker ID number, which will be associated with each respondent’s answers to the survey questions.

MTurk respondents can complete experiments without interacting with researchers. This circumvents concerns about experimenter obtrusion, subject cross-communication, and reactance.  “Mechanical Turk is a reliable source of experimental data in judgment and decision-making. Results obtained in Mechanical Turk did not substantially differ from results obtained in a subject pool at a large Midwestern U.S. university. Moreover, response error was significantly lower in Mechanical Turk than in Internet discussion boards” (Paolacci et al., 2010).

The nomothetic correlational design is intended to be highly replicable. Further, its internal validity is relatively high compared to other quasi-experiments, because the study is designed to determine whether a correlation exists between a measure and observed behavior, not to assert causality. Note that two aspects of internal validity are compromised due to the nature of the correlational design. First, the distribution of high- and low-CRT respondents within groups defined according to gender, age, education, household income, and search expertise will not be determined until after the quasi-experiment. Therefore, differences between these groups may not necessarily be helpful in post-experimental analysis. Second, the respondents may circumvent the MTurk platform to answer the CRT questions, which are available on the Internet. Generally, the amount of time spent on the CRT question should be less than the duration of the search tasks. However, this will not necessarily be the case. However, given that the survey will be administered in a standard format, the study will have few threats to internal validity.

To ensure high external validity, the search tasks were drawn from the Repository of Assigned Search Tasks (RepAST), a collaborative project being conducted at the University of North Carolina at Chapel Hill, the University of British Columbia, and the University of Sheffield. The tasks are constructed in such a way as to minimize situational, confounding variables. Further, the participant sample is constrained to U.S.-based participants between the ages of 18 years and 74 years. Therefore, though this study may or may not be generalizable to a global population, it will be applicable to numerous scenarios in the United States. One threat to external validity may be that the workers on MTurk may be more technology-savvy than the general online information-seeking population. This aptitude threat is not unlike that which occurs with academic research using undergraduate student pools.

Thus, the study will attain internal validity by measuring behavior and aptitude in a way that can be easily replicated, and will attain external validity equivalent to current quasi-experimental research conducted in information-seeking behavior research in demonstrating whether or not a correlation exists between CRT and information-seeking behavior.

Data collection

After submitting the consent form, participants will be presented with three sets of questions: demographic, CRT, and search tasks. The instructions will not indicate these categories. The order in which each set of questions appears will be random to control for ordering effects, but the questions within each set will be presented in the same order for all participants. The survey is available in the Appendix.  The demographic and CRT questions are presented without the ability to search for information online. The search tasks, however, instruct the respondents to use the Google search engine to find the answers to five questions.

The respondents’ search behavior will be tracked using the Wrapper, an embedded deployment of an open-source software program within the MTurk interface, which has been used to good effect in other information-seeking behavior studies (Jensen, 2006). Data for each variable will be stored in the MTurk database for 120 days, during which time the researcher may download a spreadsheet of the survey results and associated MTurk Worker IDs in CSV format. The study protocol is designed to minimize the need to collect and maintain identifiable information about research participants. Data will be collected anonymously and access to research data is based on a “need to know” and "minimum necessary" standard.

Data analysis

Results will be considered using exploratory data analysis to gain insight into how data are distributed. To assess the degree of linear association between cognitive reflection and the aspects of information-seeking behavior measured in the study, results will be analyzed using Pearson's product-moment correlation coefficient and Spearman's rank-order correlation coefficient to estimate of the degree to which the variables are correlated. The findings of these analyses will inform subsequent study design, particularly with regard to the presence of confounding factors that should be controlled, if possible, in future experiments.