October 14, 2011
When a survey is conducted in a litigation setting, it is often suggested that the administration of the survey be carried out anonymously. With anonymous surveys, as their name suggests, survey respondents’ answers cannot be traced back to personally identifying information. While this type of administration may help in limited ways to protect respondent identities, the method does not come without drawbacks. This article discusses the various reasons why anonymous survey administrations are not recommended.
When conducting an anonymous survey it becomes nearly impossible to identify which respondents have completed the survey and which respondents have not. Many surveys are administered in waves, where respondents are contacted on more than one occasion if they do not initially respond. Those who have already responded to the survey are typically removed from the follow-up contact list. Without identifying information for each respondent, there is no way to know which respondents should be excluded from receiving subsequent invitations to participate. Thus, respondents who have already completed the survey will receive correspondence asking them to complete the survey for a second or third time. In addition to being a nuisance to participants in the survey, repeat mailings can have implications for the quality and integrity of the collected survey data. When respondents receive a survey invitation more than once, some respondents will inevitably complete the survey instrument multiple times. From an analyst’s perspective, there is no way to determine which submissions are duplicative or how many duplicate instruments have been provided. Having multiple submissions from the same individual can bias the survey results and will likely impact the statistical precision of any calculations that use the survey data.
Survey researchers are often interested in validating the responses offered by respondents on a survey. It is common to validate responses using an external data source that can confirm a randomly-selected portion of the responses. Validation is useful for a number of purposes including determining how well respondents understood the survey questions and how likely respondents were to misremember or misreport information. A well-known example of response validation comes from the National Election Study (NES), a biennial study of the American electorate. In their survey results, researchers at the NES found that reported voter turnout on the survey was typically several points higher than voter turnout in the actual election. Having collected names as part of the survey, NES researchers were able to cross-reference survey responses with actual county-level voter records. Using this method, researchers identified survey respondents who misreported their vote and could identify characteristics associated with misreporting turnout. 1
In the context litigation-related surveys, results may also need to be validated. For example, if a survey respondent reported working seven days per week, it might be useful to measure this survey response against time clock data, site logs, or work schedules. Doing so would help to determine the degree to which survey responses correspond with data recorded in actual records. If such data are not available, it may be useful to validate survey responses using testimony provided in deposition, declarations, or other relevant sources of information. This type of validation need not take place for all of the survey respondents, but validating a randomly chosen subset of the surveys can establish a baseline for the level of measurement error which can be used to make mathematical adjustments to the results, if needed. In the absence of identifying information, performing these corrections can become very difficult.
There are several reasons why responses offered on a survey may not comport with information found in other available sources of information (e.g. time clock data, deposition testimony). One simple explanation is that survey respondents may not understand the questions being posed to them. In this case, respondents may misreport information due to confusion about what they are being asked. The ability to establish a link between actual survey responses and other information sources helps to mitigate this problem by allowing the researcher to identify which portions of the survey, if any, were unclear to respondents.
Bias, another type of measurement error, is typically more difficult to address in survey responses. One type of bias is referred to as “recall bias”. When recall bias is present, respondents may not accurately recall the information the survey is trying to elicit. In this case respondents may systematically inflate or deflate their estimates. In work-related surveys, research has shown that the direction of the bias is more likely to be in the positive direction—that is, respondents are more likely to overestimate the amount of time they spend working. A study, for example, published in the Monthly Labor Review found that respondents’ survey-reported estimates of time spent working were systematically higher than estimates obtained through more comprehensive time diaries. 2 This suggests that applying a correction to the hours reported on a survey may be necessary. However, the exclusion of identifying information precludes the possibility of making nearly any assessment of bias in the responses.
An additional confounding factor, unique to litigation environments, is that some respondents may believe that they have a financial incentive for reporting more time worked or missed meal periods on a litigation-related survey. When respondents are aware that the survey is related with litigation, they may believe, correctly or not, that reporting extra hours or more meal penalties will lead to greater financial recovery. This possibility should be examined in any potential survey and would likely require the availability of identifying information. Identifying information can also help to assess any potential impact of contact with or correspondence from counsel.
Further, anonymous surveys can make it difficult to assess the representativeness of a collected sample. Before conducting analyses, it is usually important to ensure that the survey data are representative, especially when the results are touted as representative testimony or when they are used to establish liability or to project damages to a larger population. When survey participation is voluntary, for example, those that respond to the survey may possess systematically different attributes from those who do not respond. When such systematic differences are present, using the sample to project to a larger population can provide estimates that are inaccurate or misleading. This phenomenon is referred to as non-response bias.
To test the representativeness of a sample, analysts typically merge together data from the survey with various forms of company data to ensure that the distribution of attributes found in the respondent population (e.g. date of hire, age, gender, department, etc) do not differ considerably from the distribution found in the employee population at-large. If the surveyed population differs in meaningful ways, techniques can be implemented to correct (or “weight”) the sample. When a survey is conducted anonymously, the analyst may not be able to identify important employee attributes and may encounter much more difficulty in applying such corrections.
Gathering indentifying information about survey respondents need not serve as a road block to the effective administration of the survey, as long as proper steps are taken from the outset. In the introduction to the survey, it should be clearly presented to respondents as a straightforward exercise in the collection of facts relevant to the case. The preamble to the survey should ask that respondents provide the most accurate, honest answers they can, to the best of their recollection. Further, current employees taking the survey should be informed that their responses will have no bearing on their current or future employment with the company. Respondents should be assured that their responses will not be reported on individually, but rather used in the calculation of aggregate-level estimates. Further, each survey can be assigned and coded with a random number that links back to the employees name in a separate database. This way, respondent names do not reside on the physical survey document. In no case should the appeal of “anonymity” trump analytical considerations, which are of far greater importance.
1 See, for example: Belli, Robert F., Michael W. Traugott, Margaret Young, and Katherine A. McGonagle. 1999. "Reducing Vote Over-Reporting in Surveys: Social Desirability, Memory Failure, and Source Monitoring." Public Opinion Quarterly 63:90–108; Sigelman, Lee. 1982. “The Nonvoting Voter in Voting Research.” American Journal of Political Science 26:47–56; Anderson, Barbara A., and Brian D. Silver. 1986. “Measurement and Mismeasurement of the Validity of Self-Reported Vote.” American Journal of Political Science 80:771–785.
2 Robinson, John P. and Ann Bostrom. The Overestimated Workweek? What Time Diary Measures Suggest. Monthly Labor Review, August 1994.
3 Dey, Eric. Working With Low Survey Response Rates: The Efficacy of Weighting Adjustments. Research in Higher Education, Vol. 38. No. 2. 1997.