The National Survey on Drug Use and Health (NSDUH) series, formerly titled National Household Survey on Drug Abuse, is a major source of statistical information on the use of illicit drugs, alcohol, and tobacco and on mental health issues among members of the U.S. civilian, non-institutional population aged 12 or older. The survey tracks trends in specific substance use and mental illness measures and assesses the consequences of these conditions by examining mental and/or substance use disorders and treatment for these disorders.

Examples of uses of NSDUH data include the identification of groups at high risk for initiation of substance use and issues among those with co-occurring substance use disorders and mental illness.

NSDUH public-use data files are available for download in SAS, SPSS, STATA and ASCII formats, and online analysis with SDA. NSDUH restricted-use data files are available for online analysis with the R-DAS.

The NSDUH is sponsored by the Center for Behavioral Health Statistics and Quality (formerly Office of Applied Studies), Substance Abuse and Mental Health Services Administration. For more information, visit the NSDUH website.

NSDUH Questionnaire Details

The population of the NSDUH series is the general civilian population aged 12 and older in the United States. Questions include age at first use, as well as lifetime, annual, and past-month usage for the following drugs: alcohol, marijuana, cocaine (including crack), hallucinogens, heroin, inhalants, tobacco, pain relievers, tranquilizers, stimulants, and sedatives. The survey covers substance abuse treatment history and perceived need for treatment, and includes questions from the Diagnostic and Statistical Manual (DSM) of Mental Disorders that allow diagnostic criteria to be applied.

Respondents were also asked about personal and family income sources and amounts, health care access and coverage, illegal activities and arrest record, problems resulting from the use of drugs, perceptions of risks, and needle-sharing. Demographic data include gender, race, age, ethnicity, educational level, job status, income level, veteran status, household composition, and population density.

The questionnaire was significantly redesigned in 1994. The 1994 survey included for the first time a rural population supplement to allow separate estimates to be calculated for this population. Other modules have been added each year and retained in subsequent years: mental health and access to care (1994-B); risk/availability of drugs (1996); cigar smoking and new questions on marijuana and cocaine use (1997); question series asked only of respondents aged 12 to 17 (1997); questions on tobacco brand (1999); marijuana purchase questions (2001); prior marijuana and cigarette use, additional questions on drug treatment, adult mental health services, and social environment (2003); and adult and adolescent depression questions derived from the National Comorbidity Survey, Replication (NCS-R) and National Comorbidity Survey, Adolescent (NCS-A) (2004).

Survey administration and sample design were improved with the implementation of the 1999 survey, and additional improvements were made in 2002. Since 1999, the survey sample has employed a 50-state design with an independent, multistage area probability sample for each of the 50 states and the District of Columbia. At this time, the collection mode of the survey changed from personal interviews and self-enumerated answer sheets to using computer-assisted personal interviews and audio computer-assisted self-interviews. In 2002, the survey’s title was officially changed to the National Survey on Drug Use and Health (NSDUH).

Since 2002, participants are given $30 for participating in the study. This resulted in an increase in participation rates from the years prior to 2002. Also, in 2002 and 2011, the new population data from the 2000 and 2010 decennial Censuses, respectively, became available for use in the sample weighting procedures. For these reasons, data gathered for 2002 and beyond cannot validly be compared to data prior to 2002.

The National Survey on Drug Use and Health (NSDUH) series (formerly titled National Household Survey on Drug Abuse) primarily measures the prevalence and correlates of drug use in the United States. The surveys are designed to provide quarterly, as well as annual, estimates. Information is provided on the use of illicit drugs, alcohol, and tobacco among members of United States households aged 12 and older. Questions included age at first use as well as lifetime, annual, and past-month usage for the following drug classes: marijuana, cocaine (and crack), hallucinogens, heroin, inhalants, alcohol, tobacco, and nonmedical use of prescription drugs, including pain relievers, tranquilizers, stimulants, and sedatives. The survey covered substance abuse treatment history and perceived need for treatment, and included questions from the Diagnostic and Statistical Manual (DSM) of Mental Disorders that allow diagnostic criteria to be applied. The survey included questions concerning treatment for both substance abuse and mental health related disorders. Respondents were also asked about personal and family income sources and amounts, health care access and coverage, illegal activities and arrest record, problems resulting from the use of drugs, and needle-sharing. Questions introduced in previous administrations were retained in the 2005 survey, including questions asked only of respondents aged 12 to 17. These "youth experiences" items covered a variety of topics, such as neighborhood environment, illegal activities, drug use by friends, social support, extracurricular activities, exposure to substance abuse prevention and education programs, and perceived adult attitudes toward drug use and activities such as school work. Several measures focused on prevention-related themes in this section. Also retained were questions on mental health and access to care, perceived risk of using drugs, perceived availability of drugs, driving and personal behavior, and cigar smoking. Questions on the tobacco brand used most often were introduced with the 1999 survey. Background information includes gender, race, age, ethnicity, marital status, educational level, job status, veteran status, and current household composition.

United States
survey data
The civilian, noninstitutionalized population of the United States aged 12 and older, including residents of noninstitutional group quarters such as college dormitories, group homes, shelters, rooming houses, and civilians dwelling on military installatio
Data were collected and prepared for release by Research Triangle Institute, Research Triangle Park, North Carolina.
Prior to the 2002 survey, this series was titled National Household Surveys on Drug Abuse.
Although the design of the 2005 survey is similar to the design of the 1999 through 2001 surveys, there are important methodological differences since 2002 that affect the 2005 estimates. Each NSDUH respondent since 2002 has been given an incentive payment of $30. This change resulted in an improvement in the survey response rate. In addition, in 2002 new population data from the 2000 decennial Census became available for use in NSDUH sample weighting procedures. Therefore the data from 2002 and later should not be compared with data collected in 2001 or earlier to assess changes over time.
For selected variables, statistical imputation was performed following logical inference to replace missing responses. These variables are identified in the codebook as "...LOGICALLY ASSIGNED" for the logical procedure, or by the designation "IMPUTATION-REVISED" in the variable label when the statistical procedure was also performed. The names of statistically imputed variables begin with the letters "IR". For each imputation-revised variable, a corresponding imputation indicator variable indicates whether a case's value on the variable resulted from an interview response or was imputed. Missing values for some demographic variables were imputed by the unweighted hot-deck technique used in previous surveys. Beginning in 1999, imputation of missing values for most variables was accomplished using predictive mean neighborhoods (PMN), a new procedure developed specifically for this survey. Both the hot-deck and PMN imputation procedures are described in the codebook.
To protect the privacy of respondents, all variables that could be used to identify individuals have been encrypted or collapsed in the public use file. To further ensure respondent confidentiality, the data producer used data substitution and deletion of state identifiers and a subsample of records in the creation of the public use file.
Previously published estimates may not be exactly reproducible from the variables in the public use file due to the disclosure protection procedures that were implemented.
The data definition and dictionary files for Stata are designed to be compatible with StataSE, Version 8. This is a large data file requiring that approximately 250 megabytes of Random Access Memory be allocated to Stata. Operations within Stata, including conversion of the ASCII data to Stata format, are likely to be slow. Analysts may wish to download subsets of data from the SAMHDA Data Analysis System (DAS) for use with Stata.
Since 1999, the survey sample has employed a 50-State design with an independent, multistage area probability sample for each of the 50 States and the District of Columbia.
audio computer-assisted self interview (ACASI)
A multistage area probability sample for each of the 50 states and the District of Columbia was used since 1999. The 2005 NSDUH is the first survey in a coordinated five-year sample design. Although there is no overlap with the 1999-2004 samples, the coordinated design for 2005 through 2009 facilitated a 50 percent overlap in second-stage units (area segments [see below]) between each two successive years from 2005 through 2009. This design was intended to increase the precision of estimates in year-to-year trend analyses because of the expected positive correlation resulting from the overlapping sample between successive survey years. The 2005 design allows for computation of estimates by state in all 50 states plus the District of Columbia. States may therefore be viewed as the first level of stratification as well as a reporting variable. Eight states, referred to as the large sample states, had a sample designed to yield 3,600 respondents per state for the 2005 survey. This sample size was considered adequate to support direct state estimates. The remaining 43 states (which include the District of Columbia) had a sample designed to yield 900 respondents per state in the 2005 survey. In these 43 states, adequate data were available to support reliable state estimates based on SAE methodology. Within each state, sampling strata called state sampling (SS) regions were formed. Based on a composite size measure, states were partitioned geographically into roughly equal-sized regions. In other words, regions were formed such that each area yielded, in expectation, roughly the same number of interviews during each data collection period. The eight large sample states were divided into 48 SS regions each. The remaining states were divided into 12 SS regions each. Therefore, the partitioning of the United States resulted in the formation of a total of 900 SS regions. Unlike the 1999 through 2004 surveys, the first stage of selection for the 2005 through 2009 NSDUHs was Census tracts. The first stage of selection began with the construction of an area sample frame that contained one record for each Census tract in the United States. If necessary, Census tracts were aggregated within SS regions until each tract had, at a minimum, 150 dwelling units in urban areas and 100 dwelling units in rural areas. These Census tracts served as the primary sampling units (PSUs) for the coordinated five-year sample. One area segment (one or more Census blocks) was selected within each sampled Census tract. In advance of the survey period, specially trained listers had visited each area segment and listed all addresses for housing units and eligible group quarters units in a prescribed order. Systematic sampling was used to select the allocated sample of addresses from each segment. Each respondent who completed a full interview was given a $30 cash payment as a token of appreciation for his or her time. To improve the precision of the estimates, the sample allocation process targeted five age groups: 12 to 17 years, 18 to 25 years, 26 to 34 years, 35 to 49 years, and 50 years or older. The size measures used in selecting the area segments were coordinated with the dwelling unit and person selection process so that a nearly self-weighting sample could be achieved in each of the five age groups. The achieved sample size for the 2005 survey was 68,308 persons. The public use file contains 55,905 records due to a subsampling step used in the disclosure protection procedures. A key step in the data processing procedures established the minimum item response requirements in order for cases to be retained for weighting and further analysis (i.e., "usable" cases). These requirements, as well as full sampling methodology, are detailed in the codebook.
Due to unequal selection probabilities at multiple stages of sample selection and various adjustments, such as those for nonresponse and poststratification, the 2005 NSDUH sample design is not self-weighting. Analysts are advised to use the final sample weight when attempting to use the 2005 NSDUH data to draw inferences about the target population or any subdomains of the target population. All estimates published in SAMHSA reports (such as the results from the 2005 NSDUH) are weighted using the final analysis weight for the full sample (ANALWT). For the public use file, the corresponding final sample weight is denoted as ANALWT_C, with the "C" denoting confidentiality protection. This sample weight represents the total number of target population persons each record on the file represents. Note that the sum of ANALWT_C, over all records on the data file, represents an estimate of the total number of people in the target population.
  • Strategies for ensuring high rates of participation resulted in a weighted screening response rate of 91 percent and a weighted interview response rate for the CAI of 76 percent. (Note that these response rates reflect the original sample, not the subsampled data file referenced in this document.)
  • Performed consistency checks.
  • Created online analysis version with question text.
