The National Survey on Drug Use and Health (NSDUH) series, formerly titled National Household Survey on Drug Abuse, is a major source of statistical information on the use of illicit drugs, alcohol, and tobacco and on mental health issues among members of the U.S. civilian, non-institutional population aged 12 or older. The survey tracks trends in specific substance use and mental illness measures and assesses the consequences of these conditions by examining mental and/or substance use disorders and treatment for these disorders.

Examples of uses of NSDUH data include the identification of groups at high risk for initiation of substance use and issues among those with co-occurring substance use disorders and mental illness.

NSDUH public-use data files are available for download in SAS, SPSS, STATA and ASCII formats, and online analysis with SDA. NSDUH restricted-use data files are available for online analysis with the R-DAS.

The NSDUH is sponsored by the Center for Behavioral Health Statistics and Quality (formerly Office of Applied Studies), Substance Abuse and Mental Health Services Administration. For more information, visit the NSDUH website.

NSDUH Questionnaire Details

The population of the NSDUH series is the general civilian population aged 12 and older in the United States. Questions include age at first use, as well as lifetime, annual, and past-month usage for the following drugs: alcohol, marijuana, cocaine (including crack), hallucinogens, heroin, inhalants, tobacco, pain relievers, tranquilizers, stimulants, and sedatives. The survey covers substance abuse treatment history and perceived need for treatment, and includes questions from the Diagnostic and Statistical Manual (DSM) of Mental Disorders that allow diagnostic criteria to be applied.

Respondents were also asked about personal and family income sources and amounts, health care access and coverage, illegal activities and arrest record, problems resulting from the use of drugs, perceptions of risks, and needle-sharing. Demographic data include gender, race, age, ethnicity, educational level, job status, income level, veteran status, household composition, and population density.

The questionnaire was significantly redesigned in 1994. The 1994 survey included for the first time a rural population supplement to allow separate estimates to be calculated for this population. Other modules have been added each year and retained in subsequent years: mental health and access to care (1994-B); risk/availability of drugs (1996); cigar smoking and new questions on marijuana and cocaine use (1997); question series asked only of respondents aged 12 to 17 (1997); questions on tobacco brand (1999); marijuana purchase questions (2001); prior marijuana and cigarette use, additional questions on drug treatment, adult mental health services, and social environment (2003); and adult and adolescent depression questions derived from the National Comorbidity Survey, Replication (NCS-R) and National Comorbidity Survey, Adolescent (NCS-A) (2004).

Survey administration and sample design were improved with the implementation of the 1999 survey, and additional improvements were made in 2002. Since 1999, the survey sample has employed a 50-state design with an independent, multistage area probability sample for each of the 50 states and the District of Columbia. At this time, the collection mode of the survey changed from personal interviews and self-enumerated answer sheets to using computer-assisted personal interviews and audio computer-assisted self-interviews. In 2002, the survey’s title was officially changed to the National Survey on Drug Use and Health (NSDUH).

Since 2002, participants are given $30 for participating in the study. This resulted in an increase in participation rates from the years prior to 2002. Also, in 2002 and 2011, the new population data from the 2000 and 2010 decennial Censuses, respectively, became available for use in the sample weighting procedures. For these reasons, data gathered for 2002 and beyond cannot validly be compared to data prior to 2002.

This series measures the prevalence and correlates of drug use in the United States. The surveys are designed to provide quarterly, as well as annual, estimates. Information is provided on the use of illicit drugs, alcohol, and tobacco among members of United States households aged 12 and older. Questions include age at first use as well as lifetime, annual, and past-month usage for the following drug classes: marijuana, cocaine (and crack), hallucinogens, heroin, inhalants, alcohol, tobacco, and nonmedical use of prescription drugs, including psychotherapeutics. Respondents were also asked about substance abuse treatment history, illegal activities, problems resulting from the use of drugs, personal and family income sources and amounts, need for treatment for drug or alcohol use, criminal record, and needle-sharing. Questions on mental health and access to care, which were introduced in the 1994-B questionnaire (see NATIONAL HOUSEHOLD SURVEY ON DRUG ABUSE, 1994), were retained in this administration of the survey. In 1996, the section on risk/availability of drugs was reintroduced, and sections on driving behavior and personal behavior were added (see NATIONAL HOUSEHOLD SURVEY ON DRUG ABUSE, 1996). The 1997 questionnaire continued the risk/availability section along with new items about the use of cigars, people present when respondents used marijuana or cocaine for the first time (if applicable), reasons for using these two drugs the first time, reasons for using these two drugs in the past year, reasons for discontinuing use of these two drugs (for lifetime but not past-year users), and reasons respondents never used these two drugs. In addition, a new series of questions asked only of respondents aged 12 to 17 was introduced. These items covered a variety of topics that may be associated with substance use and related behaviors, such as exposure to substance abuse prevention and education programs, gang involvement, relationship with parents, and substance use by friends. Demographic data include gender, race, age, ethnicity, marital status, educational level, job status, income level, veteran status, and current household composition.

United States
survey data
The civilian, noninstitutionalized population of the United States aged 12 and older, including residents of noninstitutional group quarters, such as college dormitories, group homes, shelters, rooming houses, and civilians dwelling on military installati
Data were collected and prepared for release by Research Triangle Institute, Research Triangle Park, NC.
The National Household Survey on Drug Abuse questionnaire and estimation methodology changed with the implementation of the 1994-B survey. Therefore, estimates produced from the 1997 survey are not comparable to those produced from the 1994-A and earlier surveys.
For selected variables, statistical imputation was performed following logical imputation to replace missing responses. These variables are identified in the codebook as "...LOGICALLY IMPUTED" and "...imputed" for the logical procedure or by the designation "IMPUTATION-REVISED" in the variable label when the statistical procedure was also performed. The names of statistically imputed variables begin with the letters "IR". For each imputation-revised variable there is a corresponding imputation indicator variable that indicates whether a case's value on the variable resulted from an interview response or was imputed by the hot-deck technique. Hot-deck imputation is described in the codebook.
The "basic sampling weights" are equal to the inverse of the probabilities of selection of sample respondents. To obtain "final NHSDA weights," the basic weights were adjusted to take into account dwelling unit-level and individual-level nonresponse and then further adjusted to ensure consistency with intercensal population projections from the United States Bureau of the Census.
To protect the privacy of respondents, all variables that could be used to identify individuals have been encrypted or collapsed in the public use file. These modifications should not affect analytic uses of the public use file.
Users who wish to replicate results published in the NHSDA Main Findings Report or other SAMHSA reports should use the 1997 NHSDA imputed data for prevalence estimates rather than raw data from the questionnaire or drug answer sheets.
For some drugs that have multiple names, questions regarding the use of that drug may be asked for each distinct name. For example, even though methamphetamine, methedrine and desoxyn are the same drug, their use was measured in three separate variables.
Multistage area probability sample design involving five selection stages: (1) primary sampling unit (PSU) areas (e.g., counties), (2) subareas within primary areas (e.g., blocks or block groups), (3) listing units within subareas, (4) age domains within sampled listing units, and (5) eligible individuals within sampled age domains. The 1997 NHSDA used the same 115 PSUs as the 1995 and 1996 NHSDAs, plus a total of 18 supplemental PSUs from Arizona and California. The 115 PSUs were selected to represent the nation's total eligible population, including areas of high Hispanic concentration. These PSUs were defined as metropolitan areas, counties, groups of counties, Census tracts, and independent cities. Of the 115 PSUs, 43 were selected with certainty and 72 were randomly selected with probability proportional to size (PPS). The national sample was supplemented by a PPS selection of 14 noncertainty PSUs from Arizona plus 4 noncertainty PSUs from California. Because the national sample provided representation for certainty PSUs in each state, no additional certainty PSUs were added to either sample. Unlike NHSDAs prior to 1996, the 1996 and 1997 NHSDAs did not oversample cigarette smokers aged 18-34. Unlike the 1996 NHSDA, which reused about 95 percent of the sample segments used in 1995, the 1997 NHSDA basically surveyed a new segment sample. Only 96 segments in the 1997 NHSDA overlapped with 1996 segments. Beginning in quarter two of the 1997 NHSDA, residents of Arizona and California were oversampled to provide direct survey estimates for these states. Due to confidentiality concerns, there is no variable on the public use data file to indicate a state identifier. The five age groups were: ages 12-17, 18-25, 26-34, 35-49, and 50 and older. The three race/ethnic groups were: Whites/others, non-Hispanic Blacks, and Hispanics. Blacks and Hispanics were oversampled. The study yielded an 85.0 percent eligibility rate for sample households and a 92.7 percent completion rate for screening eligible households.
Data were weighted based on the five stages of sampling that were used. Adjustments were made to compensate for nonresponse and sampling error. Adjustments also included trimming sample weights to reduce excessive weight variation and a post-stratification to Census population estimates. The final weight variable to be used in analysis is ANALWT.
  • The interview response rates for the three racial/ethnic groups were: 75.5 percent for Whites/others, 81.8 percent for Blacks, and 82.5 percent for Hispanics. The overall unweighted interview response rate was 78.3 percent. A completed interview had to contain, at a minimum, data on the recency of use of marijuana, cocaine, and alcohol.
  • 2013-05-06: Data collection instrument released.
  • 2008-10-23: New files were added. These files included one or more of the following: Stata setup, SAS transport (CPORT), SPSS system, Stata system, SAS supplemental syntax, and Stata supplemental syntax files, and tab-delimited ASCII data file. Modified value labels and missing values for variable GQTYPE to correct previous errors. The variable CASEID was also added to the dataset.
  • 2000-08-04: Erroneous codes for missing values were deleted for the variable IRAGE in the SAS and SPSS setup files.
  • Performed consistency checks.
  • Standardized missing values.
  • Created online analysis version with question text.
  • Checked for undocumented or out-of-range codes.

