National Survey on Drug Use and Health (NSDUH-2002-2016)

Parent Series Details:


The National Survey on Drug Use and Health (NSDUH) series, formerly titled National Household Survey on Drug Abuse, is a major source of statistical information on the use of illicit drugs, alcohol, and tobacco and on mental health issues among members of the U.S. civilian, non-institutional population aged 12 or older. The survey tracks trends in specific substance use and mental illness measures and assesses the consequences of these conditions by examining mental and/or substance use disorders and treatment for these disorders.

Examples of uses of NSDUH data include the identification of groups at high risk for initiation of substance use and issues among those with co-occurring substance use disorders and mental illness.

NSDUH public-use data files are available for download in SAS, SPSS, STATA and ASCII formats, and online analysis with SDA. NSDUH restricted-use data files are available for online analysis with the R-DAS.

The NSDUH is sponsored by the Center for Behavioral Health Statistics and Quality (formerly Office of Applied Studies), Substance Abuse and Mental Health Services Administration. For more information, visit the NSDUH website.

NSDUH State and Substate Estimates

The following links provide more information about the NSDUH state and substate estimates:

1999-2015 NSDUH Small Area Estimation


NSDUH Variable Crosswalk Charts


NSDUH Reports and Detailed Tables

NSDUH Questionnaire Details

The population of the NSDUH series is the general civilian population aged 12 and older in the United States. Questions include age at first use, as well as lifetime, annual, and past-month usage for the following drugs: alcohol, marijuana, cocaine (including crack), hallucinogens, heroin, inhalants, tobacco, pain relievers, tranquilizers, stimulants, and sedatives. The survey covers substance abuse treatment history and perceived need for treatment, and includes questions from the Diagnostic and Statistical Manual (DSM) of Mental Disorders that allow diagnostic criteria to be applied.

Respondents were also asked about personal and family income sources and amounts, health care access and coverage, illegal activities and arrest record, problems resulting from the use of drugs, perceptions of risks, and needle-sharing. Demographic data include gender, race, age, ethnicity, educational level, job status, income level, veteran status, household composition, and population density.

The questionnaire was significantly redesigned in 1994. The 1994 survey included for the first time a rural population supplement to allow separate estimates to be calculated for this population. Other modules have been added each year and retained in subsequent years: mental health and access to care (1994-B); risk/availability of drugs (1996); cigar smoking and new questions on marijuana and cocaine use (1997); question series asked only of respondents aged 12 to 17 (1997); questions on tobacco brand (1999); marijuana purchase questions (2001); prior marijuana and cigarette use, additional questions on drug treatment, adult mental health services, and social environment (2003); and adult and adolescent depression questions derived from the National Comorbidity Survey, Replication (NCS-R) and National Comorbidity Survey, Adolescent (NCS-A) (2004).

Survey administration and sample design were improved with the implementation of the 1999 survey, and additional improvements were made in 2002. Since 1999, the survey sample has employed a 50-state design with an independent, multistage area probability sample for each of the 50 states and the District of Columbia. At this time, the collection mode of the survey changed from personal interviews and self-enumerated answer sheets to using computer-assisted personal interviews and audio computer-assisted self-interviews. In 2002, the survey’s title was officially changed to the National Survey on Drug Use and Health (NSDUH).

Since 2002, participants are given $30 for participating in the study. This resulted in an increase in participation rates from the years prior to 2002. Also, in 2002 and 2011, the new population data from the 2000 and 2010 decennial Censuses, respectively, became available for use in the sample weighting procedures. For these reasons, data gathered for 2002 and beyond cannot validly be compared to data prior to 2002.

NSDUH underwent a partial redesign in 2015, so there are several measures that “broke trends” in 2015, meaning that estimates from 2015 and later are no longer comparable to their 2014 and earlier counterparts. This also means that you cannot pool data across incomparable years. For affected measures, you will likely only be able to look at the 2002-2014 timeframe to pool enough years of comparable data to get a sufficient sample size at the county level. Measures that were not affected can be pooled through 2015. More information on the partial 2015 redesign and its effects on estimates is available here:

Study Details:

This file includes data from the 2002 through 2016 National Survey on Drug Use and Health (NSDUH) survey. 

The majority of the variables found in each of the single-year PUFs are included on the 2002-2016 combined PUF. For the most part, as long as a variable was available on more than 1 year of the data files, it was included on this combined file. Retaining or dropping variables from the combined PUF was based on the analytic utility for multiple-year data analysis. A few variables that had low analytic utility or were available for only 1 or 2 years were not included on this combined PUF. These variables can be obtained from the individual PUFs. Some variables that were not included on the 2002-2015 combined PUF are now retained on the 2002‑2016 PUF because they are available on 2 years of data. This consists of many 2015 NSDUH variables that were either new or considered not comparable with their counterparts from 2002-2014 due to the partial questionnaire redesign in the 2015 NSDUH (e.g., the cancer diagnostic variables and those dealing with the use of prescription psychotherapeutics). In addition, some demographic variables, such as education and employment status, and several substance use outcome variables, such as "recoded any illicit drug use in past month" and "recoded binge alcohol use in past 30 days," which were excluded for 2015 from the 2002-2015 combined PUF, are now available on the 2002-2016 combined PUF. The marital status variable IRMARIT was recreated for the 2016 PUF (similar to 2002-2014) and reflects the move of the marital status questions from self-administration in 2015 back to interviewer administration in 2016. The 2015 marital status variable IRMARITSTAT was also retained on the 2002-2016 combined PUF. Analytic goals should be considered prior to pooling or comparing marital status data from 2015 with data from other years. For details on how these two marital status variables differ, see the 2016 PUF codebook.


[1] Center for Behavioral Health Statistics and Quality. (2017). National Survey on Drug Use and Health: 2016 public use file and codebook. Retrieved from

Study Scope

Time period: 
Collection date: 
Geographic coverage : 
United States
Unit of observation: 
Data types: 
Survey Data

It is worth noting that prior to 2016, several variables that were not comparable across time were retained on the file. Reasons for variables not being comparable across years may include questionnaire changes, skip logic (i.e., routing) changes, or changes in how recoded variables were created. Additionally, for 2015 and 2016, COUTYP4 (COUNTY METRO/NONMETRO STATUS [2013 3-LEVEL]), which was created based on the 2013 Rural/Urban Continuum Codes (RUCC13), was included to replace COUTYP2, which was created based on the 2003 Rural/Urban Continuum Codes (RUCC03). COUTYP2 was still retained for the 2002-2014 data. Also, the poverty variable, POVERTY3 (RC-POVERTY LEVEL-NEW INC [% OF US CENSUS POVERTY THRESHOLD]) created for 2015 and 2016, is in fact comparable with POVERTY2 in previous years; for details regarding this variable, see the 2015 PUF codebook.[1] A crosswalk chart (referred to as the "combined PUF measles chart") in the documentation provided for the combined 2002-2016 PUF indicates the variables that are present and comparable across the different years. Users are encouraged to look carefully at this crosswalk to ensure that comparisons across time are valid for given variables.

Subject Terms: 
  • addiction
  • alcohol
  • alcohol abuse
  • alcohol consumption
  • amphetamines
  • barbiturates
  • cocaine
  • controlled drugs
  • crack cocaine
  • demographic characteristics
  • depression (psychology)
  • drinking behavior
  • drug abuse
  • drug dependence
  • drug use
  • drugs
  • employment
  • hallucinogens
  • health

Study Methodology

Time method: 
For variance estimation, no adjustment needs to be made to the sample design variables VESTR (variance estimation [pseudo] stratum) and VEREP (variance estimation [pseudo] replicate within stratum). Note that there are 60 pseudo strata (resulting in 60 degrees of freedom for variance estimation) per year for the 2013 data and prior PUF data and 50 pseudo strata (resulting in 50 degrees of freedom) per year for the 2014 and subsequent data. This change is due to the sample redesign implemented in the 2014 NSDUH. When combining any pair of years of data (e.g., 2015 and 2016), the degrees of freedom remain the same as if the pair of years was a single year (e.g., 50 for national estimates) when these years are part of the same sample design. When combining years with different degrees of freedom (e.g., 2013 and 2014), the specific number of degrees of freedom can be computed by counting the unique values of VESTR. For example, when combining data for 2015 and 2016, DDF=50 can be used because the sample design remained same across those 2 years. When combining data for 2013 and 2014, DDF=110 can be used because the sample design changed in 2014. When comparing estimates in two domains with different degrees of freedom, researchers should err on the conservative side and use the smaller degrees of freedom. For example, when comparing 2013 estimates with 2014 estimates, DDF=50 should be used. For details about degrees of freedom, see Section 6 in the 2016 statistical inference report. As with the single-year PUFs, users of multiyear PUFs should first sort the combined data by the sample design variables, then specify them in a statistical software package, such as SUDAAN, to estimate variances and standard errors (SEs).
When analyzing any single year of data, the variable ANALWC1 should be used. This variable is the same as the variable ANALWT_C that is found on the single-year PUFs. However, with a combined file, analysts have the option of using pooled data from 2 or more years. Therefore, in addition to the analysis weights for a single year of data (ANALWC1), additional weight variables, ANALWC2 to ANALWC15, were created to allow for multiple-year data analysis. These additional weight variables were created by adjusting the single-year weights by a scalar factor (i.e., the number of years of data used) so that the estimated numbers of individuals reported is representative of the national population. For example, ANALWC2, which can be used for producing estimates using any combination of 2 years of data (e.g., pooled 2002-2003 data, pooled 2015-2016 data, or even 2008 and 2010 combined data), was obtained by dividing the single-year weight ANALWC1 by 2. Similarly, ANALWC15 was obtained by dividing ANALWC1 by 15 and can be used for producing estimates using all 15 years of NSDUH data (i.e., combined data from 2002 to 2016).