Weighting the BRFSS Data
When data are unweighted, each record counts the same as any other record. Unweighted data
analyses make the assumption that each record has an equal probability of being selected and that
noncoverage and nonresponse are equal among all segments of the population. When deviations from
these assumptions are large enough to affect the results from a data set, weighting each record
appropriately can help to adjust for assumption violations. In the BRFSS, such weighting serves as a
blanket adjustment for noncoverage and nonresponse and forces the total number of cases to equal
population estimates for each geographic region, which for the BRFSS sums to the state population.
Regardless of state sample design, use of the final weight in analysis is necessary if users are to make
generalizations from the sample to the population.
Following is a general description of the 2017 BRFSS weighting process. Where a factor does not apply,
processors set its value to one for calculation. In order to reduce bias due to unequal probability of
selection, design weighting is conducted. The BRFSS also uses iterative proportional fitting, or “raking”
to adjust for demographic differences between those persons who are sampled and the population that
they represent. The weighting methodology is therefore comprised of two sections: design weight and
raking.
Design weights are calculated using the weight of each geographic stratum (_STRWT), the number of
landline phones within a household (NUMPHON2), and the number of adults who use those phones
(NUMADULT). For cellphone respondents, both NUMPHON2 and NUMADULT are set to 1. The formula
for the design weight is:
Design Weight = _STRWT * (1/NUMPHON2) * NUMADULT
In 2017, the inclusion of cellular telephone respondents who also have landline telephones in their
residence and landline telephone respondents who also have a cellular telephone in their residence
required an adjustment to the design weights to account for the overlapping sample frames. From each
of the two sample frames, a compositing factor was calculated for the telephone dual sampling frame
users. BRFSS multiplied the design weight by the compositing factor to generate a composite weight for
the records in the overlapping sample frames as described in the section below. BRFSS then truncated
the design weight based on quartiles within geographic region, which processors used as the raking
input weight.
The stratum weight (_STRWT) accounts for differences in the probability of selection among strata
(subsets of area code/prefix combinations). It is the inverse of the sampling fraction of each stratum.
There is rarely a complete correspondence between strata, defined by subsets of area code/prefix
combinations, and regions, defined by the boundaries of government entities.
BRFSS calculates the stratum weight (_STRWT) using the following items:
Number of available records (NRECSTR) and the number of records users select (NRECSEL)
within each geographic stratum and density stratum.
Geographic strata (GEOSTR), which may be the entire state or a geographic subset such as
counties, census tracts, etc.
Density strata (_DENSTR) indicating the density of the phone numbers for a given block of
numbers as listed or not listed.
Within each _GEOSTR*_DENSTR combination, BRFSS calculates the stratum weight (_STRWT) from the
average of the NRECSTR and the sum of all sample records used to produce the NRECSEL. The stratum
weight is equal to NRECSTR / NRECSEL.
1/ NUMPHON2 The inverse of the number of residential telephone numbers in the respondent’s
household.
NUMADULT The number of adults 18 years and older in the respondent’s household.
FINAL WEIGHT BRFSS rakes the design weight to 8 margins (gender by age group, race/ethnicity,
education, marital status, tenure, gender by race/ethnicity, age group by race/ethnicity, and phone
ownership). If BRFSS includes geographic regions, it includes four additional margins (region, region by
age group, region by gender, and region by race/ethnicity). If at least one county has 500 or more
respondents, BRFSS includes four additional margins (county, county by age group, county by gender,
and county by race/ethnicity).
_LLCPWT The final weight assigned to each respondent.
BRFSS uses weight trimming to increase the value of extremely low weights and decrease the value of
extremely high weights. The objective of weight trimming is to reduce errors in the outcome estimates
caused by unusually high or low weights in some categories.
2017 design weight correction for overlapping sample frame:
The partial overlapping sample frames required an adjustment to address the respondents probability
of selection in both the landline sample frame and cell phone sample frame. The adjustment to the
design weights was made to records identified as available in both sample frames. Three possible
telephone source contact categories were included for this adjustment:
1. Landline frame with a cell phone
2. Cell phone frame with a landline
9. No Dual Phone Use
The adjustment to the design weight included the records identified as a landline sample record with a
cell phone or cell phone sample record with a landline. The A compositing factor was calculated for the
overlapping sample frame users. The compositing factors were based on the effective sample size. For
the overlapping sample frame telephone service categories calculate compositing factor:
N effective = N / DEFF, Where: N is the unweighted number of interviews, and
DEFF = 1 + (Standard deviation of design wt / Mean value of design wt)
2
.
_DUALUSE is the variable used to identify the Dual Phone use categories (_DUALUSE = 1 Land Line with
a Cell Phone, _DUALUSE = 2 Cell Phone with a Landline, _DUALUSE = 9 No Dual Phone Use )
For the _DUALUSE category 1 (Land Line with a Cell Phone) calculate the composite weight:
Composite_wt = DESIGN_WT x (N effective value for category 1 / (N effective value for category 1 + N
effective value for category 2).
For the _DUALUSE category 2 (Cell Phone with a Landline) calculate: Composite_wt = DESIGN_WT x (N
effective value for category 2 / (N effective value for category 1 +N effective value for category 2).
The corresponding SAS code is similar to: If _DUALUSE = 1 or _DUALUSE = 2 then _WT2RAKE_C =
_WT2RAKE * _DUALCOR Else _WT2RAKE_C = _WT2RAKE.
Where _WT2RAKE is the design weight, _DUALCOR is the composite factor calculated to adjust the
design weight for the records collected from overlapping sample frames.
2017 design weight truncation:
The design weight calculation is implemented separately for the landline sample (within _GEOSTR) and
the cell phone sample (within _GEOSTR). In addition to the overlapping sample frame correction to the
design weight, the combined landline and cell phone design weight has been truncated within _REGION
prior to raking. The primary purpose of the design weight truncation is to prevent any adults in a state
from carrying extremely large weights into the raking. A secondary goal is to prevent any adults from
having extremely small design weights (i.e. the responses should not completely disappear at this point).
The combined landline and cell phone samples within _REGION are truncated based on quartiles.
_LLCPWT2 holds the truncated design weight.
The design weight has been truncated, within _REGION prior to raking, based on quartiles.
The child design weights have not been truncated prior to raking.
2017 Integrated weight:
The 2017 integrated weight includes the nine state level margins and allows up to eight additional
margins to take advantage of additional adjustments to sub-state populations within the raking. There
are four additional margins if a county has at least 500 interviews available. There are also four
additional margins for _REGION, if multiple regions have been defined for a state and each region has at
least 500 interviews.
Order of margins:
Ideally, convergence would be obtained quickly and all margins would achieve agreement with specified
population control totals. In practice however, given the complexity of this weighting system, this may
not be feasible for all 16 margins. Thus, certain margins should match population control totals exactly
(e.g., age*gender and age*race/ethnicity), and in the few difficult cases where the raking algorithm has
not completely converged or has reached the point of diminishing returns, the algorithm may stop
without matching a few of the margins exactly. The last margin will achieve exact agreement with the
population control totals. Margins close to the last margin will almost always be very close to the
population control totals.
For the 2017 integrated weight, the key state-level margin, age by gender, is last and other key state-
level margins are included close to the last margin. The order is shown below.
FIRST MARGIN: sixteenth margin (county by sex)
Fifteenth margin (county by agecat7)
Fourteenth margin (county by race6cat)
Thirteenth margin (county, no collapsing)
Twelfth margin (region by race6cat)
Eleventh margin (region by sex)
Tenth margin (region by agecat7)
Ninth margin (region, no collapsing)
Eighth margin (telephone service)
Seventh margin (age3 by race6cat)
Sixth margin (sex by race6cat)
Fifth margin (own/rent)
Fourth margin (marital, no collapsing)
Third margin (education, no collapsing)
Second margin (race6cat)
LAST MARGIN: first margin (sex by agecat7, no collapsing)
Population estimates:
The population estimates obtained for building the target totals are from similar sources used in
previous years. Intercensal population estimates were purchased from The Nielsen Company, LLC at the
county-level for age, race/ethnicity, and gender. These population estimates are used as the population
totals for a state across all margins. The five-year American Community Survey PUMS data set (2012-
2016) was used to obtain estimates for margins 3, 4, and 5 (education, marital status, tenure). The non-
institutionalized adults were weighted by the person-level weights to generate the population
estimates. The percentages were then used in the raking margins. The telephone type estimates for
margin 8 were taken from the state wireless estimate percentages produced by NCHS and released in
December, 2017 (http://www.cdc.gov/nchs/data/nhis/earlyrelease/wireless_state_201712.pdf).
Calculation of a child weight
BRFSS calculates the design weight for child weighting from the stratum weight times the inverse of the
number of telephones in the household and then multiplies by the number of children:
Child Design Weight = _STRWT * (1/NUMPHON2) * CHILDREN
CHIILDWT = BRFSS rakes the child design weight to 5 margins including age by gender, race/ethnicity,
gender by race/ethnicity, age by race/ethnicity, and phone ownership.
_CLLCPWT is the weight assigned for each child interview.