Function to conduct multivariate regression analyses of survey data with the item count technique, also known as the list experiment and the unmatched count technique.
ictreg(formula, data = parent.frame(), treat = "treat", J, method = "ml", weights, h = NULL, group = NULL, matrixMethod = "efficient", overdispersed = FALSE, constrained = TRUE, floor = FALSE, ceiling = FALSE, ceiling.fit = "glm", floor.fit = "glm", ceiling.formula = ~1, floor.formula = ~1, fit.start = "lm", fit.nonsensitive = "nls", multi.condition = "none", maxIter = 5000, verbose = FALSE, ...)
formula | An object of class "formula": a symbolic description of the model to be fitted. |
---|---|
data | A data frame containing the variables in the model |
treat | Name of treatment indicator as a string. For single sensitive item models, this refers to a binary indicator, and for multiple sensitive item models it refers to a multi-valued variable with zero representing the control condition. This can be an integer (with 0 for the control group) or a factor (with "control" for the control group). |
J | Number of non-sensitive (control) survey items. |
method | Method for regression, either |
weights | Name of the weights variable as a string, if weighted regression is desired. Not implemented for the ceiling/floor models, multiple sensitive item design, or for the modified design. |
h | Auxiliary data functionality. Optional named numeric vector with length equal to number of groups. Names correspond to group labels and values correspond to auxiliary moments. |
group | Auxiliary data functionality. Optional character vector of group labels with length equal to number of observations. |
matrixMethod | Auxiliary data functionality. Procedure for estimating
optimal weighting matrix for generalized method of moments. One of
"efficient" for two-step feasible and "cue" for continuously updating.
Default is "efficient". Only relevant if |
overdispersed | Indicator for the presence of overdispersion. If
|
constrained | A logical value indicating whether the control group
parameters are constrained to be equal. Not relevant for the |
floor | A logical value indicating whether the floor liar model should be used to adjust for the possible presence of respondents dishonestly reporting a negative preference for the sensitive item among those who hold negative views of all the non-sensitive items. |
ceiling | A logical value indicating whether the ceiling liar model should be used to adjust for the possible presence of respondents dishonestly reporting a negative preference for the sensitive item among those who hold affirmative views of all the non-sensitive items. |
ceiling.fit | Fit method for the M step in the EM algorithm used to fit
the ceiling liar model. |
floor.fit | Fit method for the M step in the EM algorithm used to fit
the floor liar model. |
ceiling.formula | Covariates to include in ceiling liar model. These
must be a subset of the covariates used in |
floor.formula | Covariates to include in floor liar model. These must
be a subset of the covariates used in |
fit.start | Fit method for starting values for standard design
|
fit.nonsensitive | Fit method for the non-sensitive item fit for the
|
multi.condition | For the multiple sensitive item design, covariates
representing the estimated count of affirmative responses for each
respondent can be included directly as a level variable by choosing
|
maxIter | Maximum number of iterations for the Expectation-Maximization algorithm of the ML estimation. The default is 5000. |
verbose | a logical value indicating whether model diagnostics are printed out during fitting. |
... | further arguments to be passed to NLS regression commands. |
ictreg
returns an object of class "ictreg". The function
summary
is used to obtain a table of the results. The object
ictreg
is a list that contains the following components. Some of
these elements are not available depending on which method is used
(lm
, nls
or ml
), which design is used (standard
,
modified
), whether multiple sensitive items are include
(multi
), and whether the constrained model is used (constrained
= TRUE
).
point estimate for effect of covariate on item count fitted on treatment group
standard error for estimate of effect of covariate on item count fitted on treatment group
point estimate for effect of covariate on item count fitted on control group
standard error for estimate of effect of covariate on item count fitted on control group
variable names as defined in the data frame
call indicating whether the standard
design as proposed in Imai (2010) or thee modified
design as proposed
in Corstange (2009) is used
call of the method used
call indicating whether data is overdispersed
call indicating whether the constrained model is used
call indicating whether the floor/ceiling boundary models are used
indicator for whether multiple sensitive items were included in the data frame
the matched call
the
data
argument
the design matrix
the response vector
the vector indicating treatment status
Number of non-sensitive (control) survey items set by the user or detected.
a vector of the names used by the treat
vector
for the sensitive item or items. This is the names from the treat
indicator if it is a factor, or the number of the item if it is numeric.
a vector of the names used by the treat
vector
for the control items. This is the names from the treat
indicator if
it is a factor, or the number of the item if it is numeric.
posterior predicted probability of answering "yes" to the sensitive item. The weights from the E-M algorithm.
call indicating whether the assumption of no ceiling liars is relaxed, and ceiling parameters are estimated
point estimate for effect of covariate on whether respondents who answered affirmatively to all non-sensitive items and hold a true affirmative opinion toward the sensitive item lied and reported a negative response to the sensitive item
standard error for estimate for effect of covariate on whether respondents who answered affirmatively to all non-sensitive items and hold a true affirmative opinion toward the sensitive item lied and reported a negative response to the sensitive item
call indicating whether the assumption of no floor liars is relaxed, and floor parameters are estimated
point estimate for effect of covariate on whether respondents who answered negatively to all non-sensitive items and hold a true affirmative opinion toward the sensitive item lied and reported a negative response to the sensitive item
standard error for estimate for effect of covariate on whether respondents who answered negatively to all non-sensitive items and hold a true affirmative opinion toward the sensitive item lied and reported a negative response to the sensitive item
variable names from the ceiling liar model fit, if applicable
variable names from the floor liar model fit, if applicable
point estimate for effect of covariate on item count fitted on treatment group
standard error for estimate of effect of covariate on item count fitted on treatment group
point estimate for effect of covariate on item count fitted on treatment group
standard error for estimate of effect of covariate on item count fitted on treatment group
the log likelihood of the model, if ml
is
used
the residual standard error, if nls
or
lm
are used. This will be a scalar if the standard design was used,
and a vector if the multiple sensitive item design was used
the residual degrees of freedom, if nls
or lm
are used. This will be a scalar if the standard design was used, and a
vector if the multiple sensitive item design was used
logical value indicating whether estimation incorporates auxiliary moments
integer count of the number of auxiliary moments
procedure used to estimate the optimal weight matrix
numeric value of the Sargan Hansen overidentifying restriction test statistic
corresponding p-value for the Sargan Hansen test
This function allows the user to perform regression analysis on data from the item count technique, also known as the list experiment and the unmatched count technique.
Three list experiment designs are accepted by this function: the standard design; the multiple sensitive item standard design; and the modified design proposed by Corstange (2009).
For the standard design, three methods are implemented in this function: the
linear model; the Maximum Likelihood (ML) estimation for the
Expectation-Maximization (EM) algorithm; the nonlinear least squares (NLS)
estimation with the two-step procedure both proposed in Imai (2010); and the
Maximum Likelihood (ML) estimator in the presence of two types of dishonest
responses, "ceiling" and "floor" liars. The ceiling model, floor model, or
both, as described in Blair and Imai (2010) can be activated by using the
ceiling
and floor
options. The constrained and unconstrained
ML models presented in Imai (2010) are available through the
constrained
option, and the user can specify if overdispersion is
present in the data for the no liars models using the overdispersed
option to control whether a beta-binomial or binomial model is used in the
EM algorithm to model the item counts.
The modified design and the multiple sensitive item design are automatically detected by the function, and only the binomial model without overdispersion is available.
Blair, Graeme and Kosuke Imai. (2012) ``Statistical Analysis of List Experiments." Political Analysis. Forthcoming. available at http://imai.princeton.edu/research/listP.html
Imai, Kosuke. (2011) ``Multivariate Regression Analysis for the Item Count Technique.'' Journal of the American Statistical Association, Vol. 106, No. 494 (June), pp. 407-416. available at http://imai.princeton.edu/research/list.html
predict.ictreg
for fitted values
data(race) set.seed(1) # Calculate list experiment difference in means diff.in.means.results <- ictreg(y ~ 1, data = race, treat = "treat", J=3, method = "lm") summary(diff.in.means.results)#> #> Item Count Technique Regression #> #> Call: ictreg(formula = y ~ 1, data = race, treat = "treat", J = 3, #> method = "lm") #> #> Sensitive item #> Est. S.E. #> (Intercept) 0.0678 0.04962 #> #> Control items #> Est. S.E. #> (Intercept) 2.13413 0.03317 #> #> Residual standard error: 0.866365 with 1211 degrees of freedom #> #> Number of control items J set to 3. Treatment groups were indicated by '' and '' and the control group by ''. #># Fit linear regression # Replicates Table 1 Columns 1-2 Imai (2011); note that age is divided by 10 lm.results <- ictreg(y ~ south + age + male + college, data = race, treat = "treat", J=3, method = "lm") summary(lm.results)#> #> Item Count Technique Regression #> #> Call: ictreg(formula = y ~ south + age + male + college, data = race, #> treat = "treat", J = 3, method = "lm") #> #> Sensitive item #> Est. S.E. #> (Intercept) -0.43430 0.16033 #> south 0.20198 0.11760 #> age 0.07309 0.03051 #> male 0.18023 0.09846 #> college 0.11446 0.09775 #> #> Control items #> Est. S.E. #> (Intercept) 2.40606 0.10511 #> south -0.18021 0.07450 #> age 0.02047 0.01998 #> male -0.20177 0.06522 #> college -0.39408 0.06406 #> #> Residual standard error: 0.837231 with 1203 degrees of freedom #> #> Number of control items J set to 3. Treatment groups were indicated by '' and '' and the control group by ''. #># Fit two-step non-linear least squares regression # Replicates Table 1 Columns 3-4 Imai (2011); note that age is divided by 10 nls.results <- ictreg(y ~ south + age + male + college, data = race, treat = "treat", J=3, method = "nls") summary(nls.results)#> #> Item Count Technique Regression #> #> Call: ictreg(formula = y ~ south + age + male + college, data = race, #> treat = "treat", J = 3, method = "nls") #> #> Sensitive item #> Est. S.E. #> (Intercept) -7.08431 3.66927 #> south 2.48985 1.26819 #> age 0.26094 0.31467 #> male 3.09687 2.82923 #> college 0.61232 1.02951 #> #> Control items #> Est. S.E. #> (Intercept) 1.38811 0.18683 #> south -0.27655 0.11617 #> age 0.03307 0.03503 #> male -0.33223 0.10702 #> college -0.66175 0.11314 #> #> Residual standard error: 0.900805 with 619 degrees of freedom #> #> Number of control items J set to 3. Treatment groups were indicated by '' and '' and the control group by ''. #># NOT RUN { # Fit EM algorithm ML model with constraint # Replicates Table 1 Columns 5-6, Imai (2011); note that age is divided by 10 ml.constrained.results <- ictreg(y ~ south + age + male + college, data = race, treat = "treat", J=3, method = "ml", overdispersed = FALSE, constrained = TRUE) summary(ml.constrained.results) # Fit EM algorithm ML model with no constraint # Replicates Table 1 Columns 7-10, Imai (2011); note that age is divided by 10 ml.unconstrained.results <- ictreg(y ~ south + age + male + college, data = race, treat = "treat", J=3, method = "ml", overdispersed = FALSE, constrained = FALSE) summary(ml.unconstrained.results) # Fit EM algorithm ML model for multiple sensitive items # Replicates Table 3 in Blair and Imai (2010) multi.results <- ictreg(y ~ male + college + age + south + south:age, treat = "treat", J = 3, data = multi, method = "ml", multi.condition = "level") summary(multi.results) # Fit standard design ML model # Replicates Table 7 Columns 1-2 in Blair and Imai (2010) noboundary.results <- ictreg(y ~ age + college + male + south, treat = "treat", J = 3, data = affirm, method = "ml", overdispersed = FALSE) summary(noboundary.results) # Fit standard design ML model with ceiling effects alone # Replicates Table 7 Columns 3-4 in Blair and Imai (2010) ceiling.results <- ictreg(y ~ age + college + male + south, treat = "treat", J = 3, data = affirm, method = "ml", fit.start = "nls", ceiling = TRUE, ceiling.fit = "bayesglm", ceiling.formula = ~ age + college + male + south) summary(ceiling.results) # Fit standard design ML model with floor effects alone # Replicates Table 7 Columns 5-6 in Blair and Imai (2010) floor.results <- ictreg(y ~ age + college + male + south, treat = "treat", J = 3, data = affirm, method = "ml", fit.start = "glm", floor = TRUE, floor.fit = "bayesglm", floor.formula = ~ age + college + male + south) summary(floor.results) # Fit standard design ML model with floor and ceiling effects # Replicates Table 7 Columns 7-8 in Blair and Imai (2010) both.results <- ictreg(y ~ age + college + male + south, treat = "treat", J = 3, data = affirm, method = "ml", floor = TRUE, ceiling = TRUE, floor.fit = "bayesglm", ceiling.fit = "bayesglm", floor.formula = ~ age + college + male + south, ceiling.formula = ~ age + college + male + south) summary(both.results) # }