|
|
Face-to-face
|
Telephone
|
Mail
|
|
Response rates
|
|
|
|
|
General samples
|
Good
|
Good
|
Good
|
|
Specialized samples
|
Good
|
Good
|
Good
|
|
Representative samples
|
|
|
|
|
Avoidance of refusal bias
|
Good
|
Good
|
Poor
|
|
Control over who completes
the questionnaire
|
Good
|
Good
|
Satisfactory
|
|
|
|
|
|
Gaining access to the
selected person
|
Satisfactory
|
Good
|
Good
|
|
|
|
|
|
Locating the selected
person
|
Satisfactory
|
Good
|
Good
|
|
|
|
|
|
|
Effects on questionnaire
design
|
|
|
|
|
Ability to handle:
|
|
|
|
|
Long questionnaires
|
Good
|
Satisfactory
|
Satisfactory
|
|
Complex questions
|
Good
|
Poor
|
Satisfactory
|
|
Boring questions
|
Good
|
Satisfactory
|
Poor
|
|
Item non‑response
|
Good
|
Good
|
Satisfactory
|
|
Filter questions
|
Good
|
Good
|
Satisfactory
|
|
Question sequence control
|
Good
|
Good
|
Poor
|
|
Open ended questions
|
Good
|
Good
|
Poor
|
|
|
|
|
|
|
Quality of answers
|
|
|
|
|
Minimize socially
desirable responses
|
Poor
|
Satisfactory
|
Good
|
|
|
|
|
|
Ability to avoid
distortion due to:
|
|
|
|
|
Interviewer characteristics
|
Poor
|
Satisfactory
|
Good
|
|
Interviewer opinions
|
Satisfactory
|
Satisfactory
|
Good
|
|
Influence of other people
|
Satisfactory
|
Good
|
Poor
|
|
Allows opportunities
to consult
|
Satisfactory
|
Poor
|
Good
|
|
|
|
|
|
Avoids subversion
|
Poor
|
Satisfactory
|
Good
|
|
|
|
|
|
|
Implementing the survey
|
|
|
|
|
Ease of finding suitable
staff
|
Poor
|
Good
|
Good
|
|
Speed
|
Poor
|
Good
|
Satisfactory
|
|
Cost
|
Poor
|
Satisfactory
|
Good
|
|
Source: adapted from
Dillman, 1978
|
|
Table
11.1 Advantages and disadvantages of
three methods of questionnaire administration
|
B.
Sample Size
1.
Depends on funds, time, access to participants, planned
methods of analysis and the degree of precision and accuracy required.
2.
The larger the sample the better.
3.
You can easily reduce sample size by designing a long, complex
questionnaire that requires 30 minutes of participant time to complete. (I
recently hung up on a telephone interviewer because the survey took too much
time.)
C.
Precision and Accuracy of Estimates
1.
Larger sample means more precision.
2.
Table 11.2 shows this.
|
Table 11.2 Sample sizes
required for various sampling errors at 95 per cent confidence level (simple
random sampling)
|
Sample
size2
|
Sampling
error1
(%,)
|
Sample
size2
|
|
1.0
|
9,200
|
5.5
|
330
|
|
1.5
|
4,500
|
6.0
|
277
|
|
2.0
|
2,400
|
6.5
|
237
|
|
2.5
|
1,600
|
7.0
|
204
|
|
3.0
|
1,100
|
7.5
|
178
|
|
3.5
|
816
|
8.0
|
156
|
|
4.0
|
625
|
8.5
|
138
|
|
4.5
|
494
|
9.0
|
123
|
|
5.0
|
400
|
9.5
|
110
|
|
|
|
10.0
|
100
|
|
1This is in fact two
standard errors.
2This
assumes a 50/50 split on the variable. These sample sizes would be smaller
for more homogeneous samples.
|
3.
Example: most national surveys reported in the press have a
sample size of 625 individuals, achieving the often-reported error of ±4.0%
D.
Optimal Sample Size
1.
The optimal sample size is the size at which marginal cost equals
marginal benefit.
2.
If you have a large budget, marginal cost may be low.
3.
If accuracy requires a larger n in some cells, the marginal
benefit may be high.
E.
Variation in Key Variables
1.
We need variation in the dependent variable.
2.
We also need variation in the independent variables. Otherwise, they won’t be able to explain
much.
F.
Statistical Controls
1.
Since the controls will be applied after the data is
collected, you must think about what controls you will need before you design
the survey instrument.
2.
Once you’ve thought about them, be sure to include them in
your survey instrument.
G.
Length
1.
Questionnaires can’t be too long or your response rate will
suffer.
2.
However, if your topic is of interest to people, you can get
away with a somewhat longer survey instrument.
H.
Types of Data
1.
Must choose between open ended and forced choice responses.
2.
Must also choose between nominal and ordinal choices.
a)
Economists generally prefer nominal data because ordinal data
rapidly uses up degrees of freedom.
b)
However, some data (e.g., gender) must be ordinal.
VI.
Cross-Sectional Analysis: Descriptive Phase
A.
Counting
1.
We count responses to different questions because we want to
look at the distribution of responses.
2.
Where population data is available, response frequency should
be compared to population frequency as one measure of how representative your
sample is.
B.
Collapsing interval variables
1.
Be careful doing this. The way variables are collapsed can
change your interpretation of the results.
2.
Table 12.1 (p. 197) is an excellent illustration of this.
3.
One solution: let the distribution of the data determine your
intervals rather than assigning them yourself.
C.
Form of data
1.
In general, the statistical software will handle problems in
this area.
2.
However, you may want to rescale a variable to make its
coefficient easier to interpret.
3.
deVaus suggests normalizing variables, viz.,
.
VII.
Cross-sectional Analysis: Explanatory Analysis
A.
Statistical Controls
1.
The ceteris paribus assumption is embodied in our
independent variables.
a)
A regression coefficient tells us how much our dependent
variable will respond to a change in one independent variable holding all other
independent variables constant.
b)
One common procedure is to set the other independent variables
to their mean values, then look at the response of the independent
variable. For example, elasticities are
often calculated at the means of the variables.
2.
“… it is not possible to control for every possible variable,
so the possibility always remains that any … differences could be due to these
uncontrolled variables.” (deVaus, p. 203)
a)
This corresponds exactly to the “omitted variable” problem in
statistics.
b)
While it is important to discuss this problem, if you worry
about it too much you’ll never get any research done.
VIII.
Some Statistical Concerns
A.
Heteroscedasticity
1.
This is often a problem in cross-sectional analysis
2.
Three common tests; use groupwise heteroscedastic model (see
William H. Greene, Econometric Analysis, Macmillan, 1993, sections
14.3.1, 14.3.4, and 16.3.1):

In
other words, allow s2 to vary across i,
constructing groups. In the example
shown above there are four groups. Then
perform one of the following tests.
a)
Lagrange multiplier test

where
T is the number of time periods (equal to 1 for pure cross-sectional
data). The test is a chi square test
with n degrees of freedom.
b)
White's test
Regress
the squared OLS residual on a constant and all combinations of the independent
variables, viz., X1, X2, X12, X22,
and X1X2. Calculate
the chi-squared statistic as (nT)R2 and perform the usual chi
squared test. Note that the null hypothesis being tested is that the data is
homoscedastic. Rejecting the null
hypothesis means you have a heteroscedasticity problem.
While
White's test is very robust and doesn't assume normality (which the LM test
does), it chews up degrees of freedom rather rapidly.
c)
Approximate likelihood ratio test
The
approximate likelihood ratio statistic is

where
and 
This
chi-squared statistic has n-1 degrees of freedom. If only least squares are available, s2 and si2
may be used. However, you will lose
some power of the test, particularly if your sample is small.
3.
B.
Sample size is always a concern, particularly when working
with primary data. Remember, you lose
one degree of freedom for each independent variable. If you have k independent variables, you should have at least 50+k
observations.
IX.
Issues in Questionnaire Design
A.
The standard (and only) reference is William Foddy, Constructing
Questions for Interviews and Questionnaires: Theory and Practice in Social
Research, Cambridge University Press, 1993.
B.
Causes of error in gathering data using surveys (p. 2)
1.
Respondents' failure to understand the questions as intended.
2.
A lack of effort, or interest, on the part of respondents.
3.
Respondents' unwillingness to admit to certain attitudes or
behaviors.
4.
Failure of respondents' memory or comprehension processes in
the stressed conditions of the interview.
5.
Interviewer failures of various kinds (e.g. the tendency to
change wording, failures in presentation procedures and the adoption of faulty
recording procedures).
C.
Examples of causes of errors
1.
Factual questions sometimes elicit invalid answers.
2.
The relationship between what respondents say they do and what
they actually do is not always very strong.
3.
Respondents' attitudes, beliefs, opinions, habits, interests
often seem to be extraordinarily unstable.
4.
Small changes in wording sometimes produce major changes in
the distribution of responses.
5.
Respondents commonly misinterpret questions.
6.
Answers to earlier questions can affect respondents' answers
to later questions.
7.
Changing the order in which response options are presented
sometimes affects respondents' answers.
8.
Respondents' answers are sometimes affected by the question
format per se.
9.
Respondents often answer questions even when it appears that
they know very little about the topic.
10.
The cultural context in which a question is presented often
has an impact on the way respondents interpret and answer questions.
D.
The key issue: the comparability of answers (p. 17)
1.
The researcher must be clear about the nature of the
information required and encode a request for this information.
2.
The respondent must decode this request in the way the
researcher intends it to be decoded.
3.
The respondent must encode an answer that contains the
information the researcher has requested.
4.
The researcher must decode the answer as the respondent
intended it to be decoded.
E.
Symbolic interactionist theory (Blumer, summarized in Foddy
pp. 19 ff.)
1.
Human beings interpret and define each other's actions.
2.
Human beings can be the objects of their own attention. In other words they can act toward
themselves as they act toward others.
3.
Conscious social behavior is intentional behavior.
4.
Interpreting, planning and acting are ongoing processes which
begin anew at every stage of a social interaction. Both parties in a dyadic interaction engage in these processes.
5.
Human intelligence is, in part, reflexive in character.
6.
These processes occur in all social situations (although they
will be most obvious in newly formed situations as the interactants struggle to
align their behaviors with one another).

Figure 2.2 in
Foddy (p. 22)
X.
Conclusion
A.
Cross-sectional data analysis avoids some (but not all)
statistical problems.
B.
Make sure you have a large enough sample size.
C.
Design your questionnaire correctly.