Health Services Research Methods 3rd Edition by Leiyu Shi – Answer Key
1. Why is sampling used?
Sampling is used to study a large population of interest efficiently and accurately through examination of a carefully selected subset or sample of the population. It allows researchers to make generalizations about the whole population of interest, that is, to make estimates and test hypotheses about population characteristics based on data from the sample.
2. What are the differences between probability and nonprobability sampling? When is one more appropriate than the other?
Probability sampling requires the specification of the probability that each sample element will be included in the sample. The process of random sampling consists of using a sampling frame and some random procedure of selection that makes probability estimation and the use of inferential statistics possible. Nonprobability sampling does not require the specification of the probability that each sample element will be included in the sample. Typically, nonrandom procedures are used to select sampling elements. Nonprobability sampling may not be representative of the population of interest, and therefore may not be generalizable to the population. Nonprobability sampling is much more convenient, less expensive, and less time-consuming than probability sampling and may be useful when probability sampling methods cannot be used.
Probability sampling methods are generally used at later rather than exploratory phases of research, when accuracy of samples is critical so that sample finding may be validly generalized to the population. Nonprobability sampling is often used in the early or exploratory stage of a study, where the purpose is to find out more information about the topic, discover interesting patterns, and generate hypotheses for later, more formal investigation. It is also used when data accuracy is not very important, when resources such as time and money are very limited, or when certain subjects are difficult to locate or access. Nonprobability sampling may also be used when there are very few population elements, when random sampling may not generate a representative sample, or when sampling based on expert judgment may be more reliable.
3. Among probability sampling methods, what kinds of research conditions are more appropriate for each sampling method?
Commonly used probability sampling methods include simple random sampling, systematic sampling, cluster sampling, and stratified sampling. Simple random and systematic sampling methods are more likely to be used if the population is homogeneous, while cluster sampling is the preferred method if the population is scattered. Resource constraints in terms of time and money favor cluster over simple random or systematic sampling when face-to-face interviews are planned. The availability of a sampling frame also affects the choice of a particular method.
The feasibility of simple random sampling depends largely on whether or not there exist accurate and complete lists of the population elements from which the sample is to be drawn. It would be prohibitively expensive and time-consuming to compose such a sampling frame.
Systematic sampling is easier to draw than simple random sampling, so it is often used in lieu of simple random sampling, particularly if the sampling list is long or the desired sample size is large. Systematic sampling is commonly used when choosing a sample from city or telephone directories or other preexisting but unnumbered lists.
Cluster sampling is commonly used in survey practice, and clusters are usually natural groupings, such as organizations or associations, or geographic units. Deciding whether to study all the elements within the cluster or to randomly select the elements for study usually depends on the heterogeneity of the elements within the clusters. The more heterogeneous the elements, the greater the proportion of those should be studied. This method is more expensive in terms of sample frame preparation and data collection, but considerable amounts of money and time can be saved overall because cluster sampling does not require the complete lists of each and every population unit to be constructed. Travel-related time and costs for the purpose of conducting interviews are also greatly reduced.
Stratified sampling may be proportional or disproportional, depending on the probability of each sampling element to be selected. In proportional stratified sampling, all population strata are sampled proportional to their composition in the population. Disproportional sampling is used when one or more strata within the population are underrepresented and would not otherwise appear in sufficient numbers in simple random sampling. In general, disproportional sampling is used whenever a simple random sample would not produce enough cases of a certain type to support the intended analysis.
4. Among nonprobability sampling methods, what kinds of research conditions are more appropriate for each sampling method?
Examples of nonprobability sampling methods include convenience sampling, quota sampling, purposive sampling, and snowball sampling. These methods may not be representative of the population of interest and may not be generalizable to the population, but they are more convenient, less expensive, and less time-consuming than probability sampling, and may be useful when probability sampling methods cannot be used.
Convenience sampling relies on available subjects for inclusion in a sample, and is quick and easy, but does not represent the population of interest. This method may be used at an early stage of research when a mere feel for the subject matter is needed.
Quota sampling divides the population into relevant strata, but does not provide all population elements an equal or known probability for being selected. This method is cheap, easy, and convenient, and saves time with respect to data collection, but its representativeness is often questionable, particularly when investigators select subjects who are more conveniently available.
Purposive sampling is a nonprobability sampling method that depends on the personal judgment of the researcher selecting the sample. It may be used when the sample size is small and simple random sampling may not select the most representative elements. It is economical, but requires considerable prior knowledge of the population before selecting the sample.
Snowball sampling relies on informants to identify other relevant subjects for study inclusion. It is particularly useful for studying populations who are difficult to identify or access, but its representativeness is limited to the investigators’ network of informants.
5. How does a researcher decide on the appropriate sample size for a given study?
Factors taken into consideration when determining the sample size include the characteristics of the population, the nature of the analysis to be conducted, the desired precision of the estimates, the resources available, the study design, and the anticipated response rate.
Population characteristics such as heterogeneity and size have a significant impact on the size of a representative sample. In general, the more heterogeneous a population, the larger the sample size required, and the less heterogeneous the population, the smaller the sample size required. When all population elements are different, a census of every element is required. When there is no heterogeneity among population elements, a sample of one is sufficient. The accuracy of a sample estimate may be indicated by its standard error, which reflects the magnitude of differences in the measured variable among study subjects. The more heterogeneous the population is, the larger the sample size must be to minimize the standard error. Given the same heterogeneity, a larger population requires a larger sample size than a smaller population. However, sample size does not need to increase in proportion to population size. Only when population is small does the sampling fraction have a significant impact on standard error.
Sample size is also determined by the nature of the analysis to be performed. The types of analyses to be conducted, the number of comparisons that will be made, and the number of variables that has to be examined simultaneously have a significant influence on sample size. In general, the more comparisons or subgroup analysis to be performed, the larger the sample size should be. The number of variables to be analyzed at one time also influences sample size. Typically, in quasi-experimental research, relevant variables have to be controlled statistically because groups differ by factors other than chance. The more variables that need to be analyzed simultaneously, the larger the sample size should be to make sure the investigator will have sufficient cases representing the variables considered. Before deciding on the sample size, researchers should know the type of analysis they are going to conduct with the data. A rule of thumb is to include at least 30 to 50 cases for each subcategory.
The more precise the estimates, the larger the sample size required. Generally, the level of accuracy of estimates hinges on the importance of the research findings. If important decisions are going to be based on research findings, then decision makers demand a very high level of confidence in the data and estimates. In such cases, a larger sample size would be needed. If there are few, if any, major decisions to be based on the research findings or only rough estimates are required by the sponsor, then the sample size would be correspondingly small.
Important resources, such as time, money, and staff support, can also influence sample size, and the sample size may be prespecified by the sponsor through available funding. The amount of the budget may dictate the upper limit of a sample because the budgeted research funding is needed not just for data collection, but also research preparation, data analysis, and reporting. The time element is important if decisions based on the research have to be made at a certain time. Then research activities have to be planned around this deadline, and a smaller sample size may be necessary. Staff support is particularly important in interview surveys where the number of interviewers available is directly correlated with the number of subjects that can be studied given a particular time period, or the speed at which data can be collected given the number of interviews to be conducted.
Different study designs tend to have different demands for sample size. If the variables in an experiment are controlled, researchers can use a relatively smaller sample size. In quasi-experimental designs, a larger sample size is generally required to statistically control for extraneous factors. For the same reason, stratified, cluster, and quota sampling methods generally require a smaller sample size than simple random or systematic sampling methods.
Less-than-perfect response rates can cause the ideal and actual sample sizes to differ. The sample size is always smaller than the initial plan due to incomplete, unusable, or missing questionnaires. Researchers need to anticipate these factors and make necessary adjustments at an early stage. Issues such as relevance of the topic to respondents, number of and personal relationships with contacts, time of survey administration, question complexity, and questionnaire design all influence response rates. Response rate also impacts the validity of the research in that if a systematic bias exists that affects the response, then the results of the study may not be generalizable to the whole population