Tải bản đầy đủ (.pdf) (82 trang)

Health research method in Medicine

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (3.33 MB, 82 trang )

<span class="text_page_counter">Trang 1</span><div class="page_container" data-page="1">

<b>HEALTH RESEARCH METHODS </b>

(TRAINING MODULE 2)

The Health Department of the

Ethiopian Science and Technology Commission

</div><span class="text_page_counter">Trang 2</span><div class="page_container" data-page="2">

<b>HEALTH RESEARCH METHODS </b>

(TRAINING MODULE 2)

The Health Department of the

Ethiopian Science and Technology Commission

</div><span class="text_page_counter">Trang 3</span><div class="page_container" data-page="3">

©2005 Ethiopian Science and Technology Commission, Health Department Material from this module may be reproduced without prior permission for

non-profit-making purposes only.

</div><span class="text_page_counter">Trang 4</span><div class="page_container" data-page="4">

1.3. Selection of study design ... 3

1.4. Observational versus Experimental (Intervention) studies ... 4

1.5. Observational studies ... 5

<i>1.5.1. Descriptive studies... 5 </i>

<i>1.5.2. Analytical studies... 8 </i>

1.6. Experimental Studies...16

<i>1.6.1. The randomized clinical trial (RCT)...17 </i>

<i>1.6.2. Community intervention trials (CITs)...17 </i>

<b>2. Sampling Methods and Sample Size...21 </b>

3.2. Data collection techniques:...29

<i>3.2.1. Utilization of available data...29 </i>

<i>3.2.2. Observing ...32 </i>

<i>3.2.3. Interviewing ...32 </i>

<i>3.2.4. Self administering written questionnaires...33 </i>

<i>3.2.5. Focus Group Discussion...33 </i>

3.3. Bias in Information Collection and its possible causes ...35

<i>3.3.1. Defective instruments...35 </i>

<i>3.3.2.Observer Bias: ...35 </i>

</div><span class="text_page_counter">Trang 5</span><div class="page_container" data-page="5">

<i>3.3.3. Selection bias...36 </i>

<i>3.3.4. Information Bias...36 </i>

<i>3.3.5. Effect of the Interview (er) on the Informant ...36 </i>

3.4. Importance of Combining Different Data Collection Techniques...36

<b>4.Variables and measurement errors ...40 </b>

4.1.Learning objectives ...40

4.2. What is a variable?...40

4.3. Defining variables and indicators of variables ...42

4.4. Dependant and Independent variables...42

4.5. What is validity and reliability? ...44

5.3. Why Use Qualitative Research? ...48

5.4. How is Qualitative research used? ...49

5.5. Characteristics of qualitative research ...49

5.6. Design questions in qualitative research ...52

<i>5.6.1. Defining an area of inquiry...52 </i>

<i>5.6.2 Stating the research problem ...52 </i>

<i>5.6.3 Developing conceptual framework ...53 </i>

<i>5.6.4.Formulating qualitative research questions ...53 </i>

5. 7. Relevant concepts for designing qualitative research...53

<b>Answers to self-assessment questions...63 </b>

<b>References and further reading ...65 </b>

<b>Annexes………...67</b>

</div><span class="text_page_counter">Trang 6</span><div class="page_container" data-page="6">

<b>PREFACE </b>

The last century witnessed not more than 10,000 health research and development publications about Ethiopia which have been financially and/or technically driven by foreign researchers and donors until the launching and expansion of postgraduate programs in Addis Ababa University and the issuance of Health Science and Technology policy.

Health research is essential in developing evidence-based interventions that will make a difference in mitigating health problems, promoting health and ultimately improving the quality of life of the Ethiopian People.

Generally, however, health research and development output in terms of quality, volume and implementation has been very low compared with other developing countries in Africa and elsewhere.

To date health research capacity in terms of clear and transparent human, institutional and financial systems is not well established. Linkages, accountabilities, responsibilities and net-workings among and between research and development institutions lack clarity and much remains to be done. Conducive, implementable and comprehensive policy, strategy and legal frameworks are not properly entrenched. This calls for serious attention and consideration by all concerned including the public, business and private sectors as well as professional and civic societies.

To address this problem and its impact on the quality and volume of health research and development output the Health Department of ESTC in collaboration with the Ethiopian Public Health Association (EPHA), Regional States Health Bureaus and CDC Ethiopia has prepared this modular National Health Research Capacity Building Project. As a package the project has 6 modules with evaluation instrument tools and training guides respectively, prepared separately for easy reference and application.

The project aims to demystify the design and conduct of relevant, fundable, methodologically and ethically sound health research projects, and to promote the presentation and publication of results in scientific fora and reputable journals. These ends are to be achieved by promoting basic health research skills through a modular training approach to mid and high level Federal and Regional Health professionals drawn from government and non government organizations. This project is, therefore, believed to lay down the cornerstone for national health research capacity foundation.

The project has an in-built monitoring and evaluation mechanism to measure its outcome and impact as well as the process of its implementation in terms of countrywide fundable health projects, and publishable and presentable health research outputs in the coming three years.

</div><span class="text_page_counter">Trang 7</span><div class="page_container" data-page="7">

<small>This module on Health Research Methods is part of the modular package addressing basic issues related to research methods and their applications in undertaking health research focusing on: </small>

· Different types of epidemiological research study designs, their uses, and limitations;

· How to identify the most appropriate study designs for a particular question.

· Sampling methods and sample size determination.

· The use of qualitative research methods in health research.

This venture is a testimony of fruitful collaborative research capacity building

<i>based on an excellent public-private partnership between the Ethiopian Science </i>

and Technology Commission and Regional States’ Health Bureaus on one hand and the Ethiopian Public Health Association and CDC on the other.

Finally, it gives me pleasure to express my heart-felt gratitude to project management committee (PMC) members namely: Dr. Teferi Gedif (ESTC), Dr. Mahadi Bekri (EPHA), Dr. Frihiwot Berhane (EPHA), Dr. Shabir Ismail (CDC), consultants, reviewers institutions, and personalities directly or indirectly involved in the realization of this module, and strongly recommend beneficiaries to maximize their efforts to making the best use of this module and the training opportunity.

Last but not the least, my appreciation goes to the secretariat staff at the Health Department of the ESTC and EPHA for their contribution in making this project a

</div><span class="text_page_counter">Trang 8</span><div class="page_container" data-page="8">

<b>ACRONYMS </b>

<b>AIDS </b> Acquired Immuno Deficiency Syndrome

<b>CIT </b> Community Intervention Trial

<b>FGD </b> Focus Group Discussion

<b>HIV </b> Human Immuno Deficiency Virus

<b>INH </b> Isonizid

<b>KAP </b> Knowledge, Attitude and Practice

<b>MCH </b> Maternal and Child Health

<b>RCT </b> Randomized Clinical Trial

<b>STD </b> Sexually Transmitted Diseases

<b>TB </b> Tuberculosis

<b>VCT </b> Voluntary Counseling and Testing

</div><span class="text_page_counter">Trang 9</span><div class="page_container" data-page="9">

<b>GLOSSARY </b>

<i><b>Blinding (masking) </b></i> A technique by which observers and/or subjects are kept ignorant of the group to which the subjects are assigned.

<i><b>Morbidity </b></i> Any departure, subjective or objective, from a state of physiological or psychological well-being. In this sense sickness, illness and morbid conditions are

<b>similarly defined and synonymous. </b>

<i><b>Mortality rate </b></i> An estimate of the portion of a population that dies during a specified period.

<i><b>Prevalence rate </b></i> The total number of individuals who have an attribute or disease at a particular time or during a particular

<b>period divided by the population risk. </b>

<i><b>Point prevalence </b></i> The number of persons with a disease or an attribute at a specified point in time.

<i><b>Period prevalence </b></i> The total number of persons known to have had the disease or attribute at any time during specified period.

<i><b>Rate </b></i> A measure of the frequency of occurrence of a

phenomenon. In epidemiology, demography, and vital statistics a rate is an expression of the frequency with which and event occurs in defined population in a specified period of time.

<i><b>Ratio </b></i> The value obtained by dividing one quantity by another: a general term of which rate, proportion, percentages are subsets.

<i><b>Relative Risk </b></i> The ratio of the risk of disease or death among the exposed to the risk among the unexposed.

<i><b>Quasi experiment </b></i> A situation in which the investigator lacks full control over the allocation and/or timing of intervention but nonetheless conducts the study as if it were an

<i>experiment, allocating subjects to groups. </i>

</div><span class="text_page_counter">Trang 10</span><div class="page_container" data-page="10">

<b>Chapter 1. Research Methods Introduction </b>

<b>Definition </b>

Research is a quest for knowledge through diligent search or investigation or experimentation aimed at the discovery and interpretation of new knowledge. Research is a systematic body of procedures and techniques applied in carrying out investigation or experimentation.

The health of any community depends on the interaction and balance between the health needs of the community, the health resources that are available and the selection and application of health technologies and strategies and health related interventions. It is evident that the experience and skills of the disease control managers are important to apply the available technology in an optimal manner, within the limited resources available, in order to serve the health needs of the community.

To effect investment in interventions in areas which yield the highest social returns, disease control managers require accurate and scientific information on needs, possibilities and consequences of recommended actions. This module is aimed to equip disease control managers and other related professionals at regional level with the basic knowledge on the different health research methodologies.

<b>Categories of research </b>

<i><b>1. Empirical and theoretical research </b></i>

The philosophical approach to research is basically of two types: empirical and theoretical. Health research mainly follows the empirical approach, i.e. it is based upon observation and experience more than upon theory and abstraction. Epidemiological research, for example, depends upon the systematic collection of observations on the health related phenomena of interest in defined populations. Empirical and theoretical research complement each other in developing an understanding of the phenomena, in predicting future events, and in the prevention of events harmful to the general welfare of the population of interest.

Empirical research in the health sciences can be qualitative or quantitative in nature. In most cases, health science research deals with information of a quantitative nature. This module mainly deals with the quantitative type of research. However the qualitative research method is also briefly discussed in

<i><b>the last chapter of this module. </b></i>

</div><span class="text_page_counter">Trang 11</span><div class="page_container" data-page="11">

<i><b>2. Basic and applied </b></i>

Research can be functionally divided into basic (or pure) research and applied research. Applied research is problem-oriented, and is directed towards the solution of an existing problem. It is generally recognized that there needs to be a healthy balance between the two types of research, with the more affluent and technologically advanced societies able to support a greater proportion of basic research than those with fewer resources to spare.

<i><b>3. Health research triangle </b></i>

Yet another way of classifying health research, be it empirical or theoretical, basic or applied, is to describe it under three operational interlinked categories of biomedical, health services and behavioral research, the so-called health research triangle. Biomedical research deals primarily with basic research involving processes at the cellular level; health research deals with issues in the environment surrounding man, which promote changes at the cellular level; and behavioral research deals with the interaction of man and the environment in a

<b>manner reflecting the beliefs, attitudes and practices of the individual in society. </b>

<b>Objectives of this module </b>

At the end of the session, participants should:

ă Know the different types of epidemiological study designs, their uses and limitations,

ă Know how to identify the most appropriate study design for a particular research question,

ă Know sampling methods and sample size determination,

ă Know the use of qualitative research methods in health research

</div><span class="text_page_counter">Trang 12</span><div class="page_container" data-page="12">

<b>1. Study designs </b>

<b>1.1. Learning objectives </b>

At the end of this session, participants should be able to:

ă recognize and list the various types of descriptive studies

ă understand the advantages and disadvantages of cross sectional studies

ă know and understand the principles of planning and implementing a

ă <i><b>describe a cohort study design and indicate its strengths and weaknesses </b></i>

ă give a research question, and design an appropriate cohort study to investigate the problem.

ă describe a RCT design and indicate its strengths and weaknesses

ă describe potential sources of bias in RCTs

<b>1.2. Introduction </b>

A study may involve different study designs. Study design characteristics include type of data (qualitative vs quantitative), the type of comparisons (with or without control group), the type of setting or unit of analysis chosen, etc. Therefore, the selection of a research strategy is the core of a research design and is probably the single most important decision the investigator has to make. This section deals on the different types of epidemiological research designs.

<b>1.3.Selection of study design </b>

Depending on the existing state of knowledge about a problem that is being studied, different types of questions may be asked which require different study designs. Some examples are given in the fallowing table:

</div><span class="text_page_counter">Trang 13</span><div class="page_container" data-page="13">

<b>Table 1: Research questions and study designs </b>

<b><small>State of knowledge of the problem </small></b>

<b><small>Types of research questions Types of study design </small></b>

<small>Knowing that a problem exists but knowing little about its </small>

<small>· What do they know, believe, and think about the problem? </small>

<small>Suspecting that certain factors contribute to the problem </small>

<small>Are certain factors associated with the problem? (e.g. Is lack of school sex education related to high incidence of Having established that certain </small>

<small>factors are associated with the problem, desiring to establish the extent to which a particular factor causes or contributes to the problem </small>

<small>What is the cause of the problem? </small>

<small>Will the removal of a particular factor prevent or reduce the problem (e.g. stopping Khat, stopping smoking, providing safe water)</small>

<small>Experimental or quasi-experimental study designs </small>

<small>Having sufficient knowledge about cause to develop and assess an intervention which would prevent, control or solve the problem </small>

<small>What is the effect of a particular intervention/strategy? (e.g. new drug, special educational programme) </small>

<small>Experimental or quasi-experimental study designs </small>

The type of study design chosen depends on (see examples in Table 1):

ă The type of problem

ă The knowledge already available about the problem and

ă Resources available for the study

<b>1.4 Observational versus Experimental (Intervention) studies </b>

Observational study design is the more common approach in public health for testing hypotheses. The investigator can only observe the occurrence of disease in people who are already segregated into groups on the basis of some exposure. In this kind of study, allocation into groups on the basis of exposure to a factor is not under the control of the investigator.

The experimental (intervention) study is an epidemiologic design that can provide data of high quality. The distinguishing characteristic of experimental study design is that the investigators themselves allocate the exposure.

Although an experiment is an important step in establishing causality, it is often neither feasible nor ethical to subject human beings to risk factors in etiological studies. Therefore, experimental studies are not commonly done.

</div><span class="text_page_counter">Trang 14</span><div class="page_container" data-page="14">

<b>1.5. Observational studies </b>

Observational studies are classified into two as descriptive and analytical studies. The following sections provide detailed descriptions.

<i><b>1.5.1. Descriptive studies </b></i>

When an epidemiological study is not structured formally as an analytical or experimental study, i.e. when it is not aimed specifically to test a hypothesis, it is

<i><b>called a descriptive study. </b></i>

Descriptive studies characterize the occurrence and distribution of problems by time, place and person. The wealth of material obtained in most descriptive studies allows the generation of hypotheses, which can then be tested by analytical or experimental designs.

A descriptive study assesses morbidity or mortality in a population and the occurrence and distribution in population groups according to (1) characteristics of persons, (2) characteristics of place, and (3) characteristics of time.

The numbers of events (mortality or morbidity) are enumerated and the population at risk identified. Rates, ratios and proportions are calculated as measures of the probability of events. One must be careful to use the right measurements and the right ‘denominators’ when assessing these measures of probability.

<i><b>Types of descriptive studies </b></i>

<b>The case report is the type of descriptive study that gives a detailed report of </b>

single patient.

Classical example: In 1941 Gregs (An Australian Ophthalmologist) reported a new syndrome of congenital cataract linked to rubella in the mother during pregnancy.

Clinical observation such as this can give the first clues in the identification of a new disease and the effect of an exposure.

<b>A case series is a descriptive study that reports a series of cases of a specific </b>

condition, or a series of treated cases. These represent the numerator of disease occurrence, and should not be used to estimate risks.

Example: In the 1940s, Alton Ochenser, USA, observed that virtually all of the patients on whom he was operating for lung cancer gave a history of cigarette

</div><span class="text_page_counter">Trang 15</span><div class="page_container" data-page="15">

smoking. Based on his case series observation he hypothesized that cigarette smoking was linked with lung cancer.

In classical infectious disease epidemiology, a case series is often used as an early means of identifying the presence of epidemic.

<b>Ecological descriptive studies: when the unit of observation is an aggregate </b>

(e.g. family, clan or school) or an ecological unit (a village, town or country) the study becomes an ecological descriptive study.

As mentioned earlier, hypothesis testing is not generally an objective of the descriptive study. However, in some cross-sectional surveys, and ecological studies some hypothesis testing may be appropriate.

<b>Descriptive cross-sectional studies or community (population) surveys: </b>

cross-sectional studies entail the collection of data on, as the term implies, a cross-section of the population, which may comprise the whole population or a proportion (sample) of it. Many cross-sectional studies do not aim at testing a hypothesis about an association, and are thus descriptive. They provide a

prevalence rate at a particular point in time (point prevalence) or over a period of time (period prevalence). The study population at risk is the denominator for these prevalence rates. Included in this type of descriptive study are surveys in which the distribution of a disease, disability, and nutritional status is assessed. This design may also be used in health systems research to describe

‘prevalence’ by certain characteristics – pattern of health service utilization and compliance – or in opinion surveys. A common procedure used in family planning and in other services is the KAP survey (survey of knowledge, attitudes and practice).

<b>Trend studies: data may be collected at different points in time, and changes in </b>

the pattern are analyzed. Though different study subjects are studied at each time, each sample can represent the same type of population.

It should be noted that trend studies often involve a rather long period of data collection. In most cases, the same researcher does not personally collect the data used in a trend study, but instead conducts a secondary analysis of data collected over time by several other observers or routinely collected data.

</div><span class="text_page_counter">Trang 16</span><div class="page_container" data-page="16">

<b>Table 2: Advantages and disadvantages of cross sectional studies </b>

<small>ă Provide prevalence information. ă Researcher has control over the </small>

<small>selection of study subjects. ă Researcher has control over the </small>

<small>measurements used. ă Can study several factors or </small>

<small>outcomes at the one time. ă Often provides early clues for </small>

<small>hypothesis generation. </small>

<small>ă Does not allow the true temporal sequence of exposure and outcome to be ascertained, therefore unable to shed light on cause and effect associations. </small>

<small>ă Potential bias in measuring exposure. ă Potential sampling and/or survivor bias. ă Not feasible for rare conditions. </small>

<small>ă Does not yield incidence or true relative risk. </small>

<b>An example of a cross-sectional study </b>

<b><small>An indigenous malaria transmission in the outskirts of Addis Ababa, Akaki Town and its environs </small></b>

<small>Adugna Woyessa</small><sup>1</sup><small>, Teshome Gebre-Micheal</small><sup>2</sup><small>, Ahmed Ali</small><sup>3 </sup>

<b><small>Abstract </small></b>

<b><small>Background: In recent years malaria is becoming endemic in highland areas beyond its </small></b>

<small>previously known upper limit of transmission. Assessment of the situation of the disease in such areas is necessary in order to institute appropriate control activities. </small>

<b><small>Objectives: The objectives of the study were to determine the prevalence of malaria, the </small></b>

<small>parasite species involved and </small><i><small>Anopheles species responsible in local malaria </small></i>

<small>transmission. </small>

<b><small>Methods: A systematic sampling technique was used to select survey households. </small></b>

<small>Blood films were collected monthly between October and December 1999 from all household members by a trained and experienced laboratory technician. Larval and adult mosquitoes were monthly collected using different methods from September 1999 </small>

<b><small>to October 2000. </small></b>

<b><small>Results: Among 2136 examined blood films, 78(3.7%) of them were malaria positive of </small></b>

<i><small>which 54(69%) were due to Plasmodium vivax and 24 (31%) due to P. falciparum. </small></i>

<i><small>Anopheles gambiae s. l. (presumably An. arabiensis) and An. christyi were the dominant </small></i>

<small>man-biting species, with the former being the major vector in the area. Both these </small>

<i><small>species were found to be more of exophagic and active in the early evening, unlike An. </small></i>

<i><small>pharoensis, which showed an endophagic tendency. </small></i>

<b><small>Conclusion: This study indicated that indigenous transmission of malaria occurs in the </small></b>

<small>study area. Transmission is reckoned to be maintained by low density of vector species for short period of time under favorable conditions. Therefore, the acquisition of communal immunity is interrupted by long duration of non-malaria season leading to the occurrence of recurrent malaria epidemics</small>. <i><small>[Ethiop.J.Health Dev. 2004;18(1):2-7]</small></i>

</div><span class="text_page_counter">Trang 17</span><div class="page_container" data-page="17">

<i><b>1.5.2. Analytical studies </b></i>

Observational studies, where the primary goal of a study is establishing a relationship (association) between a ‘risk factor’ (etiological agent) and an outcome (disease), it is termed analytical. Analytical studies always require having a comparison group.

The basic approach in analytical studies is to develop a specific, testable hypothesis, and to design the study to control any extraneous variables that could potentially confound the observed relationship between the studied factor and the disease. The approach varies according to the specific strategy used as described below for case-control and cohort studies.

<i><b>1.5.2.1.Case-control studies </b></i>

<b>Case-control study design is design where by people diagnosed as having a disease (cases) are compared with persons who do not have the disease (controls) to determine if the two groups differ in the proportion of persons exposed to a specific factor or factors. </b>

Case-control study design is the most commonly used analytical strategy in epidemiology. The design is more appropriate in the clinical setting, because the hall-mark of a case-control study is that it begins with the disease (cases) and compares them with people without the disease (controls). The case-control design is relatively simple; except that it is backward-looking (retrospective) based on the exposure histories of cases and controls. With this type of study, comparison is made between individuals with disease of interest and individuals free of the disease of interest.

Data are analyzed to determine whether exposure was different for cases and for controls. The risk factor is something that happened or began in the past, presumably before disease onset. Information about the exposure is obtained by taking a history and/or from records. Occasionally, the suspected factor or attribute is a permanent one, such as blood group, which can be ascertained by clinical or laboratory investigation.

</div><span class="text_page_counter">Trang 18</span><div class="page_container" data-page="18">

<b>Past Present </b>

<b>Figure 1. Schematic diagram of time factor in case-control study </b>

A higher frequency of risk factor among cases than among controls is indicative of its association that may be of etiological significance. In other words, if a greater proportion of cases than controls give a history of exposure, or have records or indications of exposure in the past, the factor or attribute can be suspected of being a causative factor, but this does not represent proof for causation.

<i><b>Selection of cases </b></i>

<b>What constitutes a case in the study should be clearly defined with regard to </b>

specific characteristics of the disease. Cases that do not fit these criteria should be excluded from the study.

This design is particularly efficient for rare diseases, because all cases that fit the study criteria in a particular setting within a specific period are usually included. This allows for a reasonable number of cases to be included in the study without waiting for the occurrence of new cases of the disease, which might take a long time.

For reasons of convenience and completeness of case records, the cases identified for case-control studies are often those from a hospital setting, from physicians’ private practices, or from disease registries. Newly diagnosed cases within a specific period (incident cases) are preferred to prevalent cases, since such a choice may eliminate the possibility that long-term survivors of a disease were exposed to the investigated risk factor after the onset of the disease.

<i><b>Selection of controls </b></i>

It is crucial to set up one or more control groups of people who do not have the specified disease or condition in order to obtain estimates of the frequency of the attribute or risk factor for comparison with its frequency among cases. This is the most important aspect of the case-control study, as biases in the selection of controls may invalidate the study results, and bias in the selection of controls is

<small>Look for past exposure to factor in cases and control </small>

<small>Select cases and controls </small>

<b><small>Retrospective study </small></b>

</div><span class="text_page_counter">Trang 19</span><div class="page_container" data-page="19">

often the greatest cause for concern when analyzing data from case-control studies.

The sources of control groups may be:

ă a probability sample of a defined population, if the cases are drawn from that defined population;

ă a sample of patients admitted to, or attending the same institution as the cases;

ă a sample of relatives or associates of the cases (neighborhood controls);

ă a group of persons selected from the same source population as the cases, and matched with the cases for potentially confounding variables; on other risk factors (other than the one under consideration);

The selection of controls may involve matching on other risk factors:

Matching means that controls are selected such that cases and controls have the same (or very similar) characteristics other than the disease and the risk factor being investigated. The characteristics are those that would confound the effect of the putative risk factor, i.e. these characteristics are known to have an association with the disease, and may be associated with the risk factor being studied.

The purpose of the matching is to ensure comparability of these characteristics for the two groups, so that any observed association between the putative risk factor and the disease is not affected by differential distribution of these other characteristics. It is common to match for age, sex, race and socioeconomic status in case-control studies on diseases, as we know all of these factors affect the incidence of most of the diseases.

<i><b>Collection of data on exposure and other factors </b></i>

Often data are collected through interviews, questionnaires and/or examination of

ă The investigator or interviewer should not know whether a subject is in the case or control group (blinding);

ă The same procedures, e.g. interview and setting, should be used for all groups.

</div><span class="text_page_counter">Trang 20</span><div class="page_container" data-page="20">

<b>Table 3 Advantages and disadvantages of case-control studies </b>

<small>ă feasible when the disease being studied occurs only rarely, e.g. cancer of a specific organ; </small>

<small>ă relatively efficient, requiring a smaller sample than a cohort study; </small>

<small>ă little problem with attrition, </small>

<small>ă sometimes they are the earliest practical observational strategy for determining an association. </small>

<small>ă can examine multiple etiologic factors for a single disease. </small>

<small>ă the absence of epidemiological denominators (population at risk) makes the calculation of incidence rates, and hence of attributable risks, impossible; </small>

<small>ă temporality is a serious problem in many case-control studies where it is not possible to determine whether the attribute led to the disease/condition, or vice versa; </small>

<small>ă particularly prone to bias compared with other analytic designs, in particular selection and recall bias, </small>

<small>ă inefficient for the evaluation of rare exposures. </small>

<b>An example of a case-control study </b>

<b>Malnutrition and Mental Development: Is There a Sensitive Period? A Nested Case-control Study </b>

<small>Robert Drewett, Dieter Wolke, Makonnen Asefa, Mirgissa Kaba, Fasil Tessema </small>

<b><small>Abstract </small></b>

<small>To examine the possibility that there is an early sensitive period for the effects of malnutrition on cognitive development, three groups of children (N = 197) were recruited from a birth cohort with known growth characteristics in south-west Ethiopia (N = 1563). All had initial weights: 2500 g. </small>

<small>Early growth falterers dropped in weight below the third centile (z < -1.88) of the NCHS/WHO reference population in the first 4 months. Late growth falterers were children not in the first group whose weights were below the third centile at 10 and 12 months. Controls were a stratified random sample with weights above the third centile throughout the first year. All children were tested blind at 2 years using the Bayley Scales of Infant Development, adapted for use in Ethiopia. Mean (SD) scores on the psychomotor scale were 10.2 (3.7) in the controls, 6.6 (4.2) in the early growth falterers, and 8.5 (4.3) in the late growth falterers. For the mental scale they were 28.9 (5.8), 22.6 (6.2), and 26.6 (6.1) respectively. Both overall differences were statistically significant at p <0.001, and planned comparisons between the control and the combined growth faltering groups, and between the early and later growth faltering groups, showed that each difference was statistically significant for both scales. However, early weight faltering was associated with weight at the time of testing (r = .33), which was associated with scores both on the psychomotor (r = .53) and the mental scale (r = .49). After taking weight at the time of testing into account there was no additional effect attributable to the timing of growth faltering. In this population, therefore, early malnutrition does not have specific adverse effect beyond the contribution that it makes to enduring malnutrition </small>

<i><small>over the first 2 years. [Ethiop.J.Health Dev. 2002;16(Special Issue</small></i><b>):7 </b>

</div><span class="text_page_counter">Trang 21</span><div class="page_container" data-page="21">

<b>Exercise 1.1 </b>

Now do exercise 1.1 on case-control study design. This exercise should be done individually. When everyone is ready, there will be a group discussion.

a) When is a case-control study most appropriate? b) Outline the limitations of a case-control study.

<i><b>1.5.2.2. Cohort study design </b></i>

<b>A cohort study is an observational research design which begins when a group of people (a cohort) initially free of disease (outcome of interest), are classified according to a given exposure, and then followed up over time. The researcher compares whether the subsequent development of any new cases of a disease (or other outcome of interest) differs between the exposed and non-exposed groups. </b>

<b>Types of cohort studies </b>

There are basically two types of cohort studies; prospective and retrospective (historical). The difference between the two lies in where the starting point of the study is deemed to begin. In a prospective cohort study the starting point of observation (time zero) is ‘now’, and the population is followed into the future. The exposure of interest may or may not have occurred when the study was initiated, but the outcome has not.

<small>Select cohort, classify by exposure </small>

<small>Follow to see frequency with which disease </small>

</div><span class="text_page_counter">Trang 22</span><div class="page_container" data-page="22">

In a retrospective the ‘starting point’, that is the point of initial exposure occurred some time in the past and the experience of the population is followed up to the present time. At the time the study is initiated, both the exposure and outcome have occurred.

A historical cohort study depends upon the availability of data or records that allow reconstruction of the exposure of cohorts to a suspected risk factor and follow-up of their mortality or morbidity over time. In other words, although the investigator was not present when the exposure was first identified, he reconstructs exposed and unexposed populations from records, and then proceeds as though he had been present throughout the study. Historically constructed cohorts share several advantages of the prospective cohort. If all requirements are satisfied, a historical cohort may suffer less from the disadvantages of time and expense. Classical example of such study design is cancer associated with the bombing of Hiroshima and Nagasaki. Historical cohort studies have, however, the following disadvantages:

ă All of the relevant variables may not be available in the original records.

ă It may be difficult to ascertain that the study population was free from the condition at the start of the comparison. This problem does not exist if we are concerned with deaths as indicators of disease.

ă Attrition problems may be serious due to loss of records, incomplete records, or difficulties in tracing or locating all of the original population for further study.

ă These studies require ingenuity in identifying suitable populations and in obtaining reliable information concerning exposure and other relevant factors. Examples of such population groups include military personnel, industrial groups (such as miners), professional groups, etc.

<b>Data to be collected: </b>

ă Data on the exposure of interest to the study hypotheses;

ă Data on the outcome of interest to the study hypotheses;

ă Characteristics of the cohort that might confound the association under study.

<b>Methods of data collection </b>

Several methods are used to obtain the above data, which should be on a longitudinal basis. These methods include:

ă Interview surveys with follow-up procedures;

ă Medical records monitored over time;

ă Medical examinations and laboratory testing;

</div><span class="text_page_counter">Trang 23</span><div class="page_container" data-page="23">

ă Record linkage of sets with exposure data and sets with outcome data, e.g. work history data in underground mines with mortality data from national mortality files.

In a conventional cohort study, an initial cross-sectional study is often performed to exclude persons with the outcome of interest (disease) and to identify the cohort that is free from the disease.

<b>Table 4: Advantages and Disadvantages of cohort studies </b>

<small>ă Relative risk can be calculated. </small>

<small>ă Allows concluding a cause-effect relationship. </small>

<small>ă No chance of bias being introduced due to awareness of being sick as in encountered in case-control studies. ă Less chance for the problem of selective </small>

<small>survival or selective recall </small>

<small>ă Cohort studies are capable of identifying other diseases that may be related to the same risk factor. </small>

<small>ă Allows estimating attributable risks, thus indicating the absolute magnitude of disease attributable to the risk factor. </small>

ă <small>If a probability sample is taken from the reference population, it is possible to generalize from the sample to the reference population with a known degree of precision.</small>

<small>ă Cohort studies are long-term and are thus not always feasible; they are relatively inefficient for studying rare conditions. </small>

<small>ă Costly in time, personnel, space and patient follow-up. </small>

<small>ă Sample sizes required are large, especially for infrequent conditions. </small>

<small>ă Attrition or loss of people from the sample or control during the study is the major problem. The higher the proportion lost (say beyond 10-15%) the more serious the potential bias. </small>

<small>ă There may also be attrition among investigators. ă Over a long period, many changes may occur in the environment, among individuals or in the type of intervention, and these may confuse the issue of association and attributable risk. </small>

<small>ă Over a long period, study procedures may influence the behavior of the persons investigated in such a way that the development of the disease may be influenced accordingly (Hawthorne effect). </small>

ă <small>A serious ethical problem may arise when it becomes apparent that the exposed population is manifesting significant disease excess before the follow-up period is completed. </small>

</div><span class="text_page_counter">Trang 24</span><div class="page_container" data-page="24">

<b>Example of Cohort study design </b>

<b><small>Alcohol consumption and mortality from all causes, coronary heart disease, and stroke: results from a prospective cohort study of Scottish men with 21 years of follow up </small></b>

<small>Carole L Hart, George Davey Smith, David J Hole, Victor M Hawthorne </small>

<b><small>Abstract </small></b>

<b><small>Objectives: To relate alcohol consumption to mortality. Design: Prospective cohort study. </small></b>

<b><small>Setting: 27 workplaces in the west of Scotland. </small></b>

<b><small>Participants: 5766 men aged 35-64 when screened in 1970-3 who answered questions on </small></b>

<small>their usual weekly alcohol consumption. </small>

<b><small>Main outcome measures: Mortality from all causes, coronary heart disease, stroke, and </small></b>

<small>alcohol related causes over 21 years of follow up related to units of alcohol consumed per week. </small>

<b><small>Results: Risk for all cause mortality was similar for non-drinkers and men drinking up to 14 </small></b>

<small>units a week. Mortality risk then showed a graded association with alcohol consumption (relative rate compared with non-drinkers 1.34 (95% confidence interval 1.14 to 1.58) for 15-21 units a week, 1.49 (1.27 to 1.75) for 22-34 units, 1.74 (1.47 to 2.06) for 35 or more units). Adjustment for risk factors attenuated the increased relative risks, but they remained </small>

<small>significantly above 1 for men drinking 22 or more units a week. There was no strong relation between alcohol consumption and mortality from coronary heart disease after adjustment. A strong positive relation was seen between alcohol consumption and risk of mortality from stroke, with men drinking 35 or more units having double the risk of non-drinkers, even after adjustment. </small>

<b><small>Conclusions The overall association between alcohol consumption and mortality is </small></b>

<small>unfavorable for men drinking over 22 units a week, and there is no clear evidence of any protective effect for men drinking less than this. </small>

<b>Exercise 1.2. </b>

Now do Exercise 1.2. Do this exercise individually and when everyone is ready there will be a group discussion

a. What are the major difference between a cohort and case control study? b. What are the major limitations of a cohort study?

</div><span class="text_page_counter">Trang 25</span><div class="page_container" data-page="25">

<b>1.6. Experimental Studies </b>

The experimental study, or clinical trial, is an epidemiologic design that can provide data of high quality. As in a cohort study, individuals are enrolled on the basis of their exposure status: however, the distinguishing characteristic of an experimental study design is that the investigators themselves allocate the exposure.

The experimental study is the best epidemiological study design to prove causation. It can be viewed as the final or definitive step in the research process. The experimenter (investigator) has control of the subjects, the intervention, outcome measurements, and sets the conditions under which the experiment is conducted. In particular, the investigator determines who will be exposed to the intervention and who will not. This selection is done in such a way that the comparison of outcome measure between the exposed and unexposed groups is as free of bias as possible.

In health research, we are often interested in comparative experiment, where one or more groups with specific interventions is compared with a group unexposed to interventions (clinical trials) or exposed to the best treatment currently available. The effect of the new interventions on one or more outcome variables is compared between the groups by the use of statistical procedures. Two types of comparative experiments, the randomized clinical trial (RCT) and the community intervention trial (CIT) are discussed in this section.

<b><small>FIGURE 3. FLOW CHART OF AN EXPERIMENT </small></b>

</div><span class="text_page_counter">Trang 26</span><div class="page_container" data-page="26">

<i><b>1.6.1. The randomized clinical trial (RCT) </b></i>

The most commonly encountered experiment in health science research, and the research strategy by which evidence of effectiveness is measured, is the randomized, controlled, double blind clinical trial, commonly known as the RCT. Clinical trials may be done for various purposes. Some of the common types of clinical trial (according to purpose) are:

<b>a. prophylactic trials, e.g. immunization, contraception; b. therapeutic trials, e.g. drug treatment, surgical procedure; </b>

<b>c. safety trials, e.g. side effects of oral contraceptives and injectables; </b>

<b>d. risk-factor trials, e.g. proving the etiology of a disease by inducing it with the </b>

putative agent in animals, or withdrawing the agent (e.g. smoking) through cessation.

Therapeutic trials may be conducted to test efficacy (e.g. does a therapeutic agent work in an ideal, controlled situation?) or to test effectiveness (e.g. after having established efficacy, if the therapy is introduced to the population at large, will it be effective when having to deal with other co-interventions, confounding, contamination, etc.?)

<b>The intervention in a clinical trial may include: </b>

ă drugs for prevention, treatment or palliation;

ă clinical devices, such as intrauterine devices;

ă surgical procedures, rehabilitation procedures;

ă medical counseling;

ă diet, exercise, change of other lifestyle habits;

ă hospital services, e.g. integrated versus non-integrated, acute vs. chronic care;

ă risk factors;

ă communication approaches, e.g. face-to-face communication vs pamphlets;

ă different categories of health personnel, e.g. doctors versus nurses;

ă treatment regimens, e.g. once-a-day dispensation versus three times a day.

<i><b>1.6.2. Community intervention trials (CITs) </b></i>

The major difference between Randomized Clinical Trials and Community Intervention Trials is that the randomization is done on communities rather than individuals. The classic example of a community intervention trial would be that of testing a vaccine. Some communities will be randomly assigned to receive the

</div><span class="text_page_counter">Trang 27</span><div class="page_container" data-page="27">

vaccine, while other communities will either not be vaccinated, or will be vaccinated with a placebo. Another example would be a test of whether the introduction of iron-fortified salt in the community would reduce the incidence of anemia in the community. Communities selected for entry to the study have to be similar as much as is possible, especially since only a small number of communities will be entered.

Very often, blinding is not possible in these types of studies, and contamination and co-interventions become serious problems. Contamination occurs when individuals from one of the experimental groups receive the intervention from the other experimental group. For example, in the study of iron-fortified salt, some of the members of the community receiving non-fortified salt might hear about the fortified salt, and may acquire it from the other community. (The reverse is also possible). This is particularly so if the communities are geographically close.

<b>Table 5: Advantages and Disadvantages of the experimental approach </b>

<small>- The ability to manipulate or assign the exposure. - The ability to randomize </small>

<small>subjects to experimental and control groups. </small>

<small>- The ability to control confounding and eliminate </small>

<small>- Lack of reality. In most human situations, it is </small>

<small>impossible to randomize all risk factors except those under examination. </small>

<small>- Difficulties in extrapolation. </small>

<small>- Ethical problems. In human experimentation, people are either deliberately exposed to risk factors (in etiological studies) or treatment is deliberately withheld from cases (intervention trials). </small>

<small>- Difficulties in manipulating the independent variable. - Non-representativeness of samples. Many </small>

<small>experiments are carried out on captive populations or volunteers, who are not necessarily representative of the population at large. </small>

<small>- Experiments in hospitals (where the experimental approach is most feasible and is frequently used) suffer from several sources of selection bias. </small>

</div><span class="text_page_counter">Trang 28</span><div class="page_container" data-page="28">

<b>Example of a Randomized Clinical Trial </b>

<b><small>Clinical efficacy of three common treatments in acute otitis externa in primary care: randomized controlled trial. </small></b>

<small>Frank A M van Balen,W Martijn Smit, Nicolaas P A Zuithoff, Theo J M Verheij </small>

<b><small>Abstract </small></b>

<b><small>Objective: To compare the clinical efficacy of ear drops containing acetic acid, corticosteroid </small></b>

<small>and acetic acid, and steroid and antibiotic in acute otitis externa in primary care. </small>

<b><small>Design: Randomised controlled trial. Setting: 79 general practices, Netherlands. Participants: 213 adults with acute otitis externa. </small></b>

<b><small>Main outcome measures: Primary outcome: duration of symptoms (days) according to patient </small></b>

<small>diaries. Secondary outcome: cure rate according to general practitioner completed questionnaires and recurrence of symptoms between days 21 and 42. </small>

<b><small>Results: Symptoms lasted for a median of 8.0 days (95% confidence interval 7.0 to 9.0) in the </small></b>

<small>acetic acid group, 7.0 days (5.8 to 8.3) in the steroid and acetic acid group, and 6.0 days (5.1 to 6.9) in the steroid and antibiotic group. The overall cure rates at seven, 14, and 21 days were 38%, 68%, and 75%, respectively. </small>

<small>Compared with the acetic acid group, significantly more patients were cured in the steroid and acetic acid group and steroid and antibiotic group at day 14 (odds ratio 2.4, 1.1 to 5.3, and 3.5, 1.6 to 7.7, respectively) and day 21 (5.3, 2.0 to 13.7, and 3.9, 1.7 to 9.1, respectively). </small>

<small>Recurrence of symptoms between days 21 and 42 occurred in 29% (50/172) of patients and was seen significantly less in the steroid and acetic acid group (0.3, 0.1 to 0.7) and steroid and </small>

<small>antibiotic group (0.4, 0.2 to 1.0) than in the acetic acid group. </small>

<b><small>Conclusions:</small></b><small>Ear drops containing corticosteroids are more effective than acetic acid ear drops in the treatment of acute otitis externa in primary care. Steroid and acetic acid or steroid and antibiotic ear drops are equally effective.</small>

<b>Exercise 1.3 </b>

Do this exercise in groups, then one group will present the answer, then there will be discussion.

You are asked to design an RCT to evaluate the effect of new anti AIDS drug. Discuss the issues involved in selecting the study sample and implementing the study.

</div><span class="text_page_counter">Trang 29</span><div class="page_container" data-page="29">

<b>Summary points on study designs </b>

<small>There are two types of epidemiological research designs · </small> <b><small>Observational </small></b>

<small>Descriptive Analytical studies · </small> <b><small>Experimental </small></b>

<small>The randomized clinical trial (RCTs) Community intervention trials (CITs) </small>

<b><small>Descriptive studies </small></b>

<small>The distinctive feature of the descriptive study design is that its primary concern is with description rather than with the testing of hypotheses or proving causality. </small>

<small>Descriptive studies include: · Case reports </small>

<small>· Case series </small>

<small>· Cross sectional studies or community surveys · Ecological descriptive studies </small>

<small>Observational studies, where establishing a relationship (association) between a ‘risk factor’ (etiological agent) and an outcome (disease) is the primary goal, are termed analytical. In this type of study, hypothesis testing is the primary tool of inference. </small>

<b><small>Types of Analytical studies </small></b>

<small>· </small> <b><small>Case-control study </small></b>

<small>Case-control study design is a retrospective design whereby people diagnosed as having a disease (cases) are compared with persons who do not have the disease (controls) to determine if the two groups differ in the proportion of persons who have been exposed to a specific factor or factors. </small>

<small>· </small> <b><small>Cohort study (prospective and historical) </small></b>

<small>A cohort study is an observational research design which begins when a group of people (a cohort) initially free of disease, are classified according to a given exposure, and then followed up over time. The object of the exercise is to ascertain whether the </small>

<small>subsequent development of any new cases of a disease (or other outcome of interest) differs between the exposed and non-exposed groups </small>

<b><small>The experimental study, or clinical trial, is an epidemiologic design that can provide </small></b>

<small>data of high quality. As in a cohort study, individuals are enrolled on the basis of their exposure status: however, the distinguishing characteristic of experimental study design is that the investigators themselves allocate the exposure. </small>

</div><span class="text_page_counter">Trang 30</span><div class="page_container" data-page="30">

<b>Chapter 2. Sampling Methods and Sample Size </b>

<b>2.1. Learning Objectives </b>

At the end of this session participants should be able to:

ă identify and define the population to be studied.

ă identify and describe common methods of sampling

ă discuss problems of bias that should be avoided when selecting a sample

ă list the factors to consider when deciding on sample size

ă decide on the sampling methods and sample size most appropriate for the research design they are developing

<b>2.2. What is sampling? </b>

Most research studies involve the observation of a sample from some predefined population of interest. The conclusions drawn from the study are often based on generalizing the results observed in the sample to the entire population from which the sample was drawn. Therefore, the accuracy of the conclusions will depend on how well the samples have been collected, and especially on how representative the sample is of the population. In this chapter, we will discuss the major issues that a researcher has to face in selecting an appropriate sample.

Sampling is a process of choosing a section of the population for observation and study.

<b>2.3. Why sampling? </b>

There are several reasons why samples are chosen for a study, rather than studying the entire population. First and foremost, a researcher wants to minimize the costs (financial and otherwise) of collecting the data, processing and reporting on the results. If a reasonable picture of a population can be obtained by observing only a section of it, the researcher economizes by choosing such a section of the population. Obviously, when a sample is observed, the total information will be less than if one were to observe the entire population.

A major advantage of sampling over complete enumeration is the fact that the available resources can be better spent in refining the measuring instruments and methods so that the information collected is accurate (valid and reliable). Some information, such as monitoring of the body burden of toxic metals in the

</div><span class="text_page_counter">Trang 31</span><div class="page_container" data-page="31">

population, which may require specialized equipment and staff, cannot be collected from the entire population. A sample in such cases would provide a reasonable picture of the population status.

When we draw a sample from a population we will be confronted with the following questions:

- What is the group of people (study population) from which we want to draw a sample?

- How many people do we need in our sample? - How will these people be selected?

The study population has to be clearly defined for example, according to age, sex, and residents. Apart from persons, a study population may consist of villages, institutions, records, etc.

<b>Example of Study Population and Study Units </b> High dropout rates in

primary schools in District

The primary concern in selecting an appropriate sample is that the sample should be representative of the population. Every variable of interest should ideally have the same distribution in the sample as in the population from which the sample is chosen. This requires knowledge of the variables and their distribution in the population, which of course is why we are doing the study in the first place! Therefore, it is not often possible to ensure the representativeness of the population. However, statisticians have come up with ways in which we can give a reasonable guarantee of representativeness. We will discuss some of these methods briefly in this section.

<b>A REPRESENTATIVE SAMPLE has all the important characteristics of the </b>

population from which it is drawn.

</div><span class="text_page_counter">Trang 32</span><div class="page_container" data-page="32">

<b>Sampling methods </b>

There are two types of sampling methods: non-probability (convenience, quota sampling) and probability sampling methods. The non-probability sampling methods are inappropriate if the aim is to measure variables and generalize findings obtained from a sample to the total study population. For this purpose probability sampling methods should be used.

<b>Non-Probability sampling Methods </b>

<i><b>1. Convenience Sampling </b></i>

Many clinic-based studies use convenience samples

CONVENIENCE SAMPLING is a method in which for convenience sake the study units that happen to be available at the time of data collection are selected in the sample.

A researcher wants to study the attitudes of villagers towards family planning services provided by a MCH clinic. He decides to interview all adult patients who visit the out patient clinic during one particular day. This is more convenient than taking a random sample of people in the village, and it gives a useful first impression.

A drawback of convenience sampling is that the sample may be quite un-representative of the population you want to study. Some units may be over-selected, others under selected or missed altogether. It is impossible to adjust for such a distortion - if you need to be representative you have to use another sampling method.

<i><b>2. Quota Sampling </b></i>

Quota Sampling is a method that ensures that a certain number of sample units from different categories with specific characteristics appear in the sample so that all these characteristics are represented.

In this method the investigator interviews as many people in each category of study unit as he can find until he has filled his quota.

<b>3. Probability sampling </b>

If a sampling frame does exist or can be compiled, probability sampling methods can be used. With these methods, each study unit has an equal or at least a

</div><span class="text_page_counter">Trang 33</span><div class="page_container" data-page="33">

known probability of being selected in the sample. The following probability sampling methods will be discussed:

ă Simple random sampling

ă Systematic sampling

ă Stratified sampling

ă Cluster sampling

ă Multi-stage sampling

<b>PROBABLITY SAMPLING involves random selection procedures to ensure that </b>

each unit of the sample is chosen on the basis of chance. All units of the study population should have an equal, or at least a known chance of being included in the sample.

<i><b>2.4.1. Simple random sampling </b></i>

This is the most common and the simplest of the sampling methods. In this method, the subjects are chosen from the population with equal probability of

<b>selection. One may use a random number table (see ANNEX 1 and 2), or use </b>

techniques such as putting the names of people into a hat and selecting the appropriate number of names blindly. Recently, computer programs have been developed to draw simple random samples from a given population; this will be dealt in module 3. The simple random sample has the advantages that it is easy to administer, is representative of the population in the long run, and the analysis of data using such a sampling scheme is straightforward.

<i><b>3.4.2. Systematic sampling </b></i>

<b>In SYSTEMATIC </b>sampling individuals are chosen at regular intervals (for example every fifth) from the sampling frame. Ideally we randomly select a

<b>number to tell us where to start selecting individuals from the list. </b>

<i>A systematic sample is to be selected from 1200 students of a school. The sample size selected is 100. The sampling fraction is: </i>

<i>100 (sample size) 1200 (study population) </i>

<i>The sampling interval is, therefore, 12. The number of the first student to be included in the sample is chosen randomly, for example by blindly picking one out of twelve pieces of paper, numbered 1- 12. If number 6 is picked, then every twelfth student will be included in the sample, starting with student number 6, until 100 students are selected: then numbers selected would be 6, 18, 30, 42, etc. </i>

</div><span class="text_page_counter">Trang 34</span><div class="page_container" data-page="34">

Systematic sampling is usually less time consuming and easier to perform than simple random sampling. However, there is a risk of bias, as the sampling interval may coincide with a systematic variation in the sampling frame. For instance, if we want to select a random sample of days on which to count clinic attendance, systematic sampling with a sampling interval of 7 days would be inappropriate, as all study days would fall on the same day of the week, which might, for example, be market day.

<i><b>3.4.3. Stratified sampling </b></i>

When the size of the sample is small and we have some information about the distribution of a particular variable (e.g. gender: 50% male, 50% female), it may be advantageous to select simple random samples from within each of the subgroups defined by that variable. By choosing half the sample from males and half from females, we assure that the sample is representative of the population with respect to gender. When confounding is an important issue (such as in case-control studies), stratified sampling will reduce potential confounding by selecting homogeneous subgroups.

If it is important that the sample includes representative groups of study units with specific characteristics (for example, residents from urban and rural areas, or different age groups), then the sampling frame must be divided into groups or strata, according to these characteristics. Random or systematic samples of predetermined size will then have to be obtained from each group (stratum). This is called Stratified Sampling.

Stratified sampling is only possible when we know what proportion of the study population belongs to each group we are interested in.

An advantage of stratified sampling is that we can take a relatively large sample from small group in our study population. This allows us to get sample that is big enough to enable us to draw valid conclusions about a relatively small group without having to collect an unnecessarily large (and hence expensive) sample of the other, larger groups. However, in doing so, we are using unequal sampling fractions, and it is important to correct for this when generalizing our findings to the whole study population.

<i>A survey is conducted on household water supply in a district comprising 20,000 households, of which 20% are urban and 80% rural. It is suspected that in urban areas the access to safe water sources is much more satisfactory. A decision is made to include 100 urban households (out of 4000, which gives a 1 in 40 sample) and 200 rural households (out of 6,000, which gives a 1 in 80 sample). Because we know the sampling fraction for both strata, the access to safe water for all the district households can be calculated. </i>

</div><span class="text_page_counter">Trang 35</span><div class="page_container" data-page="35">

<i><b>3.4.4. Cluster sampling </b></i>

The selection of groups of study units (clusters) instead of the selection of study

<i><b>units individually is called CLUSTER SAMPLING. </b></i>

In many administrative surveys, studies are done on large populations which may be geographically quite dispersed. To obtain the required number of subjects for the study by a simple random sample method will require large costs and will be inconvenient. In such cases, clusters may be identified (e.g. households) and random samples of clusters will be included in the study; then every member of the cluster will also be part of the study. This introduces two types of variations in the data – between clusters and within clusters – and this will have to be taken into account when analyzing data.

<i>In a study of knowledge, attitudes, and practices related to family planning in rural communities of a region, a list is made of all the villages. Using this list, a random sample of villages is chosen and all the adults in the selected villages are interviewed </i>

<i><b>3.4.5. Multi-stage sampling </b></i>

Many studies, especially large nationwide surveys, will incorporate different sampling methods for different groups, and may be done in several stages. In experiments, or common epidemiological studies such as case-control or cohort studies, this is not a common practice.

<i>In a study of utilization of pit latrines in a district, 150 homesteads are to be visited for interviews with family members as well as for observations on types and cleanliness of latrines. The district is composed of six wards and each ward has between six and nine villages. The following four stage sampling procedure could be performed: </i>

<i>1.Select three wards out of the six by simple random sampling. </i>

<i>2.For each ward, select five villages by simple random sampling (15 villages in total) </i>

<i>3.For each village select ten households. Because simply choosing households in the center of the village would produce a biased sample, the following systematic sampling procedure is proposed: </i>

ă <i>Go to the center of the village. </i>

ă <i>Choose a direction in random way: spin a bottle on the ground and choose the direction the bottleneck indicates. </i>

ă <i>Walk in the chosen direction and select every third or every fifth household (depending on the size of the village) until you have the ten you need. If you </i>

</div><span class="text_page_counter">Trang 36</span><div class="page_container" data-page="36">

<i>reach the boundary of the village and you still do not have ten households return to the center of the village, walk in the opposite direction and continue to select your sample in the same way until you get ten. If there is nobody in a chosen household, take the next nearest one. </i>

<i> </i>

<i>4.Decide beforehand whom to interview (for example the head of the household, if present, or the oldest adult who lives there and who is available.) </i>

<b>Table 6: The main advantages and disadvantages of cluster- and multi-stage sampling are that: </b>

<small>ă a sampling frame of individual units is not required for the whole population. Initially a sampling frame of clusters is sufficient. Only within the clusters that are finally selected do we need to list and sample the individual units. ă The sample is easier to select than a </small>

<small>simple random sample of similar size, because the individual units in the sample are physically together in groups instead of scattered all over the study population. </small>

<small>ă compared to simple random sampling, there is a larger probability that the final sample will not be representative of the total study population. The likelihood of the sample not being representative depends mainly on the number of clusters selected in the first stage. The larger the number of clusters, the greater the likelihood that the sample will be representative. </small>

<b>1.6. How large a sample? </b>

The main determinant of the sample size is how accurate the results need to be. This depends on the purpose of the study (descriptive study to determine a summary measure of a characteristic, or an analytical study where specific sets of hypotheses are being tested). It is a widespread belief among researchers that the bigger the sample, the better the study becomes. This is not necessarily true. In general it is much better to increase the accuracy of data collection (for example by improving the training of interviewers or by better pre-testing of the data collection tools) than to increase sample size after a certain point. Also it is better to make extra efforts to get a representative sample rather than to get a very large sample.

For cross sectional surveys and analytical studies precise calculations can be

<b>made which indicate the desirable sample size (see Annex 3). Nowadays </b>

computers have made the calculation of sample size easier and this will be

<b>introduced in Module 3. </b>

</div><span class="text_page_counter">Trang 37</span><div class="page_container" data-page="37">

<b>Exercise 2 </b>

Now do exercise 2 on sampling. This is also a group exercise. At the end there will be a group discussion

Describe the sampling method you would use in each of the following cross-sectional surveys and indicate the potential sources of sampling bias in each case

- Immunization coverage of infants in Nazareth Town - Prevalence of HIV in pregnant women in your region

<b>Summary points on sampling </b>

Sampling: choosing a section of the population for observation and study. Why sample? To minimize costs (financial and otherwise) and time

To make the observations more reliable

Some information cannot be collected from the entire population. The sample should be representative of (have all of the important characteristics of) the population from which it is drawn.

The main determinant of the sample size is how accurate the results need to be. The following are the main sampling methods:

ă Simple random sampling

ă Systematic sampling

ă Stratified sampling

ă Cluster sampling

ă Multi-stage sampling

</div><span class="text_page_counter">Trang 38</span><div class="page_container" data-page="38">

<b>Chapter3. Data Collection </b>

<b>3.1. Learning objectives </b>

At the end of this section participants should be able to:

ă identify the sources of health data in the community where they work,

ă describe various data collection techniques and state their uses and limitations

ă identify the limitations and strength of routine data sources

ă state the benefits of using a combination of different data collection techniques

ă state various sources of bias in data collection and ways of preventing bias

ă promote the collection of accurate data by members of their health team

Data collection techniques allow us to systematically collect information about our subjects of study and about the settings in which they occur.

In the collection of data, we have to be systematic. If data are collected haphazardly, it will be difficult to answer research questions in any conclusive way.

<b>3.2. Data collection techniques: </b>

ă Using available information (record review)

ă Observing

ă Interviewing

ă Administering written questionnaires

ă Focus group discussions

ă <b>Other data collection techniques </b>

<i><b>3.2.1. Utilization of available data </b></i>

There is a large amount of data that has already been collected by others. Locating these sources and retrieving the information is a good starting point in any data collection effort. Some sources of such data are listed below:

ă Mortality reports

ă Morbidity reports

ă Epidemic reports

ă Reports of laboratory utilization (including laboratory test results)

ă Reports of individual case investigations

ă Reports of epidemic investigations

</div><span class="text_page_counter">Trang 39</span><div class="page_container" data-page="39">

ă Special surveys (e.g., hospital admissions, disease registers, and serologic surveys)

ă Demographic data

Analysis of health services data, census data, unpublished reports, publications in libraries or in offices at the various levels of health and health related services, may be a study in itself. In order to retrieve the data from available sources, the researcher will have to design an instrument such as a checklist or compilation sheet. In designing such instruments, it is important to inspect the layout of the source documents from which the data is to be extracted and design the data compilation sheet so that the items of data can be transferred in the order in which the items appear in the source document. This will save time and reduce error.

The assessment of the health status of the community is the basis for planning an evaluation of the health services. Useful information needed for making decisions can often be obtained from routinely available data, even though these are not accurate or complete enough for detailed or elaborate analysis. We shall consider in this section what information you can obtain on the frequency and distribution of morbidity, mortality and their causes from routine sources.

<b>Now do the exercise 3.1 below on uses and limitations of routine data. (Can be done in a group or individually, general discussion at the end). </b>

<b>Exercise 3.1 </b>

What do you think are the uses and limitations of the hospital-related sources of information shown in the table below? Think in terms of the people served and the levels of health care provided and then write down your ideas in the spaces provided in the table. When you have done this, turn over the page and compare with the explanation provided on the next page.

Health center and hospital returns

In-patient and outpatient records

Immunization reports Childhood diseases

</div><span class="text_page_counter">Trang 40</span><div class="page_container" data-page="40">

<b>Health center and hospital returns: health center and hospital returns are likely </b>

to be accurate with respect to disease diagnosis but the data may only relate to the area served by the hospital. Time-based data, such as length of stay, and organizational information, such as staffing or the distance patients travel to the hospital, can also be used.

<b>In-patient and outpatient records: Analysis of hospital records can provide </b>

high quality information on the most important causes of major illness in a community. But to be useful as an indicator of the health status of the population you must make allowances for the fact that patients treated in hospital are not representative of the general population in the area. People from remote areas, infants and the elderly, for example, will be under-represented. In some

countries, many if not most, seriously ill patients never reach hospital.

<b>Out patient records: seen in hospitals, health centers, health posts and clinics </b>

often provide much ill defined data. Diagnostic data are usually given in terms of the chief complaint. Those coming for immunizations or other preventive services may be included with those who come because of illness. The patients who are seen are again probably not representative of the general population: although coverage of the population may be greater than with a hospital because of greater geographical distribution, the people who live near a facility or who can afford the time to come will be over-represented. However, these records do provide information about the usage of outpatient facilities and the most frequent complaints and may help you to understand the pattern of disease in your

community.

<b>Immunization: useful to compare the number of births with the number of </b>

children immunized, this can give an indication of the coverage of any immunization programme.

<b>Childhood diseases: MCH clinics are one of the best sources of data on </b>

childhood diseases such as measles and malnutrition and, over a period of months or years, are reasonably accurate. MCH records, alone, are not enough as they are only a source of data on births and on deaths in children under five years. Use other sources of data to obtain a more representative picture. MCH records can also be used to measure the workload of the MCH workers.

<b>Routine data </b>

ă Fail to include a great deal of important illness and disability. In particular, much of the chronic illness due to tropical diseases such as schistosomiasis, leprosy, blindness, under nutrition and crippling due to birth trauma or polio, will not be detected from routine records.

ă Relate only to numerator data.

</div>

×