3. Collecting Data Through Experiments

In this section you will learn how to appropriately design, set up, and implement a designed experiment to collect data. Recall that data can be collected in two main ways: (1) through sample surveys or (2) through designed experiments. While sample surveys lead to observational studies, designed experiments enable researchers to control variables, leading to additional conclusions.

A designed experiment is a controlled study whose purpose is to control as many factors as possible to isolate the effects of a particular factor. Designed experiments must be carefully set up to achieve their purposes.

The variables in a designed experiment that are controlled are called the explanatory variables or are sometimes called the factors. Factors have values that can be changed by the researcher and are considered as possible causes. Examples of factors are:

  • The dosage of a drug in a medical experiment
  • The type of teaching method in an education experiment
  • One drug by itself compared with that drug used in conjunction with another

The designed experiment analyzes the effects of the factors on the response variable. Response variables are not part of a controlled environment and have values that are measured by the researcher. Examples of response variables are:

  • The blood pressures of the patients
  • The test scores for a class
  • The sizes of a cancerous tumor for patients

A treatment is the specific combination of the values of the factors. Examples of treatments are:

  • Giving one medication to one group of patients and a different medication to another
  • Using one type of fertilizer on a set of plots of corn and a different type of fertilizer on a different set of plots
  • Playing country music to one group of mice and rap music to another

A treatment is applied to “experimental units” (people, plants, materials, or other objects). When experimental units are people, we refer to them as subjects. Subjects in an experiment correspond to individuals in a survey.

Here is an example of a designed experiment. While reading this, think about who are the subjects, what is/are the factors and treatments, and what is/are the response variables.

Example 3: Drug Trials

Suppose you want to determine whether a new drug, Drug N, is more effective at treating high blood pressure than the existing drug, Drug E. Patients with high blood pressure are given either Drug N or Drug E, and the blood pressures are measured one month later.


For this experiment, the subjects are the patients selected to receive either Drug N or Drug E; the factor is the type of drug that a subject receives; the treatment is the specific drug administered (Drug N or Drug E); and the response variable is the subject’s blood pressure after one month. If patients given Drug N have significantly lower blood pressures than patients given Drug E, we would wish to conclude that Drug N is more effective. However, it’s not the easy to immediately draw such conclusions.

A carefully designed experiment ensures that the behavior of the researcher and/or subjects does not influence the outcome of the experiment. It is important for subjects to not know which treatment they get. In addition, many experiments will have a group of subjects that are not given any medication. These subjects are given a placebo (e.g., a sugar tablet) to control against the possibility that subjects imagine a change in their response variable because they know they are receiving “medication.” It is also important for the researchers to not know which group of patients is given which medication or placebo. An experiment where neither the experimenter nor the experimental unit knows what treatment is being administered is call a double-blind experiment.

Conducting an experiment involves considerable planning. Here are some steps to consider:

  1. Identify the problem. The first step in planning an experiment (or in most any project at all) is to identify the problem. This includes identifying the general purpose of the experiment, the response variable of interest, and the population. The identified problem is often referred to as a claim about the population of interest.
  2. Determine the factors. The second step in planning an experiment is to determine the factors to be studied. Factors can be identified by experts in the field, by the overall purpose of the experiment, or by using results from previous studies. Factors must be identified as either fixed at some predetermined level, controlled (those that will be manipulated in the experiment), or uncontrolled.
  3. Determine the number of experimental units (i.e., the sample size). In general, the more the experiment units, the more effective the experiment. However, the number of experimental units could have to be limited by time or money. We will learn some techniques later in the semester to calculate an appropriate number of experimental units.
  4. Determine the level of each factor. There are three ways to deal with the factors:
    • Control – Fix the levels at a constant level (for factors not of interest)
    • Manipulate – Set the levels at predetermined levels (for factors of interest)
    • Randomize – Randomize the experimental units (for uncontrolled factors not of interest). Randomization decreases (or averages out) the effects of uncontrolled factors, even ones not identified or thought about in advance.
  5. Conduct the experiment. Subjects must be assigned at random to a treatment group. There are different good methods for assigning treatments to experimental units: completely random, matched-pairs (see below), and randomized blocks.If a treatment is applied to more than one experimental unit, this is called replication, which can be useful for experimental accuracy and to further decrease the effects of uncontrolled factors. In this step, the experimenter then collects and processes the data.
  6. Test the claim. In the final step, we conduct inferential statistics, which will be studied in detail in sub-competencies 8 through 12.

A completely randomized design is when each experimental unit is assigned to a treatment completely at random.

Another type of experimental design is the matched-pairs design. A matched-pairs design is when the experimental units are paired up (e.g., twins, the same person before and after the treatment, a husband and wife) and each of the pair is assigned to a different treatment. There are only two levels of treatment (one for each of the pair). For example, a researcher would collect and compare information from the same subject before receiving a certain medication and then after receiving the medication.

Finally, we cannot always control all factors whose effects we do not care about but we suspect might have an effect on our response variable or the factors effecting our response variable. When this occurs. For example, customers with young children have different purchasing habits than those without. Perhaps men or women will respond differently to treatment. However, these are not factors that can be assigned to them. Factors like these may account for some of the variation in the response in experiments because subjects at different levels may respond differently.  So we deal with them by grouping or blocking, our subjects together and, in effect, analyzing the experiment separately for each block. Such factors are called blocking factors, and their levels are called blocks. Blocking an experiment is like stratifying in survey design.

Example 4

An Internet sales site randomly sent customers to one of three versions of its welcome page. It recorded how long each visitor stated on the site. Additionally analysts want to know if customers that came directly to the site (by typing in the URL) behave differently than those who were referred to the site from other sources (such as search engines). The decide to block by how the customers arrived. Draw a diagram of their experimental design.


sec 5