I have been working in program evaluation / causal inference space for ~ 9 years. Recently though, I came across an experimental test design that I strangely hadn’t thought much about before.
Here is an example of this design:
Students are randomized to get some tutoring vs another type of tutoring (Tutoring A vs Tutoring B). Randomization happens at the student level. However, tutoring is administered in groups by tutors who have different skills that’s aligned with each type of tutoring. Ten tutors with advanced skills will be administering Tutoring A and 7 different tutors with not as advanced skills will be administering Tutoring B. Each tutor under condition A or B will be teaching say 20 students. The point of the study is to compare students’ performance for students who receive Tutoring A vs B.
Even though students are individually randomized, this is not a simple individually randomized test because students are nested in tutors and receive treatment in clusters.
But it is also NOT a cluster randomized trial (CRT), which I am much more familiar with. In such designs, randomization happens at the cluster level. Tutors would get assigned to one treatment vs another and students they work with will get respective treatment that the tutors are assigned.
The type of design that I am dealing with is apparently called Individually Randomized Grouped Treatment trials (IGRT). I found a couple of articles and also reached out to my forever advisor James Pustejovsky for his thoughts about this design (Pals et al., 2008; Roberts & Roberts, 2005). I learned the following:
Independence Assumption
T-test or linear regression, that we normally would use to analyze data from a simple individually randomized two-group comparison test, involves the assumption that the errors are independent. The grouping in CRTs and IGRTs violate that assumption, which can lead to underestimated standard errors and, thus, Type 1 error inflation and inappropriate confidence intervals.
In cluster randomized tests, both treatment and control groups will contain clusters. In the example I described above, both treatment and control group have clustering. BUT, there is another version of this treatment design where students who get assigned to treatment get tutoring by different tutors whereas students who get assigned to control do not receive any tutoring. Hence, clustering only happens in the treatment group.
The extent of the impact of clustering is measured by something called the intracluster coefficient.
Tutors are assigned based on availability/ location. Then the groups will be even more different’
Treatment differences
ICC’s different Cluster sizes different
These two things go into design effect
SUTVA
Like in cluster randomized trials, we do need to think about possible of multiple treatments being deployed
In cluster randomized trials, people select themselves into groups. In IRGT, individuals are assigned to groups. They can be assigned randomly or more realistically units can get assigned to clusters based on location, time, availability (for example, of tutors).