Reformed Teaching Observation Protocol (RTOP) was developed as
an observation instrument to provide a standardized means for detecting
the degree to which K-20 classroom instruction in mathematics or science
|RTOP draws on five major sources for its validity|
|Development of the Instrument|
a five point Likert scale the 39 items were used by five members of the
EFG (Benford, Falconer, Turley, Piburn and Sawada) to observe three videotaped
lessons. Detailed discussion of each item resulted in discarding 14 items,
leaving 25 items with five items in each category.
EFG submitted its instrument to two members of the ACEPT ASU Mathematics
Cluster, Matt Isom and Apple Bloom. At this point in the development,
Isom and Bloom were skeptical that the instrument, with its focus on both
science and mathematics, would be sufficiently sensitive to the mathematics
standards. The Mathematics cluster informed the EFG of two major problems
with the instrument:
changes were necessary. Sawada revisited all the items. As a result several
items from the 25 were deleted or drastically reworded and new ones created.
These changes were modified by the EFG as a whole after considerable argument.
The modified instrument met with the approval of the mathematics cluster.
EFG began piloting the RTOP on various university and college classrooms
during the Spring 1999. Analysis and discussions of these ratings lead
to further modifications to items, which produced questionable results.
At the same time, Sawada began preparing an "Annotated RTOP Guide".
The Guide documented the growing inter-rater consensus about how each
item should be interpreted. The guide was also being developed to facilitate
the training of new observers. Informal calculation of inter-rater correlation
coefficients produced estimates of 0.50 -0.85. These were deemed sufficiently
high to incorporate the RTOP into the evaluation plans for ACEPT summer
1999 workshops. It was hoped that from May 1999 onwards, the changes to
RTOP would be minimal (largely the case).
|Psychometric Properties of RTOP|
was used on all the courses included in the Fall 1999 evaluation of ACEPT.
Each of the courses was observed at least two times. In order to get an
early reading of inter-rater reliability, observers agreed to work in
pairs for some of the initial observations. As a part of the plan, Kathleen
Falconer and Daiyo Sawada paired up to do a set of observations on the
same classes. The first 16 such pairs (a total of 32 independent observations)
were used to calculate estimates of reliability.
Figure 1 shows a scatter plot of the 32 data points (some data points fall on each other). The equation for the best-fit line and the proportion of variance accounted for by that line (R2 = 0.954) are shown. This estimate of reliability is very high.
In a similar manner, reliabilities were also estimated for the five subscales that constitute RTOP. Because each subscale consists of only 5 items, it was anticipated their reliability would be substantially lower than for the total score. While this was true for Subscale Two, it was not true for the others as shown in Table 1.
indicated in the introduction, the Face Validity of RTOP is established
with the credibility of the sources consulted.
Construct Validity refers to the theoretical integrity of an instrument. Because the RTOP is a quantitative measure of the degree to which a classroom is in accord with the science and mathematics reforms as embodied in the ACEPT project, the theoretical relationships of interest are those underlying the ACEPT reform.
there are a large number of individual mathematics and science standards,
the ACEPT has taken "Inquiry" as a major integrating orientation:
Learners as inquirers in the classroom. On this basis, It would be expected
that RTOP would span many standards, but underlying these standards would
be a single dimension of "inquiry-orientation."
|What the Data Say|
The graph shows an example of using RTOP to verify that the experimental and the reform groups differ significantly from each other with regard to reform. Being able to make such elemental distinctions has been important in understanding the nature of ACEPT reform.