OTA 1999 Posters
Reliability and General Applicability of Classification of Fractures of the Tibial Plafond According to a Rank Order Method
Douglas R. Dirschl, MD; J. Larry Marsh, MD; Thomas A. DeCoster, MD; Shepard R. Hurwitz, MD, University of North Carolina School of Medicine, Raleigh, NC
Purpose: Fractures of the tibial plafond are complex injuries, and previous studies have shown fair to poor interobserver reliability in the classification of these injuries. In an effort to explore methods to improve reliability in the classification of fractures, the authors explored the rank order method of classification of tibial plafond fractures, using a poster presentation at the 1998 Orthopaedic Trauma Association (OTA) Annual Meeting. The purpose of this study is to report the results of that investigation on the reliability and general applicability of the rank order method of classification of fractures of the tibial plafond conducted through the participation of attendees at the 1998 OTA annual meeting.
Methods: High quality photographs of the mortise and lateral radiographs of ten fractures of the tibial plafond were numbered and presented on a single poster at the 1998 OTA annual meeting. Attendees at the meeting served as observers, ranking the ten cases from least severe (#1) to most severe (#10), and recording their rankings on scoring sheets included with the poster presentation. Instructions to observers were only that they rank the ten cases taking into consideration all factors they believed important in making such a ranking; observers were not instructed to consider specific factors in making their rankings. Attendees were also permitted to provide comments on the study. Thirty-nine completed rankings were collected at the meeting. Data were analyzed to determine the intraclass correlation coefficient (ICC) for the thirty-nine individual rankings of the ten cases.
Results: The overall ICC for the thirty-nine observers was 0.62. Of the thirty-nine observers providing rankings, 33 were orthopaedic traumatologists and 6 were general orthopaedists. Twenty-four observers were in academic practice, 10 were in private practice, and 5 were orthopaedic truama fellows. Sixteen observers were members of the OTA and 23 were nonmembers. There was no difference in the ICC for traumatologists and general orthopaedists (p>0.5), academic orthopaedists and private practitioners (p>0.5), and OTA members and nonmembers (p>0.5). Of the thirty-nine observers, 10 reported they treated 5 or fewer tibial plafond fractures per year, 10 treated 6-10 per year, 11 treated 11-15 per year, 2 treated 16-20 per year, and 6 treated more than 20 per year. There was a slightly greater ICC for observers who treated more than 15 tibial plafond fractures per year than in those who treated 15 or fewer per year (0.70 vs. 0.61), but the difference did not reach statistical significance (p=0.11). Ten of the thirty-nine observers commented that it was difficult to rank the radiographs, while 4 observers commented that the photographs of the radiographs were difficult to see due to glare. Eight observers commented that the radiographs did not truly represent the full spectrum of injury severity for tibial plafond fractures, representing instead the mid-to-high range of injury severity.
Discussion: The rank order method of classification has only rarely been used in clinical orthopaedic research, and its use has generally been limited to instances where objective measures to stratify data were not available, such as the quality of resident performance or the relative importance of various surgical procedures. One recent clinical investigation, however, with five observers ranking 25 tibial plafond fractures, reported an impressive value for ICC of 0.94. The disappointing ICC of 0.62 in the present study may be related to several factors. Using a relatively large number of observers (39) results in inherent statistical difficulties in achieving near perfect agreement on any measure of fracture classification, including ranking the severity of injury. That the fracture ranking was done on photographs presented on a poster rather than on radiographs presented on a light box may have affected the results somewhat, as orthopaedists are not accustomed to viewing radiographs as photographs. Finally, that the images presented did not represent the full spectrum of injury severity of tibial plafond fractures also limits the reliability that can be expected >from the study. The concept of range restriction, where the sample studied represents only a portion of the true population range of a variable, is well-known in statistical circles to limit the predictive value of an investigation. The authors anticipate that altering the selected cases to more fully represent the spectrum of injury severity would lead to improved ICC results.
Conclusions: Although the interobserver reliability of the rank order method for the classification of fractures of the tibial plafond in this study was not as great as in a previous study, the reliability was actually quite good, given the large number of observers and the narrow range of injury severity used in this study. It remains for further studies to identify and validate a series of cases of tibial plafond fractures that represent a full spectrum of injury severity and which can be ranked with near perfect interobserver reliability. A series of cases such as this may then serve as a measurement standard for severity of injury against which individual cases of tibial plafond fractures may be easily and reliably compared. Although the authors do not advocated that a rank order method of classification replace the currently used methods for classifying these severe injuries, the authors believe that a rank order system may someday become a useful adjunct to more traditional methods of fractures classification.