Research Article Open Access

A Permutation Test for Comparing Two Correlated Receiver Operating Characteristic Curves

Okeh Uchechukwu Marius1 and Onyeagu Sidney2
  • 1 Ebonyi State University Abakaliki Nigeria, Nigeria
  • 2 Nnamdi Azikiwe University Awka, Nigeria
Journal of Mathematics and Statistics
Volume 16 No. 1, 2020, 62-75


Submitted On: 7 November 2019
Published On: 24 May 2020

How to Cite: Marius, O. U. & Sidney, O. (2020). A Permutation Test for Comparing Two Correlated Receiver Operating Characteristic Curves. Journal of Mathematics and Statistics, 16(1), 62-75.


The area under the Receiver Operating Characteristic (ROC) curve (AUC) is a summary measure when comparing two ROC curves. However, this summary measure is less informative when two ROC curves cross and have the same AUCs. In order to detect differences between ROC curves and to be able to tackle the problem of exchangeability of the labels between two diagnostic tests within subject, an alternative permutation test based on between-subject permutations of the labels of the subjects within each diagnostic test is proposed for assessing a change in the AUCs in a continuous matched pair of data from two diagnostic test procedures having both non-diseased and diseased subject in each of the test. The Wilcoxon signed rank test statistic was modified as a permutation test under the null hypothesis of equality of AUCs. An algorithm for carrying out complete enumeration of all the distinct permutations of the paired test results was developed which provides exact p-values. Using simulated data, the proposed test compares in statistical power to the modified sign test proposed by Braun and Alonzo but the proposed test has better operating characteristics, that is greater statistical power to detect a crossing alternative and is less conservative in test size and in the range of parameters of at least 0.8 of AUCs on the average with a correlation of at least 0.4 and small to moderately large sample sizes. Similarly in applying real life data, the proposed test has the more likelihood of rejecting null hypothesis of equality of AUC1 and AUC2 at nominal level of 0.05 with the proposed test having a p-value of 0.0312 against the Braun and Alonzo’s test with a p-value of 0.0387. This is because the proposed test is modified to adjust for the presence of zero differences in values and considers the signs of values as well as the absolute ranks of values. Also the estimates of AUC1 and AUC2 for the two diagnostic tests are 0.668 and 0.887 respectively showing that AUC2, that is 2hour 100g Oral Glucose Tolerance Test (OGTT) is superior to AUC1 (2hour 70g OGTT) at a time that the specificity is greater than 0.7.



  • Permutation Test
  • Exchangeability
  • Asymptotic Approximation
  • Algorithm
  • Two Diagnostic Test Procedures
  • Area Under the ROC Curve (AUC)
  • Modified Wilcoxon Signed Rank Test
  • Receiver Operating Characteristic (ROC) Curve