Applying Fuzzy Similarity Index and Ground Truth Fuzzy Contour to Evaluate the Segmentation Accuracy
H Kim1*, J Monroe2, M Machtay3,4, R Ellis3, S Lo3, M Yao3, J Sohn3,4, (1) Case Western Reserve University, Cleveland, OH, (2) St. Anthony's Cancer Center, Webster Groves, MO, (3) University Hospitals Case Medical Center, Cleveland, OH, (4) Case Western Reserve University, Cleveland, OHSU-E-J-105 Sunday 3:00PM - 6:00PM Room: Exhibit Hall
To evaluate the segmentation accuracy by using our novel Fuzzy Similarity Index (FSI) and Ground Truth Fuzzy Contour (GTFC) with the consideration of inter- and intra-observer variation
We developed GTFC to build consensus truth segmentation(contour) and FSI to score segmentation for an objective and quantitative evaluation of in-vivo medical image segmentation. GTFC is built by applying Fuzzy theory to consider with inter- and intra-observer variation. GTFC has the Fuzzy Membership Function(FMFn) which can assign a weight to each expert depending on their experience, unlike STAPLE. By using GTFC, we formulate a quantitative scoring index to evaluate the segmentation accuracy.
When a test segmentation is evaluated, we calculate the membership value of FMFn at every point in the test segmentation(contour). Then, we can make a distribution of membership value as Membership Score Histogram(MSH). We enhanced FSI to make more responsive. While generating a single value index(FSI) from MSH, we adopt the strategy of penalizing lower membership values. The resultant FSI equation is formulated by combining MSH with a penalty constant.
We tested the FSI by applying to a brain case. Ten experts segmented a region in the brain and six non-experts independently delineated the same place. GTFC was created from segmentations of expert to evaluate the accuracy of segmentations of non-experts. Then, we calculated FSI after making a MSH per test segmentation.
The order in higher similarity to GTFC is non-expert 6, 1, 5, 2, 3, and 4. Non-expert 2, 3, and 4 are significantly deviated from GTFC. Their FSIs are 0.845, 0.476, 0.125, 0.085, 0.078, and 0.005, respectively.
FSI can sensitively reflect the accuracy of test segmentation. It can be used to develop unbiased educational tools or credential process for clinicians. It can be also used to evaluate the performance of automated segmentation tools.