Outlier detection using iterative adaptive mini-minimum spanning tree generation with applications on medical data
[摘要] As an important technique for data pre-processing, outlier detection plays a crucial role in various real applications and has gained substantial attention, especially in medical fields. Despite the importance of outlier detection, many existing methods are vulnerable to the distribution of outliers and require prior knowledge, such as the outlier proportion. To address this problem to some extent, this article proposes an adaptive mini-minimum spanning tree-based outlier detection (MMOD) method, which utilizes a novel distance measure by scaling the Euclidean distance. For datasets containing different densities and taking on different shapes, our method can identify outliers without prior knowledge of outlier percentages. The results on both real-world medical data corpora and intuitive synthetic datasets demonstrate the effectiveness of the proposed method compared to state-of-the-art methods.
[发布日期] 2023-10-13 [发布机构]
[效力级别] [学科分类]
[关键词] minimum spanning tree;outlier detection;cluster-based outlier detection;data mining;medical data [时效性]