ECTI TRANSACTIONS ON COMPUTER INFORMATION TECHNOLOGYVolume 15, No. 02, Month AUGUST, Year 2021, Pages 258 - 266
Outlier detection in wellness data using probabilistic mapped mean-shift algorithms
Siriwan Phongsasiri, Suwanna Rasmequan
Abstract Download PDFIn this paper, the Probabilistic Mapped Mean-Shift Algorithm is proposed to detect anomalous data in public datasets and local hospital children’s wellness clinic databases. The proposed framework consists of two main parts. First, the Probabilistic Mapping step consists of k-NN instance acquisition, data distribution calculation, and data point reposition. Truncated Gaussian Distribution (TGD) was used for controlling the boundary of the mapped points. Second, the Outlier Detection step consists of outlier score calculation and outlier selection. Experimental results show that the proposed algorithm outperformed the existing algorithms with real-world benchmark datasets and a Children’s Wellness Clinic dataset (CWD). Outlier detection accuracy obtained from the proposed algorithm based on Wellness, Stamps, Arrhythmia, Pima, and Parkinson datasets was 93%, 94%, 80%, 75%, and 72%, respectively.
Outlier detection, k-NN, Truncated Gaussian Distribution, Probabilistic Mapped, Mean shift