ThaiScience  


ECTI TRANSACTIONS ON COMPUTER INFORMATION TECHNOLOGY


Volume 15, No. 02, Month AUGUST, Year 2021, Pages 258 - 266


Outlier detection in wellness data using probabilistic mapped mean-shift algorithms

Siriwan Phongsasiri, Suwanna Rasmequan


Abstract Download PDF

In this paper, the Probabilistic Mapped Mean-Shift Algorithm is proposed to detect anomalous data in public datasets and local hospital children’s wellness clinic databases. The proposed framework consists of two main parts. First, the Probabilistic Mapping step consists of k-NN instance acquisition, data distribution calculation, and data point reposition. Truncated Gaussian Distribution (TGD) was used for controlling the boundary of the mapped points. Second, the Outlier Detection step consists of outlier score calculation and outlier selection. Experimental results show that the proposed algorithm outperformed the existing algorithms with real-world benchmark datasets and a Children’s Wellness Clinic dataset (CWD). Outlier detection accuracy obtained from the proposed algorithm based on Wellness, Stamps, Arrhythmia, Pima, and Parkinson datasets was 93%, 94%, 80%, 75%, and 72%, respectively.


Keywords

Outlier detection, k-NN, Truncated Gaussian Distribution, Probabilistic Mapped, Mean shift



ECTI TRANSACTIONS ON COMPUTER INFORMATION TECHNOLOGY


Published by : ECTI Association
Contributions welcome at : http://www.ecti-thailand.org/paper/journal/ECTI-CIT