Outlier detection in test samples and supervised training set selection

Mohseni, Navid; Nematzadeh, Hossein; Akbari, Ebrahim

doi:10.22075/ijnaa.2021.4878

اداره چاپ و انتشارات دانشگاه سمنان

تعداد نشریات	21
تعداد شماره‌ها	663
تعداد مقالات	9,691
تعداد مشاهده مقاله	68,988,618
تعداد دریافت فایل اصل مقاله	48,481,752

	Outlier detection in test samples and supervised training set selection
International Journal of Nonlinear Analysis and Applications
مقاله 54، دوره 12، شماره 1، مرداد 2021، صفحه 701-712 اصل مقاله (615.62 K)
نوع مقاله: Research Paper
شناسه دیجیتال (DOI): 10.22075/ijnaa.2021.4878
نویسندگان
Navid Mohseni¹؛ Hossein Nematzadeh^* ²؛ Ebrahim Akbari²
¹Department of Computer Engineering‎, ‎Babol Branch‎, ‎Islamic Azad University‎, ‎Babol‎, ‎Iran
²Department of Computer Engineering‎, ‎Sari Branch‎, ‎Islamic Azad University‎, ‎Sari‎, ‎Iran
تاریخ دریافت: 05 مرداد 1399، تاریخ بازنگری: 18 آذر 1399، تاریخ پذیرش: 22 دی 1399
چکیده
‎Outlier detection is a technique for recognizing samples out of the main population within a data set‎. ‎Outliers have negative impacts on classification‎. ‎The recognized outliers are deleted to improve the classification power generally‎. ‎This paper proposes a method for outlier detection in test samples besides a supervised training set selection‎. ‎Training set selection is done based on the intersection of three well known similarity measures namely‎, ‎jacquard‎, ‎cosine‎, ‎and dice‎. ‎Each test sample is evaluated against the selected training set for possible outlier detection‎. ‎The selected training set is used for a two-stage classification‎. ‎The accuracy of classifiers are increased after outlier deletion‎. ‎The majority voting function is used for further improvement of classifiers‎.
کلیدواژه‌ها
‎Outlier detection‎؛ ‎Training set selection‎؛ ‎Similarity measures‎

مراجع
[1] N. Garcıa-Pedrajas, Evolutionary computation for training set selection, WIREs Data Mining and Knowledge Discovery 1(6) (2011) 512–523. [2] S. Peng, Q. Hu, J. Dang, and W. Wang, Optimal feasible step-size based working set selection for large scale SVMs training, Neurocomput. 407 (2020) 366–375. [3] J.R. Cano and S. Garcıa, Training set selection for monotonic ordinal classification, Data Knowledge Engin.112 (2017) 94–105. [4] A.M. Mohammed, E. Onieva, and M. Wozniak, Training set selection and swarm intelligence for enhanced integration in multiple classifier systems, Appl. Soft Comput. 95 (2020) 106568. [5] N. Verbiest, J. Derrac, C. Cornelis, S. García, and F. Herrera, Evolutionary wrapper approaches for training set selection as preprocessing mechanism for support vector machines: Experimental evaluation and support vector analysis, Appl. Soft Comput. 38 (2016) 10–22. [6] Z. Ren, B. Wu, X. Zhang, and Q. Sun, Image set classification using candidate sets selection and improved reverse training, Neurocomput. 341 (2019) 60–69. [7] E. Santiago-Ramirez, J.A. Gonzalez-Fraga, and Omar Alvarez-Xochihua, Optimization-based methodology for training set selection to synthesize composite correlation filters for face recognition, Signal Process.: Image Commun. 43 (2016) 54–67. [8] A. Smiti, A critical overview of outlier detection methods, Comput. Sci. Rev. 38 (2020) 100306. [9] S. Rath, A. Tripathy and A.R. Tripathy, Prediction of new active cases of coronavirus disease (COVID-19) pandemic using multiple linear regression model, Diabetes Metabolic Syndrome: Clinical Res. Rev. 14(5) (2020) 1467–1474. [10] T. Chen, E. Martin and G. Montague, Robust probabilistic PCA with missing data and contribution analysis for outlier detection, Comput. Stat. Data Anal. 53(10) (2009) 3706–3716. [11] C. Lejeune, J. Mothe, A. Soubki, and O. Teste, Shape-based outlier detection in multivariate functional data, Knowledge-Based Syst. 198 (2020) 105960. [12] B. Tang and H. He, A local density-based approach for outlier detection, Neurocomput. 241 (2017) 171–180. [13] B. Wang and Z. Mao, A dynamic ensemble outlier detection model based on an adaptive k-nearest neighbour rule, Inf. Fusion 63 (2020) 30–40. [14] G. Acampora, F. Herrera, G. Tortora, and A. Vitiello, A multi-objective evolutionary approach to training set selection for support vector machine, Knowledge-Based Syst.147 (2018) 94–108. [15] C. Liu, W. Wang, M. Wang, F. Lv, and M. Konan, An efficient instance selection algorithm to reconstruct training set for support vector machine, Knowledge-Based Syst. 116 (2017) 58–73. [16] A. Christy, G.M. Gandhi and S. Vaithyasubramanian, Cluster-based outlier detection algorithm for healthcare data, Procedia Computer Sci. 50 (2015) 209–215. [17] M. Lu, Z. Qin, Y. Cao, Z. Liu, and M. Wang, Scalable news recommendation using multi-dimensional similarity and Jaccard–Kmeans clustering, J. Syst. Software 95 (2014) 242–251. [18] A. Singh and S. Kumar, A novel dice similarity measure for IFSs and its applications in pattern and face recognition, Expert Syst. Appl. 149 (2020) 113245. [19] J. Ye, Improved cosine similarity measures of simplified neutrosophic sets for medical diagnoses, Artificial Intell. Medic. 63(3) (2015) 171–179. [20] H. Nematzadeh, R. Enayatifar, M. Mahmud, and E. Akbari, Frequency-based feature selection method using whale algorithm, Genomics 111(6) (2019) 1946–1955. [21] E. Akbari, H.M. Dahlan, R. Ibrahim, and H. Alizadeh, Hierarchical cluster ensemble selection, Engin. Appl. Artificial Intell. 39 (2015) 146–156.
آمار تعداد مشاهده مقاله: 15,923 تعداد دریافت فایل اصل مقاله: 7,435

سامانه مدیریت نشریات علمی. قدرت گرفته از سیناوب

پیوندهای مفید

پیوندهای مفید

آمار

Outlier detection in test samples and supervised training set selection