
تعداد نشریات | 21 |
تعداد شمارهها | 610 |
تعداد مقالات | 9,028 |
تعداد مشاهده مقاله | 67,082,851 |
تعداد دریافت فایل اصل مقاله | 7,656,344 |
Outlier detection in test samples and supervised training set selection | ||
International Journal of Nonlinear Analysis and Applications | ||
مقاله 54، دوره 12، شماره 1، مرداد 2021، صفحه 701-712 اصل مقاله (615.62 K) | ||
نوع مقاله: Research Paper | ||
شناسه دیجیتال (DOI): 10.22075/ijnaa.2021.4878 | ||
نویسندگان | ||
Navid Mohseni1؛ Hossein Nematzadeh* 2؛ Ebrahim Akbari2 | ||
1Department of Computer Engineering, Babol Branch, Islamic Azad University, Babol, Iran | ||
2Department of Computer Engineering, Sari Branch, Islamic Azad University, Sari, Iran | ||
تاریخ دریافت: 05 مرداد 1399، تاریخ بازنگری: 18 آذر 1399، تاریخ پذیرش: 22 دی 1399 | ||
چکیده | ||
Outlier detection is a technique for recognizing samples out of the main population within a data set. Outliers have negative impacts on classification. The recognized outliers are deleted to improve the classification power generally. This paper proposes a method for outlier detection in test samples besides a supervised training set selection. Training set selection is done based on the intersection of three well known similarity measures namely, jacquard, cosine, and dice. Each test sample is evaluated against the selected training set for possible outlier detection. The selected training set is used for a two-stage classification. The accuracy of classifiers are increased after outlier deletion. The majority voting function is used for further improvement of classifiers. | ||
کلیدواژهها | ||
Outlier detection؛ Training set selection؛ Similarity measures | ||
مراجع | ||
[1] N. Garcıa-Pedrajas, Evolutionary computation for training set selection, WIREs Data Mining and Knowledge Discovery 1(6) (2011) 512–523. [2] S. Peng, Q. Hu, J. Dang, and W. Wang, Optimal feasible step-size based working set selection for large scale SVMs training, Neurocomput. 407 (2020) 366–375. [3] J.R. Cano and S. Garcıa, Training set selection for monotonic ordinal classification, Data Knowledge Engin.112 (2017) 94–105. [4] A.M. Mohammed, E. Onieva, and M. Wozniak, Training set selection and swarm intelligence for enhanced integration in multiple classifier systems, Appl. Soft Comput. 95 (2020) 106568. [5] N. Verbiest, J. Derrac, C. Cornelis, S. García, and F. Herrera, Evolutionary wrapper approaches for training set selection as preprocessing mechanism for support vector machines: Experimental evaluation and support vector analysis, Appl. Soft Comput. 38 (2016) 10–22. [6] Z. Ren, B. Wu, X. Zhang, and Q. Sun, Image set classification using candidate sets selection and improved reverse training, Neurocomput. 341 (2019) 60–69. [7] E. Santiago-Ramirez, J.A. Gonzalez-Fraga, and Omar Alvarez-Xochihua, Optimization-based methodology for training set selection to synthesize composite correlation filters for face recognition, Signal Process.: Image Commun. 43 (2016) 54–67. [8] A. Smiti, A critical overview of outlier detection methods, Comput. Sci. Rev. 38 (2020) 100306. [9] S. Rath, A. Tripathy and A.R. Tripathy, Prediction of new active cases of coronavirus disease (COVID-19) pandemic using multiple linear regression model, Diabetes Metabolic Syndrome: Clinical Res. Rev. 14(5) (2020) 1467–1474. [10] T. Chen, E. Martin and G. Montague, Robust probabilistic PCA with missing data and contribution analysis for outlier detection, Comput. Stat. Data Anal. 53(10) (2009) 3706–3716. [11] C. Lejeune, J. Mothe, A. Soubki, and O. Teste, Shape-based outlier detection in multivariate functional data, Knowledge-Based Syst. 198 (2020) 105960. [12] B. Tang and H. He, A local density-based approach for outlier detection, Neurocomput. 241 (2017) 171–180. [13] B. Wang and Z. Mao, A dynamic ensemble outlier detection model based on an adaptive k-nearest neighbour rule, Inf. Fusion 63 (2020) 30–40. [14] G. Acampora, F. Herrera, G. Tortora, and A. Vitiello, A multi-objective evolutionary approach to training set selection for support vector machine, Knowledge-Based Syst.147 (2018) 94–108. [15] C. Liu, W. Wang, M. Wang, F. Lv, and M. Konan, An efficient instance selection algorithm to reconstruct training set for support vector machine, Knowledge-Based Syst. 116 (2017) 58–73. [16] A. Christy, G.M. Gandhi and S. Vaithyasubramanian, Cluster-based outlier detection algorithm for healthcare data, Procedia Computer Sci. 50 (2015) 209–215. [17] M. Lu, Z. Qin, Y. Cao, Z. Liu, and M. Wang, Scalable news recommendation using multi-dimensional similarity and Jaccard–Kmeans clustering, J. Syst. Software 95 (2014) 242–251. [18] A. Singh and S. Kumar, A novel dice similarity measure for IFSs and its applications in pattern and face recognition, Expert Syst. Appl. 149 (2020) 113245. [19] J. Ye, Improved cosine similarity measures of simplified neutrosophic sets for medical diagnoses, Artificial Intell. Medic. 63(3) (2015) 171–179. [20] H. Nematzadeh, R. Enayatifar, M. Mahmud, and E. Akbari, Frequency-based feature selection method using whale algorithm, Genomics 111(6) (2019) 1946–1955. [21] E. Akbari, H.M. Dahlan, R. Ibrahim, and H. Alizadeh, Hierarchical cluster ensemble selection, Engin. Appl. Artificial Intell. 39 (2015) 146–156. | ||
آمار تعداد مشاهده مقاله: 15,777 تعداد دریافت فایل اصل مقاله: 509 |