
تعداد نشریات | 21 |
تعداد شمارهها | 610 |
تعداد مقالات | 9,026 |
تعداد مشاهده مقاله | 67,082,723 |
تعداد دریافت فایل اصل مقاله | 7,656,153 |
A hybrid semi-supervised boosting to sentiment analysis | ||
International Journal of Nonlinear Analysis and Applications | ||
دوره 12، شماره 2، بهمن 2021، صفحه 1769-1784 اصل مقاله (976.47 K) | ||
نوع مقاله: Research Paper | ||
شناسه دیجیتال (DOI): 10.22075/ijnaa.2021.23334.2522 | ||
نویسندگان | ||
Jafar Tanha* 1؛ Solmaz Mahmudyan2؛ Ahmad Farahi2 | ||
1Electrical and Computer Engineering Department, Tabriz University, Tabriz, Iran | ||
2Computer Engineering Department, Payame-Noor University, Tehran, Iran | ||
تاریخ دریافت: 15 فروردین 1400، تاریخ پذیرش: 05 تیر 1400 | ||
چکیده | ||
In this article, we propose a hybrid semi-supervised boosting algorithm to sentiment analysis. Semi-supervised learning is a learning task from a limited amount of labeled data and plenty of unlabeled data which is the case in our used dataset. The proposed approach employs the classifier predictions along with the similarity information to assign label to unlabeled examples. We propose a hybrid model based on the agreement among different constructed classification model based on the boosting framework to assign a final label to unlabeled data. The proposed approach employs several different similarity measurements in its loss function to show the role of the similarity function. We further address the main preprocessing steps in the used dataset. Our experimental results on real-world microblog data from a commercial website show that the proposed approach can effectively exploit information from the unlabeled data and significantly improves the classification performance. | ||
کلیدواژهها | ||
Semi-supervised learning؛ Sentiment Analysis؛ Persian Language؛ Boosting؛ Similarity Function | ||
مراجع | ||
[1] U. Aggarwal and G. Aggarwal. Sentiment analysis: A survey. Int. J. Comput. Sci. Engin. 5(5), (2017) 222–225. [2] M. Allahyari, S. Pouriyeh, M. Assefi, S. Safaei, E. D. Trippe, J. B. Gutierrez, and K. Kochut. A brief survey of text mining: Classification, clustering and extraction techniques. arXiv preprint arXiv:1707.02919, 2017. [3] J. A. Balazs and J. D. Vel´asquez. Opinion mining and information fusion: a survey. Inf. Fus. 27, (2016) 95–110. [4] M. Belkin, P. Niyogi, and V. Sindhwani. Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. J. Machine Learn. Res. 7, (2006) 2399–2434. [5] S. Bhatia, M. Sharma, and K. K. Bhatia. Sentiment analysis and mining of opinions. 503–523. Springer, 2018.[6] P. Biyani, C. Caragea, P. Mitra, C. Zhou, J. Yen, G. E. Greer, and K. Portier. Co-training over domainindependent and domain-dependent features for sentiment analysis of an online cancer support community. Advances in Social Networks Analysis and Mining (ASONAM), 2013 IEEE/ACM International Conference on, 2013, 413–417. [7] A. Blum and T. Mitchell. Combining labeled and unlabeled data with co-training. In Proc. Eleventh Annual Conf. Comput. Learn. Theory, (1998) 92–100. [8] K. Chen and S. Wang. Semi-supervised learning via regularized boosting working on multiple semi-supervised assumptions. IEEE Trans. Pattern Anal. Machine Intel. 33(1), (2011) 129–143. [9] K. Crammer, A. Kulesza, and M. Dredze. Adaptive regularization of weight vectors. Adv. Neural Inf. Proc. Syst. 2009, (2009) 414–422. [10] D. Davidov, O. Tsur, and A. Rappoport. Enhanced sentiment learning using twitter hashtags and smileys. Proc. 23rd Int. Conf. Comput. Ling. Posters, 2010, 241–249. [11] W. C. Dhaoui, C. and L. Tan. Social media sentiment analysis: lexicon versus machine learning. J. Consumer Market. 39(6), (2017) 480–488. [12] J. D’hondt, J. Vertommen, P. Verhaegen, D. Cattrysse, and J. R. Duflou. Pairwise-adaptive dissimilarity measure for document clustering. Inf. Sci. 180(12), (2010) 2341–2358. [13] Y. Han, Y. Liu, and Z. Jin. Sentiment analysis via semi-supervised learning: a model based on dynamic threshold and multi-classifiers. Neural Computing and Applications, 32, 2020, 5117–5129. [14] S. Hong, J. Lee, and J.-H. Lee. Competitive self-training technique for sentiment analysis in mass social media. Soft Comput. Intel. Syst. 2014 Joint 7th Int. Conf. Adv. Intel. Syst. 15th Int. Symp. 2014, (2014) 9–12. [15] M. Hu and B. Liu. Mining and summarizing customer reviews. Proc. Tenth ACM SIGKDD Int. Conf. Knowledge Disc. Data Min., (2004) 168–177. [16] X. Hu, L. Tang, J. Tang, and H. Liu. Exploiting social relations for sentiment analysis in microblogging. Proc. Sixth ACM Int. Conf. Web Search Data Min., 2013, 537–546. [17] S. Inzalkar and J. Sharma. A survey on text mining-techniques and application. Int. J. Res. Sci. Engin. 24, (2015) 1–14. [18] F. H. Khan, U. Qamar, and S. Bashir. A semi-supervised approach to sentiment analysis using revised sentiment strength based on sentiwordnet. Knowl. Inf. Syst. 51(3), (2017) 851–872. [19] S. Kumar, K. De, and P. P. Roy. Movie recommendation system using sentiment analysis from microblogging data. IEEE Trans. Comput. Soc. Syst. 2020, (2020) 1–9. [20] M. Labani, P. Moradi, F. Ahmadizar, and M. Jalili. A novel multivariate filter method for feature selection in text classification problems. Engin. Appl. Artif. Intel. 70, (2018) 25–37. [21] Z. Li, C. Li, L. Yang, P. S. Yu, and Z. Li. Mixture distribution modeling for scalable graph-based semi-supervised learning. Knowledge-Based Syst. 200, (2020) 105974. [22] Y. Lin, J. Jiang, and S. Lee. A similarity measure for text classification and clustering. IEEE Trans. Knowledge Data Engin. 26(7), (2014) 1575–1590. [23] S. Liu, W. Zhu, N. Xu, F. Li, X.-q. Cheng, Y. Liu, and Y. Wang. Co-training and visualizing sentiment evolvement for tweet events. Proc. 22nd Int. Conf. World Wide Web, 2013, 105–106. [24] W. Liu, X. Jing, Y. Chen, and J. Li. Co-training based on multi-type text features. Int. Conf. Signal Inf. Proc. Network. Comput., 2017, 213–220. [25] Z. Liu, X. Dong, Y. Guan, and J. Yang. Reserved self-training: A semi-supervised sentiment classification method for chinese microblogs. Proc. Sixth Int. Joint Conf. Natural Lang. Proc., 2013, 455–462. [26] Z. Miao, Y. Li, X. Wang, and W. Tan. Snippext: Semi-supervised opinion mining with augmented data. CoRR, abs/2002.03049, 2020. [27] S. M. Mohammad, S. Kiritchenko, and X. Zhu. Nrc-canada: Building the state-of-the-art in sentiment analysis of tweets. arXiv preprint arXiv:1308.6242, 2013. [28] A. Pak and P. Paroubek. Twitter based system: Using twitter for disambiguating sentiment ambiguous adjectives. Proc. 5th Int. Workshop Semantic Ev., 2010, 436–439. [29] S. Park, J. Lee, and K. Kim. Semi-supervised distributed representations of documents for sentiment analysis. Neural Networks, 119, (2019) 139–150. [30] L. Qiu, W. Zhang, C. Hu, and K. Zhao. Selc: a self-supervised model for sentiment classification. Proc. 18th ACM conf. Inf. Knowledge Manag., 2009, 929–936. [31] J. Read and J. Carroll. Weakly supervised techniques for domain-independent sentiment classification. Proc. 1st Int. CIKM Workshop Topic-sentiment Anal. Mass Opin., 2009, 45–52. [32] H. Saif, T. Dickinson, L. Kastler, M. Fernandez, and H. Alani. A semantic graph-based approach for radicalisation detection on social media. Euro. Semantic web Conf., 2017, 571–587.[33] J. Serrano-Guerrero, J. A. Olivas, F. P. Romero, and E. Herrera-Viedma. Sentiment analysis: A review and comparative analysis of web services. Inf. Sci. 311, (2015) 18–38. [34] N. F. F. Silva, L. F. Coletta, E. R. Hruschka, and E. R. Hruschka Jr. Using unsupervised information to improve semi-supervised tweet sentiment classification. Inf. Sci. 355, (2016) 348–365. [35] N. F. F. D. Silva, L. F. Coletta, and E. R. Hruschka. A survey and comparative study of tweet sentiment analysis via semi-supervised learning. ACM Computing Surveys, 49(1), (2016) 1–15. [36] K. Taghva, R. Beckley, and M. Sadeh. A list of farsi stopwords. Ret. Sept. 2003(7), (2003). [37] C. Tan, L. Lee, J. Tang, L. Jiang, M. Zhou, and P. Li. User-level sentiment analysis incorporating social networks. Proc. 17th ACM SIGKDD Int. Conf. Knowledge Disc. Data Min., 2011, 1397–1405. [38] J. Tanha. Mssboost: A new multiclass boosting to semi-supervised learning. Neurocomput. 2018, (2018). [39] J. Tanha. A multiclass boosting algorithm to labeled and unlabeled data. Int. J. Machine Learn. Cyber. 2019, (2019). [40] J. Tanha, M. J. Saberian, and M. Van Someren. Multiclass semi-supervised boosting using similarity learning. Data Mining (ICDM), 2013 IEEE 13th Int. Conf., 2013, 1205–1210. [41] J. Tanha, M. Van Someren, and H. Afsarmanesh. Boosting for multiclass semi-supervised learning. Pattern Recog. Let. 37, (2014) 63–77. [42] J. Tanha, M. van Someren, and H. Afsarmanesh. Semi-supervised self-training for decision tree classifiers. Int. J. Machine Learn. Cyber. 8(1), (2017) 355–370. [43] H. Thakkar and D. Patel. Approaches for sentiment analysis on twitter: A state-of-art study. arXiv preprint arXiv:1512.01043, 2015. [44] A. Tripathy, A. Agrawal, and S. K. Rath. Classification of sentiment reviews using n-gram machine learning approach. Expert Syst. Appl. 57, (2016) 117–126. [45] H. Valizadegan, R. Jin, and A. K. Jain. Semi-supervised boosting for multi-class classification. Joint Euro. Conf. Machine Learn. Knowledge Disc. Datab. 2008, (2008) 522–537. [46] B. Xiang and L. Zhou. Improving twitter sentiment analysis with topic-based mixture modeling and semi-supervised training. Proc. 52nd Annual Meet. Assoc. Comput. Ling. 2, (2014) 434–439. [47] W. Xu and Y. Tan. Semi-supervised target-oriented sentiment classification. Neurocomput. 337, (2019) 120–128. [48] N. Yu. Exploring c o-training strategies for opinion detection. J. Assoc. Inf. Sci. Tech. 56(10), (2014) 2098–2110. [49] N. Yu and S. Kubler. Semi-supervised learning for opinion detection. Web Intel. Intel. Agent Tech. (WI-IAT), 2010 IEEE/WIC/ACM International Conf. 3, (2010) 249–252. [50] T. Zagibalov and J. Carroll. Unsupervised classification of sentiment and objectivity in chinese text. Proc. Third Int. Joint Conf. Natural Lang. Proc. Volume-I, 2008. [51] S. Zeng, D. Luo, C. Zhang, and X. Li. A Correlation-Based TOPSIS Method for Multiple Attribute Decision Making with Single-Valued Neutrosophic Information. Int. J. Inf. Tech. Dec. Mak. 19(1), (). [52] J. Zhao, M. Lan, and T. Zhu. Ecnu: Expression-and message-level sentiment orientation classification in twitter using multiple effective features. Proc. 8th Intm Workshop Semantic Ev. 2014, (2014) 259–264. [53] F. Zou, F. L. Wang, X. Deng, and S. Han. Automatic identification of chinese stop words. Res. Comput. Sci. 18, (2006) 151–162. | ||
آمار تعداد مشاهده مقاله: 15,718 تعداد دریافت فایل اصل مقاله: 661 |