بهبود نرخ تشخیص احساس از روی گفتار با استفاده از تفکیک جنسیتی

حریمی, علی; یغمائی, خشایار

doi:10.22075/jme.2017.2444

اداره چاپ و انتشارات دانشگاه سمنان

تعداد نشریات	21
تعداد شماره‌ها	621
تعداد مقالات	9,154
تعداد مشاهده مقاله	67,481,143
تعداد دریافت فایل اصل مقاله	8,011,124

	بهبود نرخ تشخیص احساس از روی گفتار با استفاده از تفکیک جنسیتی
مدل سازی در مهندسی
مقاله 15، دوره 15، شماره 48، خرداد 1396، صفحه 183-200 اصل مقاله (1.35 M)
نوع مقاله: کاربردی
شناسه دیجیتال (DOI): 10.22075/jme.2017.2444
نویسندگان
علی حریمی^* ¹؛ خشایار یغمائی²
¹دانشگاه آزاد اسلامی واحد شاهرود
²دانشگاه سمنان
تاریخ دریافت: 14 شهریور 1391، تاریخ بازنگری: 27 اردیبهشت 1393، تاریخ پذیرش: 23 آذر 1394
چکیده
تشخیص احساس از روی سیگنال گفتار یکی از شاخه‌های نسبتاً جدید در پردازش گفتار می‌باشد که می‌تواند در تعامل انسان و روبات نقش مهمی ایفا کند. در این مقاله ضمن استفاده از دو نوع ویژگی طیفی جدید به منظور افزایش نرخ بازشناسی به بررسی تاثیر جنسیت گویندگان در تشخیص احساس پرداخته شده است. ویژگی‌های یاد شده با استفاده از روش‌های پردازش تصویر، از تصویر طیف‌نگاره سیگنال گفتار استخراج می‌شوند . در این تحقیق به منظور جداسازی احساس‌های مختلف از یکدیگر از طبقه‌بند مرتبه ای استفاده شده است. به منظور بهینه سازی ساختار این طبقه‌بند، ابتدا جداپذیر ترین کلاس ها از هم جدا می‌شوند تا خطای ایجاد شده در مراحل اولیه طبقه‌بندی حداقل بوده و این خطا در الگوریتم منتشر نشود. سیستم پیشنهادی بر روی پایگاه داده‌ی آلمانی برلین آزمایش شده است. بر اساس نتایج بدست آمده نرخ تشخیص برای گویندگان مختلط 4/43% می‌باشد که این مقدار پس از تفکیک گویندگان بر اساس جنسیت به 86/82% افزایش پیدا می‌کند. نرخ تشخیص برای گویندگان زن 05/83% و برای مردان 61/82% بدست آمده است.
کلیدواژه‌ها
تشخیص احساس؛ احساس در زنان و مردان؛ الگوهای طیفی؛ ویژگی‌های هارمونیکی
عنوان مقاله [English]
improving speech emotion recognition via gender classification
نویسندگان [English]
Ali Harimi¹؛ Khashayar Yaghmaie²


چکیده [English]
Speech emotion recognition is a relatively new field of research that could plays an important role in man-machine interaction. In this paper we use from two new spectral features for the automatic recognition of human affective information from speech. These features are extracted from the spectrogram of speech signal by image processing techniques. Also we study the effects of gender information on speech emotion recognition. Hierarchical SVM base classifiers are designed to classify speech signals according to their emotional states. Classifiers are optimized by the Fisher Discriminant Ratio (FDR) to classify the most separable classes at the upper nodes, which can reduce the classification error. The proposed algorithm tested on the well known Berlin database for the male and female speakers separately and in combination. The overall recognition rate of 43.4% is obtained for the coeducational speakers. The results show the 39.46% improvement when the gender information is used.
کلیدواژه‌ها [English]
emotion recognition, speech processing, emotion in males and females, spectral patterns, harmonic energy features

مراجع
[1] ElAyadi, M ., Kamel, M.S., Karray, F. (2011). Survey on speech emotion recognition: Features, classification schemes, and databases. Pattern Recognition 44, pp 572–587. [2] Yang, B., Lugger, M. (2010). Emotion recognition from speech signals using new harmony features. Signal Processing 90, pp 1415–1423. [3] Monti, G., Meletti, S. (2015). Emotion recognition in temporal lobe epilepsy: A systematic review. Neuroscience and Biobehavioral Reviews 55 .pp 280–293. [4] Wu, S., Falk, T.H., Chan, W.Y. (2011). Automatic speech emotion recognition using modulation spectral features. Speech communication 53, pp 768–785. [5] Harimi, A., AhmadyFard, A.R., Shahzadi, A., Yaghmaie, K. (2015). Anger or Joy? Emotion Recognition Using Nonlinear Dynamics of Speech. Applied Artificial Intelligence 29 .pp 675–696. [6] Milton, A., Tamil Selvi, S. (2014). Class-specific multiple classifiers scheme to recognize emotions from speech signals. Computer Speech and Language 28. pp 727–742. [7] Bitouk, D., Verma, R., Nenkova, A. (2010). Class-level spectral features for emotion recognition. Speech Communication 52, pp 613–625. [8] Albornoz, E.M., Milone, D.H., Rufiner, H.L. (2011). Spoken emotion recognition using hierarchical classifiers. Computer Speech and Language 25, pp 556–570. [9] Clavel, C., Vasilescu, I., Devillers, L., Richard, G., Ehrette, T. (2008). Fear-type emotion recognition for future audio-based surveillance systems. Speech Communication 50, pp 487–503. [10] Polzehl, T., Schmitt, A., Metze, F., Wagner, M. (2011). Anger recognition in speech using acoustic and linguistic cues. Speech Communication 53, pp 1198–1209. [11] Pérez-Espinosa, H., Reyes-García, C.A., Villasenor-Pineda, L. (2011). Acoustic feature selection and classification of emotions in speech using a 3D continuous emotion model. Biomedical Signal Processing and Control 02.008. [12] Kockmann, M., Burget, L., Cernocky, J. (2011). Application of speaker- and language identification state-of-the-art techniques for emotion recognition. Speech Communication 53, pp 1172–1185. [13] Lee, C.C., Mower, E., Busso, C., Lee, S., Narayanan, S. (2011). Emotion recognition using a hierarchical binary decision tree approach.Speech Communication 53, pp 1162–1171. [14] Laukka, P., Neiberg, D., Forsell, M., Karlsson, I., Elenius, K. (2011). Expression of affect in spontaneous speech: Acoustic correlates and automatic detection of irritation and resignation. Computer Speech and Language 25, pp 84–104. [15] Bozkurt, E., Erzin, E., Erdem, C.E., Erdem, A.T. (2011). Formant position based weighted spectral features for emotion recognition. Speech Communication 53, pp 1186–1197. [16] Vayrynen, E., Toivanen, J., Seppanen, T. (2011). Classification of emotion in spoken Finnish using vowel-length segments: Increasing reliability with a fusion technique. Speech Communication 53, pp 269–282. [17] Ooi, C.S., Seng, K.P., Ang, L.M., Chew, L.W. (2014). A new approach of audio emotion recognition. Expert Systems with Applications 41, pp 5858–5869. [18] Ververidis, D., Kotropoulos, C. (2008). Fast and accurate sequential floating forward feature selection with the Bayes classifier applied to speech emotion recognition. Signal Processing 88, pp 2956–2970. [19] Buck, R. (1988). Human motivation and emotion., Wiley, New York. [20] Darwin, C. (1955). The expression of the emotions in man and animals. Philosophical Library Edition, London (reproduction of 1872 publication). [21] Ekman, P., Friesen, W.V. (1975). Unmasking the face., Englewood Cliffs: Prentice Hall. [22] Izard, C.E. (1977). Human emotions. Plenum Press, New York. [23] Panksepp, J. (1988). Affective neuroscience: The foundations of human and animal emotion. Oxford University Press, New York. [24] Pell, M.D., Monetta, L., Paulmann, S., Kotz, S.A. (2009). Recognizing emotions in a foreign language. Journal of Nonverbal Behavior. 33. pp 107–120. [25] Ross, E.D., Prodan, C.I., Monnot, M., (2007). Human facial expressions are organized functionally across the upper-lower facial axis. Neuroscientist. 13. pp 433–446. [26] Ross, E.D., Monnot, M. (2011). Affective prosody: What do comprehension errors tell us about hemispheric lateralization of emotions, sex and aging effects, and the role of cognitive appraisal Neuropsychologia. 49. pp 866–877. [27] Lewis, W., Michalson, L. (1983). Children’s emotions and moods: Developmental theory and measurement., Plenum Press, New York. [28] Malatesta, C.Z., Kalnok, D. (1984). Emotional experience in younger and older adults. Journal of Gerontology. 39, pp 301–308. [29] Engel, B. (2006). Healing Your Emotional Self. John Wiley & Sons, New Jersey. [30] Engel, B. (2008). The Nice Girl Syndrome. John Wiley & Sons, New Jersey. [31] Sanchez, F. (2006). A Thousand Moments of Solitude. Library of Congress Control Number: 2005911228, United States of America. [32] Sherwood, L. (2010). HUMAN PHISIOLOGY From Cells to Systems, eight edition. Cengage Learning. Library of Congress Control Number: 2011939366. [33] Whittle, S., Yücel, M., Yap, M.B.H., Allen, N.B. (2011). Sex differences in the neural correlates of emotion: Evidence from neuroimaging. Biological Psychology. 87. pp 319– 333. [34] Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W., Weiss, B. (2005). A database of German emotional speech. Interspeech. pp 1517–1520. [35] Shahzadi1, A., Ahmadyfard, A.R., Yaghmaie, K., Harimi, A. (2013). Recognition Of Emotion In Speech Using Spectral Patterns. Malaysian Journal of Computer Science 26(2), pp 140-158.. [36] Sreenivasa Rao, K., Ramu Reddy, V., Maity, S. (2015). Language Identification Using Spectral and Prosodic Features. New York, Springer. [37] Mitsuyoshi, S., Monnma, F., Tanaka, Y., Minami, T., Kato, M., Murata, T. (2011). Identifying neural components of emotion in free conversation with fMRI. Defense Science Research Conference and Expo (DSR). [38] Kotti, M., Kotropoulos, C. (2008). Gender classification in two Emotional Speech databases. 19th International Conference on Pattern Recognition (ICPR). [39] Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G., Kollias, S., Fellenz, W., et al. (2001). Emotion recognition in human-computer interaction. Signal Processing Magazine, IEEE, vol. 18, pp. 32-80.
آمار تعداد مشاهده مقاله: 1,016 تعداد دریافت فایل اصل مقاله: 591

سامانه مدیریت نشریات علمی. قدرت گرفته از سیناوب

پیوندهای مفید

پیوندهای مفید

آمار

بهبود نرخ تشخیص احساس از روی گفتار با استفاده از تفکیک جنسیتی

[36] Sreenivasa Rao, K., Ramu Reddy, V., Maity, S. (2015). Language Identification Using Spectral and Prosodic Features. New York, Springer.