TY - GEN
T1 - Voice and choice
T2 - 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue, SIGDIAL 2024
AU - Székely, Éva
AU - Higginbotham, Jeff
AU - Possemato, Francesco
N1 - Publisher Copyright:
© 2024 Association for Computational Linguistics.
PY - 2024
Y1 - 2024
N2 - As conversational Text-to-Speech (TTS) technologies become increasingly realistic and expressive, understanding the impact of prosodic variation on speech perception and social dynamics is crucial for enhancing conversational systems. This study explores the influence of prosodic features on listener responses to indirect requests using a specifically designed conversational TTS engine capable of controlling prosody, and generating speech across three different speaker profiles: female, male, and gender-ambiguous. We conducted two experiments to analyse how naturalistic variations in speech rate and vocal effort impact the likelihood of request compliance and perceived politeness. In the first experiment, we examined how prosodic modifications affect the perception of politeness in permission- and action requests. In the second experiment participants compared pairs of spoken requests, each rendered with different prosodic features, and chose which they were more likely to grant. Results indicate that both faster speech rate and higher vocal effort increased the willingness to comply, though the extent of this influence varied by speaker gender. Higher vocal effort in action requests increases the chance of being granted more than in permission requests. Politeness has a demonstrated positive impact on the likelihood of requests being granted, this effect is stronger for the male voice compared to female and gender-ambiguous voices.
AB - As conversational Text-to-Speech (TTS) technologies become increasingly realistic and expressive, understanding the impact of prosodic variation on speech perception and social dynamics is crucial for enhancing conversational systems. This study explores the influence of prosodic features on listener responses to indirect requests using a specifically designed conversational TTS engine capable of controlling prosody, and generating speech across three different speaker profiles: female, male, and gender-ambiguous. We conducted two experiments to analyse how naturalistic variations in speech rate and vocal effort impact the likelihood of request compliance and perceived politeness. In the first experiment, we examined how prosodic modifications affect the perception of politeness in permission- and action requests. In the second experiment participants compared pairs of spoken requests, each rendered with different prosodic features, and chose which they were more likely to grant. Results indicate that both faster speech rate and higher vocal effort increased the willingness to comply, though the extent of this influence varied by speaker gender. Higher vocal effort in action requests increases the chance of being granted more than in permission requests. Politeness has a demonstrated positive impact on the likelihood of requests being granted, this effect is stronger for the male voice compared to female and gender-ambiguous voices.
UR - https://www.scopus.com/pages/publications/105017673199
U2 - 10.18653/v1/2024.sigdial-1.40
DO - 10.18653/v1/2024.sigdial-1.40
M3 - Conference contribution
AN - SCOPUS:105017673199
T3 - SIGDIAL 2024 - 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue, Proceedings of the Conference
SP - 466
EP - 476
BT - SIGDIAL 2024 - 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue, Proceedings of the Conference
A2 - Kawahara, Tatsuya
A2 - Demberg, Vera
A2 - Ultes, Stefan
A2 - Inoue, Koji
A2 - Mehri, Shikib
A2 - Howcroft, David
A2 - Komatani, Kazunori
PB - Association for Computational Linguistics (ACL)
Y2 - 18 September 2024 through 20 September 2024
ER -