TOWARD NATURAL EMOTIONAL TEXT-TO-SPEECH SYSTEM WITH FINE-GRAINED NON-VERBAL EXPRESSION CONTROL

Wangzixi Zhou, Bagus Tris Atmaja, Sakriani Sakti

Nara Institute of Science and Technology, Japan

Comparison between (1) Only Verbal, (2) Verbal + Coarse-grained NV, (3) Verbal + Fine-Grained NV

Happy

Description Audio Player
Only verbal
Verbal + Coarse-grained NV
Verbal + Fine-grained NV (Proposed)

Preference Evaluation

happy

Non-verbal expression Audio Player
(cheering) Wo ho
(cheering) Yo
(Laughter-open) ha ha
(Laughter-closed) ha ha

Sad

Non-verbal expression Audio Player
(crying) whep
(crying) sneeze
(crying) wuuuuuuu whep
(crying) wuuuuuuu