New framework syncs robot lip movements with speech, supporting 11+ languages and enhancing humanlike interaction.
To match the lip movements with speech, they designed a "learning pipeline" to collect visual data from lip movements. An AI model uses this data for training, then generates reference points for ...