This repository accompanies the research paper Visatronic: A Multimodal Decoder-Only Model for Speech Synthesis by Gupta, Akshita, and Likhomanenko, Tatiana and Yang, Karren, and Bai, He and Aldeneh, ...