Abstract: Audio-visual zero-shot learning (ZSL) leverages both video and audio information for model training, aiming to classify new video categories that were not seen during the training. However, ...
The 2025 SANS SOC Survey shows AI use is rising, but many SOCs lack integration, customization, and clear validation ...
Based Detection, Linguistic Biomarkers, Machine Learning, Explainable AI, Cognitive Decline Monitoring Share and Cite: de Filippis, R. and Al Foysal, A. (2025) Early Alzheimer’s Disease Detection from ...
Abstract: Affective Video Facial Analysis (AVFA) is important for advancing emotion-aware AI, yet the persistent data scarcity in AVFA presents challenges. Recently, the self-supervised learning (SSL) ...
Abstract: This letter proposes to use similarities of audio captions for estimating audio-caption relevances to be used for training text-based audio retrieval systems. Current audio-caption datasets ...
GLM-TTS is a high-quality text-to-speech (TTS) synthesis system based on large language models, supporting zero-shot voice cloning and streaming inference. This system adopts a two-stage architecture: ...