Machine Learning

Speech Foundation Models Generalize to Time Series Tasks from Wearable Sensor Data

This paper was accepted at the Learning from Time Series for Health workshop at NeurIPS 2025.
Both speech and sensor time series data encode information in both the time- and frequency- domains, like spectral powers and waveform shapelets. We show that speech foundation models learn representations that generalize beyond the speech domain and achieve state-of-the-art performance on diverse time-series tasks from wearable sensors. Probes trained on features extracted from HuBERT and wav2vec 2.0 outperform those extracted from self-supervised models trained directly on modality-specific datasets…

This paper was accepted at the Learning from Time Series for Health workshop at NeurIPS 2025.
Both speech and sensor time series data encode information in both the time- and frequency- domains, like spectral powers and waveform shapelets. We show that speech foundation models learn representations that generalize beyond the speech domain and achieve state-of-the-art performance on diverse time-series tasks from wearable sensors. Probes trained on features extracted from HuBERT and wav2vec 2.0 outperform those extracted from self-supervised models trained directly on modality-specific datasets… Read More

Related Posts

Google is powering Belgium’s digital future with a two-year €5 billion investment in AI infrastructure.

3 Questions: On the future of AI and the mathematical and physical sciences

Introducing the File Search Tool in Gemini API

Leave a Reply Cancel reply