Machine Learning

Speech Foundation Models Generalize to Time Series Tasks from Wearable Sensor Data

This paper was accepted at the Learning from Time Series for Health workshop at NeurIPS 2025.
Both speech and sensor time series data encode information in both the time- and frequency- domains, like spectral powers and waveform shapelets. We show that speech foundation models learn representations that generalize beyond the speech domain and achieve state-of-the-art performance on diverse time-series tasks from wearable sensors. Probes trained on features extracted from HuBERT and wav2vec 2.0 outperform those extracted from self-supervised models trained directly on modality-specific datasets…

​This paper was accepted at the Learning from Time Series for Health workshop at NeurIPS 2025.
Both speech and sensor time series data encode information in both the time- and frequency- domains, like spectral powers and waveform shapelets. We show that speech foundation models learn representations that generalize beyond the speech domain and achieve state-of-the-art performance on diverse time-series tasks from wearable sensors. Probes trained on features extracted from HuBERT and wav2vec 2.0 outperform those extracted from self-supervised models trained directly on modality-specific datasets… ​​ Read More

Leave a Reply

Your email address will not be published. Required fields are marked *