E-Branchformer-Based E2E SLU Toward Stop on-Device Challenge

Home
Publications
E-Branchformer-Based E2E SLU Toward Stop on-Device Challenge

Research Area

AI & Machine Learning

Author

Yosuke Kashiwagi, Siddhant Arora*, Hayato Futami, Jessica Huynh*, Shih-Lun Wu*, Yifan Peng*, Brian Yan*, Emiru Tsunoo, Shinji Watanabe*
* External authors

Company

Sony Group Corporation

Venue

ICASSP

Date

2023

View Publication

Abstract

In this paper, we report our team’s study on track 2 of the Spoken Language Understanding Grand Challenge, which is a component of the ICASSP Signal Processing Grand Challenge 2023. The task is intended for on-device processing and involves estimating semantic parse labels from speech using a model with 15 million parameters. We use E2E E-Branchformer-based spoken language understanding model, which is more parameter controllable than cascade models, and reduced the parameter size through sequential distillation and tensor decomposition techniques. On the STOP dataset, we achieved an exact match accuracy of 70.9% under the tight constraint of 15 million parameters.

Related Publications

View All

この記事をシェアする