• Home
  • Publications
  • E-Branchformer-Based E2E SLU Toward Stop on-Device Challenge

Research Area

Author

  • Yosuke Kashiwagi, Siddhant Arora*, Hayato Futami, Jessica Huynh*, Shih-Lun Wu*, Yifan Peng*, Brian Yan*, Emiru Tsunoo, Shinji Watanabe*
  • * External authors

Company

  • Sony Group Corporation

Venue

  • ICASSP

Date

  • 2023

Share

E-Branchformer-Based E2E SLU Toward Stop on-Device Challenge

View Publication

Abstract

In this paper, we report our team’s study on track 2 of the Spoken Language Understanding Grand Challenge, which is a component of the ICASSP Signal Processing Grand Challenge 2023. The task is intended for on-device processing and involves estimating semantic parse labels from speech using a model with 15 million parameters. We use E2E E-Branchformer-based spoken language understanding model, which is more parameter controllable than cascade models, and reduced the parameter size through sequential distillation and tensor decomposition techniques. On the STOP dataset, we achieved an exact match accuracy of 70.9% under the tight constraint of 15 million parameters.

Share