At EMNLP 2023, the Top Conference in the Field of Natural Language Processing, Sony’s R&D activities were presented.

We had a sponsor booth at the Empirical Methods for Natural Language Processing (EMNLP), one of the top conferences in the field of natural language processing, where we presented technologies on multimodal NLP, persona commonsense knowledge construction, and a knowledge-grounded dialog system challenge.

Research Area

Audio, Speech & NLP

Keyword

EMNLP
NLP

Author

Mengjie Zhao

We presented our R&D activities on the field of natural language processing (NLP) at the conference Empirical Methods for Natural Language Processing (EMNLP) 2023 in Singapore from December 6th to 10th. We presented four themes during the conference.

Firstly, recruitment information was presented during an HR session. Nowadays, large language models largely impact people’s daily and working life. Sony also has R&D activities in research field related to NLP and is now hiring for positions. During the presentation, HR introduced detailed information such as Sony’s history, current activities of Sony R&D, and detailed job descriptions.

Secondly, R&D activities on multimodal NLP were presented. We presented our new method of improving CLIP and CyCLIP, which have potential applications such as zero-shot image search. Our new models are dubbed as CLIPs and CyCLIPs, and the main idea is to improve the uniformity of the representation space of the language encoder of the CLIP and CyCLIP (Figure 1). We improve CLIP and CyCLIP in tasks such as text-image retrieval as well as sentence classification tasks. More information can be found in here.

We also presented some of published papers on deep generative models for multimedia generation from Sony.
[published papers]
・Consistency Trajectory Models: Learning Probability Flow ODE Trajectory of Diffusion
・SAN: Inducing Metrizability of GAN with Discriminative Normalized Linear Layer
・SQ-VAE: Variational Bayes on Discrete Representation with Self-annealed Stochastic Quantization

Figure 1: Visualizing cross-modal alignment with respect to text space uniformity of trained vison-language models

The third session presented PeaCoK, which is a joint work between Sony and EPFL(École Polytechnique Fédérale de Lausanne), published in ACL (the Association for Computational Linguistics) 2023. PeaCok contributes to creating a high-quality commonsense knowledge graph about Persona, and we believe that such structured data will be an important resource in the field of persona-ground dialog generation in NLP, given its background of relying on unstructured resources. PeaCok was awarded an outstanding paper award in ACL 2023.

Please see below for the paper.
PeaCoK: Persona Commonsense Knowledge for Consistent and Engaging Narratives

The last session presented our ongoing challenge of persona-grounded dialog generation and knowledge linking. Given the success of PeaCoK, we are now organizing a grand challenge of utilizing PeaCoK for developing AI models that can response to users in fluent, consistent, and engaging manners.

Here are more information about the challenge : CPDC2023

Our presentations at Sony’s booth attracted many students and experienced researchers such that we had insightful discussions with them; they also showed strong interests to our ongoing research and asked about our job openings and the grand challenge. It was a great opportunity not only to promote our technologies, but also fosters new ideas for future development.

The organizers and presenters of Sony’s Booth at EMNLP 2023

Latest Blog & News

View All

Important Notice: Website Closure

2025-03-07|
‘Sony Research Award Program’ Focused on Emerging and Innovative Technological Development

2024-07-16| News
- #Sony Research Award Program

View All