HRI-JP Honda Research Institute Japan (HRI-JP) – Research and development of advanced technologies

Publications > Multi-Channel Environmental Sound Segmentation Utilizing Sound Source Localization and Separation U-Net

Research

publication_search

Advanced Search

January 2021

Multi-Channel Environmental Sound Segmentation Utilizing Sound Source Localization and Separation U-Net

  • Y. Sudo, K. Itoyama, K. Nishida, K. Nakadai,
  • in Proceedings of the 2021 IEEE/SICE International Symposium on System Integration(SII 2021),
  • IEEE,
  • 2021,
  • pp. 382-387,
  • Conference paper

This paper proposes a multi-channel environmental sound segmentation method. Environmental sound segmentation is an integrated method that deals with sound source localization, sound source separation and class identification. When multiple microphones are available, spatial features can be used to improve the separation accuracy of signals from different directions; however, conventional methods have two drawbacks: (a) Since sound source localization and sound source separation using spatial features and class identification using spectral features are trained in the same neural network, it overfits to the relationship between the direction of arrival and the class. (b) Although the permutation invariant training used in speech recognition could be extended, it is not practical for environmental sounds due to the maximum number of speakers limitation. This paper proposes multi-channel environmental sound segmentation method that combines U-Net which simultaneously performs sound source localization and sound source separation, and convolutional neural network which classifies the separated sounds. This method prevents overfitting to the relationship between the direction of arrival and the class. Simulation experiments using the created datasets including 75-class environmental sounds showed that the root mean squared error of the proposed method was lower than that of the conventional method.

Search by Other Conditions

Keywords
Entry type
Years
to
Authors
Language
Refereed