ホンダ・リサーチ・インスティチュート・ジャパン – 先端技術の研究開発

論文検索 > SLAM-based Online Calibration of Asynchronous Microphone Array for Robot Audition

研究活動

論文検索

Advanced Search

September 2011

SLAM-based Online Calibration of Asynchronous Microphone Array for Robot Audition

  • PDF   (593.58KB)
    Copyright (C) IEEE, 2011. The copyright of this material is retained by IEEE. This material is published on this web site by permission of IEEE for your personal use. Not for redistribution. Received IROS 2011 Best Paper Finalist (6/790)
  • Miura_2011_850   (1.93KB)
  • H. Miura, T. Yoshida, K. Nakamura, K. Nakadai,
  • in Proceedings of the 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2011),
  • IEEE,
  • 2011,
  • pp. 524-529,
  • Conference paper

This paper addresses the online calibration of an asynchronous microphone array for robots. Conventional microphone array technologies require a lot of measurements of transfer functions to calibrate microphone locations, and a multi-channel A/D converter for inter-microphone synchronization. We solve these two problems using a framework combining Simultaneous Localization and Mapping (SLAM) and beamforming in an online manner. To do this, we assume that estimations of microphone locations, a sound source location, and microphone clock difference correspond to mapping, selflocalization, observation errors in SLAM, respectively. In our framework, the SLAM process calibrates locations and clock differences of microphones every time a microphone array observes a sound like a human’s clapping, and a beamforming process works as a cost function to decide the convergence of calibration by localizing the sound with the estimated locations and clock differences. After calibration, beamforming is used for sound source localization. We implemented a prototype system using Extended Kalman Filter (EKF) based SLAM and Delay-and-Sum Beamforming (DS-BF). The experimental results showed that microphone locations and clock differences were estimated properly with 10-15 sound events (handclaps), and the error of sound source localization with the estimated information was less than the grid size of beamforming, that is, the lowest error was theoretically attained.

Search by Other Conditions

Keywords
Entry type
Years
to
Authors
Language
Refereed