Difference between revisions of "AUDIBLE"

From IntRoLab
(Video)
 
(12 intermediate revisions by 3 users not shown)
Line 1: Line 1:
AUDIBLE - Artificial audition for mobile robots
+
<analytics uacct="UA-27707792-1" ></analytics>
 +
= AUDIBLE - Artificial Audition for Mobile Robots =
 +
 
 +
<center>
 +
[[Image:AudibleSpartacus.jpg|200px]]  [[Image:AudiblePioneer.jpg|200px]] [[Image:AudibleCub.jpg|200px]]
 +
</center>
 +
 
 +
[[Image:ManyEarsGUI.png|center|600px]]
  
{|
 
|
 
[[Image:AudibleSpartacus.jpg|300px]]
 
|
 
[[Image:AudiblePioneer.jpg|250px]]
 
|
 
[[Image:AudibleCub.jpg|300px]]
 
|}
 
  
 
Artificial auditory system that gives a robot the ability to locate and track sounds, as well as the possibility of separating simultaneous sound sources and recognising simultaneous speech. We demonstrate that it is possible to implement these capabilities using an array of microphones, without trying to imitate the human auditory system. The sound source localisation and tracking algorithm uses a steered beamformer to locate sources, which are then tracked using a multi-source particle filter. Separation of simultaneous sound source is achieved using a variant of the Geometric Source Separation (GSS) algorithm, combined with a multi-source post-filter that further reduces noise, interference and reverberation. Speech recognition is performed on separated sources, either directly or by using Missing Feature Theory (MFT) to estimate the reliability of the speech features. The results obtained show that it is possible to track up to four simultaneous sound sources, even in noisy and reverberant environments. Real-time control of the robot following a sound source is also demonstrated. The sound source separation approach we propose is able to achieve a 13.7 dB improvement in signal-to-noise ratio compared to a single microphone when three speakers are present. In these conditions, the system demonstrates more than 80% accuracy on digit recognition, higher than most human listeners could obtain in our evaluation when recognising only one of these sources. All these new capabilities make it possible for humans to interact more naturally with a mobile robot in real life settings.
 
Artificial auditory system that gives a robot the ability to locate and track sounds, as well as the possibility of separating simultaneous sound sources and recognising simultaneous speech. We demonstrate that it is possible to implement these capabilities using an array of microphones, without trying to imitate the human auditory system. The sound source localisation and tracking algorithm uses a steered beamformer to locate sources, which are then tracked using a multi-source particle filter. Separation of simultaneous sound source is achieved using a variant of the Geometric Source Separation (GSS) algorithm, combined with a multi-source post-filter that further reduces noise, interference and reverberation. Speech recognition is performed on separated sources, either directly or by using Missing Feature Theory (MFT) to estimate the reliability of the speech features. The results obtained show that it is possible to track up to four simultaneous sound sources, even in noisy and reverberant environments. Real-time control of the robot following a sound source is also demonstrated. The sound source separation approach we propose is able to achieve a 13.7 dB improvement in signal-to-noise ratio compared to a single microphone when three speakers are present. In these conditions, the system demonstrates more than 80% accuracy on digit recognition, higher than most human listeners could obtain in our evaluation when recognising only one of these sources. All these new capabilities make it possible for humans to interact more naturally with a mobile robot in real life settings.
 
----
 
----
 +
 
= Équipe / Team =
 
= Équipe / Team =
 +
*David Brodeur
 
*François Grondin
 
*François Grondin
 
*Jean-Marc Valin
 
*Jean-Marc Valin
Line 29: Line 30:
  
 
= Video =  
 
= Video =  
 +
<center>
 +
<code>{{#ev:youtube|Acfxl3oqg90}}</code>
 +
</center>
 +
* [http://www.willowgarage.com/blog/2013/04/12/giving-ears-pr2-8sounds-and-manyears ManyEars on a PR2 (March 2013)]
 +
 
* [[Media:AUDIBLE_Localization.mov|AUDIBLE Localization video (2005)]]
 
* [[Media:AUDIBLE_Localization.mov|AUDIBLE Localization video (2005)]]
 
* [[Media:AUDIBLE_PIONEER.mpg|AUDIBLE Localization video (2003)]]
 
* [[Media:AUDIBLE_PIONEER.mpg|AUDIBLE Localization video (2003)]]
  
 
= Installation =
 
= Installation =
 
[[Image:ManyEarsGUI.png|center|600px]]
 
 
 
* [http://manyears.sourceforge.net ManyEars : GPL 'C' implementation of AUDIBLE]
 
* [http://manyears.sourceforge.net ManyEars : GPL 'C' implementation of AUDIBLE]
 
* [[ManyEarsPackages | Localization Packages (FlowDesigner old version)]]
 
* [[ManyEarsPackages | Localization Packages (FlowDesigner old version)]]
Line 50: Line 53:
 
= Publications =
 
= Publications =
  
#Brière, S., Valin, J.-M., Michaud, F., Létourneau, D. (2008) “Embedded auditory system for small mobile robots,” to be presented at IEEE International Conference on Robotics and Automation. (pdf)
+
#Grondin, F., Létourneau, D., Ferland, F., and Michaud, F. (2013), "An open hardware and software microphone array system for robotic applications," Demonstration session IEEE International Conference on Human-Robot Interaction. ([[Media:HRI2013demo.pdf|pdf]])
#Valin, J.-M., Yamamoto, S., Rouat, J., Michaud, F., Nakadai, K., Okuno, H. (2007), “Robust recognition of simultaneous speech by a mobile robot,” IEEE Transactions on Robotics, 23(4):742-752. (pdf)
+
#Grondin, F., Létourneau, D., Ferland, F., Rousseau, V., and Michaud, F. (2013), "The ManyEars Open Framework - Microphone array open software and open hardware system for robotic applications," ''Autonomous Robots'', 34:217-232. [http://link.springer.com/article/10.1007/s10514-012-9316-x]
#Valin, J.-M., Michaud, F., Rouat, J. (2007), “Robust localization and tracking of simultaneous moving sound sources using beamforming and particle filtering,” Robotics and Autonomous Systems Journal, 55: 216-228. (pdf)
+
#Ferland, F., Létourneau, D., Aumont, A., Frémy, J, Legault, M.-A., Lauria, M., Michaud, F. (2012), "Natural interaction design of a humanoid robot," Journal of Human-Robot Interaction, 1 (2), 118-134, [http://www.humanrobotinteraction.org/journal/index.php/HRI/article/view/65].
#Brière, S., Létourneau, D., Fréchette, M., Valin, J.-M., Michaud, F. (2006), “Embedded and integration audition for a mobile robot,” Proceedings AAAI Fall Symposium Workshop Aurally Informed Performance: Integrating Machine Listening and Auditory Presentation in Robotic Systems, FS-06-01, 6-10. (pdf)
+
#Fréchette, M., Létourneau, D., Valin, J.-M., Michaud. F. (2012), “Integration of sound source localization and separation to improve dialogue management on a robot,” IEEE/RSJ International Conference on Intelligent Robots and Systems. NTF Award for Entertainment Robots and Systems. ([[Media:IROS2012.pdf|pdf]])
#Valin, J.-M., Michaud, F., Rouat, J., (2006), “Robust 3D localization and tracking of sound sources using beamforming and particle filtering”, Proceedings International Conference on Acoustics, Speech, and Signal Processing, 841-844.
+
#Grondin, F., Michaud, F. (2012), "WISS, a Speaker Identification System for Mobile Robots," Proceedings of the International Conference on Robotics and Automation: 1817-1822 ([[Media:Grondin2012wiss.pdf|pdf]]) ([[Media:ICRA2012.mpg|mpg]])
#Valin, J.-M.. (2005), "Auditory system for a mobile robot", Ph.D. Thesis, Department of Electrical Engineering and Computer Engineering, Université de Sherbrooke, August. (pdf)
+
#Grondin, F., Reconnaissance de locuteurs pour robot mobile, Mémoire de maîtrise, Département de génie électrique et de génie informatique, Université de Sherbrooke. ([[Media:MemoireGrondin.pdf|pdf]])
#Yamamoto, S., Nakadai, K., Valin, J.M., Rouat, J., Michaud, F., Komatani, K., Ogata, T., Okuno, H. (2005), “Making a robot recognize three simultaneous sentences in real-time,” Proceedings IEEE/RSJ International Conference on Intelligent Robots and Systems, 897-902. (pdf)
+
#Badali, A., Valin, J.-M., Michaud, F., Aarabi, P. (2009), “Evaluating real-time audio localization algorithms for artificial audition on mobile robots,” to be presented at IEEE International Conference on Intelligent Robots and Systems, October. ([http://introlab.3it.usherbrooke.ca/papers/IROS2009.pdf pdf])
#Yamamoto, S., Valin, J.M., Nakadai, K., Rouat, J., Michaud, F., Ogata, T., Okuno, H. (2005), “Enhanced robot speech recognition based on microphone array source separation and missing feature theory,” IEEE International Conference on Robotics and Automation, 1489-1494.
+
#Brière, S., Valin, J.-M., Michaud, F., Létourneau, D. (2008) “Embedded auditory system for small mobile robots,” ''Proceedings IEEE International Conference on Robotics and Automation''. ([http://introlab.3it.usherbrooke.ca/papers/ICRA2008Briere.pdf pdf])
#Valin, J.-M., Rouat, J., Michaud, F. (2004), "Enhanced robot audition based on microphone array source separation with post-filter", Proceedings IEEE/RSJ International Conference on Robots and Intelligent Systems, 2123-2128. (pdf)
+
#Valin, J.-M., Yamamoto, S., Rouat, J., Michaud, F., Nakadai, K., Okuno, H. (2007), “Robust recognition of simultaneous speech by a mobile robot,” ''IEEE Transactions on Robotics'', 23(4):742-752. ([http://introlab.3it.usherbrooke.ca/papers/TRO2007.pdf pdf])
#Valin, J.-M., Michaud, F., Hadjou, B., Rouat, J. (2004), "Localization of simultaneous moving sound sources for mobile robot using a frequency-domain steered beamformer approach", Proceedings IEEE International Conference on Robotics and Automation, 1033-1038. (pdf)
+
#Michaud, F., Côté, C., Létourneau, D., Brosseau, Y., Valin, J.-M., Beaudry, É., Raïevsky, C., Ponchon, Moisan, P., Lepage, P., Morin, Y., Gagnon, F., Giguère, P., Roux, M.-A., Caron, S., Frenette, P., Kabanza, F. (2007), “Spartacus attending the 2005 AAAI Conference,” to be published in ''Autonomous Robots, ''Special Issue on the AAAI Mobile Robot Competitions and Exhibition. ([http://introlab.3it.usherbrooke.ca/papers/AR2007.pdf pdf])  
#Valin, J.-M., Rouat, J., Michaud, F. (2004), "Microphone array post-filter for separation of simultaneous non-stationary sources", accepted ICASSP. (pdf)
+
#Valin, J.-M., Michaud, F., Rouat, J. (2007), “Robust localization and tracking of simultaneous moving sound sources using beamforming and particle filtering,” ''Robotics and Autonomous Systems Journal'', 55: 216-228. ([http://introlab.3it.usherbrooke.ca/papers/RAS2007.pdf pdf])  
#Valin, J.-M., Michaud, F., Létourneau, D., Rouat, J. (2003), "Robust sound source localization using a microphone array on a mobile robot", Proceedings IEEE/RSJ International Conference on Intelligent Robots and Systems, 1228-1233. (pdf)
+
#Côté, C., Brosseau, Y., Létourneau, D., Raïevsky, C., Michaud, F. (2006), "Using MARIE in software development and integration for autonomous mobile robotics", ''International Journal of Advanced Robotic Systems'', Special Issue on Software Development and Integration in Robotics, 3(1):55-60. ([http://introlab.3it.usherbrooke.ca/papers/IJARS2006.pdf pdf])  
#Robust localization and tracking of simultaneous moving sound sources using beamforming and particle filtering. Valin, J.M., Michaud, F., Rouat, J. Demande de brevet déposée / Patent pending, 27 avril 2005, US 11/116,117.
+
#Valin, J.-M., Michaud, F., Rouat, J., (2006), “Robust 3D localization and tracking of sound sources using beamforming and particle filtering”, ''Proceedings'' ''International Conference on Acoustics, Speech, and Signal Processing'', 841-844. ([http://introlab.3it.usherbrooke.ca/papers/ICASSP2006.pdf pdf])
 +
#Létourneau, D., Valin, J.-M., Côté, C., Michaud, F. (2005), “FlowDesigner: the free data-flow oriented development environment”, ''Software 2.0'', vol. 3. ([[Media:Software2005.pdf|pdf]])  
 +
#Yamamoto, S., Nakadai, K., Valin, J.M., Rouat, J., Michaud, F., Komatani, K., Ogata, T., Okuno, H. (2005), “Making a robot recognize three simultaneous sentences in real-time,” ''Proceedings'' ''IEEE/RSJ International Conference on Intelligent Robots and Systems'', 897-902. ([http://introlab.3it.usherbrooke.ca/papers/Interspeech2005_Yamamoto.pdf pdf])  
 +
#Yamamoto, S., Valin, J.M., Nakadai, K., Rouat, J., Michaud, F., Ogata, T., Okuno, H. (2005), “Enhanced robot speech recognition based on microphone array source separation and missing feature theory,” ''IEEE International Conference on Robotics and Automation'', 1489-1494.  
 +
#Michaud, F., Létourneau, D., Lepage, P., Morin, Y., Gagnon, F., Giguère, P., Beaudry, É., Brosseau, Y., Côté, C., Duquette, A., Laplante, J.-F., Legault, M.-A., Moisan, P., Ponchon, A., Raïevsky, C., Roux, M.-A., Salter, T., Valin, J.-M., Caron, S., Frenette, P., Masson, P., Kabanza, F., Lauria, M. (2005), “Socially interactive robots for real life use,” ''Proceedings Workshop on Mobile Robot Competition, American Association for Artificial Intelligence Conference (AAAI)'', Pittsburgh USA. ([http://introlab.3it.usherbrooke.ca/papers/AAAI2005workshop.pdf pdf])
 +
#Valin, J.-M.. (2005), "Auditory system for a mobile robot", Ph.D. Thesis, Department of Electrical Engineering and Computer Engineering, Université de Sherbrooke, August. ([http://introlab.3it.usherbrooke.ca/papers/PhDValin.pdf pdf])
 +
#Valin, J.-M., Rouat, J., Michaud, F. (2004), "Enhanced robot audition based on microphone array source separation with post-filter", ''Proceedings IEEE/RSJ International Conference on Robots and Intelligent Systems'', 2123-2128. ([http://introlab.3it.usherbrooke.ca/papers/IROS2004_Valin.pdf pdf])  
 +
#Valin, J.-M., Michaud, F., Hadjou, B., Rouat, J. (2004), "Localization of simultaneous moving sound sources for mobile robot using a frequency-domain steered beamformer approach", ''Proceedings IEEE International Conference on Robotics and Automation'', 1033-1038. ([http://introlab.3it.usherbrooke.ca/papers/ICRA2004audible.pdf pdf])  
 +
#Valin, J.-M., Rouat, J., Michaud, F. (2004), "Microphone array post-filter for separation of simultaneous non-stationary sources", ''ICASSP'', Montréal. ([http://introlab.3it.usherbrooke.ca/papers/ICASSP2004.pdf pdf])  
 +
#Valin, J.-M., Michaud, F., Létourneau, D., Rouat, J. (2003), "Robust sound source localization using a microphone array on a mobile robot", ''Proceedings IEEE/RSJ International Conference on Intelligent Robots and Systems'', p. 1228-1233. ([http://introlab.3it.usherbrooke.ca/papers/IROS2003_Valin.pdf pdf])

Latest revision as of 15:52, 13 April 2013

AUDIBLE - Artificial Audition for Mobile Robots

AudibleSpartacus.jpg AudiblePioneer.jpg AudibleCub.jpg

ManyEarsGUI.png


Artificial auditory system that gives a robot the ability to locate and track sounds, as well as the possibility of separating simultaneous sound sources and recognising simultaneous speech. We demonstrate that it is possible to implement these capabilities using an array of microphones, without trying to imitate the human auditory system. The sound source localisation and tracking algorithm uses a steered beamformer to locate sources, which are then tracked using a multi-source particle filter. Separation of simultaneous sound source is achieved using a variant of the Geometric Source Separation (GSS) algorithm, combined with a multi-source post-filter that further reduces noise, interference and reverberation. Speech recognition is performed on separated sources, either directly or by using Missing Feature Theory (MFT) to estimate the reliability of the speech features. The results obtained show that it is possible to track up to four simultaneous sound sources, even in noisy and reverberant environments. Real-time control of the robot following a sound source is also demonstrated. The sound source separation approach we propose is able to achieve a 13.7 dB improvement in signal-to-noise ratio compared to a single microphone when three speakers are present. In these conditions, the system demonstrates more than 80% accuracy on digit recognition, higher than most human listeners could obtain in our evaluation when recognising only one of these sources. All these new capabilities make it possible for humans to interact more naturally with a mobile robot in real life settings.


Équipe / Team

  • David Brodeur
  • François Grondin
  • Jean-Marc Valin
  • François Michaud
  • Jean Rouat
  • Simon Brière
  • Dominic Létourneau

Nouvelles / News

BenPassowManyEars.jpg December 2009 : Congratulations to Ben Passow (PhD student) and Mario Gongora's that won the Annual Machine Intelligence Competition run by the British Computer Society with their entry called 'Fly By Ear' . This is the second year in a row a team from the the CCI has won this competition. They are using the ManyEars package for sound source localization.

Video

Installation

Separation

Publications

  1. Grondin, F., Létourneau, D., Ferland, F., and Michaud, F. (2013), "An open hardware and software microphone array system for robotic applications," Demonstration session IEEE International Conference on Human-Robot Interaction. (pdf)
  2. Grondin, F., Létourneau, D., Ferland, F., Rousseau, V., and Michaud, F. (2013), "The ManyEars Open Framework - Microphone array open software and open hardware system for robotic applications," Autonomous Robots, 34:217-232. [1]
  3. Ferland, F., Létourneau, D., Aumont, A., Frémy, J, Legault, M.-A., Lauria, M., Michaud, F. (2012), "Natural interaction design of a humanoid robot," Journal of Human-Robot Interaction, 1 (2), 118-134, [2].
  4. Fréchette, M., Létourneau, D., Valin, J.-M., Michaud. F. (2012), “Integration of sound source localization and separation to improve dialogue management on a robot,” IEEE/RSJ International Conference on Intelligent Robots and Systems. NTF Award for Entertainment Robots and Systems. (pdf)
  5. Grondin, F., Michaud, F. (2012), "WISS, a Speaker Identification System for Mobile Robots," Proceedings of the International Conference on Robotics and Automation: 1817-1822 (pdf) (mpg)
  6. Grondin, F., Reconnaissance de locuteurs pour robot mobile, Mémoire de maîtrise, Département de génie électrique et de génie informatique, Université de Sherbrooke. (pdf)
  7. Badali, A., Valin, J.-M., Michaud, F., Aarabi, P. (2009), “Evaluating real-time audio localization algorithms for artificial audition on mobile robots,” to be presented at IEEE International Conference on Intelligent Robots and Systems, October. (pdf)
  8. Brière, S., Valin, J.-M., Michaud, F., Létourneau, D. (2008) “Embedded auditory system for small mobile robots,” Proceedings IEEE International Conference on Robotics and Automation. (pdf)
  9. Valin, J.-M., Yamamoto, S., Rouat, J., Michaud, F., Nakadai, K., Okuno, H. (2007), “Robust recognition of simultaneous speech by a mobile robot,” IEEE Transactions on Robotics, 23(4):742-752. (pdf)
  10. Michaud, F., Côté, C., Létourneau, D., Brosseau, Y., Valin, J.-M., Beaudry, É., Raïevsky, C., Ponchon, Moisan, P., Lepage, P., Morin, Y., Gagnon, F., Giguère, P., Roux, M.-A., Caron, S., Frenette, P., Kabanza, F. (2007), “Spartacus attending the 2005 AAAI Conference,” to be published in Autonomous Robots, Special Issue on the AAAI Mobile Robot Competitions and Exhibition. (pdf)
  11. Valin, J.-M., Michaud, F., Rouat, J. (2007), “Robust localization and tracking of simultaneous moving sound sources using beamforming and particle filtering,” Robotics and Autonomous Systems Journal, 55: 216-228. (pdf)
  12. Côté, C., Brosseau, Y., Létourneau, D., Raïevsky, C., Michaud, F. (2006), "Using MARIE in software development and integration for autonomous mobile robotics", International Journal of Advanced Robotic Systems, Special Issue on Software Development and Integration in Robotics, 3(1):55-60. (pdf)
  13. Valin, J.-M., Michaud, F., Rouat, J., (2006), “Robust 3D localization and tracking of sound sources using beamforming and particle filtering”, Proceedings International Conference on Acoustics, Speech, and Signal Processing, 841-844. (pdf)
  14. Létourneau, D., Valin, J.-M., Côté, C., Michaud, F. (2005), “FlowDesigner: the free data-flow oriented development environment”, Software 2.0, vol. 3. (pdf)
  15. Yamamoto, S., Nakadai, K., Valin, J.M., Rouat, J., Michaud, F., Komatani, K., Ogata, T., Okuno, H. (2005), “Making a robot recognize three simultaneous sentences in real-time,” Proceedings IEEE/RSJ International Conference on Intelligent Robots and Systems, 897-902. (pdf)
  16. Yamamoto, S., Valin, J.M., Nakadai, K., Rouat, J., Michaud, F., Ogata, T., Okuno, H. (2005), “Enhanced robot speech recognition based on microphone array source separation and missing feature theory,” IEEE International Conference on Robotics and Automation, 1489-1494.
  17. Michaud, F., Létourneau, D., Lepage, P., Morin, Y., Gagnon, F., Giguère, P., Beaudry, É., Brosseau, Y., Côté, C., Duquette, A., Laplante, J.-F., Legault, M.-A., Moisan, P., Ponchon, A., Raïevsky, C., Roux, M.-A., Salter, T., Valin, J.-M., Caron, S., Frenette, P., Masson, P., Kabanza, F., Lauria, M. (2005), “Socially interactive robots for real life use,” Proceedings Workshop on Mobile Robot Competition, American Association for Artificial Intelligence Conference (AAAI), Pittsburgh USA. (pdf)
  18. Valin, J.-M.. (2005), "Auditory system for a mobile robot", Ph.D. Thesis, Department of Electrical Engineering and Computer Engineering, Université de Sherbrooke, August. (pdf)
  19. Valin, J.-M., Rouat, J., Michaud, F. (2004), "Enhanced robot audition based on microphone array source separation with post-filter", Proceedings IEEE/RSJ International Conference on Robots and Intelligent Systems, 2123-2128. (pdf)
  20. Valin, J.-M., Michaud, F., Hadjou, B., Rouat, J. (2004), "Localization of simultaneous moving sound sources for mobile robot using a frequency-domain steered beamformer approach", Proceedings IEEE International Conference on Robotics and Automation, 1033-1038. (pdf)
  21. Valin, J.-M., Rouat, J., Michaud, F. (2004), "Microphone array post-filter for separation of simultaneous non-stationary sources", ICASSP, Montréal. (pdf)
  22. Valin, J.-M., Michaud, F., Létourneau, D., Rouat, J. (2003), "Robust sound source localization using a microphone array on a mobile robot", Proceedings IEEE/RSJ International Conference on Intelligent Robots and Systems, p. 1228-1233. (pdf)