Project information

Sound source localization is a fundamental task, especially in remembrance and multiple sources environments; it includes recognizing the temporal onset and offset of sound events when active, classifying the sound events into a known set of classes, and further localizing the events in space when active using their direction of arrival (DOA). In this project, we work with 3D audio sounds captured by a first-order Ambisonic microphone and these sounds are then represented by spherical harmonics decomposition in the quaternion domain. The project aims to detect a known set of sound event classes' temporal activities and locate them in the space further using quaternion-valued data processing. In particular, we focus on sound event localization and detection (SELD). To do this, we use a given Quaternion Convolutional Neural Network with the addition of some recurrent layers (QCRNN) for the joint 3D sound event localization and detection task.