Audio and Speech Processing Lab

Senior researchers: Prof. Dr. Barış Bozkurt, Assoc. Prof. Osman Büyük

Audio and Speech Processing Lab focuses on applied research problems and aims at developing new software tools applicable in real-life scenarios.

Our main research topics are:

Audio and Speech Signal Processing
Computational Musicology
Music Information Retrieval
Automatic Speech Recognition
Speaker Verification
Natural Language Processing

Selected projects lead in our lab:

Automatic melodic analysis of Turkish makam music, TÜBİTAK ARDEB 1001. 2012-2014.
Automatic Transcription of Turkish music, TÜBİTAK ARDEB 1001. 2007-2010.
Speech and Speaker Recognition for Mobile Devices, TUBITAK ARDEB 3001. 2014-2016.

Selected projects we took part:

CompMusic, EU(ERC-267583) [PI: Prof. Xavier Serra], Computational models for the discovery of the World’s Music. 2011 – 2017
Automation of corporate support services using chat-bot, supported by TUBITAK TEYDEB, the project is carried out at Sestek Conversational Solutions Technology Company. 2018-2021
Analysis and optimization of user experience for human-machine interaction, supported by TUBITAK TEYDEB, the project is carried out at Sestek Conversational Solutions Technology Company. 2016 –2017

Selected publications of our lab:

B Bozkurt, I Germanakis, Y Stylianou, 2018, A study of time-frequency features for CNN-based automatic heart sound classification for pathology detection, Computers in Biology and Medicine 100, 132-143.
T Drugman, B Bozkurt, T Dutoit, 2012, A comparative study of glottal source estimation techniques, Computer Speech & Language 26 (1), 20-34.
AC Gedik, B Bozkurt, 2010, Pitch-frequency histogram-based music information retrieval for Turkish music, Signal Processing 90 (4), 1049-1063
A Holzapfel, Y Stylianou, AC Gedik, B Bozkurt, 2010, Three dimensions of pitched instrument onset detection, IEEE Transactions on Audio, Speech, and Language Processing 18 (6), 1517-1527.
B Bozkurt, R Ayangil, A Holzapfel, 2009, Computational analysis of turkish makam music: Review of state-of-the-art and challenges, Journal of New Music Research 43 (1), 3-23.
B Bozkurt, L Couvreur, T Dutoit, 2007, Chirp group delay analysis of speech signals, Speech communication 49 (3), 159-176.
O Buyuk, 2020, Context-dependent sequence-to-sequence Turkish spelling correction, ACM Transactions on Asian and Low-Resource Language Information Processing 19 (4), article 56, 1-16.
C. Demiroglu, O. Buyuk, A. Khodabakhsh, R. Maia, 2017, Post-processing synthetic speech with a complex cepstrum vocoder for spoofing phase-based synthetic speech detectors, IEEE Journal of Selected Topics in Signal Processing 11 (4), 671-683.
K Korucu, O Kaplan, O Buyuk, K Gullu, 2016, An investigation of usability of sound recognition for source separation of packaging waste in reverse vending machines, Waste Management 56, 46-52.
O Buyuk, 2016, Sentence-HMM state based i-vector/PLDA modeling for improved performance in text dependent single utterance speaker verification, IET Signal Processing 10 (8), 918-923.

AIDA

Artificial Intelligience and Data Analytics Application & Research Center

Audio and Speech Processing Lab