The primary focus of our research is how humans recognize speech sounds and spoken words in diverse listening situations, with a particular focus on face-to-face communication.
Three larger themes in our research are:
Speech recognition in face-to-face communication
We investigate the underlying behavioral and neural processes in situations when we only hear a speaker but also in face-to-face communication, when we hear and see a speaker. Whenever listeners see a speaker, they process and integrate the information obtained from the two modalities. In this theme, we look at multisensory processing in speech perception and how it is used effectively.
Perceptual learning and plasticity
The perceptual system is updated based on new experiences, yet it maintains a certain stability. In this theme, we focus on perceptual learning and plasticity in speech. One larger line of work examines the underlying mechanisms involved in learning about talker idiosyncrasies in the production of speech. Listeners adjust their mental auditory and visual representations to an individual talker’s – or a group of talkers’ (e.g., talkers who share the same dialect or foreign accent) – way of speaking to optimize future speech recognition. We also investigate how listeners form new representations, in particular how representations of talking-related dynamic facial motion signatures are learned. Other perceptual learning projects address multisensory learning, temporal recalibration, and phonological learning.
Perceptual and cognitive factors in speech recognition
Listeners vary in their ability to recognize speech. We investigate whether and how perceptual (e.g., hearing) and cognitive factors (e.g., attention, working memory) play a role in auditory/audiovisual speech recognition and can hence explain individual differences. In this work, we often investigate speech recognition in more naturalistic situations, that is, with different types and levels of audible noise. A special focus in our work is on how the decline in perceptual and cognitive abilities that comes with aging may affect speech recognition. Our studies on middle-aged adults examine the factors contributing to healthy aging.
Methods and Facilities
We use psycholinguistic measures (e.g., reaction times) to investigate the behavioral mechanisms underlying speech recognition. In particular, we often use eye tracking with the visual world paradigm as well as pupillometry. We also investigate neural mechanisms by analyzing ERPs and EEG. We use state-of-the-art psycholinguistic statistical methods (e.g., linear mixed effect modeling) and computational modeling.
Our lab facilities consist of 4 state-of-the art IAC sound-attenuating testing booths equipped for psychophysical research. Behavioral experiments are run in PsychToolBox under Octave. We also run online versions of our experiments. Two IAC booths host our own SR Research 2000 eye tracker and an EyeLink Duo eye tracker, used to track eye fixations as well as pupil size. In addition, we have a multimedia recording lab and a workspace with work stations for high-end video/sound editing and data analyses with R. The multimedia lab also allows for the tracking of motion. Hearing tests are conducted with our Grason Stadler audiometer and tympanometer. Our EEG experiments are run in our own EEG laboratory (hosting a BrainVision actichamp64 data acquisition system). We use EEGLAB, ERPLab, and FieldTrip for our data analyses.