ECE Student Pramit Saha leads Imagine Speech Recognition Project

UBC Electrical and Computer Engineering master’s student, Pramit Saha, is working at the forefront of developing speech-related brain-computer interfaces. The Imagine Speech Recognition Project led by Pramit and directed by ECE Professor Sidney Fels in the Human Communication Technologies (HCT) Lab aims to detect speech tokens from speech imagery brain signals. This project has revealed the possible existence of brain imagery footprint related to articulatory movements underlying imagined speech productions

Speech imagery is about representing speech in terms of the unspoken words inside the human brain without them being vocalized. They hypothesize the existence of a brain footprint for the thoughts underlying covert speech, even though the person is not vocalizing. Furthermore, it is possible to detect the imagined words by understanding the intended involvement of the vocal tract and vocal fold, which is internally encoded in the brain signals. Their deep neural network architecture is able to capture information that the brain sends to the tongue, vocal fold, etc, even without there being vocal communication. Interpreting active thoughts from EEG signals can be very challenging. Their carefully designed methodology for learning the EEG manifold includes the computation of a cross-covariance embedding from high dimensional EEG data that successfully captures the joint variability of the electrodes. This allows the classification of the phonological attributes of the imagined words based on the presence/absence of different parts ofd the articulatory system. These categories are then used to identify the imagined speech. Pramit Saha’s and Professor Sidney Fels’ work has the potential to advance the field of speech-related brain-computer interfaces aimed at providing neuro-prosthetic help to those with speech-related disabilities and disorders. It can help users with a way to express their thoughts, which can greatly help in rehabilitation.

Below, Pramit Saha speaks about the importance and potential impact of the work that is being done:

What motivates you to pursue research in this topic?

Speech is the most basic and natural means of communication. However, the neuro-muscular mechanism underlying the production of articulatory speech is extremely complicated, as a result of which, decoding imagined speech by analyzing the noisy brain signals is a highly challenging problem. The primary objective of this research is to understand the discriminative brain signal manifold corresponding to imagined speech that can enable us to relate the brain signals to the underlying articulation mechanism, crucial in designing speech-related brain-computer interfaces. Such interfaces are targeted to provide neuro-prosthetic help for more than 70 million people worldwide who are suffering from speaking disabilities and speech-related neuro-muscular disorders. Decoding their imagined speech will provide them with effective vocal communication strategies for controlling external devices through speech commands interpreted from brain signals. The idea of being able to contribute towards potentially providing people with a better means to communicate and express thoughts without needing to vocalize, thereby increasing the quality of their life, keeps me strongly motivated to pursue research in this field.

How does this project align with your professional goals?

My research goal primarily centers around the investigation of the neural pathways behind expressive communication abilities including speech and gestures. Human speech production is one of the most complex processes within the human motor repertoire, which needs precise coordination of different speech articulators. Such a refined control of articulators is apparently difficult to master. However, it is quite astonishing to me to imagine how we can perform such complex articulation spontaneously without considerable effort by establishing the connection between speech production sites in our brain and the articulators involved in vocalization. The neuro-computational bases behind such articulation are still not well understood and how the speech intent is related to the intended motion of these articulators is an open question in the domain of imagined speech research. In this work, we endeavor to address the issue by developing a hierarchical deep learning-based model that leverages phonological information (involving intended activity of different articulators) embedded in the brain signals to decode the intended speech token.

More details on their work can be found here:

Hierarchical Deep Feature Learning for Decoding Imagined Speech from EEG  

Deep Learning the EEG Manifold for Phonological Categorization from Active Thoughts

Towards Imagined Speech Recognition With Hierarchical Deep Learning

To find out more about Pramit Saha:

Personal website

UBC HCT Profile