About Me

Karan Thakkar

Hello! I’m Karan Thakkar, a Ph.D. candidate in Electrical and Computer Engineering at Johns Hopkins University. Under Professor Mounya Elhilali at the Laboratory of Computational Audio Perception, I’m pioneering AI and brain-inspired models to decode how we hear and process sound.

Beyond the lab, I paint and mix music—melding creativity with computation. Explore my technical journey in my CV, and feel free to reach out—I’m always up for new ideas and collaborations!

Milestones in My Ph.D. Journey

  • Aug 2025: On-Device Audio Reasoning Models for Multimodal Interaction. Internship, Apple.
  • Jun 2025: Foundation Models for Heart Rate Prediction from Speech. INTERSPEECH 2025 (Apple AIML).
  • May 2025: CapSpeech: Scaling Style Captions for Text-to-Speech to 10M Pairs. NeurIPS 2025.
  • Apr 2025: Computational Modeling of Auditory Attention via Temporal Coherence. PhD Thesis Proposal.
  • Feb 2025: SoloAudio: Audio editing with language. ICASSP 2025.
  • Oct 2024: A Comparison of Ferret Brain Auditory Processing and ArtificialNeural Networks. Nature Communications Biology.
  • Aug 2024: Research on Audio Foundation Models. Internship, Apple.
  • Apr 2024: DreamVoice: Text-Based Voice Conversion. INTERSPEECH 2024.
  • Dec 2023: Two papers on diffusion-based source extraction and self-supervised learning. ICASSP 2024.