Introduction to Head-Related Transfer Function (HRTF) 

2025-10-23

In our tech blog series, we aim to introduce fundamental concepts in audio research in a way that makes them approachable.  

Head-Related Transfer Functions (HRTFs) are a foundation of spatial audio. They capture how sound waves are shaped by a Listener’s physical features: the head, torso, and the outer ear (pinna), before reaching the ear canal. Those subtle changes are exactly what the auditory system uses to infer where a sound came from, even with eyes closed.  

In technical terms, an HRTF is the direction-dependent transfer function from a free-field source to a point in, or near, the ear canal; in the time domain, the corresponding filters are called Head-Related Impulse Responses (HRIRs). Convolving a mono signal with a left/right HRIR pair results in a binaural signal that, when played over headphones, can create a convincing impression of a sound at a specific location in three-dimensional space. 

HRTFs can be combined with Room Impulse Responses (RIRs) to create Binaural Room Impulse Responses (BRIRs). BRIRs incorporate both the filtering effects of the listener’s head and ears and the reflections and reverberation of the environment, producing a more realistic spatial audio experience.  

What HRTFs Capture 

The ear is not a simple microphone. The folds of the pinna create frequency-dependent notches and peaks that change with elevation and front-versus-back angles. The head and torso add interaural time and level differences, shadowing high frequencies more than low ones. HRTFs bundle all of these effects as a function of direction, and, in near-field scenarios, distance, so the same sound is filtered differently when it arrives from above, behind, or to the side. Classic measurement studies and reviews agree on this formal definition and on the practical distinction between frequency-domain HRTFs and time-domain HRIRs.  

How HRTFs Are Measured 

The reference method remains direct measurement in an anechoic room. Miniature microphones are placed at or just inside the ear canal while loudspeakers emit test signals from many angles on a spherical grid. The result is a dense set of HRIRs or HRTFs for each ear. This procedure is accurate but time-consuming and sensitive to probe placement. For that reason, several research groups have released high-quality public databases: for example, the CIPIC database (UC Davis) measured 45 subjects at 1,250 directions and includes anthropometric data, while the 3D3A Lab (Princeton) paired hundreds of HRTFs per subject with 3D scans of the head and torso in the open SOFA format. At Brandenburg Labs, we use the dataset of measurements based on the Neumann KU100 dummy head from the SADIE II database (University of York). These resources underpin a great deal of contemporary work in spatial audio. 

Estimating HRTFs When Direct Measurement Is Impractical  

Because the indicated anechoic measurements are not feasible for most listeners, estimation methods have grown rapidly. One family of approaches maps a small set of head-and-ear measurements or photographs to “closest-match” HRTFs drawn from a database. Another uses 3D scans plus numerical acoustics to compute individualized HRTFs directly from geometry. More recent work leverages machine learning to predict HRTFs from sparse cues such as a few anthropometric features or images of the ear, reducing the capture burden while improving accuracy relative to generic data. Surveys of the field summarize these trends and the trade-offs involved in balancing speed, convenience, and perceptual fidelity.  

Personalization: How Much Does It Matter? 

Small changes in pinna shape can shift spectral notches that are critical for elevation perception and for telling whether a sound is in front or behind. Controlled listening studies generally find that individualized HRTFs improve localization accuracy, externalization (hearing the sound outside the head), and naturalness. However, the magnitude of benefit depends on the task and on the availability of additional cues such as head movements and consistent room acoustics. In some conditions, well-chosen non-individual HRTFs combined with head tracking and brief training can provide convincing spatial impressions, though front–back confusions and elevation errors tend to increase relative to individualized data. Recent research also shows that adaptation can improve performance over time with non-individual HRTFs.  

Beyond the HRTF: Dynamic Cues and the Listening Room 

HRTFs are necessary but not sufficient for perceptual plausibility. The brain expects the room response to match what is seen and felt. When a binaural rendering simulates a room that diverges from the actual listening space, externalization can suffer; this has been demonstrated experimentally and discussed under the “room divergence” effect. Head tracking also matters: allowing the listener to move and updating the binaural rendering accordingly enhances stability and externalization, because the auditory system uses those dynamic cues to anchor sound sources in space. Research from Technische Universität Ilmenau and Brandenburg Labs has explored these factors in depth for auditory augmented reality and dynamic binaural synthesis.  

Where HRTFs Are Used Today 

HRTFs support most headphone-based spatial audio. In virtual and augmented reality, they enable interactive audio scenes that remain stable during head motion. In media production, binaural rendering simulates multichannel loudspeaker layouts for monitoring over headphones, aiding content creation and review. In telepresence, placing remote voices at distinct positions can reduce cognitive load and improve speaker separation. HRTFs are also essential in hearing research, for controlled studies of spatial hearing, and in some hearing-health applications that aim to restore or train spatial perception. These applications draw on the same signal-processing building blocks, HRIR convolution, head tracking, and room modeling, but differ in how much individualization and environmental congruence are practical or necessary. 

Our contributions at Brandenburg Labs  

At Brandenburg Labs, advancing spatial audio research goes hand in hand with creating technology that people can experience and enjoy. A recent peer-reviewed contribution by Brandenburg Labs’ colleagues presented a proof-of-concept binaural renderer that achieved high plausibility in listening tests, even when using a generic HRTF dataset, provided that head tracking and room cues were carefully integrated. In this study, virtualized loudspeakers played over headphones were directly compared with their physical counterparts in the same room, and listeners rated both with strikingly similar levels of plausibility. 

Building on these insights, we continue to bridge research and application. Our headphone-based demos, which allow listeners to experience the comparison between real loudspeakers and their virtualized versions, have been presented at international conferences and conventions. These showcases are not only a testament to our scientific foundation but also to our commitment to making immersive audio over headphones a reality. Learn more about our products and services.  

If you are interested in exploring related concepts, our previous blog posts on binaural audio and binaural room impulse responses (BRIRs) offer a great companion read. 

Resources  

Begault, D. R., and Trejo, L. J., 3-D sound for virtual reality and multimedia (No. NASA/TM-2000-209606), 2000. 

Blauert, J. “Spatial hearing: The psychophysics of human sound localization“. MIT Press. 1997 

Møller, H., Sørensen, M. F., Jensen, C. B., & Hammershøi, D., “Binaural technique: Do we need individual recordings?” in Journal of the Audio Engineering Society, 44(6), 451–469, 1995 

Sloma, U., Merten, N., Thron, T., Brandenburg, K., Wollwert, F., Profeta, R., & Rodriguez, C., “Proof of concept of a binaural renderer with increased plausibility,” in 49th Annu. Meet. for Acoustics, DAGA, Hamburg, 2023. https://pub.dega-akustik.de/DAGA_2023/data/articles/000185.pdf 

Wenzel, E. M., Arruda, M., Kistler, D. J., & Wightman, F. L., “Localization using nonindividualized head-related transfer functions.” in Journal of the Acoustical Society of America, 94(1), 111–123, 1993. https://doi.org/10.1121/1.407089 

Ziegelwanger, H., Majdak, P., & Kreuzer, W., “Numerical calculation of listener-specific head-related transfer functions and sound localization: Microphone model and mesh discretization.” in Journal of the Acoustical Society of America, 138(1), 208–222, 2015 https://doi.org/10.1121/1.4922518 

More
NEWS