Βλέπετε μια παλιά εκδήλωση. Η πώληση εισιτηρίων και η υποβολή εργασιών έχουν κλείσει.

Σύγκριση και Ομαδοποίηση Αμφιωτικών Χωρικών Φίλτρων (HRTF) με Χρήση Μεθόδων Μεταεπεξεργασίας και Μείωσης Διαστάσεων

Συγγραφείς

Κωνσταντίνος Μπακογιάννης, Αρετή Ανδρεοπούλου

Σύνοψη

The influence of measurement setups on Head-Related Transfer Function (HRTF) datasets significantly affects their clustering and, consequently, the accuracy of their comparisons. Datasets originating from the same database tend to cluster together due to the coloration imparted by their recording processes, complicating the personalization of spatial audio experiences. This presentation outlines our approach to mitigating these effects through advanced post-processing and data analysis. HRTFs are crucial for binaural listening and spatial perception, making their personalization highly important. HRTF database matching is one way to obtain personalized HRTFs. The aim of this technique is to find the user’s best match from a repository of HRTFs, which we build by collecting publicly available HRTF databases. We utilize the SOFA repository since all HRTF datasets there share a common file format. We have compiled 13 reliable databases comprising a total of about 1,000 HRTF datasets. The technique is tailored to the application type, meaning it considers the importance of spatial perception for specific applications. We choose relevant metrics to quantify acoustic characteristics, such as spectral smoothness and localization accuracy. These metrics are either applied to common measurement positions between a pair of datasets or to the entire dataset. The next step is to focus on the region of the dataset universe where we believe the best match is located. To achieve this, we cluster the universe and derive representative datasets, which we assume are closest to each cluster’s centroid. We then create tailored tasks based on application type and needs to find the optimal user selection, either through a rating process or by evaluating the user’s performance. Upon selection, we zoom in to the winning cluster and repeat the same procedure: further grouping it into clusters, performing tasks with the representative HRTFs, and identifying the winning cluster. The final step of this iterative process is selecting the best couple of HRTF datasets. However, we have observed that datasets tend to cluster according to their database origin, which is supported by relevant literature. The reason is that the measurement setup imposes specific coloration. To mitigate this, we have designed a specific methodology presented in this study. For consistency, we put the datasets through a post-processing process. This includes converting HRTFs to DTFs, resampling to a uniform frequency, truncating impulse responses, adjusting the frequency range, removing DC content, and normalizing gain levels. This standardization process aims to minimize the influence of measurement artifacts, thereby improving dataset consistency. Although this post-processing improves the consistency of the datasets, and clusters are less related to their database origin than clustering without this post-processing, the problem remains. To further mitigate it, we apply a next step. So far, when a metric was run for every common position between two datasets to build the HRTFs universe, we utilized averaging methods to derive a single value describing the pairwise comparison result. This led to the creation of a two-dimensional distance matrix, which we then used with multidimensional scaling (MDS) techniques to build the HRTFs universe. We modified this methodology slightly: instead of averaging the metric outcome to obtain a 2D matrix for clustering, we now aim to cluster the multidimensional matrix before averaging to resolve discrepancies introduced by averaging processes. For this, we utilize advanced dimensionality reduction techniques, specifically principal component analysis (PCA). The application of PCA in this step enables us to simplify the dataset while still retaining the most important characteristics for clustering. By focusing on the variance captured in the principal components, we can distinguish the essential features that affect HRTF dataset clustering. This approach allows us to reveal latent structures that were previously obscured by the averaging process. The initial results of this additional step reveal a more comprehensive and reliable dataset comparison, which shows promise in our ultimate goal of mitigating the recording setup coloration issue. By analyzing the impact of each principal component on the clustering, we gain insights into which dimensions contribute most significantly to the clustering process. For instance, the data may show that certain frequency ranges or spatial positions play a larger role in clustering than others. This deeper understanding allows us to better identify the inherent characteristics influenced by the recording setup and ultimately improve the personalization of spatial audio experiences. Our future work involves refining our approach with PCA and MDS, aiming to eliminate coloration biases further and enhance applications in gaming, auralization, and virtual reality.