Master Thesis »hierarchy-aware Classification Loss

vor 3 Wochen


Ilmenau, Deutschland Fraunhofer-Gesellschaft Vollzeit

The Fraunhofer Institute for Digital Media Technology IDMT is part of the Fraunhofer-Gesellschaft. Headquartered in Ilmenau, Germany, the institute is internationally recognized for its expertise in applied electroacoustics and audio engineering, AI-based signal analysis and machine learning, and data privacy and security.

At the headquarters, on the campus of “Technische Universität Ilmenau” researchers work on technologies for robust, trustworthy AI-based analysis and classification of audio and video data. These are used, among other things, to monitor industrial production processes, but also in traffic monitoring or in the media context, for example when it comes to automatic metadata extraction and audio manipulation detection. Another focus is the development of algorithms for the areas of virtual product development, intelligent actuator-sensor systems and audio for the automotive sector.

There are currently around 70 employees working at Fraunhofer IDMT in Ilmenau.

**What you will do**

In recent years, deep learning has become more and more powerful for classification tasks in many domains, replacing traditional statistical methods as more data becomes available. In music information retrieval (MIR), these tasks include, for instance, instrument classification and genre detection, where deep learning methods such as Convolutional Neural Networks achieve state-of-the-art accuracy by a wide margin.

However, evaluation metrics such as accuracy do not necessarily tell the whole story. It has been shown that while these systems achieve high accuracy scores, their errors can be particularly nonsensical [1]. In fields such as Computer Vision, such errors can be catastrophic (imagine a self-driving car mistaking a pedestrian for a road marking), and thus, some research has been done with the goal of minimizing the severity of errors with regard to an altered loss function or a hierarchical treatment of class labels [2].

We therefore propose to build upon an existing model for either instrument detection or genre classification, and investigate ways to measure and reduce its error severity for its classification task.

Specifically, in this Master's Thesis, the following objectives should be accomplished:
(1) A literature review of existing methods for error severity reduction, and existing methods for error severity measurement, drawn especially from Computer Vision.
(2) A literature review of hierarchical taxonomies for musical instruments or genres, either from existing work (e.g., [3]), from music theoretic principles, or using an unsupervised approach (e.g., clustering).
(3) The re-implementation of a simple baseline model for the chosen task (genre classification or instrument detection).
(4) The implementation of a suitable metric for measuring error severity (from (1)).
(5) The implementation and evaluation of at least 2 strategies for minimizing error severity using the baseline model from (3) and the evaluation metric from (4), compared against the model's baseline performance. At least one method should involve an implemented hierarchical classification strategy.

Finally, the student should write a final thesis document.

References:
[1] Jeanneret, G., Pérez, J. C., & Arbeláez, P. (2021). A Hierarchical Assessment of Adversarial Severity. IEEE/CVF International Conference on Computer Vision Workshops, 61-70.
[4] Bogdanov, D., Won M., Tovstogan P., Porter A., & Serra X. (2019). The MTG-Jamendo Dataset for Automatic Music Tagging. Machine Learning for Music Discovery Workshop, International Conference on Machine Learning (ICML 2019).

**What you bring to the table**

Prerequisites for this topic are a knowledge of machine learning and deep learning, as well as a passion for (and some knowledge of) music.

**What you can expect**
- exciting market-related topics with complex issues to be solved - you can be actively involved in shaping the future
- challenges at a high level - on top we offer you excellent opportunities for professional and technical trainings
- space to also implement your own ideas, such as in our quarterly open-topic idea contest
- an excellent technical infrastructure
- renowned partners and customers who work closely with you to develop the technologies of tomorrow
- a very good work-life balance thanks to flexible working hours, a co-child office, the option of digital childcare in case of daycare shortages, and the possibility of mobile working, because family comes first - we know that
- an open-minded and interested team, a tolerant and familiar atmosphere as well as regular team events
- good transport connections and proximity to the state capital Erfurt
- attractive special offers as part of Fraunhofer corporate benefits with numerous enterprise partners
- new work and diversity are not just empty buzzwords, but an integral part of our corporate culture

With its focus on developing key technologies that are vital for the future and enabling the commercial



  • Ilmenau, Deutschland Fraunhofer-Gesellschaft Vollzeit

      What you bring to the table For this thesis topic, a solid understanding of the fundamentals of audio signal processing, machine learning, and deep learning, along with experience in using of version control systems such as Git is highly desirable.   What you can expect exciting market-related topics with complex issues to...


  • Ilmenau, Deutschland Fraunhofer-Gesellschaft Vollzeit

      What you bring to the table The prerequisites for this master's thesis topic are excellent skills in audio signal processing and deep learning, practical experience using Python and deep learning libraries such as TensorFlow or PyTorch, as well as a general interest in bioacoustic research topics.   What you can expect ...


  • Ilmenau, Deutschland Fraunhofer-Gesellschaft Vollzeit

    The Fraunhofer Institute for Digital Media Technology IDMT is part of the Fraunhofer-Gesellschaft. Headquartered in Ilmenau, Germany, the institute is internationally recognized for its expertise in applied electroacoustics and audio engineering, AI-based signal analysis and machine learning, and data privacy and security. At the headquarters, on the campus...