Surely more than once we have picked up the mobile to find out what song is playing in a bar or in a series that we are watching on television. Nowadays it is a gesture that we usually do without being aware of the technology behind applications like Shazam, to give an example. This technology is known as "Audio Fingerprinting", and basically consists of identifying an audio file by extracting unique characteristics from its signal. Audio fingerprinting is a technique used to identify a particular audio recording. This technique converts an audio recording into a unique dataset used to identify the recording. The audio fingerprinting process involves several steps, such as extracting audio features, creating an audio fingerprint, and searching for matches in a database.
Sound recognition has become an important task in many fields, such as safety, music, and advertising. However, sound recognition remains challenging due to the complexity of sounds and the variety of sound sources in the environment. One of the most common applications is the identification of songs on the radio or in an online playlist. The audio fingerprinting technique is used to identify the song based on the unique characteristics of the audio recording. With this technology, songs can be identified automatically, allowing you to identify which song is playing and providing information such as the title, artist and album associated with the song. But it can also be very useful to detect songs that have been uploaded illegally, by comparing the fingerprints of songs with the fingerprints of legal songs in the database.
"Audio Fingerprinting has been studied for a long time, but there are certain scenarios where it does not work well. I am referring to noisy environments such as concerts, bars, outdoor shows, or also in environments where music plays in the background, as often happens on radio and television»
Guillem Cortes Sebastian Tweet
Developing a music monitoring system is a technological challenge, especially when you have to consider the noisy scenarios in which music is often heard, such as concerts, bars or outdoor shows. To solve this problem, the NextCore project of the company BMAT was born, which has created an innovative technology for real-time music monitoring. The company was founded in 2005 as a spin-off of the MTG (Music Technology Group) research group at Pompeu Fabra University (UPF). Research has always been present in the company and has played a decisive role in its success, as confirmed by its various patents and publications. BMAT has already launched two Industrial Doctorates, the first project was carried out by industrial doctor Blai Meléndez Catalán, a successful project that has developed a technology that has already been incorporated into the services offered by the company.
Guillem Cortès Sebastià is the doctoral student of the second Industrial Doctorates project carried out at BMAT in collaboration with UPF, under the supervision of Professor Xavier Serra and Dr. Emilio Molina. The objective of the project is to investigate how to improve music monitoring with deep learning algorithms. Cortès has always had an intense relationship with music, until discovering the importance of the relationship between music and mathematics. Music and mathematics are closely related. In fact, many of the fundamental aspects of music, such as rhythm, melody, harmony and form, can be described and understood through mathematical concepts. Cortés was surprised to discover how the sounds we hear are related in music and mathematics. To give an example, the note "La3" which is often used as a reference note for tuning, emits a sound wave that vibrates at a frequency of 440 Hz. If they play the same note, but an octave higher, the so-called "La4", the frequency is 880 Hz, twice that of the first. This also happens with chords that sound good in our ears, the frequencies of the notes that form them have a simple mathematical relationship.
«The great challenge that BMAT faces is to use Audio Fingerprinting in noisy scenarios and in situations where music sounds in the background, as often happens on radio and television»
Guillem Cortes Sebastian Tweet
Cortès explains that the aim of the project is to improve music monitoring systems and make them more robust in multiple settings, as well as to create tools to encourage research in this field: "Audio Fingerprinting has been studied for some time, but there are certain scenarios where it does not work well. I am referring to noisy environments such as concerts, bars, outdoor shows, or also in environments where music sounds in the background, as often happens on radio and television." The monitoring system of the NextCore project is based on the Audio Fingerprinting technology that has been explained at the beginning of this article, and the great challenge that BMAT poses is to use Audio Fingerprinting in noisy scenarios and in situations where music plays in the background, as often happens on radio and television.
Thanks to the collaboration between BMAT and UPF, the NextCore project can change the way music monitoring systems adapt to noisy environments. This research, which combines music and mathematics, has the potential to encourage the development of new tools in the field of audio identification.