Content-based sound retrieval

Refers to techniques used to search for sound files by features of their content, using specialist software, which is particularly helpful when studying large databases. It is often preferable to perform searches relying on metadata, which can be expensive and time-consuming to produce, as it requires humans to describe each individual item in the database.

For music files, MIR (Music Information Retrieval) searches can be performed by providing an example of the sound, e.g. using a sample sound file, whistling into the computer’s microphone, or tapping the computer’s keyboard. This process is called ‘query by example’. Music files can also be searched by using the Parson’s code, which roughly describes the melodic contour of a piece by indicating where the pitch goes up (U), down (D) or repeats (R).

Spoken word files can be searched using ‘semantic retrieval’, i.e. by entering keywords. The software will then identify the part of the file where these words are spoken.

Related methods include: Content analysis, Content-based image retrieval and Searching and querying.

