507: Automatic Inference of Humpback Whalesong Grammar

Proposer: Ashley Walker and Robert Fisher, 650-3098, rbf@aifh

Suggested Supervisors: Fisher, Hallam, Walker

Principal goal of the project: Previous work in this department has resulted in a self-organizing network capable of robustly inferring phonemes (or 'whalesong units') from unsegmented acoustic signals. The primary goal of the present project is to develop a methodology and programme for analyzing the structural relationships amongst units classified by this network. If successful, the programme will be used to determine whether the structure of songs in our database conform to the currently accepted grammar. Time permitting, the student will use the programme to address fundamental research questions including:


Humpback whales ( megaptera novaeangliae) emit long, complex patterned vocalisations, or "songs". A number of discrete populations of humpback whales exist which, at any point in time, can be characterised by a unique song shared by all singing population members. The songs of the various populations which have been studied all appear to have in common a complex hierarchical structure. Over a series of years, the characteristic song of each humpback population changes extensively and irreversibly (within the confines of this grammar) and, like the songs themselves, song evolution appears to be governed by syntax-like rules.

While humpback whale songs demonstrate a remarkable amount of regular high-level structure, they are composed of a variety of complex and transient elemental phonological "units". Reliable analysis of song structure requires robust unit classification -- a feature which has made this process difficult to automate. Recent research in this department on humpback whale phonology [Mitsakakis 1996; Mitsakakis et. al 1996; Walker et. al 1997] has demonstrated that it is possible to reliably detect and classify units using a hierarchical classification space generated by a topographic mapping algorithm. The success of this approach arises in part because the complete acoustic structure of song units is analyzed, rather than the extracted fundamental pitch frequencies used in previous studies. This technique is robust to random variations in sound due to the variability inherent in the marine mammal sound production mechanism and, consequently, is also robust to the low signal-to-noise conditions of many hydrophone recordings.

During work in developing the unit classification algorithm, the structure of songs in our database were observed to vary in characteristic ways from the hypothesised whalesong grammar. To closely quantify and assess the significance of these variations, much larger sets of recordings must be processed. As the world-wide corpus of recorded whalesong is on the order of several hundreds of hours large, an automated tool is clearly required. The present project proposes to create that tool. Specifically, the student will expand and use the previously developed sub-symbolic unit detector and classifier as the first processing layer in a hierarchical programme -- the second layer of which will infer song structure from the transitions between classified units. It remains to be determined (by the student in collaboration with the supervisors) what form this second layer programme should take (e.g., string matching algorithm or a connectionist network) and where (if at all) the boundary between the symbolic and sub-symbolic layers should lie.

The following three further factors (one resulting from the nature of the humpback whales songs themselves, and two from their collection) must be taken into account in the construction of the structural inference layer:

  1. Recorder fragmenting. Because whalesongs are recorded in the whale's environment (rather than in conditions better suited to the recordist) rarely is it possible to track and record a single singer from the beginning to the end of his song. Therefore, algorithms which seek to infer song structure from transitions between units must be capable of operating on song fragments -- i.e., be insensitive to absolute song starting points.
  2. Redundant repetitions. The signals themselves are complicated by the fact that the duration of songs varies -- due to stutter-like repetitions of units. This phenomena is currently regarded to be behaviourally insignificant and, therefore, it is desirable that multiple occurrences of sound material repeated in this way are ignored.
  3. Rhythm. The information contained in a whalesong exists in both the types and ordering of units, as well as the timing of their delivery. Therefore, whalesong grammar inference algorithms must take into account where in a song's rhythmic structure particular strings of units occur.

This project should lead to a publishable article.

Note: Bob Fisher will be on sabbatical during the MSc project period and so: 1) he will only select a small number of projects to supervise and 2) he will be in Sweden from June to mid Aug, so 2nd academic supervisor will be involved.

Resources Required: C/C++, MATLAB (Signal processing toolboxes)

Degree of Difficulty: Medium conceptual difficulty. Full research project is hard and unclear if convincing results can be obtained, given the previously unexplored nature of the problem.

Background Needed: Some knowledge of speech or signal processing and connectionist computing. A strong interdisciplinary interests in acoustic communication and sensory ecology. Eagerness to explore basic research issues.

Degree Programmes Suitable: MSC AI/CS4 AI/M4