PyData Global 2023

VocalPy: a core Python package for acoustic communication research
12-06, 17:00–17:30 (UTC), General Track

Almost all animals communicate with sound, but as far as we know only humans speak languages. How did speech evolve? How do animals like birds, bats, and dolphins learn their songs, and is it similar to how we learn to speak? Questions like these are answered by the study of acoustic communication. This talk will get you acquainted with this exciting research. Along the way you'll hear many different animal sounds, and find out how researchers in this area are using neural network models. You'll learn why there is a need for a core package for researchers in this area (think AstroPy for astronomy). We will present a package we've developed to meet that need, VocalPy, and give a demo of the features. Then we'll present some results we've obtained with VocalPy on evaluating methods for segmenting audio into sequences of animal sounds. Finally we'll share our development roadmap, and tell you how you can get involved with the VocalPy community.


Outline

Introduction (10 minutes)

  • This section will introduce acoustic communication research, and motivate the need for a core Python package for researchers in this area.
  • We'll explain why we see a need for VocalPy, based on our previous experience developing TweetyNet, a neural network model for automatic annotation of animal vocalizations, and the vak and crowsetta libraries.
  • Takeaway: acoustic communication research needs a core package that is
  • expressive enough for researchers to write readable analyses with high-level code that they can share and other groups can easily reproduce
  • robust enough for research software engineers to use it in the back end of applications like a GUI

Features of VocalPy In Depth (~10 minutes)

  • This section will include a demo of the core features of VocalPy, including:
  • Data types: audio, spectrograms, annotations
  • Classes for common steps in workflows for acoustic communication: segmenting audio, generating spectrograms
  • Datasets with metadata captured automatically

Results (~7.5 minutes)

  • This section will present results benchmarking different methods for segmenting audio.
  • We'll compare standard signal processing methods with neural network models, focusing on metrics used in information retrieval such as precision and recall.

Conclusion (2.5 minutes)

  • We'll close with our development roadmap, and an invitation to participate in the VocalPy community.

Prior Knowledge Expected

No previous knowledge expected