{% extends "base-with-sidebar.html" %} {% load static %} {% block content-center %}
Koe (Japanese for 'voice') is open-source software for segmenting, measuring, classifying, filtering and exporting acoustic units. It is a complete acoustic database solution, suitable for any animal species with distinct acoustic units.
You can use it on the web from any device at koe.io.ac.nz, which makes it ideal for collaboration, education, and citizen science. (In the future the code will also be available to download and deploy on your own local network or intranet, requiring minimal knowledge of Python and command line tools.)
Acoustic communication is fundamental to the behaviour of many species. If we want to understand animal behaviour we need to get to grips with their vocalisations. What information are they sharing, and how is it encoded?
Very often, acoustic communication is structured as a temporal sequence of distinct acoustic units (syllables), where information is encoded in the types of units and sometimes their temporal arrangement (syntax). In such cases, this is the flowchart for acoustic analysis:
The classification of syllables into types is a key step, because once you have a dataset of labelled syllables, you can analyse repertoire size, syntax and meaning and compare between individuals, sexes, sites, and seasons.
Manual classification by human eye and ear remains the primary and most reliable method for most species, but is hindered by a lack of tools, especially for large and diverse datasets.
That’s where Koe comes in. By facilitating the processing of acoustic units in a streamlined workflow, Koe makes large-scale classification practicable and expands the possibilities for bioacoustic research.
The workflow of Koe is intuitive and flexible. Here is a suggested workflow
Raw recordings are imported and segmented into “songs” (vocalisation bouts), which are in turn segmented into “syllables” (acoustic units). Segmented syllables populate a table in Syllables view with one row per syllable. The user may first run one of Koe’s algorithms to pre-sort the syllable rows by acoustic similarity, thus expediting classification. The user then classifies syllables into types by eye and ear. (Each syllable row has a spectrogram that plays when clicked.)
To classify, the user selects one or more rows and applies a label to the selection. Every new label type gets automatically added to Exemplars view which shows one row per type, serving as a helpful reference during labelling. To check accuracy of syllable labelling, the user sorts the data by label to look for ‘odd ones out’, and compares types in Exemplars view to see if types need to be merged or split.
Once satisfied with the labelling, the classification should be validated. The user can produce a test database to be classified by other independent observers. The labels of participants can then be collated to examine concordance. Syllable measurements can also be extracted and exported to clustering software to examine statistical separation of types.
The validated database can then be used to analyse repertoires, syntax, meaning, and to compare across individuals, sites, or times.
Go to koe.io.ac.nz, click on Guest (top right), then register an account. Use invitation code FRIEND-OF-KOE.
Click Songs > Using label on the navigation bar to enter Koe. Create a new database by clicking on the gears icon (top right) > Database > Create new database.
You can request access to an existing database by clicking Database > Request access and choosing a database from the list. Once the database owner approves your request, you will be able to view and edit the database (depending on your permission level).
Your new database is blank, of course, so you need to add recordings to it. If you have already-segmented song files on your computer, you can upload these directly. In Songs view, click Upload > Audio recording (WAV). Otherwise...
To upload a raw recording, in Songs view, click Upload > Raw recording on the bottom bar. Select a
WAV file to upload and click Open.
Once your raw recording loads, play the sound, looking for songs to segment. Change the playback speed and
adjust the spectrogram contrast with the sliders (top left).
To segment a song simply drag over the spectrogram to create a selection box. Click the selection box to play
it. Adjust the selection box endpoints by holding Shift and dragging the box handles that appear. For each
song you segment, a row appears in the table beneath. You can type in annotation columns in this table.
Once you’ve finished segmenting, fill in track name and record date, check all checkboxes and hit Save to upload the songs you’ve segmented.
Go to Songs > Using label on the navigation bar. You will see a grid with one song per row. To segment a song into units, click its filename in the Filename column. This will take you to segmentation view.
Drag over syllables on the spectrogram to segment. Mouse over a selection box to highlight that syllable’s row in the table. Click a selection box for playback. Fill in annotation columns as desired. Then hit Save segmentation to upload your syllables to the database.
You can alter the segmentation at any time by going to Songs view and clicking a song filename in the Filename column to return to the segmentation view.
Koe takes a vast array of measurements on segmented syllables. There are frame-by-frame measures, taking a measurement for every timestep of every unit, as well as whole-unit measures. Because they are computationally expensive, currently the measurements cannot be run on the server; you will need to download Koe to run on your local machine.
Librosa features:
spectral flatness
spectral bandwidth
spectral centroid
spectral contrast
tonnetz
spectral rolloff
chroma stft
chroma cqt
chroma cens
mfcc
zero crossing rate
Raven features:
total energy
aggregate entropy
average entropy
average power
max power
max frequency
Multitaper features:
frequency modulation
amplitude modulation
goodness of pitch
amplitude
normalise
entropy
mean frequency
spectral continuity
Other features:
duration
frame entropy
average frame power
max frame power
dominant frequency
Go to Syllables > Label them. Here in Syllables view, all your segmented units are in a table and you can start classifying them by applying labels!
The first thing you want to do is run a sort algorithm, to pre-sort your syllables by spectral similarity. This pre-sorting provides a major ‘leg up’ for rapid manual classification, by grouping syllables in clusters that can be labelled in bulk.
Running the algorithms is a little involved – please contact Y.Fukuzawa@massey.ac.nz for instructions tailored to your data.
Once pre-sort algorithms have been run, you can choose them from the list at any time from the gears icon (top right) > Sort algorithm. Selecting the algorithm populates the Dendrogram Index column. If we sort the table by this column (by clicking the column header), Koe will group units by similarity.
Much like in Excel, you can click on a cell to select it, or whizz around the grid with the arrow keys. Page Up and Page Down will page through one screen-height at a time.
Sort by a column by clicking on its header (e.g. Duration). To sort by multiple columns at once, Shift+click subsequent headers.
To hear a syllable, simply click on its spectrogram, or hit spacebar with spectrogram cell selected. There is a speed slider at the top left of the screen for slowing down playback; this helps our human ears hear details not apparent at full speed.
The filter box at the top of the screen is a quick and powerful way to filter your data, using regex. See https://www.computerhope.com/jargon/r/regex.htm for a crash course on using regex.
To filter rows with a certain label, right-click the Label column and click Filter by this column.
Then type a string, e.g.
label: crazysqueak
You can filter most columns in this way. You can also filter by multiple columns at once.
label: crazysqueak song: BobtheBat_1.wav
To filter numerical columns, such as ID, use '==' for 'exactly', '>' and '<' for 'greater than' and
'less than', respectively. '^$' is a handy way of finding blanks; label:^$ will find all rows with blank
labels.
Labelling can be done in two ways; individually, or in bulk.
Simply click in the label cell and start typing the label. If the label already exists you will see it in the
dropdown list. Click the list item you want, or use Alt+Down to navigate down the list and Enter to
select.
Select multiple rows of the same type by clicking their checkboxes or hitting spacebar with the checkbox cell
selected. When you have finished selecting rows, hit Ctrl+Shift+L to bulk label all selected rows.
Ctrl+Shift+F bulk labels the Family, and Ctrl+Shift+S bulk labels the Subfamily.
After choosing the
right label from the dropdown list, click Set (or stick to the keyboard with Tab followed by
Enter).
Un-select all selected rows with Ctrl+`
Every new label type you create in Syllable view gets automatically added to the Exemplars view. One row per type, with up to 10 exemplars for each row. Go to Exemplars > By label on the navigation bar to see your exemplars, and click on spectrograms to play the sound, just as in other views.
You can have Koe open in as many tabs as you want, so it’s easy to have exemplars in one tab and syllables in another, to help you label by comparison. If you make changes in one tab, refresh your other tabs to see the changes.
Click on Songs > By label to access Songs view. Here you can see all your songs, with one row per song. You will see the song syntax as a sequence of unit labels. Click on a unit to play it. Click the play button at the start of a sequence to play the entire song. Use the filter to search for songs of a particular unit sequence, e.g.
sequence:"upsqueak"-"alarmy"
This will filter for all songs with an "upsqueak" syllable immediately followed by an "alarmy" syllable. Useful for exploring syntax.
You can export the data from each view as a CSV file by clicking Export on the bottom bar. You will see two options - either you can export the entire dataset, or a filtered subset (see Filtering section above).