Catherine George is an M.Sc. Data Science student from University of Salford on a hybrid internship for Internet of things and is on the Nurture programme which accelerates the graduate journey into the world of data science. Our Ethos Wilder Sensing project is a programme partner. This aims to quantify biodiversity changes using Artificial Intelligence gathered from sensors in the natural environment. The collaboration provided students with valuable data they can use to complete their M.Sc.

Catherine tells her story from joining the Ethos platform to her solutions for audio data classification.

Catherine George

Catherine George

“From the interview to submitting the dissertation, the entire journey was a great experience. When Lucy Lynch from Eden Smith mentioned Ethos Wilder I was excited because they stood for environmental conservation.

The project assigned was to classify birds from their respective sounds. Ethos Wilder co-founder Geoff Carss highlighted a sensor which would record bird sounds in a landscape.

We were assigned to create a Machine Learning model that classifies birds according to their sounds. The data for training and testing the model was obtained from xeno-canto database (https://xeno-canto.org/), a massive database with a huge collection of bird sounds.

If we take a single recording of a particular bird which spans for a minute, we get a single data point. But audio data holds a lot of information and is multi-dimensional. If we try to chop this one minute recording into small windows, we will have many data points, say 100 data points. This process is called windowing. Now if the windows overlap with each other, this would help retain the bird song in any two data points. So what do we do with all this data now?

Consider we have gathered 100 recordings for each of three different birds; we have almost 30,000 data points for analysis. We did various visualisation techniques (like Principle Component Analysis (PCA) and t-SNE) on this data. This was done and we were able to get three different clusters for the three birds.

We tried a bunch of classification algorithms using these data points. But the accuracy was less than the existing commercially available models. Then we came across the app developed by Cornell University, which converted the audio recordings into an image domain. The audio recordings were converted into spectrograms. Ethos had created Amazon Web Service (AWS) accounts with 2000 credits for three of the M.Sc. students working on the project. Geoff arranged an Immersion day class with the AWS team and we were introduced to the AWS team along with our project. They guided on the different approaches that could be taken and shared the courses and links for learning Amazon SageMaker. This session with the AWS team was very informative and equipped us to use SageMaker to implement the model. One challenge we initially faced was to convert the mp3 recordings from xeno-canto into a spectrogram. A spectrogram is an image which shows the variation of frequency as a function of time.

spectrogram of a bird's (common whitethroat) audio
Spectrograms were created for each recording using librosa library, a python package for music and audio analysis. Thus an audio classification problem was changed to an image classification problem. AWS has a feature called Amazon Rekognition which could automatically assign labels to the spectrograms based on how they are stored in S3 bucket. A model was built for ten birds using Rekognition. This model could be applied to identify birds in a particular region and would help in rewilding the environment. The birds considered for the model were birds from the UK only, this could however be extended to different birds.

Geoff Carss added:

We are working with 2 more groups of M.Sc. students exploring options for other aspects of Wilder Sensing – when you have millions of records for birds, animals, insects etc over 3 – 4 years you can ask questions of the data that were never possible before.

If you are interested in new ways to understand how biodiversity is changing please contact us. If you are a landowner, a data scientist, a charity or potential funder please get in touch with the Ethos Wilder sensing project to discuss how we could partner with you! Contact partner Geoff Carss.

Background Information

The sensing project at EthosVO is working closely with Big Data Specialists at Eden Smith. As part of their ‘Nurture’ programme, we’ve been working with three Big Data M.Sc. students at the University of Salford; Jennifer Ikeoha, Catherine George and Mohammad Chowdhory This collaboration has been made possible by the work of Lucy Lynch: Head of Graduate Partnerships at Eden Smith. The trio worked with us for a number of months and embraced collaborative opportunities along the way. Ethos has organised training for the use of Amazon Web Services which has been delivered to the team and a number of Ethos Young Leaders. Huge thank you to Mohamed Ezzat, Ehab Azad and Niall Dryburgh for some great training and mentoring and being so generous with their knowledge and insights.