了解谷歌研究人员如何将原本用于识别鸟鸣的AI模型重新应用于检测鲸鱼叫声及监测海洋健康状况。探索生物声学监测的未来发展方向。
Discover how Google researchers repurposed AI models trained on bird songs to detect whale calls and track marine health. Explore the future of bioacoustic monitoring.
我们描述了Perch 2.0——Google DeepMind的生物声学基础模型——如何在训练于鸟类及其他陆地动物发声的基础上,以“卓越”的表现将能力迁移至水下声学挑战中的“虎鲸”识别任务。如果预训练分类模型(例如我们的多物种虎鲸模型)已具备所需标签,并且在研究者的数据集上表现良好,可直接用于生成其音频数据的得分与标签。然而,若需为新发现的声音创建自定义分类器,或提升在新数据上的准确率,则可采用迁移学习,而非从零开始构建新模型。该方法显著减少了创建新自定义分类器所需的计算量与实验次数。图中展示了各模型在DCLDE 2026生态型数据集上的嵌入向量tSNE可视化结果,该数据集包含虎鲸(orca)物种的五个生态型变种。图表使用sci-kit learn的PCA与tSNE库生成,嵌入向量在应用tSNE前先被投影至32维向量。
We describe how Perch 2.0, Google DeepMind's bioacoustics foundation model, trained on birds and other terrestrial animal vocalizations, transfers ‘whale’ to underwater acoustics challenges with ‘killer’ performance. If a pre-trained classification model, such as our multi-species whale model, already has the necessary labels and works well on a researcher’s dataset, it can be used directly to produce scores and labels for their audio data. However, to create a new custom classifier for newly discovered sounds or to improve accuracy on new data, we can leverage transfer learning instead of building a new model from scratch. This approach drastically reduces the amount of computation and experimentation needed to create a new custom classifier. tSNE plots of the embeddings from each model on the DCLDE 2026 Ecotype dataset, which contains five ecotype variants of the killer whale (orca) species. Plots were generated with sci-kit learn PCA and tSNE libraries, with embeddings first projected to 32 dimension vectors prior to tSNE being applied.