What Our Models Listen For

H-ear uses machine learning models to identify sounds in your audio. Each model specialises in different types of sounds — from everyday noise to wildlife species. When you upload or record, you choose which model analyses your audio.

YAMNet

General sound classification

Identifies everyday sounds across 521 categories organised into 6 broad groups: Animal, Human sounds, Sounds of things, Music, Natural sounds, and Source-ambiguous sounds. Think of it as a reliable all-rounder that can tell the difference between a dog barking, traffic noise, construction, or an alarm.

Dogs, cats, birds, and other animal sounds

Traffic, sirens, construction, and machinery

Music, speech, and human activity

Weather sounds like rain, thunder, and wind

Coverage: 521 sound categories across 6 top-level tiersBest for: General noise monitoring — residential complaints, neighbourhood disputes, baseline assessmentsLearn more

PANNs

High-fidelity audio analysis

Provides the most detailed sound analysis with 527 categories. PANNs excels at distinguishing similar sounds and providing nuanced confidence breakdowns, making it particularly useful in complex environments where multiple noise sources overlap.

Fine-grained distinction between similar sounds

Detailed confidence scores per match

Handles overlapping noise sources well

Deeper analysis for ambiguous environments

Coverage: 527 sound categories with detailed confidence scoringBest for: Complex environments — mixed traffic and construction, multi-source noise corridors, detailed evidence gatheringLearn more

BirdNET

Wildlife and species identification

Specialised in recognising over 6,500 bird and animal species by their sounds alone. If your monitoring area is near parks, waterways, bushland, or rural areas, BirdNET can identify which species are present and when they are active.

Over 6,500 bird and animal species

Taxonomic classification (birds, mammals, insects, amphibians)

Seasonal and time-of-day activity patterns

Environmental and biodiversity monitoring

Coverage: 6,500+ species across multiple taxonomic groupsBest for: Environmental monitoring — near parks, waterways, rural properties, wildlife corridorsLearn more

Community

Your models, your voice

H-ear believes broad, easy access to Models should be accessible to all. We want to empower communities, researchers, and organisations to develop and advance the depth and engagement of their work.

Bring your own trained model

Specialised classification for your needs

Community-driven sound libraries

Open collaboration with researchers

Coverage: Unlimited — defined by your modelBest for: Specialised needs — industry-specific sounds, research projects, community-driven monitoringGet in touch

How Matching Works

Models break your audio into short chunks and classify each one. Results are organised into a tier system — from broad categories down to specific sounds — so you can see exactly what was detected and how confident the match is.

Sounds are classified into tiers

Each sound is placed into a hierarchy. For example: Tier 1 "Animal" → Tier 2 "Domestic animals" → Tier 3 "Dog" → Tier 4 "Bark". The deeper the tier, the more specific the identification.

Confidence scores show certainty

Every match comes with a confidence score. A score of 0.92 for "Dog bark" means the model is very confident. Lower scores may indicate background noise or ambiguous sounds — you can filter by confidence to focus on clear events.

Multiple sounds detected simultaneously

Real environments have overlapping sounds. Models can detect a dog barking and traffic noise in the same audio chunk, each with their own tier classification and confidence score.

Choose your model when you upload or record — H-ear your environment as words