What Our Models Listen For
H-ear uses machine learning models to identify sounds in your audio. Each model specialises in different types of sounds — from everyday noise to wildlife species. When you upload or record, you choose which model analyses your audio.
YAMNet
General sound classification
Identifies everyday sounds across 521 categories organised into 6 broad groups: Animal, Human sounds, Sounds of things, Music, Natural sounds, and Source-ambiguous sounds. Think of it as a reliable all-rounder that can tell the difference between a dog barking, traffic noise, construction, or an alarm.
Dogs, cats, birds, and other animal sounds
Traffic, sirens, construction, and machinery
Music, speech, and human activity
Weather sounds like rain, thunder, and wind
Coverage: 521 sound categories across 6 top-level tiersBest for: General noise monitoring — residential complaints, neighbourhood disputes, baseline assessmentsLearn more
PANNs
High-fidelity audio analysis
Provides the most detailed sound analysis with 527 categories. PANNs excels at distinguishing similar sounds and providing nuanced confidence breakdowns, making it particularly useful in complex environments where multiple noise sources overlap.
Fine-grained distinction between similar sounds
Detailed confidence scores per match
Handles overlapping noise sources well
Deeper analysis for ambiguous environments
Coverage: 527 sound categories with detailed confidence scoringBest for: Complex environments — mixed traffic and construction, multi-source noise corridors, detailed evidence gatheringLearn more
BirdNET
Wildlife and species identification
Specialised in recognising over 6,500 bird and animal species by their sounds alone. If your monitoring area is near parks, waterways, bushland, or rural areas, BirdNET can identify which species are present and when they are active.
Over 6,500 bird and animal species
Taxonomic classification (birds, mammals, insects, amphibians)
Seasonal and time-of-day activity patterns
Environmental and biodiversity monitoring
Coverage: 6,500+ species across multiple taxonomic groupsBest for: Environmental monitoring — near parks, waterways, rural properties, wildlife corridorsLearn more
Community
Your models, your voice
H-ear believes broad, easy access to Models should be accessible to all. We want to empower communities, researchers, and organisations to develop and advance the depth and engagement of their work.
Bring your own trained model
Specialised classification for your needs
Community-driven sound libraries
Open collaboration with researchers
Coverage: Unlimited — defined by your modelBest for: Specialised needs — industry-specific sounds, research projects, community-driven monitoringGet in touch
How Matching Works
Models break your audio into short chunks and classify each one. Results are organised into a tier system — from broad categories down to specific sounds — so you can see exactly what was detected and how confident the match is.
Sounds are classified into tiers
Each sound is placed into a hierarchy. For example: Tier 1 "Animal" → Tier 2 "Domestic animals" → Tier 3 "Dog" → Tier 4 "Bark". The deeper the tier, the more specific the identification.
Confidence scores show certainty
Every match comes with a confidence score. A score of 0.92 for "Dog bark" means the model is very confident. Lower scores may indicate background noise or ambiguous sounds — you can filter by confidence to focus on clear events.
Multiple sounds detected simultaneously
Real environments have overlapping sounds. Models can detect a dog barking and traffic noise in the same audio chunk, each with their own tier classification and confidence score.
Choose your model when you upload or record — H-ear your environment as words