Skip to main content

How It Works

H-ear is Audio Classification at enterprise scale; near real time and enriched for the consumer. H-ear does not teach or train. H-ear uses community ML Classification Models to parse your audio and give you annotation... with a special temporal H‑ear twist.
This is not speech-to-text. H-ear focuses on what is happening, not what is being said
Acquire Audio

Try Record for free, upload media or ask an AI (MCP/API).

Choose Model

Select a ML model; YAMnet, BirdNET, PANNS.

Get Cost Calculation

Receive instant pricing based on your file duration and complexity.

Login & Pay

Trusted login with Google or Microsoft. Secure payment via Stripe.

Get Your Report

H-ear Analysis, Notifications and spatiotemporal, annotated UX.

Acquire Audio

Try Record for free, upload media or ask an AI (MCP/API).

Choose Model

Select a ML model; YAMnet, BirdNET, PANNS.

Get Cost Calculation

Receive instant pricing based on your file duration and complexity.

Login & Pay

Trusted login with Google or Microsoft. Secure payment via Stripe.

Get Your Report

H-ear Analysis, Notifications and spatiotemporal, annotated UX.

H-ear your Environment

Play the audio. Interact with the annotation timeline. Download 100% real output and compare H-ear noiseEvents versus ML rawPredictions (we give you both).

0 / 1m 1s
21/26
Standard
Detail
Wild animaSnoringFrogSlap, smacDog
002:31:49 AM8.8s02:31:58 AM17.6s02:32:07 AM26.3s02:32:15 AM35.1s02:32:24 AM43.9s02:32:33 AM52.7s02:32:42 AM1m 1s02:32:51 AM
Analysis
Marker
Leaflet © OpenStreetMap contributors
Job ID: demo-job
26
Total Events
26
Total Events
40.3
Avg dB
40.3
Avg dB
62.0
Max dB
62.0
Max dB
70%
Avg Confidence
70%
Avg Confidence
YAMNet
Model
YAMNet
Model
Detected Sounds
Animal: 7
Human sounds: 5
Source-ambiguous sounds: 4
Sounds of things: 6
Music: 4
Top Noise Sources

1. Animal > Livestock, farm animals, working animals > Fowl

2 events · 5.8s · 100% conf
Fowl_15

2. Human sounds > Respiratory sounds > Breathing

1 events · 3.8s · 100% conf
Breathing_1

3. Animal > Wild animals > Frog

1 events · 2.9s · 100% conf
Frog_0
Snippet Details
Snippet ID
demo-snippet
Original Filename
demo-60s-fixture-1.mp3
Duration

62.277s (1m 2s)

File Size

973.9 KB

Source Type

Upload

GPS Location
Latitude

-35.250830

Longitude

149.049271

Accuracy

212m

GPS Timestamp

7 Apr 2:31 am

GPS Source

browser

Timezone

Australia/Sydney

Timestamps
Recording Started

7 Apr 2:31 am

Recording Ended

7 Apr 2:32 am

Created At

10 Apr 7:11 pm

Updated At

10 Apr 7:11 pm


How It Really Works

Behind the simplicity lies sophisticated technology. Our cloud-native, fully encrypted, ML processing pipeline ensures your data remains private while delivering enterprise-grade, reliable analysis.

ML Processing Pipeline

Browser Check

Format, Size, Duration

Enterprise API

Stream, Upload, Notify

MCP Agent

Openclaw, Claude

Upload Stream

Chunked Transfer

Firewall

Security Gateway

Blob Storage

Security scanning

Upload

SAS Token Auth

Queue

Await ML Capacity

Container

Isolated Instance

Preprocess

Optimisation & Filtering

ML Analysis

Model Parsing & Transform

Reports

Human XLSX, Agentic JSON

Notifications

Edge Device Real-time

MCP

Openclaw, Claude

API

Notifications & Realtime

Delivery

Email + Download

Secure Archive

Encrypted, PAYG Storage

Browser Check

Format, Size, Duration

Enterprise API

Stream, Upload, Notify

MCP Agent

Openclaw, Claude

Upload Stream

Chunked Transfer

Firewall

Security Gateway

Blob Storage

Security scanning

Upload

SAS Token Auth

Queue

Await ML Capacity

Container

Isolated Instance

Preprocess

Optimisation & Filtering

ML Analysis

Model Parsing & Transform

Reports

Human XLSX, Agentic JSON

Notifications

Edge Device Real-time

MCP

Openclaw, Claude

API

Notifications & Realtime

Delivery

Email + Download

Secure Archive

Encrypted, PAYG Storage

Semantic Compression ~ 100 x 400!

TBH, this is so new, it is hard for the Replicators in the backend to keep up. Our H-ear output is not just more semantically readable and useful, it is roughly ~ 100 - 400 times smaller! This makes compression of large temporal datasets a fascinating proposition for many industries, and especially monitoring services. The H‑ear temporal algorithm changes your perspective, and empowers how you act. H-ear unlocks a totally new digital sense...

Privacy-First Architecture

Zero speech-to-text libraries exist in our codebase—transcription is architecturally impossible. Audio is encrypted at rest (Azure-managed keys) and in transit (TLS 1.2+). Microsoft Defender scans every upload. GDPR data deletion is supported, and PCI DSS SAQ A compliance is maintained via Stripe Elements—no card data touches our infrastructure.

Isolated Container Processing

Each analysis runs in a dedicated Container Instance. Your audio never shares resources with other users. Containers spin up, process, and terminate—leaving no persistent state. Your data is completely ephemeral through our processing. Only you can generate keys to your data on our storage endpoints via short term, auto-rotating SAS tokens.

Multi-Model Classification Engine

Multiple ML models run across three frameworks: TensorFlow.js, TensorFlow 2, and PyTorch. AudioSet-trained models like YAMNet (521 classes) and PANNs (527 classes) classify environmental sounds—dogs, traffic, aircraft, machinery. BirdNET adds 6,522 species for biodiversity monitoring. Each model outputs timestamped classification annotation. Contact us if you want your model hosted.

Browser-Side Preprocessing

Before upload, your browser validates file format, codec compatibility, and duration. Client-side checks ensure only supported media reaches our servers. Early validation prevents wasted bandwidth and provides instant feedback on file compatibility.

Secure Storage

Completed analyses come with free 2GB of Enterprise grade, encrypted-at-rest storage. See Data Settings for secure, cheap, PAYG, long term storage without limits.

Why Choose H‑ear?

Advanced AI-powered noise analysis designed for the World...

AI Efficient & Effective

H‑ear output empowers your AI with a highly token efficient intermediate format, roughly ~ 100 - 400 times smaller!

Spatiotemporal Verification

Accurate time and date stamps and GPS for every noise event detected.

Enterprise-Grade Security

Azure-hosted with encrypted storage, secure authentication, and full audit trails.

Algorithm

Our H-ear sound classification and analysis alogrithm ontop of just ML parsing, empowers your flow via Human, Agent or Machine.