Home / Guides / Bit-perfect capture (Windows)
Bit-perfect means the WAV you record contains exactly the same samples the Windows audio engine produced — no sample-rate conversion, no bit-depth conversion, no gain change, no DSP. On Windows, AudioRoute achieves this through WASAPI process loopback at the device's native mix rate. We don't ask — we prove it: a verification harness null-tests the captured audio against the source and the result is max |c − s| = 0: literally bit-identical samples. This guide explains the setup, the proof, and the honest boundaries.
Every byte of audio that leaves an app and goes to your speakers has a sample rate (44.1 kHz, 48 kHz, 96 kHz, …) and a bit depth (16-bit, 24-bit, 32-bit float). A capture is bit-perfect if the resulting file contains those exact same samples — no conversion of any kind sat between the source and the file.
In practice this means three things didn't happen:
On Windows, there is one honest boundary worth being precise about. The OS audio engine sits between any application and the loopback tap that capture tools use. The engine mixes all simultaneous streams, applies per-session volume, and may run endpoint effects (audio processing objects, or APOs — "Audio Enhancements" in Sound Settings). So the claim we make and that we verify is:
We capture, bit-for-bit, exactly what the audio engine produces at the device's native mix rate. Our pipeline adds no resampling and no gain. If you set the engine to play at the source's native rate and don't have effects enabled, that means the captured file is bit-identical to the source.
If an application plays 44.1 kHz audio into a device configured at 48 kHz, the engine resamples 44.1 → 48 before our tap ever sees it — outside any capture tool's reach. What we guarantee is that we add nothing on top of the engine's output.
Windows shared-mode audio runs at a single mix format — almost always 32-bit float at the endpoint's currently-configured rate (44.1 kHz, 48 kHz, 96 kHz, 192 kHz). Every render stream is converted to that mix format. Loopback captures the mix.
The process-loopback API (AUDIOCLIENT_ACTIVATION_TYPE_PROCESS_LOOPBACK) will deliver capture in whatever format you ask for, converting internally if needed. Ask for 48 kHz while the device runs at 96 kHz and the API resamples 96 → 48 for you — silent quality loss, and no longer bit-perfect.
The fix is the obvious one: stop hardcoding a rate and capture at the device's own mix rate. Then the requested format equals the delivered format equals the engine's mix format, and the API's internal converter is a no-op:
App → [engine: mix @ R] → loopback tap (request R) → ring buffer → WAV
└— no conversion: R == R —┘
That's exactly what AudioRoute 0.1.15+ does on Windows. The Auto recording rate resolves at capture start to the default output device's current mix-format rate via WASAPI's GetMixFormat, then locks it for the session so the WAV header stays consistent with the data.
Get bit-perfect capture working in about 90 seconds:
The next three sections walk through each step with the exact menus, paths, and what to look for.
Windows doesn't auto-switch the device's sample rate to match what an app is playing. The device runs at whatever Default Format you've picked in its properties, and the engine resamples every incoming stream to that rate. If you want the capture (and therefore the file) to be bit-identical to a 96 kHz source, the device's Default Format has to be 96 kHz.
Open Settings → System → Sound. Under Output, click your current output device's name — this opens its device-specific page. Scroll to Properties (or sometimes Output settings depending on Windows build) and look for Format. On older builds the equivalent path is Sound Control Panel → right-click your device → Properties → Advanced → Default Format.
Pick the format that matches your source:
Bluetooth devices ignore Default Format. Bluetooth audio uses its own codec negotiation (SBC / AAC / aptX) that doesn't expose the engine's mix rate the same way. Bit-perfect Bluetooth capture isn't possible because Bluetooth audio is inherently lossy at the codec level — the codec-decoded PCM is what reaches the engine, not the original samples. Use wired headphones, speakers, or your built-in DAC for bit-perfect work.
In the same Properties → Advanced area, look for Audio Enhancements and set it to Off. Enhancements are processing the engine runs after mixing but before the loopback tap sees the audio — if they're on, the engine alters the stream and we then capture that altered stream exactly. The capture is still bit-perfect relative to what the engine produced, just not relative to the original source. For a fully transparent chain, disable enhancements.
Open the AudioRoute tray popup (left-click the AudioRoute icon in the system tray; right-click for the menu). Expand Advanced. Find the Rec Rate dropdown and pick Auto. Set Bit Depth to 32-bit float so the recording doesn't lose precision even if your source is 24-bit.
Auto is the magic setting here. Under the hood AudioRoute's daemon calls WASAPI's GetMixFormat at capture start, reads the device's current native rate, initialises the process-loopback client at that exact rate, and writes the WAV at that rate. The API's internal sample-rate converter is a no-op because the requested rate matches the engine's mix rate. The path from the engine's mix to the WAV is a straight memcpy — no resampler, no gain, no conversion.
If you pick an explicit rate instead of Auto, AudioRoute asks WASAPI for that exact rate. If the device is at a different rate, the WASAPI loopback API resamples internally to honour your choice. That's the right behaviour when you specifically want a 44.1 kHz WAV regardless of the device rate — e.g. for import into a 44.1 kHz DAW session — but the file is no longer bit-identical to the source. For bit-perfect work, leave it on Auto.
Start your source playing (Apple Music for Windows, foobar2000, VLC, browser tab, Spotify, whatever), then hit Record to File in the AudioRoute tray. The status row shows the rate it's actually recording at — if your device's Default Format is 96 kHz, you'll see 96000 Hz.
Stop the recording when you're done. By default the file lands in %USERPROFILE%\Music\AudioRoute\, named with the timestamp.
"Bit-perfect" is a claim that's testable. A capture chain that says "no resampling" can still be quietly altering samples in ways the user can't see — gain changes, dither, anti-click ramps, format conversions in the loopback API itself. The only way to be sure is to compare the captured samples to the original and measure the residual.
AudioRoute ships with a verification test (BitPerfectLoopbackTest) that does exactly that. It runs entirely on the user's machine, against the user's audio hardware, so the result reflects the real path through Windows.
GetMixFormat.Pass criteria are deliberately strict: rate preserved, residual SNR ≥ 100 dB (transparent), and additionally the strict bit-identical check — gain ≈ 1.0 and max sample error == 0.
With the OS's stream-startup window excluded (more on that in the next section), the captured audio is literally bit-identical to the source on every input tested:
| Device rate | Source | Recovered gain | Max |c − s| | Verdict |
|---|---|---|---|---|
| 48 kHz | Synthetic noise (quiet) | 1.000000 | 0.000e+00 | bit-identical |
| 48 kHz | Synthetic noise (loud, −0.4 dBFS) | 1.000000 | 0.000e+00 | bit-identical |
| 48 kHz | Decorrelated stereo noise (L ≠ R) | 1.000000 | 0.000e+00 | bit-identical |
| 96 kHz | Synthesised A-major chord | 1.000000 | 0.000e+00 | bit-identical |
| 48 kHz | Windows Notify Calendar.wav | 1.000000 | 0.000e+00 | bit-identical |
| 48 kHz | Windows Foreground.wav | 1.000000 | 0.000e+00 | bit-identical |
| 48 kHz | Windows User Account Control.wav | 1.000000 | 0.000e+00 | bit-identical |
The 96 kHz row confirms the full chain at a different device rate — the system was reconfigured to 96 kHz and every captured sample still equals every source sample. The "max |c − s|" column is the maximum absolute difference between any captured sample and its corresponding source sample, where 0.000e+00 means literally not one sample was changed.
A test that only ever prints "PASS" proves very little. The reason we trust this one is that it initially failed on real audio — and chasing down why is what makes the eventual pass meaningful.
Synthesised content (noise, chords) captured perfectly: max error 0, residual essentially infinite. But the first real file we ran — Windows Notify Calendar.wav, a 48 kHz stereo notification chime — read:
recovered gain : 0.999984
residual SNR : 57.9 dB
max |c - s| : 3.594e-03 ← NOT bit-identical
A 57.9 dB SNR on a real file when synthetic noise was passing infinitely. Something in the actual-file path was different, and we needed to find it before claiming bit-perfect.
Each hypothesis got a controlled run. Rationalising and moving on would have been faster but the test would lose its credibility. Here's what we ruled out:
| Experiment | Result | Rules out |
|---|---|---|
| Attenuate the file ×0.2 | SNR unchanged at 57.9 dB; max error scaled exactly ×0.2 | Limiter / loudness compression (would be non-linear) |
| Loud synthetic noise (−0.4 dBFS) | bit-perfect | Clipping / level-dependent enhancement |
| Decorrelated stereo noise (L ≠ R) | bit-perfect | Stereo widening / crossfeed |
| Synthetic noise → our WAV writer → reload → run | bit-perfect | Our WAV I/O / bit-depth conversion |
A linear, level-independent distortion that leaves broadband white noise untouched is self-contradictory for any time-invariant filter. The error had to be localised in time, not spread across the spectrum.
Splitting the analysis window into 16 segments and reporting per-segment SNR made it obvious:
seg: 52 96 96 96 96 96 96 96
96 96 96 96 96 96 96 96
Every bit of the error lived in segment 0 — the onset. A head-trim sweep pinned the boundary precisely:
| Head-trim | Result |
|---|---|
| 50 ms | 57.9 dB (dirty) |
| 100 ms | 999 dB, max error 0 |
| 200 / 300 ms | 999 dB, max error 0 |
The first ~50–100 ms of any freshly started loopback stream carries the Windows audio engine's anti-click fade-in ramp plus loopback latency-settling. A notification sound's loudest transient sits right at its onset, so that corrupted region dominated the global gain/alignment fit and dragged the entire residual down. Stationary noise never exposed this because there was no dominant onset transient to land in the affected window.
Two important things about this ramp:
The test now judges steady-state fidelity (200 ms head-trim by default) while keeping the per-segment numbers visible so the onset can never be silently hidden. The "999 dB" results in the table above are with the OS startup ramp excluded — the steady-state path is genuinely bit-identical.
Three places where the chain stops being bit-perfect — worth knowing about so the claim doesn't surprise you in edge cases:
We are bit-perfect relative to the engine's mix. If you have Audio Enhancements enabled on your output device (Bass Boost, Loudness Equalization, Sonic, third-party APOs like Nahimic / Dolby Atmos), the engine alters the stream before our tap sees it. We then capture that altered stream exactly. Bit-perfect against the original source additionally requires enhancements off and unity per-app volume.
The investigation section above covers this in detail. Briefly: the OS applies a short anti-click fade-in to any freshly started render stream. This is faithfully captured by our tap because it really is what the engine produces. For continuous capture this is invisible (no fresh stream starts inside the recording window). For brand-new sounds whose loudest content is at their onset (notification chimes, click samples), the first ~50–100 ms isn't bit-identical to the source — everything after it is.
Bit-perfect requires the played content to be authored at the device rate (or for the app to play it at the device rate). If an app plays 44.1 kHz audio into a 48 kHz device, the engine resamples 44.1 → 48 upstream of any capture tool. The captured 48 kHz file is bit-identical to the engine's mix, but the engine's mix is no longer bit-identical to the original 44.1 kHz file. Set the device's Default Format to match your source rate to avoid this.
This is the most common reason a capture isn't bit-perfect, even with everything set up correctly:
In short: leave Rec Rate on Auto, match the device's Default Format to your source's native rate, disable enhancements / spatial, and your captures are bit-perfect by default.
All of these can be captured bit-perfect on Windows as long as you set the device's Default Format to match their native rate:
The architectural principle is the same on both platforms: AudioRoute itself adds no resampling. The OS-level capture path is different, though.
On Windows, WASAPI process loopback reads the audio engine's already-mixed PCM frames directly — at matched rates the API's internal converter is a no-op and the captured samples equal the engine's mix samples bit-for-bit. That's what the null-test results above demonstrate.
On macOS, Apple's Core Audio process-tap API (CATapDescription + AudioHardwareCreateProcessTap) performs some ingestion-side reconstruction filtering that no third-party developer can opt out of. We've verified this with the same null-test approach — the captured samples are practically transparent for music and speech (signals that don't carry content right up to Nyquist), but a square-wave test shows sinc-style pre/post-ringing at every transition. AudioRoute's pipeline on Mac is still zero-resampling, zero-conversion — what comes out of Apple's tap is what lands in the WAV — but the tap itself isn't bit-perfect.
Read the macOS side: Bit-perfect system audio capture on Mac.
Two possibilities, in order of likelihood:
AudioRoute captures stereo today. Multi-channel surround capture is on the roadmap but not shipping yet. Sources with surround channels are mixed down by the engine to stereo before AudioRoute sees them. Spatial Audio (Windows Sonic, Dolby Atmos for Headphones) is an additional renderer that resamples and mixes inside the engine — disable it for bit-perfect work.
32-bit float is the format the engine uses internally on Windows, so writing it directly avoids any depth conversion. A 32-bit float file holds an exact representation of any 24-bit source — nothing is lost. If you specifically need 24-bit integer for an older DAW, set Bit Depth to 24-bit explicitly; AudioRoute will dither the conversion. For bit-perfect work, stay on 32-bit float.
Exclusive mode lets one app take over the device and bypass the engine's mixer entirely. It can give app → device bit-perfect playback (the app's samples go straight to the hardware DAC), but the trade-off is that nothing else on the system can play audio simultaneously — including any capture tool. AudioRoute uses shared-mode process loopback because it has to coexist with all other audio. The shared-mode boundary is the cost of that coexistence.
Bluetooth audio is inherently lossy: the audio is encoded by Windows (SBC, AAC, or aptX) and decoded by the headphones. Whatever AudioRoute captures via loopback is the engine's pre-encode PCM — that part is bit-perfect — but the headphones are receiving the lossy codec output. For bit-perfect playback as well as capture, use wired headphones, speakers, or your built-in DAC.
No. Process loopback is a non-blocking observer of the engine's mix — it doesn't insert itself in the playback path. Speakers and headphones play with the same latency as without AudioRoute installed.
Yes — the BitPerfectLoopbackTest harness is part of the AudioRoute Windows source tree. We're happy to share it with anyone who wants to verify on their own hardware; reach out at support@audio-route.com and we'll send the binary + instructions.
If something's not behaving the way this guide describes — the WAV's sample rate doesn't match the device, the API resamples when it shouldn't, anything — we'd like to hear about it.
Email support Back to guides