Forum: War Ensemble BBS

Random/OT: Low sample rate audio weirdness/mystery

From BGB@cr88192@gmail.com to comp.arch on Sat Sep 6 05:28:16 2025

From Newsgroup: comp.arch

Just randomly thinking again about some things I noticed with audio at
low sample rates.

For baseline, can note, basic sample rates:
44100: Standard, sounds good, but bulky
32000: Sounds good
22050: Moderate
16000: OK, Modest size, acceptable quality.
Seems like best tradeoff if not going for high quality.
11025: Poor, muffled.
8000: Very poor, speech almost unintelligible (normally).
But, it is seeming like a "weird hack" may exist here.

For sample formats:
16-bit PCM: Good
Binary16: Also good
A-Law: Decent (space efficient)
8-bit PCM: Sounds poor crap at all sample rates.
Tends to introduce an obvious hiss.

So, at higher sample rates, 16-bit PCM or A-Law are the clearly better options. And, 16000 16-bit sounds better than 44100 8-bit, despite the
latter having the higher data-rate, because 8-bit PCM adds a very
obvious hiss.

For upsampling, a usual filtering strategy that works well seems to be
to use a cubic spline.

For downsampling, there are a few options:
Nearest Neighbor:
Simplest, poor quality
Introduces very weird distortions when going to low sample rates.
Box average:
Take N samples and average them;
Only works well for power-of-2 resampling.
Pseudo tricubic:
Take a block of N samples;
Downsample by half until its both above and below target rate;
Weighted sum of lower (average) and cubic interpolation (higher).
Sinc:
Theoretically exists, but never made it work well.

As a general strategy, pseudo tricubic had worked well.

So, seemingly, at least when working at 16kHz or above:
16-bit PCM, A-Law, or Binary16 is a win;
Pseudo tricubic seems to give the best perceptual audio quality.

If going to low sample rates (11 or 8kHz), a problem emerges:
Speech becomes muffled and unintelligible.
So, 16-bit PCM and A-Law don't work, audio is still muffled.

But, there is something weird I had noticed at low rates (eg, 8kHz):
ADPCM encoding seems to increase the intelligibility.
Speech seemingly more intelligible after ADPCM than before.
More so, not using the "obvious choice" of minimizing error.
It works better if the encoder is tuned to slightly overshoot.
The effect is more obvious with 2-bit PCM than with 4-bit.
Like, some sort of weird "less is more" with the quality.

My sense of hearing and RMSE heuristic somewhat disagree with which is "better" quality. Where, RMSE seems to prefer if the ADPCM encoder tends
to undershoot, and RMSE also preferring more muffled versions (that are
closer to the down-sampled input audio).

Similarly (partly inspired by the ADPCM effect), it also seems new
contender arrive on the scene as a resampling algorithms:
Treat the previously generated sample points as a reference, and try to
pick a next point that best fits a line or B-Spline to the intermediate
points (in effect, treating each sample point as if it were a control
point for a B-Spline fitted to the input samples).

The line-fitting is simpler, but the B-Spline seems to give a similar
effect with better quality (even if RMSE does not agree, it sees the
error from this as worse than with the other methods if the audio is
upsampled using the normal spline method).

Though, RMSE is lower if the upsampler also treats the audio samples
more like the control points in a B-Spline.

Where, in my usual cubic-spline upsampler, the interpolation passes
through each control point (if the interpolated position directly aligns
with a control point, it returns this point). This differs from a
B-Splines, were generally the curve undershoots the control points.

Then, it seems like for storage, the low-rate audio (control points) can
be stored in ADPCM (though this time, error-minimization during encoding giving the best results).

And, oddly, it seems like the audio (in this low-sample rate,
control-points form) actually has higher perceptual audio quality (and
things like speech seem more intelligible; despite the low 8kHz sample
rate).

But, I am at a loss here as to why and of this would be true in a
theoretical sense.

Stuff online mentioning the use of B-Splines for audio seems to work on
the assumption of generating control points and then using another
B-Spline to generate audio at the target rate (rather than directly
listening to the control points as audio).

Stuff online also mentions needing to low-pass filter the audio before generating the spline, but if any sort of low-pass filtering is applied (before spline generation) than (again) it becomes muffled and
unintelligible.

Presumably, the idea would be to filter out things above the Nyquist
frequency of the target sample rate (so, say, 4kHz for 8kHz audio), but,
as noted, a 4kHz low-pass filter (in general) wrecks intelligibility.

Then again, maybe the mention of low-pass filtering assumes operating at somewhat higher target sample rates?...

Where, seemingly for speech and frequency ranges:
under 1 kHz: mystery range...
Filtering out has little effect.
1-2 kHz; "fullness"
Filtering out this range causes a "tinny" sound
Filtering this out seems to strongly displease cats.
2-4 kHz: Has vowel sounds
Filtering out this range makes voices sound robotic.
Many of the distinguishing parts of the voice go away.
4-8 kHz: Consonants / etc seem to live here
Filtering this out removes the "what is being said" part.
8-16 kHz: Mostly optional
Improves quality, but not effect on intelligibility.
16 kHz: Upper end of hearing
Like CRT TV whistling is up here.

Where, I had noted that general intelligibility of speech and other
audio remains intact with a 4kHz to 8kHz band-pass filter, though with a "robotic" sound, and it is harder to tell peoples' voices apart (like, everyone is speaking with a similar-sounding robotic voice).

But, with 8kHz audio having a 4kHz Nyquist frequency, it makes a
problem. Can sort of hear vowel sounds, but sounds are often largely undifferentiated. Like, can hear that someone is talking, or whose voice
it is, but not really what they are saying.

Though, does leave a mystery then of why telephony would have used 8kHz,
when presumably intelligible speech is the whole point of a telephone?...

Then again, my actual phone experience has mostly been muffled with a
rather obnoxious hiss (like, if the general phone experience wasn't bad enough, they have to punish people for using the phone by having some
truly awful audio quality...).

But, then, had noted that with the ADPCM hack, or the B-Spline fitting
hack, it is again possible to hear what is being said at an 8kHz
sampling rate. But...

I don't really know why, or how it makes a difference, because
presumably the Nyquist frequency is the same either way (but it is
almost like the 4-8kHz band is still present somehow).

Seemingly can't really push it down to a 6kHz sample rate though (seems
like 6kHz might be closer to a hard-limit here).

It is a mystery if anyone has a possible explanation for these effects?...

--- Synchronet 3.21a-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Sat Sep 6 16:21:12 2025

From Newsgroup: comp.arch

BGB <cr88192@gmail.com> posted:

Just randomly thinking again about some things I noticed with audio at
low sample rates.

For baseline, can note, basic sample rates:
44100: Standard, sounds good, but bulky

No it does not sound "good" on a system that accurately reproduces
22KHz; like systems with electrostatic speakers covering the high
end of the audio spectrum.

Might sound "good" to someone who does not know what it is supposed
to actually sound like, though.

32000: Sounds good
22050: Moderate
16000: OK, Modest size, acceptable quality.
Seems like best tradeoff if not going for high quality.
11025: Poor, muffled.
8000: Very poor, speech almost unintelligible (normally).
But, it is seeming like a "weird hack" may exist here.

--- Synchronet 3.21a-Linux NewsLink 1.2

From David Schultz@david.schultz@earthlink.net to comp.arch on Sat Sep 6 11:59:37 2025

From Newsgroup: comp.arch

On 9/6/25 5:28 AM, BGB wrote:

8000: Very poor, speech almost unintelligible (normally).
But, it is seeming like a "weird hack" may exist here.

You might want to look at how AT&T did it. It has been a while but I
think this is near what they used. Back when phones were analog and
digital was just getting started.
--
http://davesrocketworks.com
David Schultz
"The cheaper the crook, the gaudier the patter." - Sam Spade
--- Synchronet 3.21a-Linux NewsLink 1.2

From Brian G. Lucas@bagel99@gmail.com to comp.arch on Sat Sep 6 12:31:31 2025

From Newsgroup: comp.arch

On 9/6/25 11:59 AM, David Schultz wrote:

On 9/6/25 5:28 AM, BGB wrote:

8000: Very poor, speech almost unintelligible (normally).
But, it is seeming like a "weird hack" may exist here.

You might want to look at how AT&T did it. It has been a while but I think this
is near what they used. Back when phones were analog and digital was just getting started.

That was T1 carrier. When I looked at the schematics, I was surprised to see the audio compression was done in analog, using the exponential curve of a diode to get logarithmic compression. If I remember correctly:-)

Brian

--- Synchronet 3.21a-Linux NewsLink 1.2

From Michael S@already5chosen@yahoo.com to comp.arch on Sat Sep 6 20:52:51 2025

From Newsgroup: comp.arch

On Sat, 6 Sep 2025 05:28:16 -0500
BGB <cr88192@gmail.com> wrote:

Just randomly thinking again about some things I noticed with audio
at low sample rates.

For baseline, can note, basic sample rates:
44100: Standard, sounds good, but bulky
32000: Sounds good
22050: Moderate
16000: OK, Modest size, acceptable quality.
Seems like best tradeoff if not going for high quality.
11025: Poor, muffled.
8000: Very poor, speech almost unintelligible (normally).
But, it is seeming like a "weird hack" may exist here.

8000 x 8bit (mu-law in USA, A-law in majority of the world) was a
standard sampling rate for digital back ends of analog wired telephony
for more than 50 years. I didn't check, but would assume that it still
is.
Most people founded it quite intelligible. Certainly more intelligible
than cellular telephony, until less then 20 years ago cellular improved
a little.

--- Synchronet 3.21a-Linux NewsLink 1.2

From BGB@cr88192@gmail.com to comp.arch on Sat Sep 6 13:54:54 2025

From Newsgroup: comp.arch

On 9/6/2025 11:21 AM, MitchAlsup wrote:

BGB <cr88192@gmail.com> posted:

Just randomly thinking again about some things I noticed with audio at
low sample rates.

For baseline, can note, basic sample rates:
44100: Standard, sounds good, but bulky

No it does not sound "good" on a system that accurately reproduces
22KHz; like systems with electrostatic speakers covering the high
end of the audio spectrum.

Might sound "good" to someone who does not know what it is supposed
to actually sound like, though.

Dunno. I mostly use headphones.

Seemingly, at least with the headphones I have, I can hear tones up to
around 17 kHz, but above this, pretty much nothing.

I noticed when trying to get new headphones, I got some cheap ones at
first that sounded like muffled crap (they were around $10 IIRC). I
tried generating tones and with these headphones audio dropped off to
nothing after around 11 kHz. Ended up needing to buy some slightly more expensive headphones (around $30 IIRC, from Logitech), which sounded a
bit better.

Ended up giving the cheap ones to my dad, they apparently worked fine
for him.

Below 1kHz, sine waves rapidly drop off in intensity, whereas square and sawtooth waves retain full loudness.

on the headphones, I can still hear sine waves (well under 1kHz) if the
volume is fairly high.

IRL, I have noted that I am mostly unable to hear tuning forks.

My mom also recently got a "steel tongue drum" (with an apparent 432Hz tuning), which I had noted I can sorta hear, but the sound is very
quiet. I mostly hear the "thwap" sound when she uses the little
rubber-tipped mallet on it.

If I put my hand near it, I can feel vibrations, but I don't really hear anything.

Personally, much over a 32 kHz sample rate, any difference rapidly drops
off, so 44100 and 48000 seem to sound basically the same.

I was mostly trying to explore the area around 8000 though, where
normally I hear crap-all. But, seemingly, with some questionable
filtering, intelligible speech can come through, I just don't entirely understand how it works.

But, as noted, there are several variations of the trick:
Feed audio through ADPCM;
Works better with either 2-bit/sample IMA,
or with encoder tuned to overshoot.
Model audio as line-fitting during downsampling.
This is likely similar to what ADPCM ends up doing.
Model audio as B-spline fitting.
Seems to preserve more perceptual quality than the line fitting.

But, what I am not entirely sure of is why this would make any real difference.

But, can note that it does differ from the more conventional
downsampling strategies of "just average stuff", in that both approaches
tend to generate points outside the original curve.

32000: Sounds good
22050: Moderate
16000: OK, Modest size, acceptable quality.
Seems like best tradeoff if not going for high quality.
11025: Poor, muffled.
8000: Very poor, speech almost unintelligible (normally).
But, it is seeming like a "weird hack" may exist here.

Seemingly, there is no general disagreement that 11025 and 8000 sound
kinda like crap?...

I guess 11025 worked OK for Doom and Quake.
Quake 2 had used 22050 (but, still 8-bit PCM).
Quake 3 had used 22050 (but 16-bit PCM now)

With Wolfenstein 3D, it wasn't until hearing some slightly better
quality versions of the sound effects from the iOS port that I realized
the enemies were saying stuff for their sound effects. Like, the
low-level enemies apparently saying "Achtung!" rather than "Aaah-Uuuh"
(but, with the audio from the DOS version, just sorta heard a whole lot
of the latter).

But, as noted, I mostly ended up preferring 16000 A-Law for sound
effects and similar as a good tradeoff for space and quality. Also
ADPCM, which uses less space.

Some people seem to try to use MP3 or OGG for sound effects, but:
128 kbps: Bulky
64 kbps: Poor
32 kbps: Can full of broken glass.
In addition to both formats being complicated, computationally expensive
to decode, and typically needing to use a third party library to decode
them.

Also, in this case, 2-bit IMA ADPCM seems to somewhat beat MP3 at the
low bitrate game (at least to my hearing).

Not sure of a good way to go lower, best way I have found in past
fiddling was, eg:
Downsample by 1/16 or so to generate a reference line;
Eg, spline-fitting the samples;
Also generate the side-intensity
Eg, standard deviation from samples and the spline.
Store this line in some form, such as via ADPCM;
Approximate the intermediate table with patterns from a table.
The table of patterns itself derived partly from the frequencies.
Stores the relative intensity above/below the spline curve.

Where, one way of storing the line is, say:
4x 3-bit, each control-point sample, as ADPCM
3 or 4-bit, side/intensity sample (eg, standard deviation channel).
Pattern table might be stored as 4 or 8 bits per block.
Pattern is chosen by whichever best fits the intermediate samples.

but with, say, 8 bits per sample block, but with a 16x internal
downsample, could work out to 0.5 bits/sample (or, 16kHz audio in
8kbps). With 8-bit patterns, it is 0.75 bits/sample.

Example patterns:
0: Flat line, follow spline
1: Positive (sin 8*PI)
2: Positive Hump (sin PI)
3: Negative Hump
4: Positive (sin 2*PI)
5: Negative (sin 2*PI)
6: Positive (sin 3*PI)
7: Negative (sin 3*PI)
8/9: 4*PI
A/B: 5*PI
C/D: 6*PI
E/F: 7*PI
...
If using 6 or 8-bit patterns, it can include a second (or 3rd)
sub-frequency.
00..0F: Same as above
10..1F: Same main pattern as 00..0F
Sub-frequency mirrored in frequency and polarity (+8 mod 16).
Roughly 5/8 amplitude of main frequency.
2x: Same, but lower intensity sub-frequency (3/8).
3x: Same, but lower intensity sub-frequency (1/8).
4x..7x: Same, but use a different sub-frequency index (+/-5 mod 16).
Encodes offset sign and intensity (5/8 or 3/8).
8x..Fx: Add a 3rd frequency, lower intensity than the second (1/8).
Similar strategy to above.
...

Decoding algorithm would work in blocks, eg:
Unpack spline points;
Interpolate splines for each sample;
Multiply deviation channel with the values from the pattern table.
This is then added onto the base spline.

However, this sort of approach is somewhat more complicated than just
using a low-bitrate ADPCM (and I haven't used it much).

Also, quality is inferior to 2-bit ADPCM.

But, not a lot in this area that doesn't sound like total garbage...

I had noted in past experiments that seemingly a lower-limit scheme for intelligible speech for me, was:
Split audio into blocks of 128 samples (at a 16kHz sample rate);
Match a sine-wave between 4 and 8 kHz (picking the loudest sine wave);
Encode the frequency and intensity of this sign wave.

This can achieve ~ 0.125 bits per sample, or 2 kbps.
However, speech is very unnatural sounding;
Pretty much any non-speech audio becomes unrecognizable noise.
Where, say, frequency is a byte in steps of 16 Hz; and intensity is an
A-Law value.

Though, this pushes the limits of intelligibility, and it is possible
that others might find such a scheme unintelligible.

Has also experimented with schemes of encoding the relative intensity of
a series of 16 bands (between 4 and 8 kHz), but quality was also pretty
low here (and it won neither for quality or ability to achieve a low
bitrate). Quality is better with more bands, but this quickly reaches a practical limit.

It being seemingly more effective to pick 1 or 2 sine waves, and then
encoding the specific frequency and intensity of each.

For slightly more natural sound, can pick N sine waves from within
specific frequency ranges, say, 4 waves:
2-3kHz, 3-4kHz, 4-6 kHz, 6-8kHz
Resulting in something slightly more like a normal human voice.
But, still sounds unnatural.
And, it still falls on its face for any non-speech audio.

Also, can note:
While I am saying sine waves here, wave shape is non critical, it also
seems to work if using square waves or similar.

for these experiments, had mostly ended up discarding everything below
2kHz, as it seems to not contain anything particularly relevant.

Can note that the "block sampling rate" for this approach seems to needs
to be over 100 Hz for best effect (a block size of 128 samples giving a
125Hz block-sampling frequency).

But, can note that seemingly no mainline audio codecs work this way...

...

--- Synchronet 3.21a-Linux NewsLink 1.2

From BGB@cr88192@gmail.com to comp.arch on Sat Sep 6 14:19:40 2025

From Newsgroup: comp.arch

On 9/6/2025 12:52 PM, Michael S wrote:

On Sat, 6 Sep 2025 05:28:16 -0500
BGB <cr88192@gmail.com> wrote:

Just randomly thinking again about some things I noticed with audio
at low sample rates.

For baseline, can note, basic sample rates:
44100: Standard, sounds good, but bulky
32000: Sounds good
22050: Moderate
16000: OK, Modest size, acceptable quality.
Seems like best tradeoff if not going for high quality.
11025: Poor, muffled.
8000: Very poor, speech almost unintelligible (normally).
But, it is seeming like a "weird hack" may exist here.

8000 x 8bit (mu-law in USA, A-law in majority of the world) was a
standard sampling rate for digital back ends of analog wired telephony
for more than 50 years. I didn't check, but would assume that it still
is.

It seems that the issue isn't (purely) with the sample rate or encoding.

But, there is some "weird hacks" that can be done in audio processing
when downsampling that seems to notably increase intelligibility at an
8kHz sample rate (in which case, A-Law is back to being effective again).

In general, I don't fault mu-law or A-Law as the quality they give is by
far superior to 8-bit linear PCM.

Just, the "standard" audio down-sampling strategies (glorified averaging
in various forms) just sort of result in audio becoming muffled and
speech poorly intelligible at low sample rates.

Whereas, the "hacky" strategies (line or curve fitting) seem to give a
better result.

I am more just sort of at a loss as to what is going on here exactly, as
it appears to defy common wisdom about how audio resampling (or audio
quality) should work.

Well, and as noted my perception is often at odds with what an RMSE
score says (where RMSE tends to more strongly prefer the
muffled-sounding versions). Where, usually, RMSE is sort of the gold
standard of measuring quality in image and audio processing.

Most people founded it quite intelligible. Certainly more intelligible
than cellular telephony, until less then 20 years ago cellular improved
a little.

Whatever the cellphones are doing now, still sounds like garbage even
versus what I get from 2-bit IMA ADPCM at 8kHz ...

Like, if I encode some speech as 2-bit ADPCM, at least I can still
understand what is being said (even if 2-bit ADPCM doesn't necessarily
give the best audio quality; and is kinda poorly supported in SW vs the
more common 4-bit ADPCM).

Like, it is a disincentive to talk over the phone when I have to make
extra effort to try to decipher what people are saying due to poor audio quality.

--- Synchronet 3.21a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.arch on Sat Sep 6 13:18:46 2025

From Newsgroup: comp.arch

On 9/6/2025 11:54 AM, BGB wrote:

On 9/6/2025 11:21 AM, MitchAlsup wrote:

BGB <cr88192@gmail.com> posted:

Just randomly thinking again about some things I noticed with audio at
low sample rates.

For baseline, can note, basic sample rates:
    44100: Standard, sounds good, but bulky

No it does not sound "good" on a system that accurately reproduces
22KHz; like systems with electrostatic speakers covering the high
end of the audio spectrum.

Might sound "good" to someone who does not know what it is supposed
to actually sound like, though.

Dunno. I mostly use headphones.

Seemingly, at least with the headphones I have, I can hear tones up to around 17 kHz, but above this, pretty much nothing.

I noticed when trying to get new headphones, I got some cheap ones at
first that sounded like muffled crap (they were around $10 IIRC). I
tried generating tones and with these headphones audio dropped off to nothing after around 11 kHz. Ended up needing to buy some slightly more expensive headphones (around $30 IIRC, from Logitech), which sounded a
bit better.

Ended up giving the cheap ones to my dad, they apparently worked fine
for him.

Below 1kHz, sine waves rapidly drop off in intensity, whereas square and sawtooth waves retain full loudness.

on the headphones, I can still hear sine waves (well under 1kHz) if the volume is fairly high.

IRL, I have noted that I am mostly unable to hear tuning forks.

My mom also recently got a "steel tongue drum" (with an apparent 432Hz tuning), which I had noted I can sorta hear, but the sound is very
quiet. I mostly hear the "thwap" sound when she uses the little rubber- tipped mallet on it.

If I put my hand near it, I can feel vibrations, but I don't really hear anything.

Personally, much over a 32 kHz sample rate, any difference rapidly drops off, so 44100 and 48000 seem to sound basically the same.

I was mostly trying to explore the area around 8000 though, where
normally I hear crap-all. But, seemingly, with some questionable
filtering, intelligible speech can come through, I just don't entirely understand how it works.

But, as noted, there are several variations of the trick:
Feed audio through ADPCM;
    Works better with either 2-bit/sample IMA,
     or with encoder tuned to overshoot.
Model audio as line-fitting during downsampling.
    This is likely similar to what ADPCM ends up doing.
Model audio as B-spline fitting.
    Seems to preserve more perceptual quality than the line fitting.

But, what I am not entirely sure of is why this would make any real difference.

But, can note that it does differ from the more conventional
downsampling strategies of "just average stuff", in that both approaches tend to generate points outside the original curve.

    32000: Sounds good
    22050: Moderate
    16000: OK, Modest size, acceptable quality.
      Seems like best tradeoff if not going for high quality.
    11025: Poor, muffled.
     8000: Very poor, speech almost unintelligible (normally).
       But, it is seeming like a "weird hack" may exist here.

Seemingly, there is no general disagreement that 11025 and 8000 sound
kinda like crap?...

I guess 11025 worked OK for Doom and Quake.
Quake 2 had used 22050 (but, still 8-bit PCM).
Quake 3 had used 22050 (but 16-bit PCM now)

With Wolfenstein 3D, it wasn't until hearing some slightly better
quality versions of the sound effects from the iOS port that I realized
the enemies were saying stuff for their sound effects. Like, the low-
level enemies apparently saying "Achtung!" rather than "Aaah-Uuuh" (but, with the audio from the DOS version, just sorta heard a whole lot of the latter).

But, as noted, I mostly ended up preferring 16000 A-Law for sound
effects and similar as a good tradeoff for space and quality. Also
ADPCM, which uses less space.

Some people seem to try to use MP3 or OGG for sound effects, but:
128 kbps: Bulky
   64 kbps: Poor
   32 kbps: Can full of broken glass.
In addition to both formats being complicated, computationally expensive
to decode, and typically needing to use a third party library to decode them.

Also, in this case, 2-bit IMA ADPCM seems to somewhat beat MP3 at the
low bitrate game (at least to my hearing).

Not sure of a good way to go lower, best way I have found in past
fiddling was, eg:
Downsample by 1/16 or so to generate a reference line;
    Eg, spline-fitting the samples;
Also generate the side-intensity
    Eg, standard deviation from samples and the spline.
Store this line in some form, such as via ADPCM;
Approximate the intermediate table with patterns from a table.
    The table of patterns itself derived partly from the frequencies.
    Stores the relative intensity above/below the spline curve.

Where, one way of storing the line is, say:
4x 3-bit, each control-point sample, as ADPCM
3 or 4-bit, side/intensity sample (eg, standard deviation channel). Pattern table might be stored as 4 or 8 bits per block.
Pattern is chosen by whichever best fits the intermediate samples.

but with, say, 8 bits per sample block, but with a 16x internal
downsample, could work out to 0.5 bits/sample (or, 16kHz audio in
8kbps). With 8-bit patterns, it is 0.75 bits/sample.

Example patterns:
0: Flat line, follow spline
1: Positive (sin 8*PI)
2: Positive Hump (sin PI)
3: Negative Hump
4: Positive (sin 2*PI)
5: Negative (sin 2*PI)
6: Positive (sin 3*PI)
7: Negative (sin 3*PI)
8/9: 4*PI
A/B: 5*PI
C/D: 6*PI
E/F: 7*PI
...
If using 6 or 8-bit patterns, it can include a second (or 3rd) sub- frequency.
00..0F: Same as above
10..1F: Same main pattern as 00..0F
    Sub-frequency mirrored in frequency and polarity (+8 mod 16).
    Roughly 5/8 amplitude of main frequency.
2x: Same, but lower intensity sub-frequency (3/8).
3x: Same, but lower intensity sub-frequency (1/8).
4x..7x: Same, but use a different sub-frequency index (+/-5 mod 16).
    Encodes offset sign and intensity (5/8 or 3/8).
8x..Fx: Add a 3rd frequency, lower intensity than the second (1/8).
    Similar strategy to above.
...

Decoding algorithm would work in blocks, eg:
Unpack spline points;
Interpolate splines for each sample;
Multiply deviation channel with the values from the pattern table.
    This is then added onto the base spline.

However, this sort of approach is somewhat more complicated than just
using a low-bitrate ADPCM (and I haven't used it much).

Also, quality is inferior to 2-bit ADPCM.

But, not a lot in this area that doesn't sound like total garbage...

I had noted in past experiments that seemingly a lower-limit scheme for intelligible speech for me, was:
Split audio into blocks of 128 samples (at a 16kHz sample rate);
Match a sine-wave between 4 and 8 kHz (picking the loudest sine wave);
Encode the frequency and intensity of this sign wave.

This can achieve ~ 0.125 bits per sample, or 2 kbps.
However, speech is very unnatural sounding;
Pretty much any non-speech audio becomes unrecognizable noise.
Where, say, frequency is a byte in steps of 16 Hz; and intensity is an
A-Law value.

Though, this pushes the limits of intelligibility, and it is possible
that others might find such a scheme unintelligible.

Has also experimented with schemes of encoding the relative intensity of
a series of 16 bands (between 4 and 8 kHz), but quality was also pretty
low here (and it won neither for quality or ability to achieve a low bitrate). Quality is better with more bands, but this quickly reaches a practical limit.

It being seemingly more effective to pick 1 or 2 sine waves, and then encoding the specific frequency and intensity of each.

For slightly more natural sound, can pick N sine waves from within
specific frequency ranges, say, 4 waves:
2-3kHz, 3-4kHz, 4-6 kHz, 6-8kHz
Resulting in something slightly more like a normal human voice.
But, still sounds unnatural.
And, it still falls on its face for any non-speech audio.

Also, can note:
While I am saying sine waves here, wave shape is non critical, it also
seems to work if using square waves or similar.

for these experiments, had mostly ended up discarding everything below
2kHz, as it seems to not contain anything particularly relevant.

Can note that the "block sampling rate" for this approach seems to needs
to be over 100 Hz for best effect (a block size of 128 samples giving a 125Hz block-sampling frequency).

But, can note that seemingly no mainline audio codecs work this way...

Playing around with WAV almost destroyed my eardrums and my speakers.
FWIW, I have an example of a wav experiment right here:

https://youtu.be/DrPp6xfLe4Q?t=63

--- Synchronet 3.21a-Linux NewsLink 1.2

From BGB@cr88192@gmail.com to comp.arch on Sat Sep 6 15:42:17 2025

From Newsgroup: comp.arch

On 9/6/2025 3:18 PM, Chris M. Thomasson wrote:

On 9/6/2025 11:54 AM, BGB wrote:

On 9/6/2025 11:21 AM, MitchAlsup wrote:

BGB <cr88192@gmail.com> posted:

...

Also, can note:
While I am saying sine waves here, wave shape is non critical, it also
seems to work if using square waves or similar.

for these experiments, had mostly ended up discarding everything below
2kHz, as it seems to not contain anything particularly relevant.

Can note that the "block sampling rate" for this approach seems to
needs to be over 100 Hz for best effect (a block size of 128 samples
giving a 125Hz block-sampling frequency).

But, can note that seemingly no mainline audio codecs work this way...

Playing around with WAV almost destroyed my eardrums and my speakers.
FWIW, I have an example of a wav experiment right here:

https://youtu.be/DrPp6xfLe4Q?t=63

This is fairly quiet (apart from a slight warbling sound) until the
piano comes in at around 1:20...

I don't have any of my experiments here on YouTube; would likely need to
find some good public domain audio test examples (and/or record myself speaking, but would rather not), and set something up here.

Though, in this case, it would be more examples of "pushing the limits
for poor audio quality".

Did find a video of another guy doing something vaguely similar to what
I have done in some experiments:
https://www.youtube.com/watch?v=qosYRO6WjkQ

But, his examples sound very different from mine (with a characteristic
sound more like bad MP3 compression), I suspect because he was using
different frequency bands or similar (as noted, mine ignored pretty much everything below 2kHz).

--- Synchronet 3.21a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.arch on Sat Sep 6 14:37:29 2025

From Newsgroup: comp.arch

On 9/6/2025 1:42 PM, BGB wrote:

On 9/6/2025 3:18 PM, Chris M. Thomasson wrote:

On 9/6/2025 11:54 AM, BGB wrote:

On 9/6/2025 11:21 AM, MitchAlsup wrote:

BGB <cr88192@gmail.com> posted:

...

Also, can note:
While I am saying sine waves here, wave shape is non critical, it
also seems to work if using square waves or similar.

for these experiments, had mostly ended up discarding everything
below 2kHz, as it seems to not contain anything particularly relevant.

Can note that the "block sampling rate" for this approach seems to
needs to be over 100 Hz for best effect (a block size of 128 samples
giving a 125Hz block-sampling frequency).

But, can note that seemingly no mainline audio codecs work this way...

Playing around with WAV almost destroyed my eardrums and my speakers.
FWIW, I have an example of a wav experiment right here:

https://youtu.be/DrPp6xfLe4Q?t=63

This is fairly quiet (apart from a slight warbling sound) until the
piano comes in at around 1:20...

The rest of it is all from my MIDI program. I got afraid of
experimenting with raw WAV, and made the volume lower. Have you ever had
the piercing screech that makes you say WTF! Or, a massive bass that
makes your speakers want to snuff themselves? Shit man.

I don't have any of my experiments here on YouTube; would likely need to find some good public domain audio test examples (and/or record myself speaking, but would rather not), and set something up here.

Though, in this case, it would be more examples of "pushing the limits
for poor audio quality".

:^) I still love music from the SNES:

(Donkey Kong Country 2 - Stickerbush Symphony) https://youtu.be/nwBlulZ2Uq8?list=RDnwBlulZ2Uq8

Did find a video of another guy doing something vaguely similar to what
I have done in some experiments:
https://www.youtube.com/watch?v=qosYRO6WjkQ

But, his examples sound very different from mine (with a characteristic sound more like bad MP3 compression), I suspect because he was using different frequency bands or similar (as noted, mine ignored pretty much everything below 2kHz).

Thanks for that link.
--- Synchronet 3.21a-Linux NewsLink 1.2

From Terje Mathisen@terje.mathisen@tmsw.no to comp.arch on Sun Sep 7 12:26:37 2025

From Newsgroup: comp.arch

MitchAlsup wrote:

BGB <cr88192@gmail.com> posted:

Just randomly thinking again about some things I noticed with audio at
low sample rates.

For baseline, can note, basic sample rates:
44100: Standard, sounds good, but bulky

No it does not sound "good" on a system that accurately reproduces
22KHz; like systems with electrostatic speakers covering the high
end of the audio spectrum.

Might sound "good" to someone who does not know what it is supposed
to actually sound like, though.

My ears are not good enough to notice the difference between CD quality, AAC/high sample rate MP3/ogg vorbis/etc, but according to my savant (?)
cousin who could listen to a 16 min piece of music once and then write
down the score for all the instruments, none of them sound like live,
but they are close enough that he can listen and internally translate to
what it would have sounded like in a concert.

Terje
--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
--- Synchronet 3.21a-Linux NewsLink 1.2

From BGB@cr88192@gmail.com to comp.arch on Sun Sep 7 14:59:23 2025

From Newsgroup: comp.arch

On 9/7/2025 5:26 AM, Terje Mathisen wrote:

MitchAlsup wrote:

BGB <cr88192@gmail.com> posted:

Just randomly thinking again about some things I noticed with audio at
low sample rates.

For baseline, can note, basic sample rates:
44100: Standard, sounds good, but bulky

No it does not sound "good" on a system that accurately reproduces
22KHz; like systems with electrostatic speakers covering the high
end of the audio spectrum.

Might sound "good" to someone who does not know what it is supposed
to actually sound like, though.

My ears are not good enough to notice the difference between CD quality, AAC/high sample rate MP3/ogg vorbis/etc, but according to my savant (?) cousin who could listen to a 16 min piece of music once and then write
down the score for all the instruments, none of them sound like live,
but they are close enough that he can listen and internally translate to what it would have sounded like in a concert.

To me, 44100 and 48000 sound basically the same, so not much gain in
going higher.

The difference between 32000 and 44100 is slight.

Though, one merit of 32000 is that is is 2x 16000.

So, audio can be split into two domains based on power-of-2 relations:
8000 / 16000 / 32000
11025 / 22050 / 44100

On a PC, 32K or 44K are good "output" sample rates, though usually
things like sound-effects are preferably stored at lower rates (8/11/16K).

In some cases, size matters more than quality. For example, in 3D
engines, "ambient" sound effects can be quite large, and it may be
preferable to store them at the lowest quality one can get away with
(like, few people are going to notice if the "wind howling in the
background" is stored at poor quality).

Though, sometimes it is noticeable:
One animated show known as "Bravest Warriors" (made by same guy who made "Adventure Time" with a similar art style) typically uses high quality
audio for the voice acting, but usually "kinda poor" audio quality for
any sound effects and background music (enough so that it is kinda noticeable). Like, someone was like "8kHz 8-bit PCM?... Good enough."
(not sure the specifics exactly).

Like, it is noticeable when one mixes 44K voice acting with 8K
background music.

But, in general, for something like a 3D engine I would think, say:
16K: any voice dialog or similar;
11K: generic weapon sound effects or explosions or similar.
8K: most ambient sound effects
Such as wind or fluorescent light hum, etc.
Maybe pay a little more for background music, like maybe 16kHz.

A-Law works well, ADPCM is also OK.
Just, preferably avoid 8-bit linear PCM due to the hiss issue.

Here, 2-bit ADPCM is interesting, as it can allow 16K at the same
bitrate as 8K, or 2b 8K if the quality doesn't matter (or there is some
other reason to save space). Though the quality tradeoff of 2b/16K vs
4b/8K is likely to depend on the use-case.

Then again, maybe audio filtering could remove the hiss from 8-bit PCM,
or avoid generating it, but A-Law is still preferable here.

And, just say no to low bitrate MP3 (why do people even do this?...).

Like the hiss from 8-bit PCM, the artifacts of the MP3 compression would
blend into other stuff and make everything else sound worse.

...

OTOH, if one has music that is likely to be "actively listened to", then
32K or 44K makes sense, but then MP3 also makes sense, as pretty much
any "reasonable" way of storing 32K or 44K music is going to exceed the bitrate of 128kbps MP3 (and, at 128kbps, it avoids the artifacts).

Likewise music storage is an area where ADPCM is weak.
But, not everything is music.

But, can note that seemingly, 16KHz 4bit ADPCM is a pretty good format
for speech, quality wise. This is 64 kbps, and IMHO beats MP3 at this task.

Also, both Int16 and Binary16 float can give what seems like "perfect"
audio quality when used as storage formats. But, sometimes one may need
more as a working format (such as Binary32).

Or, say:
Int16: Final Output
Binary16: Intermediate Storage or Math
A-Law: Storage / Sound-Effects
Binary32: Active processing (such as mixing or filtering operations).

For many less trivial audio tasks, integer or fixed-point math is
lacking (10.22 fixed point or similar isn't quite ideal).

In my case, generally 128 kbps MP3 sounds OK (and is pretty much the
"gold standard" MP3 bitrate), but the quality of MP3 drops off fast if
one goes much lower. At roughly 64 kbps or below, MP3 is kinda trash.
Even 96 kbps is borderline, as the quality drops off very rapidly.

OGG Vorbis is kinda similar, it has a slightly different sound to it but
is basically the same stuff.

The issue seems to be for me that MP3 and OGG are *very* abusive to the
higher frequency ranges. They don't just discard them (as downsampling
would do), but instead result in a whole lot of chaotic noise in the
higher frequencies, which is kinda obnoxious to me.

So, OK format for audio distribution or music on the internet.
But, poor formats for sound effects.

One effect I can hear IRL, that I sometimes get annoyed with in games
lacking, is that when I hear a sound IRL, I don't just hear the sound,
but also the shape of the room or similar that the sound is present in.
Like, I can hear the sounds reflecting off walls, and sometimes between
the panels in doors and other things.

Whereas, most games pretty much don't bother. Any sound effects are
played as point sources in an empty void.

Sometimes, games have tried to fake it (with the whole "EAX" thing that
was popular at one point), but then they use "presets" of just some
generic space that doesn't really match at all with the space the player
is actually in.

Like, "Woo, now you have a closet sized wooden box centered around your head...", it follows you along, until it then switches out for a
slightly bigger concrete-walled box (still centered around the player's
head). Sometimes the box is tall or narrow, or shorter and wider, but,
yeah, "kinda weak", not much better than just playing the sounds into an
empty void.

Well, the "EAX" thing did at least try to make the audio more directional.
Many games just adjust the left/right balance for stereo;
But, for better effect, one needs to do a phase offset;
And, some other adjustments for up/down and forward/back.
For example, above/below/behind has less stereo separation.
Also the amount of stereo separation is correlated to distance.
More distant also reduces separation, not just direction.
...
And, also apply Doppler shifts for the relative velocities;
...

Though, neither EAX nor OpenAL seemingly bothered with Doppler effects.
But, at the speeds characters often move in games, it is enough that
Doppler shifting becomes at least semi-relevant.

Though, in this case, the phase offset and Doppler effects are
"basically" the same phenomena, so in effect one can sort of represent
each "ear" as a point in terms of the Doppler math, with the appropriate propagation delay, then the phase offsets happen naturally.

In my 3D engines, I have sometimes made an effort to do a better job
here, typically trying to calculate the audio contribution of sound
reflecting off each block in the general vicinity, and then
approximating for some more distant blocks.

The effect is still rather poor, and the audio still often sounds a bit
jank, but I had usually made an effort at least.

Doing this "well" would be computationally expensive.

Though, one partial workaround being to do some of this at lower sample
rates internally (like, say, it is less noticeable if the audio
reflections off a wall are happening as 8kHz and at A-Law quality, ...).

Also my more recent 3D engine doesn't handle the "underwater" situation
very well, just sort of giving the "very excessive" reverb of the the
player's head being stuck inside of a large number of blocks *all*
reflecting the audio. But, sorta works. Also a little wacky as it
applies to the background music and makes it more obvious that the
player is themselves the point-source of the background music
(alternative would be to mix this in afterwards, and not have it like
the player is carrying around a boombox that is playing the background
music). But, underwater that is a sort of "reverb hell" sorta works,
still maybe better IMO than "well, you are still in a void, but now it
is playing scuba related sound effects...".

Though, it may not seem like much, this sort of audio processing can
also eat a lot of RAM for the intermediate audio data. So, don't really
want to just store all the intermediate audio as 'float'.

But, then again, most people seemingly don't bother with any of this.

Terje

--- Synchronet 3.21a-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Sun Sep 7 21:12:03 2025

From Newsgroup: comp.arch

Terje Mathisen <terje.mathisen@tmsw.no> posted:

MitchAlsup wrote:

BGB <cr88192@gmail.com> posted:

Just randomly thinking again about some things I noticed with audio at
low sample rates.

For baseline, can note, basic sample rates:
44100: Standard, sounds good, but bulky

No it does not sound "good" on a system that accurately reproduces
22KHz; like systems with electrostatic speakers covering the high
end of the audio spectrum.

Might sound "good" to someone who does not know what it is supposed
to actually sound like, though.

My ears are not good enough to notice the difference between CD quality, AAC/high sample rate MP3/ogg vorbis/etc, but according to my savant (?) cousin who could listen to a 16 min piece of music once and then write
down the score for all the instruments, none of them sound like live,
but they are close enough that he can listen and internally translate to what it would have sounded like in a concert.

Just after graduating CMU I worked in a high end stereo store. The listening room was 4 walls none of them parallel and a slanted ceiling; so it had essentially no reverberation. The Pittsburgh string quartet rented out
the room for various practices, and we recorded on 9 track tape at 60"/s
and played it back on Dalquist speakers and other high end amplification; diddling with the equalization until the recording sounded like the live
string quartet (only seconds apart live<->recorded).

I have/had 2 brothers who could listen to a movie and then go write down
the score of one or two of the tunes. I, personally, can't carry a tune
in a basket--but I admire those who can. I can hear things that others don't seem to. Things like whether the phono section of a pre-amp has a tube or not--its all in the harmonics!!

Terje

--- Synchronet 3.21a-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Sun Sep 7 21:13:31 2025

From Newsgroup: comp.arch

BGB <cr88192@gmail.com> posted:

On 9/7/2025 5:26 AM, Terje Mathisen wrote:

MitchAlsup wrote:

BGB <cr88192@gmail.com> posted:

Just randomly thinking again about some things I noticed with audio at >>> low sample rates.

For baseline, can note, basic sample rates:
44100: Standard, sounds good, but bulky

No it does not sound "good" on a system that accurately reproduces
22KHz; like systems with electrostatic speakers covering the high
end of the audio spectrum.

Might sound "good" to someone who does not know what it is supposed
to actually sound like, though.

My ears are not good enough to notice the difference between CD quality, AAC/high sample rate MP3/ogg vorbis/etc, but according to my savant (?) cousin who could listen to a 16 min piece of music once and then write down the score for all the instruments, none of them sound like live,
but they are close enough that he can listen and internally translate to what it would have sounded like in a concert.

To me, 44100 and 48000 sound basically the same, so not much gain in
going higher.

The difference between 32000 and 44100 is slight.

The difference is in the phase of the high end spectrum 15K-22K
--- Synchronet 3.21a-Linux NewsLink 1.2

From BGB@cr88192@gmail.com to comp.arch on Sun Sep 7 18:58:18 2025

From Newsgroup: comp.arch

On 9/7/2025 4:13 PM, MitchAlsup wrote:

BGB <cr88192@gmail.com> posted:

On 9/7/2025 5:26 AM, Terje Mathisen wrote:

MitchAlsup wrote:

BGB <cr88192@gmail.com> posted:

Just randomly thinking again about some things I noticed with audio at >>>>> low sample rates.

For baseline, can note, basic sample rates:
44100: Standard, sounds good, but bulky

No it does not sound "good" on a system that accurately reproduces
22KHz; like systems with electrostatic speakers covering the high
end of the audio spectrum.

Might sound "good" to someone who does not know what it is supposed
to actually sound like, though.

My ears are not good enough to notice the difference between CD quality, >>> AAC/high sample rate MP3/ogg vorbis/etc, but according to my savant (?)
cousin who could listen to a 16 min piece of music once and then write
down the score for all the instruments, none of them sound like live,
but they are close enough that he can listen and internally translate to >>> what it would have sounded like in a concert.

To me, 44100 and 48000 sound basically the same, so not much gain in
going higher.

The difference between 32000 and 44100 is slight.

The difference is in the phase of the high end spectrum 15K-22K

I can notice a slight difference, but as noted, it isn't much...

Meanwhile, decided to check the delta between:
Audio downsampled from 16K to 8K via averaging pairs of samples;
Audio downsampled from 16K to 8K via spline curve fitting.

And, I had noticed there is a difference in the 8 kHz signals.
The curve-fitting delta signal is quite strong in high frequencies (with
much of the total energy in the 2 to 4 kHz range); and actually a fair
bit louder than could have been expected.

The difference signal itself contains intelligible speech (and most
other significant aspects of the audio), though exists pretty much
entirely in the high part of the frequency range.

Where, say (S0/1/2/3) for a spline, point between 1 and 2:
Linear:
V=(S1*(1-F))+(s2*F)
Quadratic Spline (Bezier)
P1=(S0*(1-F))+(S1*F)
P2=(S1*(1-F))+(S2*F)
V=(P1*(1-F))+(P2*F)
Cubic Spline
P1=(S0*(1-F))+(S1*F)
P2=(S1*(1-F))+(S2*F)
P3=(S2*(1-F))+(S3*F)
Q1=(P1*(1-F))+(P2*F)
Q2=(P2*(1-F))+(P3*F)
V=(Q1*(1-F))+(Q2*F)

This is a different spline construction than I had usually used for
audio processing:
G=1-F
P1=(S1*(1+F))-(S0*F)
P2=(S2*(1+G))-(S3*G)
V=(P1*(1-F))+(P2*F)

But it seems the former may have more useful properties in this case
(mostly in that estimating the control points the former splines better preserves high-frequency properties of the signal); whereas the latter
is more solely useful for interpolation tasks (such as upsampling).

Though, for 2x cases, F is only ever 0.25 or 0.75, partly simplifying
the math.

But, for calculating the points, one doesn't actually have the former or following control points, so it is necessary to carry out the math for additional samples into the past and future to estimate the other
control points to try to calculate the current control-point (or, a bit
more hairy). For the terminal points, linear extrapolation seems to work.

But, yeah, it seems the control-points style signal seems to be
significantly boosted in terms of high-frequency components.

And, as audio, it seems to preserve some aspects of the 16kHz signal
that are otherwise lost when downsampling to 8 kHz.

I guess I could try looking some at a reconstructed version of the 16
kHz sample and see if anything survives past the 4kHz mark.

Well, OK, trying to resample it up to 16kHz using the B-spline is just
sort of being a bit weird. Seems almost like the math is broken somehow.

In the reconstruction attempt there are a few big notches in the
spectrum; seems to be an issue with the output spline rather than the
input signal.

Seems to not be an issue with my typical spline, rather something
specific about my attempt at upsampling again with with a cubic Bezier
spline.

The upsampled reconstruction attempt sounds like dog crap; but does interestingly seem to have stuff going on beyond past the 4kHz Nyquist
cutoff (so, this still leaves the possibility that parts of the higher frequencies may be surviving the downsampling process).

Though, curiously, despite sounding like dog-crap and having big notches
in the spectrum, the Bezier Spline reconstruction does have the lower
RMSE value for some reason.

Though, can note that the input audio it seems is fairly weak in the 4-8
kHz range, so isn't entirely obvious what specifically is being effected
in downsampling, but seemingly clearly something at least.

Well, more fiddling it seem to try to figure this out...

--- Synchronet 3.21a-Linux NewsLink 1.2

From David Schultz@david.schultz@earthlink.net to comp.arch on Sun Sep 7 21:16:34 2025

From Newsgroup: comp.arch

On 9/7/25 6:58 PM, BGB wrote:

Meanwhile, decided to check the delta between:
Audio downsampled from 16K to 8K via averaging pairs of samples;
Audio downsampled from 16K to 8K via spline curve fitting.

Seems, inadequate to satisfy the Nyquist criteria.
--
http://davesrocketworks.com
David Schultz
"The cheaper the crook, the gaudier the patter." - Sam Spade
--- Synchronet 3.21a-Linux NewsLink 1.2

From BGB@cr88192@gmail.com to comp.arch on Sun Sep 7 22:55:34 2025

From Newsgroup: comp.arch

On 9/7/2025 9:16 PM, David Schultz wrote:

On 9/7/25 6:58 PM, BGB wrote:

Meanwhile, decided to check the delta between:
Audio downsampled from 16K to 8K via averaging pairs of samples;
Audio downsampled from 16K to 8K via spline curve fitting.

Seems, inadequate to satisfy the Nyquist criteria.

Dunno.

Averaging pairs would be the traditional method for downsample, but,
when downsampling to 8kHz, audio sounds muffled, and intelligibility of
speech is poor.

Curve fitting seems to generate a "perceptually better" result at 8kHz.
But, is weirdly behaved;
Apparently makes the audio harder to upsample again without it sounding
like crap (at least when upsampling with some variant of a cubic spline;
seems naive LERP is still happy here).

I actually am neither sure what I am doing here, nor the exact nature of
the effect I have stumbled onto...

General algorithm for spline fitting ATM:
Reuse the two previous points (S0 and S1);
Make a guess for S2 and S3;
Use iterative convergence to reduce error for S3;
Use iterative convergence to reduce error for S2;
Use S2 as next point.

The spline function can be swapped out.

Of the two:
Well, if used in the convergence-step, my usual spline function
generates terrible results.

The Bezier spline function seems better behaved.

After upsampling:
Usual spline gives poor results;
Bezier spline gives low RMSE, but has a big ugly notch at around 4 kHz;
Naive LERP seemingly does OK though (not great, but avoids the frequency notches).

Tried using a quadratic spline, didn't really work very well.

In most variations, the notch appears to be near 4kHz, or near the
Nyquist frequency of the 8kHz sample rate (with another smaller notch
one octave lower, around 2 kHz; and another small dip around 1 kHz).

Actually, thinking about it, a notch right near the Nyquist frequency of
the control points might be an inescapable side effect of using a spline
here. So, it can somehow encode some information slightly above the
Nyquist rate into the spline, but not *at* the Nyquist rate.

So, it seems the curve-fitting is causing things to go weird with the
spline during upsampling.

Well, it is a weird effect, in any case.

--- Synchronet 3.21a-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to comp.arch on Mon Sep 8 10:59:50 2025

From Newsgroup: comp.arch

On 07/09/2025 23:12, MitchAlsup wrote:

Terje Mathisen <terje.mathisen@tmsw.no> posted:

MitchAlsup wrote:

BGB <cr88192@gmail.com> posted:

Just randomly thinking again about some things I noticed with audio at >>>> low sample rates.

For baseline, can note, basic sample rates:
44100: Standard, sounds good, but bulky

No it does not sound "good" on a system that accurately reproduces
22KHz; like systems with electrostatic speakers covering the high
end of the audio spectrum.

Might sound "good" to someone who does not know what it is supposed
to actually sound like, though.

My ears are not good enough to notice the difference between CD quality,
AAC/high sample rate MP3/ogg vorbis/etc, but according to my savant (?)
cousin who could listen to a 16 min piece of music once and then write
down the score for all the instruments, none of them sound like live,
but they are close enough that he can listen and internally translate to
what it would have sounded like in a concert.

Just after graduating CMU I worked in a high end stereo store. The listening room was 4 walls none of them parallel and a slanted ceiling; so it had essentially no reverberation. The Pittsburgh string quartet rented out
the room for various practices, and we recorded on 9 track tape at 60"/s
and played it back on Dalquist speakers and other high end amplification; diddling with the equalization until the recording sounded like the live string quartet (only seconds apart live<->recorded).

I have/had 2 brothers who could listen to a movie and then go write down
the score of one or two of the tunes. I, personally, can't carry a tune
in a basket--but I admire those who can. I can hear things that others don't seem to. Things like whether the phono section of a pre-amp has a tube or not--its all in the harmonics!!

Having a good memory for tunes, or being able to replicate tunes, and
being able to distinguish the quality of sound reproduction is not
actually highly correlated. The former is primarily a higher-level
brain function, while the later is partly physical, partly low-level
brain (or software vs. hardware, to suit the group better!).

For those that work directly with music, age brings experience and
improves abilities like recognising or duplicating tunes. Age also
brings deterioration in the physical aspects of hearing - especially at
higher frequencies.

There is /some/ overlap, because both groups spend a lot of time
listening to music, which exercises and improves both functions.

One key difference, however, is that it is easy to appreciate when
people can listen to a tune once and play it again afterwards - you can
watch them do it. For people who say they can distinguish CD audio from
AAC or other high bps compressed audio, and other "golden ears"
distinctions, it's a different matter - in double-blind tests, most fail badly. There are a great many factors involved in high-quality audio reproduction - the basic sample rate is only one of them.

--- Synchronet 3.21a-Linux NewsLink 1.2

From David Schultz@david.schultz@earthlink.net to comp.arch on Mon Sep 8 06:49:55 2025

From Newsgroup: comp.arch

On 9/7/25 10:55 PM, BGB wrote:

Dunno.

Averaging pairs would be the traditional method for downsample, but,
when downsampling to 8kHz, audio sounds muffled, and intelligibility of speech is poor.

It has been a couple of decades since that discrete time signal
processing course so the details have faded. But I do know that aliasing
can be a big problem. Hence the good low pass filter as part of the
decimation process.

Assuming your 16KSPS data started with a good presample filter there was
no signal or noise (or at least negligible) above 8KHz, it is going to
have stuff between 4KHz and 8KHz. Fail to filter that adequately and it
gets folded/aliased to a lower frequency.

So you need a discrete time filter to remove most of the information in
your data above 4KHz while leaving what you want alone.

The moving average filter is not a good choice. Sure it has a zero at
your new sample rate but it has poor performance in general.
--
http://davesrocketworks.com
David Schultz
"The cheaper the crook, the gaudier the patter." - Sam Spade
--- Synchronet 3.21a-Linux NewsLink 1.2

From BGB@cr88192@gmail.com to comp.arch on Mon Sep 8 15:10:31 2025

From Newsgroup: comp.arch

On 9/8/2025 3:59 AM, David Brown wrote:

On 07/09/2025 23:12, MitchAlsup wrote:

Terje Mathisen <terje.mathisen@tmsw.no> posted:

MitchAlsup wrote:

BGB <cr88192@gmail.com> posted:

Just randomly thinking again about some things I noticed with audio at >>>>> low sample rates.

For baseline, can note, basic sample rates:
44100: Standard, sounds good, but bulky

No it does not sound "good" on a system that accurately reproduces
22KHz; like systems with electrostatic speakers covering the high
end of the audio spectrum.

Might sound "good" to someone who does not know what it is supposed
to actually sound like, though.

My ears are not good enough to notice the difference between CD quality, >>> AAC/high sample rate MP3/ogg vorbis/etc, but according to my savant (?)
cousin who could listen to a 16 min piece of music once and then write
down the score for all the instruments, none of them sound like live,
but they are close enough that he can listen and internally translate to >>> what it would have sounded like in a concert.

Just after graduating CMU I worked in a high end stereo store. The
listening
room was 4 walls none of them parallel and a slanted ceiling; so it had
essentially no reverberation. The Pittsburgh string quartet rented out
the room for various practices, and we recorded on 9 track tape at 60"/s
and played it back on Dalquist speakers and other high end amplification;
diddling with the equalization until the recording sounded like the live
string quartet (only seconds apart live<->recorded).

I have/had 2 brothers who could listen to a movie and then go write down
the score of one or two of the tunes. I, personally, can't carry a tune
in a basket--but I admire those who can. I can hear things that others
don't
seem to. Things like whether the phono section of a pre-amp has a tube or
not--its all in the harmonics!!

Having a good memory for tunes, or being able to replicate tunes, and
being able to distinguish the quality of sound reproduction is not
actually highly correlated. The former is primarily a higher-level
brain function, while the later is partly physical, partly low-level
brain (or software vs. hardware, to suit the group better!).

FWIW: My musical ability is almost non-existent.

For those that work directly with music, age brings experience and
improves abilities like recognising or duplicating tunes. Age also
brings deterioration in the physical aspects of hearing - especially at higher frequencies.

This is a concern for me, partly if I lose high frequencies, seemingly I
wont have anything, as my hearing of low frequencies (sub 1kHz) is
seemingly already impaired.

Seemingly, most of my world of audio perception is located between 1kHz
and 8kHz.

Most lower frequencies are more felt than heard.

There is /some/ overlap, because both groups spend a lot of time
listening to music, which exercises and improves both functions.

I listen to music a lot, but usually House and EDM and similar.

When I was younger, a lot of Goth (mostly of the Synthpop/Synthwave
variaty), and Industrial.

There was Dubstep for a while, but seemingly the whole genre kind of
imploded (though, never got much mainstream popularity aside from
"Skrillex").

When I was in childhood, there were mostly bands like "Nine Inch Nails"
and "Marilyn Manson" and similar (FWIW: I am Gen Y / Millennial).

Well, I think also popular (but I wasn't into them) were bands like
"Eminem" and "Backstreet Boys", etc, then a little later, I think a lot
of people were into "Linkin Park" and similar.

Had noted that some older music was also pretty good, like from bands
like "Depeche Mode" and similar (though, had noted seemingly some
people object to this band, due to its apparent popularity with gays).

Contrast, my dad is mostly more into "Heavy Metal" and similar.

Not actually sure what anyone younger than me is into.

One key difference, however, is that it is easy to appreciate when
people can listen to a tune once and play it again afterwards - you can watch them do it. For people who say they can distinguish CD audio from AAC or other high bps compressed audio, and other "golden ears" distinctions, it's a different matter - in double-blind tests, most fail badly. There are a great many factors involved in high-quality audio reproduction - the basic sample rate is only one of them.

I am not really a "golden ear" AFAICT.

At high bitrates, or high sample rates, I can't hear much difference.

But, mostly just noting that it is at low bitrates where things like MP3
and similar start to sound like crap.

Where, say:
128-192 kbps: Good
96: Obvious degredation
64: Kinda bleh
48: Rather poor
32: Rattling cans of broken glass.
24: Mostly chaotic whistling noises
...

Well, kinda like JPEG:
Looks good at 80-95% quality;
But, load and resave, it gets worse each time.
Save and resave an image at 0% a few times, yeah...
This itself became a meme at one point.

At the lower end, things like 16-color BMP images can still be useful
(and with LZ compression, like gzip or similar, is often smaller than
the same image expressed as a PNG). Likewise for monochrome.

Whereas, despite being good for photos, JPEG sucks for pixel art or
16-color or monochrome graphics.

Well, and I guess, much like not all images are photos, not all audio is music.

Well, unless there is someone who wants to disagree with me about the
relative (lacking) audio quality of 32 kbps MP3, ...

For low bitrate uses (usually mono), I have personally often found ADPCM
to be a good/better option.

Where, say:
16000, 4-bit ADPCM: 64 kbps
11025, 4-bit ADPCM: 44 kbps
16000, 2-bit ADPCM: 32 kbps
8000, 4-bit ADPCM: 32 kbps
11025, 2-bit ADPCM: 22 kbps
8000, 2-bit ADPCM: 16 kbps (current lowest-quality option in BGBCC)

The recent "mystery" implied that there might be a way to squeeze more perceptual quality out of 8000 2-bit ADPCM (observing already that it
seemed like it could somehow "enhance" audio to some extent depending on
how the encoder was tuned, *1).

But, it seems this "mystery" property might be poorly behaved, and may interact poorly with my usual upsampling/interpolation filters to
produce a significant reduction in audio quality (effectively limiting
the usefulness of trying to use it deliberately).

*1: The usual strategy for 4-bit ADPCM encoding is that for each sample,
one will linearly quantize the delta based on the step size, maybe round
it, and encode this. This approach partly breaks down with 2-bit ADPCM,
so a more effective strategy is to effectively "brute force search" the encoding space multiple samples at a time to find the path with the
least error. But, with slight tweaks to the error estimation math, it is possible to get different effects.

But, say, one option is searching 3 samples at a time, with a roughly 6
bit search space. Searching 5 or 6 might seem obvious (then you can emit
a whole byte at a time), but this is an order of magnitude slower. So,
one can instead chain two dependent 3 sample searches to get a block of
4 samples (or 1 byte).

Theoretically, this can also be done with 4-bit ADPCM, but 4-bit has a
bigger search space, so it is inherently slow. While 1 sample at a time encoding is possible with 2-bit ADPCM, its quality is poor.

There were several variations of 2-bit ADPCM:
ITU-T G.726: Exists, but pretty much no software support.
IMA ADPCM, 2-bit:
Supported by VLC Media Player and Audacity as an input format.
Other software either refuses to open it or gives broken audio.
Has a partial drawback of needing 4 bytes/block for header.
Also stereo, if used, takes 2x the bitrate...
Custom: I can just do my own thing here.

For custom variants, had noted a few tricks:
Store initial predictor sample as A-Law, as a full PCM sample is overkill;
Use a 64 entry step table rather than 89 entries (in this case, the step
table can be seen as a 4.2 bit fixed-point value in Log2 space);
Skip use of range clamping (instead, encoder is not allowed to go out of
range for either the predictor or step index).
This can allow the header to be encoded in 16 bits (with 2-bits
remaining for decoder control flags).

The Log2 stepping and elimination of range clamping would allow for a
cheaper hardware decoder, but the loss of range clamping negatively
effects audio quality in some cases (when the audio uses the full
dynamic range). Likewise, the encoder is not allowed to take paths that
would take the step index out of range (unlike IMA, which requires
clamping here as well).
It also reduces LOC and allows for a faster decoder (can decode 4
samples at a time without the code becoming unwieldy, and with no
intermediate "if()" branches).

A lot of my use-cases had been mono, but can note that there is a
cheaper way to do stereo (vs fully duplicating the left/right channels,
eg, "joint stereo"):
Split audio into Center and Side channels;
C=(L+R)/2, S=L-R
Encode the center channel at full sample rate;
Encode side channel at 1/4 sample rate.

This way, stereo is 1.25x the bitrate of mono, rather than 2x. Though,
one needs to decide how the side-channel is interpolated (usual options
being either linear interpolation or a cubic spline).

Another option being the 4x 3-bit center, 4-bit side, scheme (had used
this a few times, same bitrate as 4-bit mono). For 2-bit, makes more
sense though to have a center block and a 1/4 sub-sampled side block.

Can note that 2 bits/sample is the bottom end of what is possible with
the ADPCM strategy.

Going much lower is harder.
Getting under 2 bits per sample requires a more complex strategy.

Best I am aware of ATM being to use a pattern table, eg, for each block:
Split audio into a base curve and deviation (say, at 1/16 sample rate);
ADPCM encode the base curve and deviation (with a sub-sampled
deviation), reusing the joint-stereo encoding;
Then follow this with a list of block pattern indices (4 or 8 bits each).

So, while ADPCM itself can't go lower than 2 bits, it can encode lower
if encoded at what is effectively 1 kHz and 250 Hz (and can still be
leveraged here as a component in a more complex format).

Can get to, say, around 0.5 bits/sample (and doesn't need an entropy
encoder or much of anything else fancy), but quality is lacking.

However, resource cost and decoding speeds can be similar, and only
around 2x more code versus a normal ADPCM decoder (some fudging possible here). Generally, roughly half of the code in this case being for the
ADPCM decoder. Or, with precomputed tables, in the area of around 500
lines of C.

The encoding process is a fair bit more involved though, so is generally
a fair bit bigger, more complex, and slower than an ADPCM encoder.

Haven't made much active use of this approach thus far, since as noted,
audio quality is inferior to normal ADPCM.

Though, have not explored its use at higher sample rates.
An 0.5 bit/sample variant could give 28kbps for 44.1 stereo, so might
make sense to compare against MP3 (if it can avoid sounding like
complete garbage with music playback, it would probably still be a win).

Still basically "scraping the bottom of the barrel" though.

Though, the more likely option being a lower bitrate option for sticking
sound effects into a PE/COFF resource section (where, currently the
lowest quality option in BGBCC being 16kbps 8000 2-bit ADPCM).

...

--- Synchronet 3.21a-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.arch on Mon Sep 8 21:10:49 2025

From Newsgroup: comp.arch

BGB <cr88192@gmail.com> writes:

On 9/8/2025 3:59 AM, David Brown wrote:

On 07/09/2025 23:12, MitchAlsup wrote:

There is /some/ overlap, because both groups spend a lot of time
listening to music, which exercises and improves both functions.

I listen to music a lot, but usually House and EDM and similar.

Not that I listen to that, but IIRC, most of that is fairly narrow
in frequence range and mostly generated electronically instead
of by actual instruments (drum machines, synthesizers, etc.)

So you're starting with "artificial" digital signals.

While classical, progressive rock, jazz and classic rock music all leverage real-world analog instruments, most of which have unique and complex
harmonic elements and remain in the analog domain until converted
to final digital form.

--- Synchronet 3.21a-Linux NewsLink 1.2

From George Neuner@gneuner2@comcast.net to comp.arch on Mon Sep 8 17:57:42 2025

From Newsgroup: comp.arch

On Mon, 8 Sep 2025 10:59:50 +0200, David Brown
<david.brown@hesbynett.no> wrote:

... For people who say they can distinguish CD audio from
AAC or other high bps compressed audio, and other "golden ears" >distinctions, it's a different matter - in double-blind tests, most fail >badly. There are a great many factors involved in high-quality audio >reproduction - the basic sample rate is only one of them.

That's true, but the basic sample rate does make a significant
difference. I don't know if it is true, but I have read that to
/accurately/ reproduce a given note requires 10 to 11 harmonics: the
primary note, 7 higher, and 2 to 3 lower.

This means most notes will include sounds that are outside the range
of (normal) human hearing, but you can still /feel/ these sounds [even
the high ones] and miss them when they are absent.

C8 (high C) on the piano is ~4186 Hz. Assuming the need for the 7th
higher harmonic - 29302 Hz - Nyquist would demand a minimum sampling
rate of 58604/s to accurately reproduce C8.

In practice, unless you like orchestral, or certain folk or country,
you are not likely to hear much difference between a CD and a decent
quality compressed version of it. But the CD itself is not a faithful reproduction of the live performance.

And, of course, if you like orchestral you are more likely to be
listening to vinyl rather than CD. 8-)
--- Synchronet 3.21a-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Mon Sep 8 23:51:45 2025

From Newsgroup: comp.arch

George Neuner <gneuner2@comcast.net> posted:

On Mon, 8 Sep 2025 10:59:50 +0200, David Brown
<david.brown@hesbynett.no> wrote:

... For people who say they can distinguish CD audio from
AAC or other high bps compressed audio, and other "golden ears" >distinctions, it's a different matter - in double-blind tests, most fail >badly. There are a great many factors involved in high-quality audio >reproduction - the basic sample rate is only one of them.

That's true, but the basic sample rate does make a significant
difference. I don't know if it is true, but I have read that to
/accurately/ reproduce a given note requires 10 to 11 harmonics: the
primary note, 7 higher, and 2 to 3 lower.

This means most notes will include sounds that are outside the range
of (normal) human hearing, but you can still /feel/ these sounds [even
the high ones] and miss them when they are absent.

Or when the reproduction of the harmonics is out-of-phase wrt the
harmonics of any live version of the same note.

C8 (high C) on the piano is ~4186 Hz. Assuming the need for the 7th
higher harmonic - 29302 Hz - Nyquist would demand a minimum sampling
rate of 58604/s to accurately reproduce C8.

While I can guarantee that I could not hear a 24 KHz pure sine wave tone;
I can guarantee that I can hear a phase shift of the 24 KHz harmonic of
the non-sinusoidal musical note at 6 KHz.

In practice, unless you like orchestral, or certain folk or country,
you are not likely to hear much difference between a CD and a decent
quality compressed version of it. But the CD itself is not a faithful reproduction of the live performance.

And, of course, if you like orchestral you are more likely to be
listening to vinyl rather than CD. 8-)

--- Synchronet 3.21a-Linux NewsLink 1.2

From BGB@cr88192@gmail.com to comp.arch on Mon Sep 8 18:57:33 2025

From Newsgroup: comp.arch

On 9/8/2025 4:10 PM, Scott Lurndal wrote:

BGB <cr88192@gmail.com> writes:

On 9/8/2025 3:59 AM, David Brown wrote:

On 07/09/2025 23:12, MitchAlsup wrote:

There is /some/ overlap, because both groups spend a lot of time
listening to music, which exercises and improves both functions.

I listen to music a lot, but usually House and EDM and similar.

Not that I listen to that, but IIRC, most of that is fairly narrow
in frequence range and mostly generated electronically instead
of by actual instruments (drum machines, synthesizers, etc.)

So you're starting with "artificial" digital signals.

Fair enough.

I had noted when looking at some of this that typically the frequency
spectrum drops off sharply to pretty much nothing. Exactly where this
point is depends a lot on the song, but somewhere in the area of 11 to
16 kHz seems typical.

But, yeah, it appears like things often drop off steeply after a few
points: 8kHz, 11kHz, and 16kHz. With a few songs looked at having a sort
of "stair step" look in their spectrum.

If I take a song and do an 8kHz high-pass filter, what is left mostly
sounds like varying levels of white noise.

One of my other test cases (mostly for speech; ripped from the
audio-track of an episode of an animated TV show), has a drop-off wall
at 8kHz (nothing over 8kHz).

Checking for another animated show, it seems to have an 8kHz wall for
the speech, but a 4kHz wall for the background music.

The presence of a sharp 8kHz frequency wall in several cases does imply
that 16kHz recording is likely popular for voice acting.

Some songs are also weak for testing stereo encodings (or, "too easy")
because they are essentially mono with little (if any) stereo
divergence. Often, if there is stereo divergence, it is usually a pan adjustment.

There are some songs I had found where there is a stronger stereo
component. For example, the intro song for "Ghost in the Shell: Stand
Alone Complex" being a stronger test case for stereo encodings (and has
a more obvious loss of quality when converted to mono).

Semi-interesting is one can invert one of the channels and see how much
of the song drops out.

While classical, progressive rock, jazz and classic rock music all leverage real-world analog instruments, most of which have unique and complex
harmonic elements and remain in the analog domain until converted
to final digital form.

OK.

I guess, in any case, natural recordings probably wouldn't have an
8/11/16 kHz stair-step pattern.

It seems like one could expect people to use 44 or 48 as a default for
pretty much everything, but if everything were being done at 44 or 48, wouldn't likely see this stair-step pattern either (nor the apparent "frequency walls" effect when looking at audio pulled from TV shows;
where whatever is going on with the audio is at a sampling rate lower
than 44 or 48).

I haven't really listened that much to non-electronic music.
Pretty much my whole life has been in the time where people mostly make
music on computers.

Haven't listened that much to classical or similar apart from being in
the form of MIDI files and similar.

Well, apart from the band at the church I go to. They are mostly using a synthesizer and electronic drums and similar though (where the drum set
is the black plastic disks that make drum sounds when hit variety). The instruments then connect up to a computer in the back of the room where
most of the audio mixing happens. So, probably not a true analog
experience here.

IIRC, they are using 3-pin XLR for the microphones, with DIN-5 for some
of the other instruments (the keyboard and drums use DIN-5). My dad
plays guitar sometimes, and they use an interface box to plug 1/4" TRS
cable into XLR.

Not entirely sure of the specifics, but they can adjust the volume and
similar of the various instruments on the computer (not entirely sure
how it works, not looked into it too much). IIRC, I think all the XLR connectors and similar plug into a box which then plugs into the
computer over USB or something.

Looking on Amazon, a similar looking sort of box seems to go for around
$500 (for a "Multi-track mixer/recorder").

Not sure of the audio properties of all of this.

Also they have a wireless microphone for the pastor and similar, which
also feeds into it somehow.

Sometimes they need to test things before starting (having him turn on
the microphone and stay stuff), as sometimes the microphone fails to
connect to the computer in the back.

Not sure, but seems to behave sort of like it is using a Bluetooth
interface or similar. Where, turn it on, say stuff, usually connects (if
it does so) after around 10 or 15 seconds or so (and if/when it loses connection, it goes silent until it can reconnect; after another 10
seconds or so, when audio comes back).

Not seeing an exact match, but similar looking sorts of microphones seem
to go for around $40 to $60 on Amazon.

I don't really mess with any of this as it isn't really my area.

...

--- Synchronet 3.21a-Linux NewsLink 1.2

From BGB@cr88192@gmail.com to comp.arch on Mon Sep 8 21:49:33 2025

From Newsgroup: comp.arch

On 9/8/2025 6:57 PM, BGB wrote:

On 9/8/2025 4:10 PM, Scott Lurndal wrote:

BGB <cr88192@gmail.com> writes:

On 9/8/2025 3:59 AM, David Brown wrote:

On 07/09/2025 23:12, MitchAlsup wrote:

There is /some/ overlap, because both groups spend a lot of time
listening to music, which exercises and improves both functions.

I listen to music a lot, but usually House and EDM and similar.

Not that I listen to that, but IIRC, most of that is fairly narrow
in frequence range and mostly generated electronically instead
of by actual instruments (drum machines, synthesizers, etc.)

So you're starting with "artificial" digital signals.

Fair enough.

I had noted when looking at some of this that typically the frequency spectrum drops off sharply to pretty much nothing. Exactly where this
point is depends a lot on the song, but somewhere in the area of 11 to
16 kHz seems typical.

But, yeah, it appears like things often drop off steeply after a few
points: 8kHz, 11kHz, and 16kHz. With a few songs looked at having a sort
of "stair step" look in their spectrum.

If I take a song and do an 8kHz high-pass filter, what is left mostly
sounds like varying levels of white noise.

One of my other test cases (mostly for speech; ripped from the audio-
track of an episode of an animated TV show), has a drop-off wall at 8kHz (nothing over 8kHz).

Checking for another animated show, it seems to have an 8kHz wall for
the speech, but a 4kHz wall for the background music.

The presence of a sharp 8kHz frequency wall in several cases does imply
that 16kHz recording is likely popular for voice acting.

OK, going and ripping the audio tracks off episodes from a few more
animated shows and manually looking at the spectrum, it seems:
Sonic Boom:
Pretty much everything seems to be ~ 8 kHz / 16000.
Miraculous Ladybug:
Pretty much everything seems to be ~ 11 kHz / 22050.
Bravest Warriors (S4):
Voice frequency wall: ~ 16kHz, so likely 32000 or maybe 44100
Some distorted voices appear to be using 44100.
Music / SFX: ~ 4kHz, so likely still 8000
Though, show is fast-paced enough making it difficult to isolate.
Amazing Digital Circus:
Voice frequency wall: ~ 11 kHz, so likely 22050.
Various sound effects: 4 and 5 kHz, so likely 8000 or 11025.
Actually, whole episode seems to have a hard cutoff at 11 kHz.
So, unclear 22050 was input or output side.
(Would need to look at more episodes to figure it out).

It would appear that some of these shows might be scavenging audio from
places where 8000 and 11025 is popular for things like sound effects.
While using higher sampling rates for the voice acting.

Though, I suspect it might just have been "more obvious" in the case of Bravest Warriors, possibly because the music is used more prominently,
and the mix of high-fidelity voice recording with poor-fidelity
background music is more obvious than if everything were more "samey".

Of the examples looked at this far, it would appear that the shows made
by "indie" studios are more likely to make use of low-fidelity sound
effects.

May be a case of trying not to be too obvious.
If all the voices sound cooked, it is more obvious;
If a random sound effect sounds cooked, much less obvious;
If background music for a scene sounds cooked, also obvious.

Not sure where people are getting sound effects, but I would still find
it kinda amusing whenever watching something and hearing a sound effect
that I recognize from somewhere else (like "FreeDoom" or "OpenQuartz" or similar; seems there is now also LibreQuake which is using the same
terms as "FreeDoom").

But, I guess probably the goal is to not be too obvious unless maybe as
part of an in-joke.

Though, does seem like possibly if someone is using something, and it is obvious that it was originally 8000 or 11025, probably scavenged from somewhere (say, if one assumes people will not tend to make new sound
effects for a TV show and then save them as 8000 or similar).

Some stuff online says that voice-acting should use 48000, but this
would not appear to be the case for the examples I have looked at (well, either this, or it was "lost" somewhere in the process).

My analysis process here being to extract the audio track using "VLC
Media Player" and then looking at stuff manually in Audacity.

...

--- Synchronet 3.21a-Linux NewsLink 1.2

From BGB@cr88192@gmail.com to comp.arch on Mon Sep 8 23:28:23 2025

From Newsgroup: comp.arch

On 9/8/2025 6:49 AM, David Schultz wrote:

On 9/7/25 10:55 PM, BGB wrote:

Dunno.

Averaging pairs would be the traditional method for downsample, but,
when downsampling to 8kHz, audio sounds muffled, and intelligibility
of speech is poor.

It has been a couple of decades since that discrete time signal
processing course so the details have faded. But I do know that aliasing
can be a big problem. Hence the good low pass filter as part of the decimation process.

Assuming your 16KSPS data started with a good presample filter there was
no signal or noise (or at least negligible) above 8KHz, it is going to
have stuff between 4KHz and 8KHz. Fail to filter that adequately and it
gets folded/aliased to a lower frequency.

So you need a discrete time filter to remove most of the information in
your data above 4KHz while leaving what you want alone.

The moving average filter is not a good choice. Sure it has a zero at
your new sample rate but it has poor performance in general.

OK.

I was just seeing here if I could make stuff "less muffled".
A cleaner averaging filter, like:
(S0+3*S1+3*S2+S3)/8
Tends to have more muffle.

Whereas:
(5*S1+5*S2-S0-S3)/4
Slightly boosts high frequencies (so less muffle) but may have other drawbacks.

...

A lot of the 16000 audio in this case is actually from opening a 44100
or 48000 WAV and then dynamically resampling to 16000 because this is
what I tend to use in a lot of my experiments of this sort.

The typical downsampler that I use for general stuff works like:
Downsample by powers of 2 (via pairwise averaging);
Interpolate samples from above and below the target rate.
Or, basically, sorta like mip-mapping.

I had tried various strategies long ago, and I think this is the one
that won out (though, gives weak results when the target is a low sample rate).

This was contrast to a different strategy (IIRC also used in the Quake engine), something like:
void Resample8(byte *dst, byte *src, int dlen, int slen)
{
int i, spos, sstep;

spos=0; sstep=(256*slen)/dlen;
for(i=0; i<dlen; i++)
{ dst[i]=src[spos>>8]; spos+=sstep; }
}
Or, 16-bit:
void Resample16(s16 *dst, s16 *src, int dlen, int slen)
{
int i, spos, sstep;

spos=0; sstep=(256*slen)/dlen;
for(i=0; i<dlen; i++)
{ dst[i]=src[spos>>8]; spos+=sstep; }
}

Which gives results that sound poor.

Versus, say (algo from memory, slower but less bad):

void ResampleHalf16(s16 *dst, s16 *src, int dlen)
{
int i;
for(i=0; i<dlen; i++)
dst[i]=(src[i*2+0]+src[i*2+1])/2;
}

void ResampleInterpSpline(s16 *dst, s16 *src, int dlen, int slen)
{
int s0, s1, s2, s3, t0, t1, t2;
int i, j, spos, sstep, sfr, tfr;

spos=0; sstep=(256*slen)/dlen;
for(i=0; i<dlen; i++)
{
j=spos>>8;
s2=src[j+0];
s0=s2; s1=s2; s3=s2;
if(j>0) s1=src[j-1];
if(j>1) s0=src[j-2];
if((j+1)<slen1) s3=src[j+1];

sfr=spos&255;
tfr=sfr^255;
t0=(((256+sfr)*s1)-(sfr*s0))>>8;
t1=(((256+tfr)*s2)-(tfr*s3))>>8;
t2=(((256-sfr)*t0)+(sfr*t1))>>8;
if(t2<-32767)t2=-32767;
if(t2> 32767)t2= 32767;

dst[i]=t2;
spos+=sstep;
}
}

void ResampleInterpSpline2x(s16 *dst,
s16 *src1, s16 *src2, int dlen, int slen)
{
int s0, s1, s2, s3, s4, t0, t1, t2;
int i, j, k, spos, sstep, ufr, sfr, tfr;

spos=0; sstep=(256*slen)/dlen;

//try to estimate the interpolation values between the levels
//assume sstep is between 0x80 and 0xFF in this case
//0x80 means dlen is ~ 2x slen
//0xFF means dlen ~= slen
ufr=511-(sstep*2);

for(i=0; i<dlen; i++)
{
j=spos>>8;
k=spos>>7;
s2=src2[j+0];
s0=s2; s1=s2; s3=s2;
if(j>0) s1=src2[j-1];
if(j>1) s0=src2[j-2];
if((j+1)<slen1) s3=src2[j+1];
s4=src1[k];

sfr=spos&255;
tfr=sfr^255;
t0=(((256+sfr)*s1)-(sfr*s0))>>8;
t1=(((256+tfr)*s2)-(tfr*s3))>>8;
t2=(((256-sfr)*t0)+(sfr*t1))>>8;
t3=(((256-ufr)*t2)+(ufr*s4))>>8;

if(t3<-32767)t3=-32767;
if(t3> 32767)t3= 32767;

dst[i]=t3;
spos+=sstep;
}
}

void Resample16(s16 *dst, s16 *src, int dlen, int slen)
{
s16 *sbuf1, *sbuf2, *sbt;
int sl1, sl2;
if(dlen>=slen)
{
if(dlen==slen)
{
memcpy(dst, src, dlen*2);
return;
}
ResampleInterpSpline(dst, src, dlen, slen);
return;
}

sl1=slen/2; sl2=sl1/2;
sbuf1=malloc(slen);
sbuf2=malloc(slen/2);
ResampleHalf16(sbuf1, src, sl1);
ResampleHalf16(sbuf2, sbuf1, sl2);
while(sl2>=dlen)
{ sbt=sbuf1; sbuf1=sbuf2; sbuf2=sbt;
sl1=sl2; sl2=sl1/2;
ResampleHalf16(sbuf2, sbuf1, sl2);
}

if(sl1==dlen)
{
memcpy(dst, sbuf1, dlen*2);
free(sbuf1); free(sbuf2);
return;
}

ResampleInterpSpline2x(dst, sbuf1, sbuf2, dlen, sl2);
free(sbuf1); free(sbuf2);
}

--- Synchronet 3.21a-Linux NewsLink 1.2

From David Schultz@david.schultz@earthlink.net to comp.arch on Tue Sep 9 07:06:17 2025

From Newsgroup: comp.arch

On 9/8/25 11:28 PM, BGB wrote:

OK.

I was just seeing here if I could make stuff "less muffled".
A cleaner averaging filter, like:
(S0+3*S1+3*S2+S3)/8
Tends to have more muffle.

Whereas:
(5*S1+5*S2-S0-S3)/4
Slightly boosts high frequencies (so less muffle) but may have other drawbacks.

Use a real FIR filter:
http://t-filter.engineerjs.com/
As an example, I tried:
16KSPS
Pass band: 0 to 3500Hz, 1dB dips
Stop band: 4KHz to 8KHz: 40dB attenuation

This resulted in a filter with 51 taps. The more brick wall like the
filter is, the more taps required.
--
http://davesrocketworks.com
David Schultz
"The cheaper the crook, the gaudier the patter." - Sam Spade
--- Synchronet 3.21a-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to comp.arch on Tue Sep 9 15:06:44 2025

From Newsgroup: comp.arch

On 08/09/2025 22:10, BGB wrote:

On 9/8/2025 3:59 AM, David Brown wrote:

On 07/09/2025 23:12, MitchAlsup wrote:

Terje Mathisen <terje.mathisen@tmsw.no> posted:

MitchAlsup wrote:

BGB <cr88192@gmail.com> posted:

For those that work directly with music, age brings experience and
improves abilities like recognising or duplicating tunes. Age also
brings deterioration in the physical aspects of hearing - especially
at higher frequencies.

This is a concern for me, partly if I lose high frequencies, seemingly I wont have anything, as my hearing of low frequencies (sub 1kHz) is
seemingly already impaired.

Seemingly, most of my world of audio perception is located between 1kHz
and 8kHz.

Most lower frequencies are more felt than heard.

There is /some/ overlap, because both groups spend a lot of time
listening to music, which exercises and improves both functions.

I listen to music a lot, but usually House and EDM and similar.

Isn't that an oxymoron? "Music" and "House / EDM" don't belong together
in the same sentence :-)

When I was younger, a lot of Goth (mostly of the Synthpop/Synthwave variaty), and Industrial.

There was Dubstep for a while, but seemingly the whole genre kind of imploded (though, never got much mainstream popularity aside from "Skrillex").

It sounds like you have likely damaged your hearing - but those kinds of "music" (not that I am opinionated...) are often played at very high
volumes.

One key difference, however, is that it is easy to appreciate when
people can listen to a tune once and play it again afterwards - you
can watch them do it. For people who say they can distinguish CD
audio from AAC or other high bps compressed audio, and other "golden
ears" distinctions, it's a different matter - in double-blind tests,
most fail badly. There are a great many factors involved in
high-quality audio reproduction - the basic sample rate is only one of
them.

I am not really a "golden ear" AFAICT.

At high bitrates, or high sample rates, I can't hear much difference.

But, mostly just noting that it is at low bitrates where things like MP3
and similar start to sound like crap.

There are, as I said, /many/ factors involved - you are mixing up
bitrates and sample rates.

If everything else is "perfect", 44.1 kHz sample rate can reproduce frequencies (including phase information) up to 22.05 kHz - more than
enough for anyone but some young children.

But everything else is usually very far from perfect. A particular
issue is the dynamic range - 16-bit linear coding does not have enough
range for a lot of music. Either quite sounds are "pixelated", losing a
lot of important information, or the dynamic range is compressed before
the CD quality image is generated - giving the music a "flat" sound.
When compressed audio formats are used, they may start off at higher bit depths and sample rates, but in effect the bit depth also gets
compressed and you lose resolution as well as sample rate and high
frequency information for high compression ratios. And just as high
jpeg compression produces artefacts for some images, such as ghosting,
so does high audio compression.

--- Synchronet 3.21a-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to comp.arch on Tue Sep 9 15:51:10 2025

From Newsgroup: comp.arch

On 08/09/2025 23:57, George Neuner wrote:

On Mon, 8 Sep 2025 10:59:50 +0200, David Brown
<david.brown@hesbynett.no> wrote:

... For people who say they can distinguish CD audio from
AAC or other high bps compressed audio, and other "golden ears"
distinctions, it's a different matter - in double-blind tests, most fail
badly. There are a great many factors involved in high-quality audio
reproduction - the basic sample rate is only one of them.

That's true, but the basic sample rate does make a significant
difference. I don't know if it is true, but I have read that to
/accurately/ reproduce a given note requires 10 to 11 harmonics: the
primary note, 7 higher, and 2 to 3 lower.

There you are talking about /timbre/, not just frequency. It's the
harmonics that make the same note sound differently on a piano and a
violin. But at the high frequency range, you can't distinguish these.
If you can hear sounds up to 16 kHz (which would be better than most
regulars in this group, with the demographics of males of a certain
age), you would not be able to able to determine if it is a piano note
or a violin note. Musical instruments don't go anything like that high
- even a piccolo won't go over about 4 kHz. Sounds about that range
have all the musical nuances of chalk on a blackboard.

High-pitch music or very high pitch singing usually tops out at about 2
Khz for the base frequency. The tenth harmonic would then be at 20 kHz.
So the maths works out fine for 44.1 kHz sampling rate.

Of course there is the very significant matter of how the reproduced
44.1 kHz is filtered - it must, in an analogue filter, try to cut out virtually everything of 22.5 kHz and above while allowing 20 kHz and
below to pass through with a flat power response and linear phase
response. That does not happen in practice - and that is a key reason
why harmonics of higher pitched notes are usually poor from CD quality
music, especially on low-end audio systems. (High-end audio systems
up-sample the 44.1 kHz / 16-bit to perhaps 192 kHz / 24-bit, so that the filtering is vastly better).

This means most notes will include sounds that are outside the range
of (normal) human hearing, but you can still /feel/ these sounds [even
the high ones] and miss them when they are absent.

Nope. Most notes are much lower, and harmonics of relevance are within
the range of human hearing. For high enough notes, you simply don't
hear as much harmonic information.

C8 (high C) on the piano is ~4186 Hz. Assuming the need for the 7th
higher harmonic - 29302 Hz - Nyquist would demand a minimum sampling
rate of 58604/s to accurately reproduce C8.

You can't accurately hear C8 even when live - you don't get the same
harmonic information as you do with C6, because your ears can't
distinguish the higher harmonics. Your ears have the same limitations
as any other senses in this manner - you can look at your cat's feet and
count its toes, but if you look at a fly's feet you can't count the toes.

In practice, unless you like orchestral, or certain folk or country,
you are not likely to hear much difference between a CD and a decent
quality compressed version of it. But the CD itself is not a faithful reproduction of the live performance.

Good quality compressed formats are often better than CD quality. The
killer for CD quality is not the sample rate, it is the limited dynamic
range from the linear 16-bit range. Compressed formats will, in effect,
use a more logarithmic scale (like A-law and mu-law, used to get comprehensible speech despite a much smaller sample size) that is more
in line with the way the human brain interprets sound.

And, of course, if you like orchestral you are more likely to be
listening to vinyl rather than CD. 8-)

In theory (but very rarely in practice), when combined with good enough amplifiers and speakers, vinyl has a a higher dynamic range than CD
audio. But that is only the case when the record is new. Play it a few times, and the wear from the needle will smooth out the tracks enough to eliminate the difference.

But enjoying music is a psychologically, physically, mentally and
biologically complex hobby. The comfort of the chair you are sitting
in, or the type of reflections and absorptions from the rest of the
room, can make a big difference. Knowing that you have spent a great
deal of money on your impressive-looking hifi system will improve your listening experience regardless of what any audio measurement might say.
Some audiophiles prefer the "valve sound" over "transistor sound" -
not because the sound reproduction is more accurate (it is not - valves
add second harmonic distortion that is non-existent in transistor
amplifiers), but simply because they like it better.

--- Synchronet 3.21a-Linux NewsLink 1.2

From Michael S@already5chosen@yahoo.com to comp.arch on Tue Sep 9 18:47:00 2025

From Newsgroup: comp.arch

On Tue, 9 Sep 2025 07:06:17 -0500
David Schultz <david.schultz@earthlink.net> wrote:

On 9/8/25 11:28 PM, BGB wrote:

OK.

I was just seeing here if I could make stuff "less muffled".
� A cleaner averaging filter, like:
�� (S0+3*S1+3*S2+S3)/8
Tends to have more muffle.

Whereas:
�� (5*S1+5*S2-S0-S3)/4
Slightly boosts high frequencies (so less muffle) but may have
other drawbacks.

Use a real FIR filter:
http://t-filter.engineerjs.com/
As an example, I tried:
16KSPS
Pass band: 0 to 3500Hz, 1dB dips
Stop band: 4KHz to 8KHz: 40dB attenuation

This resulted in a filter with 51 taps. The more brick wall like the
filter is, the more taps required.

IIRC, AT&T had Fpass = 3.2 KHz. That's significantly easier than 3.5.
Of course, AT&T used analog IIR filter rather than digital FIR. I have
no idea what sort of IIR it was, but would guess that 5th order Bessel
filter with -3dB point at 3.6 KHz could serve as a fair digital
imitation of their circuit.
--- Synchronet 3.21a-Linux NewsLink 1.2

From BGB@cr88192@gmail.com to comp.arch on Tue Sep 9 14:13:03 2025

From Newsgroup: comp.arch

On 9/9/2025 8:06 AM, David Brown wrote:

On 08/09/2025 22:10, BGB wrote:

On 9/8/2025 3:59 AM, David Brown wrote:

On 07/09/2025 23:12, MitchAlsup wrote:

Terje Mathisen <terje.mathisen@tmsw.no> posted:

MitchAlsup wrote:

BGB <cr88192@gmail.com> posted:

For those that work directly with music, age brings experience and
improves abilities like recognising or duplicating tunes. Age also
brings deterioration in the physical aspects of hearing - especially
at higher frequencies.

This is a concern for me, partly if I lose high frequencies, seemingly
I wont have anything, as my hearing of low frequencies (sub 1kHz) is
seemingly already impaired.

Seemingly, most of my world of audio perception is located between
1kHz and 8kHz.

Most lower frequencies are more felt than heard.

There is /some/ overlap, because both groups spend a lot of time
listening to music, which exercises and improves both functions.

I listen to music a lot, but usually House and EDM and similar.

Isn't that an oxymoron? "Music" and "House / EDM" don't belong together
in the same sentence :-)

Probably still more "music" than "Gansta Rap" though...

Where I am living, most of the local people seem to be into "Country",
but I am not so much into this, like no, will take a pass.

Though, there were a few decent Country musicians, like "Johnny Cash"
and similar.

But mostly people are all like, "this stuff is great" and it is mostly
guitar slide effects and people going on about how their wife ran off
and took their dog and pickup truck and similar...

Vs, say, "Gangsta Rap" being mostly about doing crimes and picking up
girls and similar.

Had a few times imagined hybrid styles, usually taking one sort of
instrument usage style and having the lyrics in a different clashing style.

Say, for example, taking the instrument styles of Country or Disco or
similar, and then having the singer do it "Gansta Rap" style, etc.
Like, say, Country style guitar slides, followed by lyrics like:
"I be roamin da hood, and be gettin da wood."
"I see dem honeys, I show em my moneys."
"'Where it be at?' She be a gyat."
"Daammnn, G."
...

Well, and there is a lot of Mexican music where, even if one can't
really hear the song in much detail (due to distance or whatever), it is recognizable due to a "boop boop" sound, with two different tones of
"boop" with a roughly 1 second "boop" spacing.

Whereas, with House and EDM and similar, usually it is more about the
beats. Also a decent BPM range, etc...

When I was younger, a lot of Goth (mostly of the Synthpop/Synthwave
variaty), and Industrial.

There was Dubstep for a while, but seemingly the whole genre kind of
imploded (though, never got much mainstream popularity aside from
"Skrillex").

It sounds like you have likely damaged your hearing - but those kinds of "music" (not that I am opinionated...) are often played at very high volumes.

I don't usually go to clubs or anything...

One of the loudest things I had tended to deal with in my experiences
wasn't so much music, rather running a waterjet.

This was one use-case for noise-canceling wireless headphones, but
headphones couldn't entirely defeat this.

Not sure of the exact volume, but basically however much noise is
generated by pushing 90k PSI water through a 1/16" hole.

But, yeah, in any case, I seem to have experienced some form of "reverse slope" hearing loss, where I hear relatively little under around 1kHz,
so things like tuning forks and similar are basically silent IRC.

Seemingly, I can hear them better on computers than IRL, where:
440 Hz sine wave on PC is quiet, but still audible;
440 Hz tuning fork is inaudible.

My mom got a steel drum tuned to 432 Hz, it is barely audible to me. Can
put my hand near it and feel vibrations, but don't hear much.

Would be easier if these things generated square waves, I can hear
square waves.

My dad plays guitar some, but to me guitar mostly generates a kind of
buzzing sound, or a sound vaguely similar to a chainsaw (but, like, each string being a different speed of chainsaw).

Apparently, there was a difference between me and my dad as to what an
air compressor sounds like. To him, he says it is very loud. To me, it
sounds more like a pond pump (and mostly dominated by a buzzing noise),
but a little louder and more annoying.

One time not too long ago, the air line for the mill had developed a
leak (as a small hole in the hose), but this leak was to me very obvious
due to the hiss. Apparently my dad didn't hear it.

Also I can note that I don't hear car engines (which apparently make a
lot of noise or something), usually I more hear cars in the form of the
noise made by the tire rolling on the ground.

...

One key difference, however, is that it is easy to appreciate when
people can listen to a tune once and play it again afterwards - you
can watch them do it. For people who say they can distinguish CD
audio from AAC or other high bps compressed audio, and other "golden
ears" distinctions, it's a different matter - in double-blind tests,
most fail badly. There are a great many factors involved in high-
quality audio reproduction - the basic sample rate is only one of them.

I am not really a "golden ear" AFAICT.

At high bitrates, or high sample rates, I can't hear much difference.

But, mostly just noting that it is at low bitrates where things like
MP3 and similar start to sound like crap.

There are, as I said, /many/ factors involved - you are mixing up
bitrates and sample rates.

At least with ADPCM, bitrate and sample rate are tied together.

MP3 is more independent here.

So, you can keep the sample rate high but drive the bitrate low, and it
sounds kinda terrible.

If everything else is "perfect", 44.1 kHz sample rate can reproduce frequencies (including phase information) up to 22.05 kHz - more than
enough for anyone but some young children.

Yes, granted.

MP3 seemingly always tries to operate at 44.1, but often still goes to
crap when set to encode lower bitrates.

As noted, I had had good experiences with ADPCM variants, but usually
the only way to get to lower bitrates is to drop the sample rate.

But, then, if going below 16000, perceptual quality drops off sharply.
Usual unavoidable enemy being that stuff starts sounding muffled, which
is most obvious at 8000.

But, even as such, I can note (when downloading and looking for the
audio files for LibreQuake) that, even despite being fairly recent, they
are still doing most of their sound effects at 11025 (apart from the
random straggler files at 22050 or 44100).

Well, anyways, I guess I can fiddle more with trying to use a
combination of ADPCM and a pattern table to get to a lower bits-per-sample.

At least, the decoding process isn't too expensive.

But everything else is usually very far from perfect. A particular
issue is the dynamic range - 16-bit linear coding does not have enough
range for a lot of music. Either quite sounds are "pixelated", losing a lot of important information, or the dynamic range is compressed before
the CD quality image is generated - giving the music a "flat" sound.
When compressed audio formats are used, they may start off at higher bit depths and sample rates, but in effect the bit depth also gets
compressed and you lose resolution as well as sample rate and high
frequency information for high compression ratios. And just as high
jpeg compression produces artefacts for some images, such as ghosting,
so does high audio compression.

Yeah, I am aware that seemingly a lot of music uses compression to try
to achieve a sort of semi-uniform "loudness wall".

I can note that when looking at music in an audio program, it tends to
use the full amplitude range for pretty much the whole song.

Contrast, the audio from TV shows tends to use closer to around 25% to
33% of the amplitude range, and with a lot more variability in loudness between sections.

Though, if there is one merit to compression, it does make it easier to
hear low frequencies.

So, while a 440Hz pure sine wave is very quiet; a 440Hz sine wave fed
through the compressor filter is a lot louder, even if the overall
amplitude hasn't really changed much. Possibly because compression often
makes the shape of a sine wave closer to that of a square wave.

Ironically, one could almost make a case here for using an A-Law variant
with the low-order bits XOR'ed with the sign, in which case it could
function as both a higher dynamic-range format and as 8-bit PCM.

Though, using it as 8-bit PCM would still have the "noise floor"
annoyance; where it almost invariably adds a low intensity hiss to the
audio.

--- Synchronet 3.21a-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.arch on Tue Sep 9 20:55:56 2025

From Newsgroup: comp.arch

BGB <cr88192@gmail.com> writes:

On 9/9/2025 8:06 AM, David Brown wrote:

On 08/09/2025 22:10, BGB wrote:

On 9/8/2025 3:59 AM, David Brown wrote:

On 07/09/2025 23:12, MitchAlsup wrote:

Terje Mathisen <terje.mathisen@tmsw.no> posted:

MitchAlsup wrote:

BGB <cr88192@gmail.com> posted:

For those that work directly with music, age brings experience and
improves abilities like recognising or duplicating tunes. Age also
brings deterioration in the physical aspects of hearing - especially
at higher frequencies.

This is a concern for me, partly if I lose high frequencies, seemingly
I wont have anything, as my hearing of low frequencies (sub 1kHz) is
seemingly already impaired.

Seemingly, most of my world of audio perception is located between
1kHz and 8kHz.

Most lower frequencies are more felt than heard.

There is /some/ overlap, because both groups spend a lot of time
listening to music, which exercises and improves both functions.

I listen to music a lot, but usually House and EDM and similar.

Isn't that an oxymoron? "Music" and "House / EDM" don't belong together >> in the same sentence :-)

Probably still more "music" than "Gansta Rap" though...

It's not music, it is C-rap.

Try some prog rock: https://www.youtube.com/watch?v=0HWv7YJtYCA

--- Synchronet 3.21a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.arch on Tue Sep 9 14:15:17 2025

From Newsgroup: comp.arch

On 9/9/2025 12:13 PM, BGB wrote:
[...]

This is fairly nice:

https://youtu.be/RijB8wnJCN0?list=RDRijB8wnJCN0

:^)
--- Synchronet 3.21a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.arch on Tue Sep 9 14:17:20 2025

From Newsgroup: comp.arch

On 9/9/2025 2:15 PM, Chris M. Thomasson wrote:

On 9/9/2025 12:13 PM, BGB wrote:
[...]

This is fairly nice:

https://youtu.be/RijB8wnJCN0?list=RDRijB8wnJCN0

:^)

More ambient: https://youtu.be/x1afn71-0sI?list=RDMM ;^D
--- Synchronet 3.21a-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to comp.arch on Tue Sep 9 23:23:40 2025

From Newsgroup: comp.arch

On 09/09/2025 21:13, BGB wrote:

My mom got a steel drum tuned to 432 Hz, it is barely audible to me. Can
put my hand near it and feel vibrations, but don't hear much.

Would be easier if these things generated square waves, I can hear
square waves.

For a square wave, you have a third harmonic at third volume, a fifth
harmonic at fifth volume, and so on. So you have very large harmonics -
if you have trouble hearing at 400 Hz but can hear somewhat at 800 Hz
and fine at 1200 Hz, then a 400 Hz sine wave will be inaudible but a 400
Hz square wave will be merely a little dampened.

--- Synchronet 3.21a-Linux NewsLink 1.2

From BGB@cr88192@gmail.com to comp.arch on Tue Sep 9 20:27:14 2025

From Newsgroup: comp.arch

On 9/9/2025 10:47 AM, Michael S wrote:

On Tue, 9 Sep 2025 07:06:17 -0500
David Schultz <david.schultz@earthlink.net> wrote:

On 9/8/25 11:28 PM, BGB wrote:

OK.

I was just seeing here if I could make stuff "less muffled".
A cleaner averaging filter, like:
(S0+3*S1+3*S2+S3)/8
Tends to have more muffle.

Whereas:
(5*S1+5*S2-S0-S3)/4
Slightly boosts high frequencies (so less muffle) but may have
other drawbacks.

Use a real FIR filter:
http://t-filter.engineerjs.com/
As an example, I tried:
16KSPS
Pass band: 0 to 3500Hz, 1dB dips
Stop band: 4KHz to 8KHz: 40dB attenuation

This resulted in a filter with 51 taps. The more brick wall like the
filter is, the more taps required.

Well, but downsides:
FIR filtering would likely reduce noise;
But, would not reduce the muffle, which is the bigger issue at 8000;
FIR filtering, particularly larger filters, tends to be slow.

But, yeah, could in theory use a slightly bigger sliding filter, say:
( 1 1 -2 -2 6 6 -2 -2 1 1 ) /8

Which could in theory maybe have less noise.

IIRC, AT&T had Fpass = 3.2 KHz. That's significantly easier than 3.5.

Of course, AT&T used analog IIR filter rather than digital FIR. I have
no idea what sort of IIR it was, but would guess that 5th order Bessel
filter with -3dB point at 3.6 KHz could serve as a fair digital
imitation of their circuit.

Dunno...

Seems like one has to be careful to not loose too much more.

But, seems like this still mostly just leaves trying to use a more
complex encoding and a higher sample rate (such as 16000) as the more
likely path forward.

So, say, more complex format:

Base Block, 2-bit ADPCM:
1 byte: Initial Predictor (A-Law)
1 byte: Initial Step (low 6 bits, log2 4.2)
N/4 bytes: Audio Samples

Where each sample is:
00: Small Positive (V+=1*StepScale, StepIndex-=1)
01: Large Positive (V+=3*StepScale, StepIndex+=2)
10: Small Negative (V-=1*StepScale, StepIndex-=1)
11: Large Negative (V-=3*StepScale, StepIndex+=2)
Encoder responsible for avoiding V or StepIndex going out of range.

Mono Block:
Encode a Base Spline at 1/16 rate
Encode a Deviation Spline at 1/64 rate
Encode a pattern table index at 4 or 8 bits per 16-samples.
4b: ~ 0.5 bits/sample
8b: ~ 0.75 bits/sample

Issue:
More fiddling, and it is proving difficult to get it above "some minimum standard of acceptable audio quality".

In any case, can't be a good alternative to 8kHz 2b ADPCM if it sounds significantly worse that 8kHz 2b ADPCM (and 12kbps isn't a huge win over 16kbps...).

...

--- Synchronet 3.21a-Linux NewsLink 1.2

From Lawrence =?iso-8859-13?q?D=FFOliveiro?=@ldo@nz.invalid to comp.arch on Thu Sep 11 01:33:44 2025

From Newsgroup: comp.arch

On Sat, 6 Sep 2025 14:19:40 -0500, BGB wrote:

But, there is some "weird hacks" that can be done in audio processing
when downsampling that seems to notably increase intelligibility at an
8kHz sample rate ...

There are digital encoding formats used with mobile phones that are
optimized for speech. Ever heard a call where the other end sounded every
now and then like they were underwater? That’s the kind of compression artifact you get.
--- Synchronet 3.21a-Linux NewsLink 1.2

From Lawrence =?iso-8859-13?q?D=FFOliveiro?=@ldo@nz.invalid to comp.arch on Thu Sep 11 01:35:00 2025

From Newsgroup: comp.arch

On Sat, 06 Sep 2025 16:21:12 GMT, MitchAlsup wrote:

No it does not sound "good" on a system that accurately reproduces
22KHz; like systems with electrostatic speakers covering the high end of
the audio spectrum.

I wonder how that works, given that the audio engineer that mastered the recording was using speakers that cost a fraction of the price.
--- Synchronet 3.21a-Linux NewsLink 1.2

From BGB@cr88192@gmail.com to comp.arch on Thu Sep 11 02:05:59 2025

From Newsgroup: comp.arch

On 9/10/2025 8:33 PM, Lawrence D’Oliveiro wrote:

On Sat, 6 Sep 2025 14:19:40 -0500, BGB wrote:

But, there is some "weird hacks" that can be done in audio processing
when downsampling that seems to notably increase intelligibility at an
8kHz sample rate ...

There are digital encoding formats used with mobile phones that are
optimized for speech. Ever heard a call where the other end sounded every
now and then like they were underwater? That’s the kind of compression artifact you get.

Looking some at it, apparently a lot of the current modern phone class
audio codecs are based on trying to run a model of the human vocal tract
and then adding white noise to make it sound more natural (with some apparently partly based on vocoder technology).

But, in my case, I don't really hear speech effectively over phones, I
mostly hear a lot of warbling that I am left trying to decipher over all
the hiss.

As noted, the filtering hack mostly kept to normal PCM handling, but I
soon realized can't work as a general solution to "stuff sounding bad"
at an 8kHz sample rate.

When I was looking into it, 4-channel sinewave synthesis is possible, but: Quality is still poor;
At a 125Hz update frequency, at 16 bits per sinewave, still takes around 8kbps.

Needs 16 bits roughly to encode both the frequency and amplitude of each sinewave to an acceptable degree.

When fiddling with it, I ended up finding an OK strategy of:
Sample for 12 signwaves, dividing the 2-8 kHz range into roughly 1/6
octave chunks (picking the loudest wave within each chunk);
Pick the top 4 loudest waves from the 12 sampled.

I was experimenting with pushing the scheme I mentioned else-thread to
around 6 kbps, which (last I messed with it) still generates some truly
awful audio quality.

Posted an example to my twitter feed: https://x.com/cr88192/status/1965694742186049683

It does sound a fair bit better with a 16kHz sampling rate (12kbs), but
is still notably inferior to 8kHz 2-bit ADPCM (16kbps).

The 6kbps case is interesting as it gets a 2-minute song into around
96K, which is kinda pushing into MIDI territory. But, MIDI would have
sounded better (though, no real obvious way to auto-convert PCM audio
into MIDI commands).

Well, unless maybe doing something like sinewave synthesis but then
trying to convert the sine waves into Note On/Off commands. Though,
naively mapping sinewave synthesis to MIDI commands would likely add a
fair bit of bulk and overhead.

It is possible that I may need to take a different approach to
generating the pattern table.

Initial approach:
Fill it with sine-waves;
Didn't work very well.
Current strategy:
Start with a table of 16-bit patterns (curated manually);
Map each to samples, 0=full negative, 1=full positive;
Run N passes of averaging;
Generate a pattern table with 4-bits per pattern sample.

Possible pattern-table generation strategy (not yet tried):
Use the sign of each sample relative to the base curve to generate a
16-bit key;
average the relative values for each key, keeping track of relative
usage frequency;
Pick the top-N else merge similar patterns until one has fewer than 256
or so.

Note that any sounds much over ~ 250Hz at an 8000 sample rate are being generated from the pattern table.

But, it is possible this approach may be a lost cause (could not be made
to give anywhere acceptable quality at these bitrates).

Note that I don't want something significantly more complicated or
expensive than ADPCM (so, ideally no entropy coding or fancy transforms
on the decoder side...).

To be useful, would need to either:
Do better than ADPCM at a similar bitrate;
Achieve bitrates lower than what is possible with ADPCM.

Was partly looking at the latter, but to be useful it needs to have some
level of "passable" quality, which I have yet to achieve at this target
(eg, particularly at 6 kbps).

...

--- Synchronet 3.21a-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.arch on Thu Sep 11 15:06:57 2025

From Newsgroup: comp.arch

Lawrence =?iso-8859-13?q?D=FFOliveiro?= <ldo@nz.invalid> writes:

On Sat, 06 Sep 2025 16:21:12 GMT, MitchAlsup wrote:

No it does not sound "good" on a system that accurately reproduces
22KHz; like systems with electrostatic speakers covering the high end of
the audio spectrum.

I wonder how that works, given that the audio engineer that mastered the >recording was using speakers that cost a fraction of the price.

Have you priced quality studio monitors? Obviously not.

A nice pair of intro electrostatics run about a USD1200 (magnapan lrs+).

A single studio monitor can easily cost more than USD12000.

--- Synchronet 3.21a-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Thu Sep 11 15:59:32 2025

From Newsgroup: comp.arch

scott@slp53.sl.home (Scott Lurndal) posted:

Lawrence =?iso-8859-13?q?D=FFOliveiro?= <ldo@nz.invalid> writes:

On Sat, 06 Sep 2025 16:21:12 GMT, MitchAlsup wrote:

No it does not sound "good" on a system that accurately reproduces
22KHz; like systems with electrostatic speakers covering the high end of >> the audio spectrum.

I wonder how that works, given that the audio engineer that mastered the >recording was using speakers that cost a fraction of the price.

Have you priced quality studio monitors? Obviously not.

A nice pair of intro electrostatics run about a USD1200 (magnapan lrs+).

Magnepan's are not electrostatic, but use the moving Mylar plane sort-of
like they were electrostatic--but they use magnetic strips on the backplane
to impart forces onto the Mylar plane.

Martin Logan speakers are electrostatic (I have a pair from 1986-ish,
reved up from B-to-G in 1996.) They sound much like electrostatic
headphones except rooms sized sound pressure levels. These cost
around $2,000 in 1986...

Dalhquist are electrostatic; around since 1973-ish.

A single studio monitor can easily cost more than USD12000.

And often accompanied by a tuning system to allow the speakers to be tuned
to the room in which they are used. Velodyne sub-woofers allow the woofer
to be tuned to the room and phase aligned with the main speakers.
--- Synchronet 3.21a-Linux NewsLink 1.2

From BGB@cr88192@gmail.com to comp.arch on Thu Sep 11 12:56:09 2025

From Newsgroup: comp.arch

On 9/11/2025 10:06 AM, Scott Lurndal wrote:

Lawrence =?iso-8859-13?q?D=FFOliveiro?= <ldo@nz.invalid> writes:

On Sat, 06 Sep 2025 16:21:12 GMT, MitchAlsup wrote:

No it does not sound "good" on a system that accurately reproduces
22KHz; like systems with electrostatic speakers covering the high end of >>> the audio spectrum.

I wonder how that works, given that the audio engineer that mastered the
recording was using speakers that cost a fraction of the price.

Have you priced quality studio monitors? Obviously not.

A nice pair of intro electrostatics run about a USD1200 (magnapan lrs+).

A single studio monitor can easily cost more than USD12000.

I guess, this is a fair bit different, say, from using some $35
headphones, or $60 for some external speakers, probably throwing some
money Logitech's way...

Or, slightly cheaper, some "Amazon Basics" equivalents.
Or, more expensive, throwing their money at Bosch or Senheiser or similar.

Granted, there are cheaper headphones, but they are often lacking in
terms of comfort and/or audio quality.

Otherwise, I would think an option would be to try to guess which sort
of hardware consumers are most likely to be using, and then tune for
best results on this (so, say, aim for cheap, but not too cheap).

...

--- Synchronet 3.21a-Linux NewsLink 1.2

From George Neuner@gneuner2@comcast.net to comp.arch on Fri Sep 12 13:01:36 2025

From Newsgroup: comp.arch

On Tue, 9 Sep 2025 15:51:10 +0200, David Brown
<david.brown@hesbynett.no> wrote:

On 08/09/2025 23:57, George Neuner wrote:

:

This means most notes will include sounds that are outside the range
of (normal) human hearing, but you can still /feel/ these sounds [even
the high ones] and miss them when they are absent.

Nope. Most notes are much lower, and harmonics of relevance are within
the range of human hearing. For high enough notes, you simply don't
hear as much harmonic information.

You are forgetting the lower harmonics. If it is true about 3 lower,
then ~1/3 of notes on the piano will include an overtone that is below
the (average) hearing threshold.

C8 (high C) on the piano is ~4186 Hz. Assuming the need for the 7th
higher harmonic - 29302 Hz - Nyquist would demand a minimum sampling
rate of 58604/s to accurately reproduce C8.

You can't accurately hear C8 even when live - you don't get the same >harmonic information as you do with C6, because your ears can't
distinguish the higher harmonics. Your ears have the same limitations
as any other senses in this manner - you can look at your cat's feet and >count its toes, but if you look at a fly's feet you can't count the toes.

My point was about sampling and reproduction, not whether the note
could be heard. There is not a lot of piano music that involves the
1st, 7th or 8th octaves - because the 1st octave is jarring and the
7th and 8th (in general) are too high to carry to the audience without amplification.

In practice, unless you like orchestral, or certain folk or country,
you are not likely to hear much difference between a CD and a decent
quality compressed version of it. But the CD itself is not a faithful
reproduction of the live performance.

Good quality compressed formats are often better than CD quality. The >killer for CD quality is not the sample rate, it is the limited dynamic >range from the linear 16-bit range. Compressed formats will, in effect,
use a more logarithmic scale (like A-law and mu-law, used to get >comprehensible speech despite a much smaller sample size) that is more
in line with the way the human brain interprets sound.

And, of course, if you like orchestral you are more likely to be
listening to vinyl rather than CD. 8-)

In theory (but very rarely in practice), when combined with good enough >amplifiers and speakers, vinyl has a a higher dynamic range than CD
audio. But that is only the case when the record is new. Play it a few >times, and the wear from the needle will smooth out the tracks enough to >eliminate the difference.

True, but in fact there are laser based record players that do not
touch or damage the media. You still need to worry about warping, so
it is necessary to store your records properly.

I don't deal much with vinyl records myself anymore, but my sister has
an extensive collection.

But enjoying music is a psychologically, physically, mentally and >biologically complex hobby. The comfort of the chair you are sitting
in, or the type of reflections and absorptions from the rest of the
room, can make a big difference. Knowing that you have spent a great
deal of money on your impressive-looking hifi system will improve your >listening experience regardless of what any audio measurement might say.
Some audiophiles prefer the "valve sound" over "transistor sound" -
not because the sound reproduction is more accurate (it is not - valves
add second harmonic distortion that is non-existent in transistor >amplifiers), but simply because they like it better.

--- Synchronet 3.21a-Linux NewsLink 1.2

From David Schultz@david.schultz@earthlink.net to comp.arch on Fri Sep 12 12:23:39 2025

From Newsgroup: comp.arch

On 9/12/25 12:01 PM, George Neuner wrote:

You are forgetting the lower harmonics. If it is true about 3 lower,
then ~1/3 of notes on the piano will include an overtone that is below
the (average) hearing threshold.

One of the coolest things I ever heard, felt really, were the beat tones between a couple of peddle notes on the pipe organ at the Meyerson in
Dallas.
--
http://davesrocketworks.com
David Schultz
"The cheaper the crook, the gaudier the patter." - Sam Spade
--- Synchronet 3.21a-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to comp.arch on Fri Sep 12 20:32:48 2025

From Newsgroup: comp.arch

On 12/09/2025 19:23, David Schultz wrote:

On 9/12/25 12:01 PM, George Neuner wrote:

You are forgetting the lower harmonics. If it is true about 3 lower,
then ~1/3 of notes on the piano will include an overtone that is below
the (average) hearing threshold.

Harmonics are always integer multiples of the base frequency, not
fractions - that's the definition of a harmonic.

You can get lower frequencies produced as beats when different
instruments play nominally the same note, but are a little out of tune.
That's not something you would normally want to have in music (though it
is a very useful effect for getting things in tune).

One of the coolest things I ever heard, felt really, were the beat tones between a couple of peddle notes on the pipe organ at the Meyerson in Dallas.

Big pipe organs have notes that are too low for human hearing, but the
volume is enough to feel them. Infrasound (sound below the lowest
audible frequency) has long been associated with feelings of
"supernatural" or "paranormal", increasing stress and tension. Horror
movies sometimes like to have them in their soundtracks, and more than
one "haunted house" turned out to have issues with the plumbing,
ventilation or nearby diesel engines that produced infrasound that made
people feel uneasy without knowing why. I'd imagine that for a church
organ, playing some infrasound notes will help the listeners feel
"religious" or feel some kind of "spiritual presence", though I have not
heard of that being done intentionally except by some of the more
dedicated fake healer conmen.

--- Synchronet 3.21a-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Fri Sep 12 18:58:59 2025

From Newsgroup: comp.arch

David Schultz <david.schultz@earthlink.net> posted:

On 9/12/25 12:01 PM, George Neuner wrote:

You are forgetting the lower harmonics. If it is true about 3 lower,
then ~1/3 of notes on the piano will include an overtone that is below
the (average) hearing threshold.

One of the coolest things I ever heard, felt really, were the beat tones between a couple of peddle notes on the pipe organ at the Meyerson in Dallas.

Have you listened to a helicopter-style sub-woofer ??

Generally housed between stories in a building--a helicopter arranged set
of blades, that can go all the way down to 0 Hz--and up to about 30 Hz.
The low frequency components adjust the pitch of the blades through the
cyclic.
--- Synchronet 3.21a-Linux NewsLink 1.2

From BGB@cr88192@gmail.com to comp.arch on Fri Sep 12 14:30:08 2025

From Newsgroup: comp.arch

On 9/12/2025 1:58 PM, MitchAlsup wrote:

David Schultz <david.schultz@earthlink.net> posted:

On 9/12/25 12:01 PM, George Neuner wrote:

You are forgetting the lower harmonics. If it is true about 3 lower,
then ~1/3 of notes on the piano will include an overtone that is below
the (average) hearing threshold.

One of the coolest things I ever heard, felt really, were the beat tones
between a couple of peddle notes on the pipe organ at the Meyerson in
Dallas.

Have you listened to a helicopter-style sub-woofer ??

Generally housed between stories in a building--a helicopter arranged set
of blades, that can go all the way down to 0 Hz--and up to about 30 Hz.
The low frequency components adjust the pitch of the blades through the cyclic.

Main subwoofers I am aware of/had seen:
Large plastic-cone speakers;
Seemingly, the relative rigidity of a plastic cone works well here.
Large solenoid driving a big/heavy weight (such as a big chunk of
steel), which is then bolted down to something (presumably, the surface
of whatever it is bolted to serving a similar role to the speaker cone).

well, in other news, have slightly improved the quality of my new
experimental audio compressor at 6 kbps, but it is still pretty bad.
Seems my previous attempt (that I had posted online) was suffering from
32-bit truncation in the pattern table (some stuff was happening with
'int' that should have been with 'unsigned long long'; which was
negatively effecting audio quality).

...

TODO: Might still be worth testing it out at 32kHz / 24kbps, and see how
it compares against low bitrate MP3. If it doesn't sound completely
awful, might be OK (only reason I think it may stand a chance is because
of how terrible MP3 sounds at these sorts of bitrates).

At 16kHz (and 12kbps), does sound a bit better, though audio quality is
still inferior at present to 8000 2-bit ADPCM (16kbps).

...

--- Synchronet 3.21a-Linux NewsLink 1.2

From David Schultz@david.schultz@earthlink.net to comp.arch on Fri Sep 12 14:46:20 2025

From Newsgroup: comp.arch

On 9/12/25 1:58 PM, MitchAlsup wrote:

David Schultz <david.schultz@earthlink.net> posted:

One of the coolest things I ever heard, felt really, were the beat tones
between a couple of peddle notes on the pipe organ at the Meyerson in
Dallas.

Have you listened to a helicopter-style sub-woofer ??

No. I built a sub a decade or three ago (JBL 2245H driver) but have
never wanted to go quite so far as to build a rotary subwoofer.

The 18" sub gets as freaky as I want it to.
--
http://davesrocketworks.com
David Schultz
"The cheaper the crook, the gaudier the patter." - Sam Spade
--- Synchronet 3.21a-Linux NewsLink 1.2

From George Neuner@gneuner2@comcast.net to comp.arch on Sun Sep 14 07:43:07 2025

From Newsgroup: comp.arch

On Fri, 12 Sep 2025 20:32:48 +0200, David Brown
<david.brown@hesbynett.no> wrote:

On 12/09/2025 19:23, David Schultz wrote:

On 9/12/25 12:01 PM, George Neuner wrote:

You are forgetting the lower harmonics. If it is true about 3 lower,
then ~1/3 of notes on the piano will include an overtone that is below
the (average) hearing threshold.

Harmonics are always integer multiples of the base frequency, not
fractions - that's the definition of a harmonic.

I learned them as "overtones" ... but it seems that musicians call
them all "harmonics" regardless of whether they are higher or lower.

MMV.
--- Synchronet 3.21a-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to comp.arch on Sun Sep 14 15:08:28 2025

From Newsgroup: comp.arch

On 14/09/2025 13:43, George Neuner wrote:

On Fri, 12 Sep 2025 20:32:48 +0200, David Brown
<david.brown@hesbynett.no> wrote:

On 12/09/2025 19:23, David Schultz wrote:

On 9/12/25 12:01 PM, George Neuner wrote:

You are forgetting the lower harmonics. If it is true about 3 lower,
then ~1/3 of notes on the piano will include an overtone that is below >>>> the (average) hearing threshold.

Harmonics are always integer multiples of the base frequency, not
fractions - that's the definition of a harmonic.

I learned them as "overtones" ... but it seems that musicians call
them all "harmonics" regardless of whether they are higher or lower.

MMV.

I'm not a musician - my knowledge of harmonics is from maths, physics,
signal processing, motor control, and that kind of thing. Maybe
musicians use the term slightly differently.

--- Synchronet 3.21a-Linux NewsLink 1.2

Who's Online
Recent Visitors
- Microbot
  Tue Sep 16 10:00:46 2025
  from Moore, Ok via Telnet
- Snow
  Mon Sep 15 12:19:45 2025
  from Nyc via Telnet
- Microbot
  Mon Sep 15 11:13:27 2025
  from Moore, Ok via Telnet
- Noozle
  Sun Sep 14 14:16:26 2025
  from Noozle City via Telnet

System Info

Sysop:	DaiTengu
Location:	Appleton, WI
Users:	1,070
Nodes:	10 (0 / 10)
Uptime:	127:41:28
Calls:	13,731
Calls today:	1
Files:	186,965
D/L today:	1,258 files (486M bytes)
Messages:	2,417,820

Random/OT: Low sample rate audio weirdness/mystery

Who's Online

Recent Visitors

System Info