• Random/OT: Low sample rate audio weirdness/mystery

    From BGB@cr88192@gmail.com to comp.arch on Sat Sep 6 05:28:16 2025
    From Newsgroup: comp.arch

    Just randomly thinking again about some things I noticed with audio at
    low sample rates.

    For baseline, can note, basic sample rates:
    44100: Standard, sounds good, but bulky
    32000: Sounds good
    22050: Moderate
    16000: OK, Modest size, acceptable quality.
    Seems like best tradeoff if not going for high quality.
    11025: Poor, muffled.
    8000: Very poor, speech almost unintelligible (normally).
    But, it is seeming like a "weird hack" may exist here.

    For sample formats:
    16-bit PCM: Good
    Binary16: Also good
    A-Law: Decent (space efficient)
    8-bit PCM: Sounds poor crap at all sample rates.
    Tends to introduce an obvious hiss.

    So, at higher sample rates, 16-bit PCM or A-Law are the clearly better options. And, 16000 16-bit sounds better than 44100 8-bit, despite the
    latter having the higher data-rate, because 8-bit PCM adds a very
    obvious hiss.


    For upsampling, a usual filtering strategy that works well seems to be
    to use a cubic spline.

    For downsampling, there are a few options:
    Nearest Neighbor:
    Simplest, poor quality
    Introduces very weird distortions when going to low sample rates.
    Box average:
    Take N samples and average them;
    Only works well for power-of-2 resampling.
    Pseudo tricubic:
    Take a block of N samples;
    Downsample by half until its both above and below target rate;
    Weighted sum of lower (average) and cubic interpolation (higher).
    Sinc:
    Theoretically exists, but never made it work well.

    As a general strategy, pseudo tricubic had worked well.


    So, seemingly, at least when working at 16kHz or above:
    16-bit PCM, A-Law, or Binary16 is a win;
    Pseudo tricubic seems to give the best perceptual audio quality.


    If going to low sample rates (11 or 8kHz), a problem emerges:
    Speech becomes muffled and unintelligible.
    So, 16-bit PCM and A-Law don't work, audio is still muffled.


    But, there is something weird I had noticed at low rates (eg, 8kHz):
    ADPCM encoding seems to increase the intelligibility.
    Speech seemingly more intelligible after ADPCM than before.
    More so, not using the "obvious choice" of minimizing error.
    It works better if the encoder is tuned to slightly overshoot.
    The effect is more obvious with 2-bit PCM than with 4-bit.
    Like, some sort of weird "less is more" with the quality.

    My sense of hearing and RMSE heuristic somewhat disagree with which is "better" quality. Where, RMSE seems to prefer if the ADPCM encoder tends
    to undershoot, and RMSE also preferring more muffled versions (that are
    closer to the down-sampled input audio).


    Similarly (partly inspired by the ADPCM effect), it also seems new
    contender arrive on the scene as a resampling algorithms:
    Treat the previously generated sample points as a reference, and try to
    pick a next point that best fits a line or B-Spline to the intermediate
    points (in effect, treating each sample point as if it were a control
    point for a B-Spline fitted to the input samples).


    The line-fitting is simpler, but the B-Spline seems to give a similar
    effect with better quality (even if RMSE does not agree, it sees the
    error from this as worse than with the other methods if the audio is
    upsampled using the normal spline method).

    Though, RMSE is lower if the upsampler also treats the audio samples
    more like the control points in a B-Spline.

    Where, in my usual cubic-spline upsampler, the interpolation passes
    through each control point (if the interpolated position directly aligns
    with a control point, it returns this point). This differs from a
    B-Splines, were generally the curve undershoots the control points.


    Then, it seems like for storage, the low-rate audio (control points) can
    be stored in ADPCM (though this time, error-minimization during encoding giving the best results).

    And, oddly, it seems like the audio (in this low-sample rate,
    control-points form) actually has higher perceptual audio quality (and
    things like speech seem more intelligible; despite the low 8kHz sample
    rate).


    But, I am at a loss here as to why and of this would be true in a
    theoretical sense.



    Stuff online mentioning the use of B-Splines for audio seems to work on
    the assumption of generating control points and then using another
    B-Spline to generate audio at the target rate (rather than directly
    listening to the control points as audio).

    Stuff online also mentions needing to low-pass filter the audio before generating the spline, but if any sort of low-pass filtering is applied (before spline generation) than (again) it becomes muffled and
    unintelligible.

    Presumably, the idea would be to filter out things above the Nyquist
    frequency of the target sample rate (so, say, 4kHz for 8kHz audio), but,
    as noted, a 4kHz low-pass filter (in general) wrecks intelligibility.

    Then again, maybe the mention of low-pass filtering assumes operating at somewhat higher target sample rates?...


    Where, seemingly for speech and frequency ranges:
    under 1 kHz: mystery range...
    Filtering out has little effect.
    1-2 kHz; "fullness"
    Filtering out this range causes a "tinny" sound
    Filtering this out seems to strongly displease cats.
    2-4 kHz: Has vowel sounds
    Filtering out this range makes voices sound robotic.
    Many of the distinguishing parts of the voice go away.
    4-8 kHz: Consonants / etc seem to live here
    Filtering this out removes the "what is being said" part.
    8-16 kHz: Mostly optional
    Improves quality, but not effect on intelligibility.
    16 kHz: Upper end of hearing
    Like CRT TV whistling is up here.

    Where, I had noted that general intelligibility of speech and other
    audio remains intact with a 4kHz to 8kHz band-pass filter, though with a "robotic" sound, and it is harder to tell peoples' voices apart (like, everyone is speaking with a similar-sounding robotic voice).

    But, with 8kHz audio having a 4kHz Nyquist frequency, it makes a
    problem. Can sort of hear vowel sounds, but sounds are often largely undifferentiated. Like, can hear that someone is talking, or whose voice
    it is, but not really what they are saying.

    Though, does leave a mystery then of why telephony would have used 8kHz,
    when presumably intelligible speech is the whole point of a telephone?...

    Then again, my actual phone experience has mostly been muffled with a
    rather obnoxious hiss (like, if the general phone experience wasn't bad enough, they have to punish people for using the phone by having some
    truly awful audio quality...).



    But, then, had noted that with the ADPCM hack, or the B-Spline fitting
    hack, it is again possible to hear what is being said at an 8kHz
    sampling rate. But...

    I don't really know why, or how it makes a difference, because
    presumably the Nyquist frequency is the same either way (but it is
    almost like the 4-8kHz band is still present somehow).

    Seemingly can't really push it down to a 6kHz sample rate though (seems
    like 6kHz might be closer to a hard-limit here).


    It is a mystery if anyone has a possible explanation for these effects?...


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Sat Sep 6 16:21:12 2025
    From Newsgroup: comp.arch


    BGB <cr88192@gmail.com> posted:

    Just randomly thinking again about some things I noticed with audio at
    low sample rates.

    For baseline, can note, basic sample rates:
    44100: Standard, sounds good, but bulky

    No it does not sound "good" on a system that accurately reproduces
    22KHz; like systems with electrostatic speakers covering the high
    end of the audio spectrum.

    Might sound "good" to someone who does not know what it is supposed
    to actually sound like, though.

    32000: Sounds good
    22050: Moderate
    16000: OK, Modest size, acceptable quality.
    Seems like best tradeoff if not going for high quality.
    11025: Poor, muffled.
    8000: Very poor, speech almost unintelligible (normally).
    But, it is seeming like a "weird hack" may exist here.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From David Schultz@david.schultz@earthlink.net to comp.arch on Sat Sep 6 11:59:37 2025
    From Newsgroup: comp.arch

    On 9/6/25 5:28 AM, BGB wrote:
       8000: Very poor, speech almost unintelligible (normally).
         But, it is seeming like a "weird hack" may exist here.

    You might want to look at how AT&T did it. It has been a while but I
    think this is near what they used. Back when phones were analog and
    digital was just getting started.
    --
    http://davesrocketworks.com
    David Schultz
    "The cheaper the crook, the gaudier the patter." - Sam Spade
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Brian G. Lucas@bagel99@gmail.com to comp.arch on Sat Sep 6 12:31:31 2025
    From Newsgroup: comp.arch

    On 9/6/25 11:59 AM, David Schultz wrote:
    On 9/6/25 5:28 AM, BGB wrote:
        8000: Very poor, speech almost unintelligible (normally).
          But, it is seeming like a "weird hack" may exist here.

    You might want to look at how AT&T did it. It has been a while but I think this
    is near what they used. Back when phones were analog and digital was just getting started.

    That was T1 carrier. When I looked at the schematics, I was surprised to see the audio compression was done in analog, using the exponential curve of a diode to get logarithmic compression. If I remember correctly:-)

    Brian

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Michael S@already5chosen@yahoo.com to comp.arch on Sat Sep 6 20:52:51 2025
    From Newsgroup: comp.arch

    On Sat, 6 Sep 2025 05:28:16 -0500
    BGB <cr88192@gmail.com> wrote:

    Just randomly thinking again about some things I noticed with audio
    at low sample rates.

    For baseline, can note, basic sample rates:
    44100: Standard, sounds good, but bulky
    32000: Sounds good
    22050: Moderate
    16000: OK, Modest size, acceptable quality.
    Seems like best tradeoff if not going for high quality.
    11025: Poor, muffled.
    8000: Very poor, speech almost unintelligible (normally).
    But, it is seeming like a "weird hack" may exist here.


    8000 x 8bit (mu-law in USA, A-law in majority of the world) was a
    standard sampling rate for digital back ends of analog wired telephony
    for more than 50 years. I didn't check, but would assume that it still
    is.
    Most people founded it quite intelligible. Certainly more intelligible
    than cellular telephony, until less then 20 years ago cellular improved
    a little.


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From BGB@cr88192@gmail.com to comp.arch on Sat Sep 6 13:54:54 2025
    From Newsgroup: comp.arch

    On 9/6/2025 11:21 AM, MitchAlsup wrote:

    BGB <cr88192@gmail.com> posted:

    Just randomly thinking again about some things I noticed with audio at
    low sample rates.

    For baseline, can note, basic sample rates:
    44100: Standard, sounds good, but bulky

    No it does not sound "good" on a system that accurately reproduces
    22KHz; like systems with electrostatic speakers covering the high
    end of the audio spectrum.

    Might sound "good" to someone who does not know what it is supposed
    to actually sound like, though.


    Dunno. I mostly use headphones.


    Seemingly, at least with the headphones I have, I can hear tones up to
    around 17 kHz, but above this, pretty much nothing.

    I noticed when trying to get new headphones, I got some cheap ones at
    first that sounded like muffled crap (they were around $10 IIRC). I
    tried generating tones and with these headphones audio dropped off to
    nothing after around 11 kHz. Ended up needing to buy some slightly more expensive headphones (around $30 IIRC, from Logitech), which sounded a
    bit better.

    Ended up giving the cheap ones to my dad, they apparently worked fine
    for him.



    Below 1kHz, sine waves rapidly drop off in intensity, whereas square and sawtooth waves retain full loudness.

    on the headphones, I can still hear sine waves (well under 1kHz) if the
    volume is fairly high.


    IRL, I have noted that I am mostly unable to hear tuning forks.

    My mom also recently got a "steel tongue drum" (with an apparent 432Hz tuning), which I had noted I can sorta hear, but the sound is very
    quiet. I mostly hear the "thwap" sound when she uses the little
    rubber-tipped mallet on it.

    If I put my hand near it, I can feel vibrations, but I don't really hear anything.



    Personally, much over a 32 kHz sample rate, any difference rapidly drops
    off, so 44100 and 48000 seem to sound basically the same.


    I was mostly trying to explore the area around 8000 though, where
    normally I hear crap-all. But, seemingly, with some questionable
    filtering, intelligible speech can come through, I just don't entirely understand how it works.

    But, as noted, there are several variations of the trick:
    Feed audio through ADPCM;
    Works better with either 2-bit/sample IMA,
    or with encoder tuned to overshoot.
    Model audio as line-fitting during downsampling.
    This is likely similar to what ADPCM ends up doing.
    Model audio as B-spline fitting.
    Seems to preserve more perceptual quality than the line fitting.

    But, what I am not entirely sure of is why this would make any real difference.

    But, can note that it does differ from the more conventional
    downsampling strategies of "just average stuff", in that both approaches
    tend to generate points outside the original curve.



    32000: Sounds good
    22050: Moderate
    16000: OK, Modest size, acceptable quality.
    Seems like best tradeoff if not going for high quality.
    11025: Poor, muffled.
    8000: Very poor, speech almost unintelligible (normally).
    But, it is seeming like a "weird hack" may exist here.


    Seemingly, there is no general disagreement that 11025 and 8000 sound
    kinda like crap?...

    I guess 11025 worked OK for Doom and Quake.
    Quake 2 had used 22050 (but, still 8-bit PCM).
    Quake 3 had used 22050 (but 16-bit PCM now)

    With Wolfenstein 3D, it wasn't until hearing some slightly better
    quality versions of the sound effects from the iOS port that I realized
    the enemies were saying stuff for their sound effects. Like, the
    low-level enemies apparently saying "Achtung!" rather than "Aaah-Uuuh"
    (but, with the audio from the DOS version, just sorta heard a whole lot
    of the latter).


    But, as noted, I mostly ended up preferring 16000 A-Law for sound
    effects and similar as a good tradeoff for space and quality. Also
    ADPCM, which uses less space.

    Some people seem to try to use MP3 or OGG for sound effects, but:
    128 kbps: Bulky
    64 kbps: Poor
    32 kbps: Can full of broken glass.
    In addition to both formats being complicated, computationally expensive
    to decode, and typically needing to use a third party library to decode
    them.


    Also, in this case, 2-bit IMA ADPCM seems to somewhat beat MP3 at the
    low bitrate game (at least to my hearing).


    Not sure of a good way to go lower, best way I have found in past
    fiddling was, eg:
    Downsample by 1/16 or so to generate a reference line;
    Eg, spline-fitting the samples;
    Also generate the side-intensity
    Eg, standard deviation from samples and the spline.
    Store this line in some form, such as via ADPCM;
    Approximate the intermediate table with patterns from a table.
    The table of patterns itself derived partly from the frequencies.
    Stores the relative intensity above/below the spline curve.

    Where, one way of storing the line is, say:
    4x 3-bit, each control-point sample, as ADPCM
    3 or 4-bit, side/intensity sample (eg, standard deviation channel).
    Pattern table might be stored as 4 or 8 bits per block.
    Pattern is chosen by whichever best fits the intermediate samples.

    but with, say, 8 bits per sample block, but with a 16x internal
    downsample, could work out to 0.5 bits/sample (or, 16kHz audio in
    8kbps). With 8-bit patterns, it is 0.75 bits/sample.


    Example patterns:
    0: Flat line, follow spline
    1: Positive (sin 8*PI)
    2: Positive Hump (sin PI)
    3: Negative Hump
    4: Positive (sin 2*PI)
    5: Negative (sin 2*PI)
    6: Positive (sin 3*PI)
    7: Negative (sin 3*PI)
    8/9: 4*PI
    A/B: 5*PI
    C/D: 6*PI
    E/F: 7*PI
    ...
    If using 6 or 8-bit patterns, it can include a second (or 3rd)
    sub-frequency.
    00..0F: Same as above
    10..1F: Same main pattern as 00..0F
    Sub-frequency mirrored in frequency and polarity (+8 mod 16).
    Roughly 5/8 amplitude of main frequency.
    2x: Same, but lower intensity sub-frequency (3/8).
    3x: Same, but lower intensity sub-frequency (1/8).
    4x..7x: Same, but use a different sub-frequency index (+/-5 mod 16).
    Encodes offset sign and intensity (5/8 or 3/8).
    8x..Fx: Add a 3rd frequency, lower intensity than the second (1/8).
    Similar strategy to above.
    ...


    Decoding algorithm would work in blocks, eg:
    Unpack spline points;
    Interpolate splines for each sample;
    Multiply deviation channel with the values from the pattern table.
    This is then added onto the base spline.


    However, this sort of approach is somewhat more complicated than just
    using a low-bitrate ADPCM (and I haven't used it much).

    Also, quality is inferior to 2-bit ADPCM.

    But, not a lot in this area that doesn't sound like total garbage...


    I had noted in past experiments that seemingly a lower-limit scheme for intelligible speech for me, was:
    Split audio into blocks of 128 samples (at a 16kHz sample rate);
    Match a sine-wave between 4 and 8 kHz (picking the loudest sine wave);
    Encode the frequency and intensity of this sign wave.

    This can achieve ~ 0.125 bits per sample, or 2 kbps.
    However, speech is very unnatural sounding;
    Pretty much any non-speech audio becomes unrecognizable noise.
    Where, say, frequency is a byte in steps of 16 Hz; and intensity is an
    A-Law value.

    Though, this pushes the limits of intelligibility, and it is possible
    that others might find such a scheme unintelligible.



    Has also experimented with schemes of encoding the relative intensity of
    a series of 16 bands (between 4 and 8 kHz), but quality was also pretty
    low here (and it won neither for quality or ability to achieve a low
    bitrate). Quality is better with more bands, but this quickly reaches a practical limit.

    It being seemingly more effective to pick 1 or 2 sine waves, and then
    encoding the specific frequency and intensity of each.

    For slightly more natural sound, can pick N sine waves from within
    specific frequency ranges, say, 4 waves:
    2-3kHz, 3-4kHz, 4-6 kHz, 6-8kHz
    Resulting in something slightly more like a normal human voice.
    But, still sounds unnatural.
    And, it still falls on its face for any non-speech audio.


    Also, can note:
    While I am saying sine waves here, wave shape is non critical, it also
    seems to work if using square waves or similar.

    for these experiments, had mostly ended up discarding everything below
    2kHz, as it seems to not contain anything particularly relevant.

    Can note that the "block sampling rate" for this approach seems to needs
    to be over 100 Hz for best effect (a block size of 128 samples giving a
    125Hz block-sampling frequency).


    But, can note that seemingly no mainline audio codecs work this way...

    ...


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From BGB@cr88192@gmail.com to comp.arch on Sat Sep 6 14:19:40 2025
    From Newsgroup: comp.arch

    On 9/6/2025 12:52 PM, Michael S wrote:
    On Sat, 6 Sep 2025 05:28:16 -0500
    BGB <cr88192@gmail.com> wrote:

    Just randomly thinking again about some things I noticed with audio
    at low sample rates.

    For baseline, can note, basic sample rates:
    44100: Standard, sounds good, but bulky
    32000: Sounds good
    22050: Moderate
    16000: OK, Modest size, acceptable quality.
    Seems like best tradeoff if not going for high quality.
    11025: Poor, muffled.
    8000: Very poor, speech almost unintelligible (normally).
    But, it is seeming like a "weird hack" may exist here.


    8000 x 8bit (mu-law in USA, A-law in majority of the world) was a
    standard sampling rate for digital back ends of analog wired telephony
    for more than 50 years. I didn't check, but would assume that it still
    is.

    It seems that the issue isn't (purely) with the sample rate or encoding.


    But, there is some "weird hacks" that can be done in audio processing
    when downsampling that seems to notably increase intelligibility at an
    8kHz sample rate (in which case, A-Law is back to being effective again).


    In general, I don't fault mu-law or A-Law as the quality they give is by
    far superior to 8-bit linear PCM.


    Just, the "standard" audio down-sampling strategies (glorified averaging
    in various forms) just sort of result in audio becoming muffled and
    speech poorly intelligible at low sample rates.

    Whereas, the "hacky" strategies (line or curve fitting) seem to give a
    better result.

    I am more just sort of at a loss as to what is going on here exactly, as
    it appears to defy common wisdom about how audio resampling (or audio
    quality) should work.


    Well, and as noted my perception is often at odds with what an RMSE
    score says (where RMSE tends to more strongly prefer the
    muffled-sounding versions). Where, usually, RMSE is sort of the gold
    standard of measuring quality in image and audio processing.


    Most people founded it quite intelligible. Certainly more intelligible
    than cellular telephony, until less then 20 years ago cellular improved
    a little.


    Whatever the cellphones are doing now, still sounds like garbage even
    versus what I get from 2-bit IMA ADPCM at 8kHz ...

    Like, if I encode some speech as 2-bit ADPCM, at least I can still
    understand what is being said (even if 2-bit ADPCM doesn't necessarily
    give the best audio quality; and is kinda poorly supported in SW vs the
    more common 4-bit ADPCM).


    Like, it is a disincentive to talk over the phone when I have to make
    extra effort to try to decipher what people are saying due to poor audio quality.





    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.arch on Sat Sep 6 13:18:46 2025
    From Newsgroup: comp.arch

    On 9/6/2025 11:54 AM, BGB wrote:
    On 9/6/2025 11:21 AM, MitchAlsup wrote:

    BGB <cr88192@gmail.com> posted:

    Just randomly thinking again about some things I noticed with audio at
    low sample rates.

    For baseline, can note, basic sample rates:
        44100: Standard, sounds good, but bulky

    No it does not sound "good" on a system that accurately reproduces
    22KHz; like systems with electrostatic speakers covering the high
    end of the audio spectrum.

    Might sound "good" to someone who does not know what it is supposed
    to actually sound like, though.


    Dunno. I mostly use headphones.


    Seemingly, at least with the headphones I have, I can hear tones up to around 17 kHz, but above this, pretty much nothing.

    I noticed when trying to get new headphones, I got some cheap ones at
    first that sounded like muffled crap (they were around $10 IIRC). I
    tried generating tones and with these headphones audio dropped off to nothing after around 11 kHz. Ended up needing to buy some slightly more expensive headphones (around $30 IIRC, from Logitech), which sounded a
    bit better.

    Ended up giving the cheap ones to my dad, they apparently worked fine
    for him.



    Below 1kHz, sine waves rapidly drop off in intensity, whereas square and sawtooth waves retain full loudness.

    on the headphones, I can still hear sine waves (well under 1kHz) if the volume is fairly high.


    IRL, I have noted that I am mostly unable to hear tuning forks.

    My mom also recently got a "steel tongue drum" (with an apparent 432Hz tuning), which I had noted I can sorta hear, but the sound is very
    quiet. I mostly hear the "thwap" sound when she uses the little rubber- tipped mallet on it.

    If I put my hand near it, I can feel vibrations, but I don't really hear anything.



    Personally, much over a 32 kHz sample rate, any difference rapidly drops off, so 44100 and 48000 seem to sound basically the same.


    I was mostly trying to explore the area around 8000 though, where
    normally I hear crap-all. But, seemingly, with some questionable
    filtering, intelligible speech can come through, I just don't entirely understand how it works.

    But, as noted, there are several variations of the trick:
      Feed audio through ADPCM;
        Works better with either 2-bit/sample IMA,
         or with encoder tuned to overshoot.
      Model audio as line-fitting during downsampling.
        This is likely similar to what ADPCM ends up doing.
      Model audio as B-spline fitting.
        Seems to preserve more perceptual quality than the line fitting.

    But, what I am not entirely sure of is why this would make any real difference.

    But, can note that it does differ from the more conventional
    downsampling strategies of "just average stuff", in that both approaches tend to generate points outside the original curve.



        32000: Sounds good
        22050: Moderate
        16000: OK, Modest size, acceptable quality.
          Seems like best tradeoff if not going for high quality.
        11025: Poor, muffled.
         8000: Very poor, speech almost unintelligible (normally).
           But, it is seeming like a "weird hack" may exist here.


    Seemingly, there is no general disagreement that 11025 and 8000 sound
    kinda like crap?...

    I guess 11025 worked OK for Doom and Quake.
      Quake 2 had used 22050 (but, still 8-bit PCM).
      Quake 3 had used 22050 (but 16-bit PCM now)

    With Wolfenstein 3D, it wasn't until hearing some slightly better
    quality versions of the sound effects from the iOS port that I realized
    the enemies were saying stuff for their sound effects. Like, the low-
    level enemies apparently saying "Achtung!" rather than "Aaah-Uuuh" (but, with the audio from the DOS version, just sorta heard a whole lot of the latter).


    But, as noted, I mostly ended up preferring 16000 A-Law for sound
    effects and similar as a good tradeoff for space and quality. Also
    ADPCM, which uses less space.

    Some people seem to try to use MP3 or OGG for sound effects, but:
      128 kbps: Bulky
       64 kbps: Poor
       32 kbps: Can full of broken glass.
    In addition to both formats being complicated, computationally expensive
    to decode, and typically needing to use a third party library to decode them.


    Also, in this case, 2-bit IMA ADPCM seems to somewhat beat MP3 at the
    low bitrate game (at least to my hearing).


    Not sure of a good way to go lower, best way I have found in past
    fiddling was, eg:
      Downsample by 1/16 or so to generate a reference line;
        Eg, spline-fitting the samples;
      Also generate the side-intensity
        Eg, standard deviation from samples and the spline.
      Store this line in some form, such as via ADPCM;
      Approximate the intermediate table with patterns from a table.
        The table of patterns itself derived partly from the frequencies.
        Stores the relative intensity above/below the spline curve.

    Where, one way of storing the line is, say:
      4x 3-bit, each control-point sample, as ADPCM
      3 or 4-bit, side/intensity sample (eg, standard deviation channel). Pattern table might be stored as 4 or 8 bits per block.
      Pattern is chosen by whichever best fits the intermediate samples.

    but with, say, 8 bits per sample block, but with a 16x internal
    downsample, could work out to 0.5 bits/sample (or, 16kHz audio in
    8kbps). With 8-bit patterns, it is 0.75 bits/sample.


    Example patterns:
      0: Flat line, follow spline
      1: Positive (sin 8*PI)
      2: Positive Hump (sin PI)
      3: Negative Hump
      4: Positive (sin 2*PI)
      5: Negative (sin 2*PI)
      6: Positive (sin 3*PI)
      7: Negative (sin 3*PI)
      8/9: 4*PI
      A/B: 5*PI
      C/D: 6*PI
      E/F: 7*PI
      ...
    If using 6 or 8-bit patterns, it can include a second (or 3rd) sub- frequency.
      00..0F: Same as above
      10..1F: Same main pattern as 00..0F
        Sub-frequency mirrored in frequency and polarity (+8 mod 16).
        Roughly 5/8 amplitude of main frequency.
      2x: Same, but lower intensity sub-frequency (3/8).
      3x: Same, but lower intensity sub-frequency (1/8).
      4x..7x: Same, but use a different sub-frequency index (+/-5 mod 16).
        Encodes offset sign and intensity (5/8 or 3/8).
      8x..Fx: Add a 3rd frequency, lower intensity than the second (1/8).
        Similar strategy to above.
      ...


    Decoding algorithm would work in blocks, eg:
      Unpack spline points;
      Interpolate splines for each sample;
      Multiply deviation channel with the values from the pattern table.
        This is then added onto the base spline.


    However, this sort of approach is somewhat more complicated than just
    using a low-bitrate ADPCM (and I haven't used it much).

    Also, quality is inferior to 2-bit ADPCM.

    But, not a lot in this area that doesn't sound like total garbage...


    I had noted in past experiments that seemingly a lower-limit scheme for intelligible speech for me, was:
    Split audio into blocks of 128 samples (at a 16kHz sample rate);
    Match a sine-wave between 4 and 8 kHz (picking the loudest sine wave);
    Encode the frequency and intensity of this sign wave.

    This can achieve ~ 0.125 bits per sample, or 2 kbps.
      However, speech is very unnatural sounding;
      Pretty much any non-speech audio becomes unrecognizable noise.
    Where, say, frequency is a byte in steps of 16 Hz; and intensity is an
    A-Law value.

    Though, this pushes the limits of intelligibility, and it is possible
    that others might find such a scheme unintelligible.



    Has also experimented with schemes of encoding the relative intensity of
    a series of 16 bands (between 4 and 8 kHz), but quality was also pretty
    low here (and it won neither for quality or ability to achieve a low bitrate). Quality is better with more bands, but this quickly reaches a practical limit.

    It being seemingly more effective to pick 1 or 2 sine waves, and then encoding the specific frequency and intensity of each.

    For slightly more natural sound, can pick N sine waves from within
    specific frequency ranges, say, 4 waves:
      2-3kHz, 3-4kHz, 4-6 kHz, 6-8kHz
    Resulting in something slightly more like a normal human voice.
      But, still sounds unnatural.
      And, it still falls on its face for any non-speech audio.


    Also, can note:
    While I am saying sine waves here, wave shape is non critical, it also
    seems to work if using square waves or similar.

    for these experiments, had mostly ended up discarding everything below
    2kHz, as it seems to not contain anything particularly relevant.

    Can note that the "block sampling rate" for this approach seems to needs
    to be over 100 Hz for best effect (a block size of 128 samples giving a 125Hz block-sampling frequency).


    But, can note that seemingly no mainline audio codecs work this way...

    Playing around with WAV almost destroyed my eardrums and my speakers.
    FWIW, I have an example of a wav experiment right here:

    https://youtu.be/DrPp6xfLe4Q?t=63



    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From BGB@cr88192@gmail.com to comp.arch on Sat Sep 6 15:42:17 2025
    From Newsgroup: comp.arch

    On 9/6/2025 3:18 PM, Chris M. Thomasson wrote:
    On 9/6/2025 11:54 AM, BGB wrote:
    On 9/6/2025 11:21 AM, MitchAlsup wrote:

    BGB <cr88192@gmail.com> posted:


    ...



    Also, can note:
    While I am saying sine waves here, wave shape is non critical, it also
    seems to work if using square waves or similar.

    for these experiments, had mostly ended up discarding everything below
    2kHz, as it seems to not contain anything particularly relevant.

    Can note that the "block sampling rate" for this approach seems to
    needs to be over 100 Hz for best effect (a block size of 128 samples
    giving a 125Hz block-sampling frequency).


    But, can note that seemingly no mainline audio codecs work this way...

    Playing around with WAV almost destroyed my eardrums and my speakers.
    FWIW, I have an example of a wav experiment right here:

    https://youtu.be/DrPp6xfLe4Q?t=63


    This is fairly quiet (apart from a slight warbling sound) until the
    piano comes in at around 1:20...


    I don't have any of my experiments here on YouTube; would likely need to
    find some good public domain audio test examples (and/or record myself speaking, but would rather not), and set something up here.

    Though, in this case, it would be more examples of "pushing the limits
    for poor audio quality".



    Did find a video of another guy doing something vaguely similar to what
    I have done in some experiments:
    https://www.youtube.com/watch?v=qosYRO6WjkQ

    But, his examples sound very different from mine (with a characteristic
    sound more like bad MP3 compression), I suspect because he was using
    different frequency bands or similar (as noted, mine ignored pretty much everything below 2kHz).





    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.arch on Sat Sep 6 14:37:29 2025
    From Newsgroup: comp.arch

    On 9/6/2025 1:42 PM, BGB wrote:
    On 9/6/2025 3:18 PM, Chris M. Thomasson wrote:
    On 9/6/2025 11:54 AM, BGB wrote:
    On 9/6/2025 11:21 AM, MitchAlsup wrote:

    BGB <cr88192@gmail.com> posted:


    ...



    Also, can note:
    While I am saying sine waves here, wave shape is non critical, it
    also seems to work if using square waves or similar.

    for these experiments, had mostly ended up discarding everything
    below 2kHz, as it seems to not contain anything particularly relevant.

    Can note that the "block sampling rate" for this approach seems to
    needs to be over 100 Hz for best effect (a block size of 128 samples
    giving a 125Hz block-sampling frequency).


    But, can note that seemingly no mainline audio codecs work this way...

    Playing around with WAV almost destroyed my eardrums and my speakers.
    FWIW, I have an example of a wav experiment right here:

    https://youtu.be/DrPp6xfLe4Q?t=63


    This is fairly quiet (apart from a slight warbling sound) until the
    piano comes in at around 1:20...

    The rest of it is all from my MIDI program. I got afraid of
    experimenting with raw WAV, and made the volume lower. Have you ever had
    the piercing screech that makes you say WTF! Or, a massive bass that
    makes your speakers want to snuff themselves? Shit man.


    I don't have any of my experiments here on YouTube; would likely need to find some good public domain audio test examples (and/or record myself speaking, but would rather not), and set something up here.

    Though, in this case, it would be more examples of "pushing the limits
    for poor audio quality".

    :^) I still love music from the SNES:

    (Donkey Kong Country 2 - Stickerbush Symphony) https://youtu.be/nwBlulZ2Uq8?list=RDnwBlulZ2Uq8




    Did find a video of another guy doing something vaguely similar to what
    I have done in some experiments:
    https://www.youtube.com/watch?v=qosYRO6WjkQ

    But, his examples sound very different from mine (with a characteristic sound more like bad MP3 compression), I suspect because he was using different frequency bands or similar (as noted, mine ignored pretty much everything below 2kHz).

    Thanks for that link.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Terje Mathisen@terje.mathisen@tmsw.no to comp.arch on Sun Sep 7 12:26:37 2025
    From Newsgroup: comp.arch

    MitchAlsup wrote:

    BGB <cr88192@gmail.com> posted:

    Just randomly thinking again about some things I noticed with audio at
    low sample rates.

    For baseline, can note, basic sample rates:
    44100: Standard, sounds good, but bulky

    No it does not sound "good" on a system that accurately reproduces
    22KHz; like systems with electrostatic speakers covering the high
    end of the audio spectrum.

    Might sound "good" to someone who does not know what it is supposed
    to actually sound like, though.

    My ears are not good enough to notice the difference between CD quality, AAC/high sample rate MP3/ogg vorbis/etc, but according to my savant (?)
    cousin who could listen to a 16 min piece of music once and then write
    down the score for all the instruments, none of them sound like live,
    but they are close enough that he can listen and internally translate to
    what it would have sounded like in a concert.

    Terje
    --
    - <Terje.Mathisen at tmsw.no>
    "almost all programming can be viewed as an exercise in caching"
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From BGB@cr88192@gmail.com to comp.arch on Sun Sep 7 14:59:23 2025
    From Newsgroup: comp.arch

    On 9/7/2025 5:26 AM, Terje Mathisen wrote:
    MitchAlsup wrote:

    BGB <cr88192@gmail.com> posted:

    Just randomly thinking again about some things I noticed with audio at
    low sample rates.

    For baseline, can note, basic sample rates:
        44100: Standard, sounds good, but bulky

    No it does not sound "good" on a system that accurately reproduces
    22KHz; like systems with electrostatic speakers covering the high
    end of the audio spectrum.

    Might sound "good" to someone who does not know what it is supposed
    to actually sound like, though.

    My ears are not good enough to notice the difference between CD quality, AAC/high sample rate MP3/ogg vorbis/etc, but according to my savant (?) cousin who could listen to a 16 min piece of music once and then write
    down the score for all the instruments, none of them sound like live,
    but they are close enough that he can listen and internally translate to what it would have sounded like in a concert.


    To me, 44100 and 48000 sound basically the same, so not much gain in
    going higher.

    The difference between 32000 and 44100 is slight.

    Though, one merit of 32000 is that is is 2x 16000.

    So, audio can be split into two domains based on power-of-2 relations:
    8000 / 16000 / 32000
    11025 / 22050 / 44100


    On a PC, 32K or 44K are good "output" sample rates, though usually
    things like sound-effects are preferably stored at lower rates (8/11/16K).

    In some cases, size matters more than quality. For example, in 3D
    engines, "ambient" sound effects can be quite large, and it may be
    preferable to store them at the lowest quality one can get away with
    (like, few people are going to notice if the "wind howling in the
    background" is stored at poor quality).


    Though, sometimes it is noticeable:
    One animated show known as "Bravest Warriors" (made by same guy who made "Adventure Time" with a similar art style) typically uses high quality
    audio for the voice acting, but usually "kinda poor" audio quality for
    any sound effects and background music (enough so that it is kinda noticeable). Like, someone was like "8kHz 8-bit PCM?... Good enough."
    (not sure the specifics exactly).

    Like, it is noticeable when one mixes 44K voice acting with 8K
    background music.



    But, in general, for something like a 3D engine I would think, say:
    16K: any voice dialog or similar;
    11K: generic weapon sound effects or explosions or similar.
    8K: most ambient sound effects
    Such as wind or fluorescent light hum, etc.
    Maybe pay a little more for background music, like maybe 16kHz.

    A-Law works well, ADPCM is also OK.
    Just, preferably avoid 8-bit linear PCM due to the hiss issue.

    Here, 2-bit ADPCM is interesting, as it can allow 16K at the same
    bitrate as 8K, or 2b 8K if the quality doesn't matter (or there is some
    other reason to save space). Though the quality tradeoff of 2b/16K vs
    4b/8K is likely to depend on the use-case.


    Then again, maybe audio filtering could remove the hiss from 8-bit PCM,
    or avoid generating it, but A-Law is still preferable here.

    And, just say no to low bitrate MP3 (why do people even do this?...).

    Like the hiss from 8-bit PCM, the artifacts of the MP3 compression would
    blend into other stuff and make everything else sound worse.

    ...

    OTOH, if one has music that is likely to be "actively listened to", then
    32K or 44K makes sense, but then MP3 also makes sense, as pretty much
    any "reasonable" way of storing 32K or 44K music is going to exceed the bitrate of 128kbps MP3 (and, at 128kbps, it avoids the artifacts).

    Likewise music storage is an area where ADPCM is weak.
    But, not everything is music.

    But, can note that seemingly, 16KHz 4bit ADPCM is a pretty good format
    for speech, quality wise. This is 64 kbps, and IMHO beats MP3 at this task.




    Also, both Int16 and Binary16 float can give what seems like "perfect"
    audio quality when used as storage formats. But, sometimes one may need
    more as a working format (such as Binary32).

    Or, say:
    Int16: Final Output
    Binary16: Intermediate Storage or Math
    A-Law: Storage / Sound-Effects
    Binary32: Active processing (such as mixing or filtering operations).

    For many less trivial audio tasks, integer or fixed-point math is
    lacking (10.22 fixed point or similar isn't quite ideal).



    In my case, generally 128 kbps MP3 sounds OK (and is pretty much the
    "gold standard" MP3 bitrate), but the quality of MP3 drops off fast if
    one goes much lower. At roughly 64 kbps or below, MP3 is kinda trash.
    Even 96 kbps is borderline, as the quality drops off very rapidly.

    OGG Vorbis is kinda similar, it has a slightly different sound to it but
    is basically the same stuff.

    The issue seems to be for me that MP3 and OGG are *very* abusive to the
    higher frequency ranges. They don't just discard them (as downsampling
    would do), but instead result in a whole lot of chaotic noise in the
    higher frequencies, which is kinda obnoxious to me.

    So, OK format for audio distribution or music on the internet.
    But, poor formats for sound effects.




    One effect I can hear IRL, that I sometimes get annoyed with in games
    lacking, is that when I hear a sound IRL, I don't just hear the sound,
    but also the shape of the room or similar that the sound is present in.
    Like, I can hear the sounds reflecting off walls, and sometimes between
    the panels in doors and other things.

    Whereas, most games pretty much don't bother. Any sound effects are
    played as point sources in an empty void.


    Sometimes, games have tried to fake it (with the whole "EAX" thing that
    was popular at one point), but then they use "presets" of just some
    generic space that doesn't really match at all with the space the player
    is actually in.

    Like, "Woo, now you have a closet sized wooden box centered around your head...", it follows you along, until it then switches out for a
    slightly bigger concrete-walled box (still centered around the player's
    head). Sometimes the box is tall or narrow, or shorter and wider, but,
    yeah, "kinda weak", not much better than just playing the sounds into an
    empty void.


    Well, the "EAX" thing did at least try to make the audio more directional.
    Many games just adjust the left/right balance for stereo;
    But, for better effect, one needs to do a phase offset;
    And, some other adjustments for up/down and forward/back.
    For example, above/below/behind has less stereo separation.
    Also the amount of stereo separation is correlated to distance.
    More distant also reduces separation, not just direction.
    ...
    And, also apply Doppler shifts for the relative velocities;
    ...

    Though, neither EAX nor OpenAL seemingly bothered with Doppler effects.
    But, at the speeds characters often move in games, it is enough that
    Doppler shifting becomes at least semi-relevant.

    Though, in this case, the phase offset and Doppler effects are
    "basically" the same phenomena, so in effect one can sort of represent
    each "ear" as a point in terms of the Doppler math, with the appropriate propagation delay, then the phase offsets happen naturally.



    In my 3D engines, I have sometimes made an effort to do a better job
    here, typically trying to calculate the audio contribution of sound
    reflecting off each block in the general vicinity, and then
    approximating for some more distant blocks.

    The effect is still rather poor, and the audio still often sounds a bit
    jank, but I had usually made an effort at least.

    Doing this "well" would be computationally expensive.


    Though, one partial workaround being to do some of this at lower sample
    rates internally (like, say, it is less noticeable if the audio
    reflections off a wall are happening as 8kHz and at A-Law quality, ...).

    Also my more recent 3D engine doesn't handle the "underwater" situation
    very well, just sort of giving the "very excessive" reverb of the the
    player's head being stuck inside of a large number of blocks *all*
    reflecting the audio. But, sorta works. Also a little wacky as it
    applies to the background music and makes it more obvious that the
    player is themselves the point-source of the background music
    (alternative would be to mix this in afterwards, and not have it like
    the player is carrying around a boombox that is playing the background
    music). But, underwater that is a sort of "reverb hell" sorta works,
    still maybe better IMO than "well, you are still in a void, but now it
    is playing scuba related sound effects...".


    Though, it may not seem like much, this sort of audio processing can
    also eat a lot of RAM for the intermediate audio data. So, don't really
    want to just store all the intermediate audio as 'float'.

    But, then again, most people seemingly don't bother with any of this.



    Terje



    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Sun Sep 7 21:12:03 2025
    From Newsgroup: comp.arch


    Terje Mathisen <terje.mathisen@tmsw.no> posted:

    MitchAlsup wrote:

    BGB <cr88192@gmail.com> posted:

    Just randomly thinking again about some things I noticed with audio at
    low sample rates.

    For baseline, can note, basic sample rates:
    44100: Standard, sounds good, but bulky

    No it does not sound "good" on a system that accurately reproduces
    22KHz; like systems with electrostatic speakers covering the high
    end of the audio spectrum.

    Might sound "good" to someone who does not know what it is supposed
    to actually sound like, though.

    My ears are not good enough to notice the difference between CD quality, AAC/high sample rate MP3/ogg vorbis/etc, but according to my savant (?) cousin who could listen to a 16 min piece of music once and then write
    down the score for all the instruments, none of them sound like live,
    but they are close enough that he can listen and internally translate to what it would have sounded like in a concert.

    Just after graduating CMU I worked in a high end stereo store. The listening room was 4 walls none of them parallel and a slanted ceiling; so it had essentially no reverberation. The Pittsburgh string quartet rented out
    the room for various practices, and we recorded on 9 track tape at 60"/s
    and played it back on Dalquist speakers and other high end amplification; diddling with the equalization until the recording sounded like the live
    string quartet (only seconds apart live<->recorded).

    I have/had 2 brothers who could listen to a movie and then go write down
    the score of one or two of the tunes. I, personally, can't carry a tune
    in a basket--but I admire those who can. I can hear things that others don't seem to. Things like whether the phono section of a pre-amp has a tube or not--its all in the harmonics!!

    Terje

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Sun Sep 7 21:13:31 2025
    From Newsgroup: comp.arch


    BGB <cr88192@gmail.com> posted:

    On 9/7/2025 5:26 AM, Terje Mathisen wrote:
    MitchAlsup wrote:

    BGB <cr88192@gmail.com> posted:

    Just randomly thinking again about some things I noticed with audio at >>> low sample rates.

    For baseline, can note, basic sample rates:
        44100: Standard, sounds good, but bulky

    No it does not sound "good" on a system that accurately reproduces
    22KHz; like systems with electrostatic speakers covering the high
    end of the audio spectrum.

    Might sound "good" to someone who does not know what it is supposed
    to actually sound like, though.

    My ears are not good enough to notice the difference between CD quality, AAC/high sample rate MP3/ogg vorbis/etc, but according to my savant (?) cousin who could listen to a 16 min piece of music once and then write down the score for all the instruments, none of them sound like live,
    but they are close enough that he can listen and internally translate to what it would have sounded like in a concert.


    To me, 44100 and 48000 sound basically the same, so not much gain in
    going higher.

    The difference between 32000 and 44100 is slight.

    The difference is in the phase of the high end spectrum 15K-22K
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From BGB@cr88192@gmail.com to comp.arch on Sun Sep 7 18:58:18 2025
    From Newsgroup: comp.arch

    On 9/7/2025 4:13 PM, MitchAlsup wrote:

    BGB <cr88192@gmail.com> posted:

    On 9/7/2025 5:26 AM, Terje Mathisen wrote:
    MitchAlsup wrote:

    BGB <cr88192@gmail.com> posted:

    Just randomly thinking again about some things I noticed with audio at >>>>> low sample rates.

    For baseline, can note, basic sample rates:
        44100: Standard, sounds good, but bulky

    No it does not sound "good" on a system that accurately reproduces
    22KHz; like systems with electrostatic speakers covering the high
    end of the audio spectrum.

    Might sound "good" to someone who does not know what it is supposed
    to actually sound like, though.

    My ears are not good enough to notice the difference between CD quality, >>> AAC/high sample rate MP3/ogg vorbis/etc, but according to my savant (?)
    cousin who could listen to a 16 min piece of music once and then write
    down the score for all the instruments, none of them sound like live,
    but they are close enough that he can listen and internally translate to >>> what it would have sounded like in a concert.


    To me, 44100 and 48000 sound basically the same, so not much gain in
    going higher.

    The difference between 32000 and 44100 is slight.

    The difference is in the phase of the high end spectrum 15K-22K

    I can notice a slight difference, but as noted, it isn't much...


    Meanwhile, decided to check the delta between:
    Audio downsampled from 16K to 8K via averaging pairs of samples;
    Audio downsampled from 16K to 8K via spline curve fitting.

    And, I had noticed there is a difference in the 8 kHz signals.
    The curve-fitting delta signal is quite strong in high frequencies (with
    much of the total energy in the 2 to 4 kHz range); and actually a fair
    bit louder than could have been expected.

    The difference signal itself contains intelligible speech (and most
    other significant aspects of the audio), though exists pretty much
    entirely in the high part of the frequency range.


    Where, say (S0/1/2/3) for a spline, point between 1 and 2:
    Linear:
    V=(S1*(1-F))+(s2*F)
    Quadratic Spline (Bezier)
    P1=(S0*(1-F))+(S1*F)
    P2=(S1*(1-F))+(S2*F)
    V=(P1*(1-F))+(P2*F)
    Cubic Spline
    P1=(S0*(1-F))+(S1*F)
    P2=(S1*(1-F))+(S2*F)
    P3=(S2*(1-F))+(S3*F)
    Q1=(P1*(1-F))+(P2*F)
    Q2=(P2*(1-F))+(P3*F)
    V=(Q1*(1-F))+(Q2*F)


    This is a different spline construction than I had usually used for
    audio processing:
    G=1-F
    P1=(S1*(1+F))-(S0*F)
    P2=(S2*(1+G))-(S3*G)
    V=(P1*(1-F))+(P2*F)

    But it seems the former may have more useful properties in this case
    (mostly in that estimating the control points the former splines better preserves high-frequency properties of the signal); whereas the latter
    is more solely useful for interpolation tasks (such as upsampling).


    Though, for 2x cases, F is only ever 0.25 or 0.75, partly simplifying
    the math.

    But, for calculating the points, one doesn't actually have the former or following control points, so it is necessary to carry out the math for additional samples into the past and future to estimate the other
    control points to try to calculate the current control-point (or, a bit
    more hairy). For the terminal points, linear extrapolation seems to work.



    But, yeah, it seems the control-points style signal seems to be
    significantly boosted in terms of high-frequency components.

    And, as audio, it seems to preserve some aspects of the 16kHz signal
    that are otherwise lost when downsampling to 8 kHz.


    I guess I could try looking some at a reconstructed version of the 16
    kHz sample and see if anything survives past the 4kHz mark.

    Well, OK, trying to resample it up to 16kHz using the B-spline is just
    sort of being a bit weird. Seems almost like the math is broken somehow.

    In the reconstruction attempt there are a few big notches in the
    spectrum; seems to be an issue with the output spline rather than the
    input signal.

    Seems to not be an issue with my typical spline, rather something
    specific about my attempt at upsampling again with with a cubic Bezier
    spline.


    The upsampled reconstruction attempt sounds like dog crap; but does interestingly seem to have stuff going on beyond past the 4kHz Nyquist
    cutoff (so, this still leaves the possibility that parts of the higher frequencies may be surviving the downsampling process).

    Though, curiously, despite sounding like dog-crap and having big notches
    in the spectrum, the Bezier Spline reconstruction does have the lower
    RMSE value for some reason.


    Though, can note that the input audio it seems is fairly weak in the 4-8
    kHz range, so isn't entirely obvious what specifically is being effected
    in downsampling, but seemingly clearly something at least.


    Well, more fiddling it seem to try to figure this out...


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From David Schultz@david.schultz@earthlink.net to comp.arch on Sun Sep 7 21:16:34 2025
    From Newsgroup: comp.arch

    On 9/7/25 6:58 PM, BGB wrote:
    Meanwhile, decided to check the delta between:
      Audio downsampled from 16K to 8K via averaging pairs of samples;
      Audio downsampled from 16K to 8K via spline curve fitting.

    Seems, inadequate to satisfy the Nyquist criteria.
    --
    http://davesrocketworks.com
    David Schultz
    "The cheaper the crook, the gaudier the patter." - Sam Spade
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From BGB@cr88192@gmail.com to comp.arch on Sun Sep 7 22:55:34 2025
    From Newsgroup: comp.arch

    On 9/7/2025 9:16 PM, David Schultz wrote:
    On 9/7/25 6:58 PM, BGB wrote:
    Meanwhile, decided to check the delta between:
       Audio downsampled from 16K to 8K via averaging pairs of samples;
       Audio downsampled from 16K to 8K via spline curve fitting.

    Seems, inadequate to satisfy the Nyquist criteria.


    Dunno.

    Averaging pairs would be the traditional method for downsample, but,
    when downsampling to 8kHz, audio sounds muffled, and intelligibility of
    speech is poor.


    Curve fitting seems to generate a "perceptually better" result at 8kHz.
    But, is weirdly behaved;
    Apparently makes the audio harder to upsample again without it sounding
    like crap (at least when upsampling with some variant of a cubic spline;
    seems naive LERP is still happy here).

    I actually am neither sure what I am doing here, nor the exact nature of
    the effect I have stumbled onto...



    General algorithm for spline fitting ATM:
    Reuse the two previous points (S0 and S1);
    Make a guess for S2 and S3;
    Use iterative convergence to reduce error for S3;
    Use iterative convergence to reduce error for S2;
    Use S2 as next point.

    The spline function can be swapped out.

    Of the two:
    Well, if used in the convergence-step, my usual spline function
    generates terrible results.

    The Bezier spline function seems better behaved.

    After upsampling:
    Usual spline gives poor results;
    Bezier spline gives low RMSE, but has a big ugly notch at around 4 kHz;
    Naive LERP seemingly does OK though (not great, but avoids the frequency notches).

    Tried using a quadratic spline, didn't really work very well.


    In most variations, the notch appears to be near 4kHz, or near the
    Nyquist frequency of the 8kHz sample rate (with another smaller notch
    one octave lower, around 2 kHz; and another small dip around 1 kHz).

    Actually, thinking about it, a notch right near the Nyquist frequency of
    the control points might be an inescapable side effect of using a spline
    here. So, it can somehow encode some information slightly above the
    Nyquist rate into the spline, but not *at* the Nyquist rate.


    So, it seems the curve-fitting is causing things to go weird with the
    spline during upsampling.


    Well, it is a weird effect, in any case.


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From David Brown@david.brown@hesbynett.no to comp.arch on Mon Sep 8 10:59:50 2025
    From Newsgroup: comp.arch

    On 07/09/2025 23:12, MitchAlsup wrote:

    Terje Mathisen <terje.mathisen@tmsw.no> posted:

    MitchAlsup wrote:

    BGB <cr88192@gmail.com> posted:

    Just randomly thinking again about some things I noticed with audio at >>>> low sample rates.

    For baseline, can note, basic sample rates:
    44100: Standard, sounds good, but bulky

    No it does not sound "good" on a system that accurately reproduces
    22KHz; like systems with electrostatic speakers covering the high
    end of the audio spectrum.

    Might sound "good" to someone who does not know what it is supposed
    to actually sound like, though.

    My ears are not good enough to notice the difference between CD quality,
    AAC/high sample rate MP3/ogg vorbis/etc, but according to my savant (?)
    cousin who could listen to a 16 min piece of music once and then write
    down the score for all the instruments, none of them sound like live,
    but they are close enough that he can listen and internally translate to
    what it would have sounded like in a concert.

    Just after graduating CMU I worked in a high end stereo store. The listening room was 4 walls none of them parallel and a slanted ceiling; so it had essentially no reverberation. The Pittsburgh string quartet rented out
    the room for various practices, and we recorded on 9 track tape at 60"/s
    and played it back on Dalquist speakers and other high end amplification; diddling with the equalization until the recording sounded like the live string quartet (only seconds apart live<->recorded).

    I have/had 2 brothers who could listen to a movie and then go write down
    the score of one or two of the tunes. I, personally, can't carry a tune
    in a basket--but I admire those who can. I can hear things that others don't seem to. Things like whether the phono section of a pre-amp has a tube or not--its all in the harmonics!!

    Having a good memory for tunes, or being able to replicate tunes, and
    being able to distinguish the quality of sound reproduction is not
    actually highly correlated. The former is primarily a higher-level
    brain function, while the later is partly physical, partly low-level
    brain (or software vs. hardware, to suit the group better!).

    For those that work directly with music, age brings experience and
    improves abilities like recognising or duplicating tunes. Age also
    brings deterioration in the physical aspects of hearing - especially at
    higher frequencies.

    There is /some/ overlap, because both groups spend a lot of time
    listening to music, which exercises and improves both functions.

    One key difference, however, is that it is easy to appreciate when
    people can listen to a tune once and play it again afterwards - you can
    watch them do it. For people who say they can distinguish CD audio from
    AAC or other high bps compressed audio, and other "golden ears"
    distinctions, it's a different matter - in double-blind tests, most fail badly. There are a great many factors involved in high-quality audio reproduction - the basic sample rate is only one of them.

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From David Schultz@david.schultz@earthlink.net to comp.arch on Mon Sep 8 06:49:55 2025
    From Newsgroup: comp.arch

    On 9/7/25 10:55 PM, BGB wrote:
    Dunno.

    Averaging pairs would be the traditional method for downsample, but,
    when downsampling to 8kHz, audio sounds muffled, and intelligibility of speech is poor.

    It has been a couple of decades since that discrete time signal
    processing course so the details have faded. But I do know that aliasing
    can be a big problem. Hence the good low pass filter as part of the
    decimation process.

    Assuming your 16KSPS data started with a good presample filter there was
    no signal or noise (or at least negligible) above 8KHz, it is going to
    have stuff between 4KHz and 8KHz. Fail to filter that adequately and it
    gets folded/aliased to a lower frequency.

    So you need a discrete time filter to remove most of the information in
    your data above 4KHz while leaving what you want alone.

    The moving average filter is not a good choice. Sure it has a zero at
    your new sample rate but it has poor performance in general.
    --
    http://davesrocketworks.com
    David Schultz
    "The cheaper the crook, the gaudier the patter." - Sam Spade
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From BGB@cr88192@gmail.com to comp.arch on Mon Sep 8 15:10:31 2025
    From Newsgroup: comp.arch

    On 9/8/2025 3:59 AM, David Brown wrote:
    On 07/09/2025 23:12, MitchAlsup wrote:

    Terje Mathisen <terje.mathisen@tmsw.no> posted:

    MitchAlsup wrote:

    BGB <cr88192@gmail.com> posted:

    Just randomly thinking again about some things I noticed with audio at >>>>> low sample rates.

    For baseline, can note, basic sample rates:
         44100: Standard, sounds good, but bulky

    No it does not sound "good" on a system that accurately reproduces
    22KHz; like systems with electrostatic speakers covering the high
    end of the audio spectrum.

    Might sound "good" to someone who does not know what it is supposed
    to actually sound like, though.

    My ears are not good enough to notice the difference between CD quality, >>> AAC/high sample rate MP3/ogg vorbis/etc, but according to my savant (?)
    cousin who could listen to a 16 min piece of music once and then write
    down the score for all the instruments, none of them sound like live,
    but they are close enough that he can listen and internally translate to >>> what it would have sounded like in a concert.

    Just after graduating CMU I worked in a high end stereo store. The
    listening
    room was 4 walls none of them parallel and a slanted ceiling; so it had
    essentially no reverberation. The Pittsburgh string quartet rented out
    the room for various practices, and we recorded on 9 track tape at 60"/s
    and played it back on Dalquist speakers and other high end amplification;
    diddling with the equalization until the recording sounded like the live
    string quartet (only seconds apart live<->recorded).

    I have/had 2 brothers who could listen to a movie and then go write down
    the score of one or two of the tunes. I, personally, can't carry a tune
    in a basket--but I admire those who can. I can hear things that others
    don't
    seem to. Things like whether the phono section of a pre-amp has a tube or
    not--its all in the harmonics!!

    Having a good memory for tunes, or being able to replicate tunes, and
    being able to distinguish the quality of sound reproduction is not
    actually highly correlated.  The former is primarily a higher-level
    brain function, while the later is partly physical, partly low-level
    brain (or software vs. hardware, to suit the group better!).


    FWIW: My musical ability is almost non-existent.


    For those that work directly with music, age brings experience and
    improves abilities like recognising or duplicating tunes.  Age also
    brings deterioration in the physical aspects of hearing - especially at higher frequencies.


    This is a concern for me, partly if I lose high frequencies, seemingly I
    wont have anything, as my hearing of low frequencies (sub 1kHz) is
    seemingly already impaired.

    Seemingly, most of my world of audio perception is located between 1kHz
    and 8kHz.

    Most lower frequencies are more felt than heard.



    There is /some/ overlap, because both groups spend a lot of time
    listening to music, which exercises and improves both functions.


    I listen to music a lot, but usually House and EDM and similar.

    When I was younger, a lot of Goth (mostly of the Synthpop/Synthwave
    variaty), and Industrial.

    There was Dubstep for a while, but seemingly the whole genre kind of
    imploded (though, never got much mainstream popularity aside from
    "Skrillex").



    When I was in childhood, there were mostly bands like "Nine Inch Nails"
    and "Marilyn Manson" and similar (FWIW: I am Gen Y / Millennial).

    Well, I think also popular (but I wasn't into them) were bands like
    "Eminem" and "Backstreet Boys", etc, then a little later, I think a lot
    of people were into "Linkin Park" and similar.

    Had noted that some older music was also pretty good, like from bands
    like "Depeche Mode" and similar (though, had noted seemingly some
    people object to this band, due to its apparent popularity with gays).


    Contrast, my dad is mostly more into "Heavy Metal" and similar.

    Not actually sure what anyone younger than me is into.



    One key difference, however, is that it is easy to appreciate when
    people can listen to a tune once and play it again afterwards - you can watch them do it.  For people who say they can distinguish CD audio from AAC or other high bps compressed audio, and other "golden ears" distinctions, it's a different matter - in double-blind tests, most fail badly.  There are a great many factors involved in high-quality audio reproduction - the basic sample rate is only one of them.


    I am not really a "golden ear" AFAICT.

    At high bitrates, or high sample rates, I can't hear much difference.



    But, mostly just noting that it is at low bitrates where things like MP3
    and similar start to sound like crap.

    Where, say:
    128-192 kbps: Good
    96: Obvious degredation
    64: Kinda bleh
    48: Rather poor
    32: Rattling cans of broken glass.
    24: Mostly chaotic whistling noises
    ...

    Well, kinda like JPEG:
    Looks good at 80-95% quality;
    But, load and resave, it gets worse each time.
    Save and resave an image at 0% a few times, yeah...
    This itself became a meme at one point.

    At the lower end, things like 16-color BMP images can still be useful
    (and with LZ compression, like gzip or similar, is often smaller than
    the same image expressed as a PNG). Likewise for monochrome.

    Whereas, despite being good for photos, JPEG sucks for pixel art or
    16-color or monochrome graphics.

    Well, and I guess, much like not all images are photos, not all audio is music.


    Well, unless there is someone who wants to disagree with me about the
    relative (lacking) audio quality of 32 kbps MP3, ...




    For low bitrate uses (usually mono), I have personally often found ADPCM
    to be a good/better option.

    Where, say:
    16000, 4-bit ADPCM: 64 kbps
    11025, 4-bit ADPCM: 44 kbps
    16000, 2-bit ADPCM: 32 kbps
    8000, 4-bit ADPCM: 32 kbps
    11025, 2-bit ADPCM: 22 kbps
    8000, 2-bit ADPCM: 16 kbps (current lowest-quality option in BGBCC)


    The recent "mystery" implied that there might be a way to squeeze more perceptual quality out of 8000 2-bit ADPCM (observing already that it
    seemed like it could somehow "enhance" audio to some extent depending on
    how the encoder was tuned, *1).

    But, it seems this "mystery" property might be poorly behaved, and may interact poorly with my usual upsampling/interpolation filters to
    produce a significant reduction in audio quality (effectively limiting
    the usefulness of trying to use it deliberately).



    *1: The usual strategy for 4-bit ADPCM encoding is that for each sample,
    one will linearly quantize the delta based on the step size, maybe round
    it, and encode this. This approach partly breaks down with 2-bit ADPCM,
    so a more effective strategy is to effectively "brute force search" the encoding space multiple samples at a time to find the path with the
    least error. But, with slight tweaks to the error estimation math, it is possible to get different effects.

    But, say, one option is searching 3 samples at a time, with a roughly 6
    bit search space. Searching 5 or 6 might seem obvious (then you can emit
    a whole byte at a time), but this is an order of magnitude slower. So,
    one can instead chain two dependent 3 sample searches to get a block of
    4 samples (or 1 byte).

    Theoretically, this can also be done with 4-bit ADPCM, but 4-bit has a
    bigger search space, so it is inherently slow. While 1 sample at a time encoding is possible with 2-bit ADPCM, its quality is poor.

    There were several variations of 2-bit ADPCM:
    ITU-T G.726: Exists, but pretty much no software support.
    IMA ADPCM, 2-bit:
    Supported by VLC Media Player and Audacity as an input format.
    Other software either refuses to open it or gives broken audio.
    Has a partial drawback of needing 4 bytes/block for header.
    Also stereo, if used, takes 2x the bitrate...
    Custom: I can just do my own thing here.


    For custom variants, had noted a few tricks:
    Store initial predictor sample as A-Law, as a full PCM sample is overkill;
    Use a 64 entry step table rather than 89 entries (in this case, the step
    table can be seen as a 4.2 bit fixed-point value in Log2 space);
    Skip use of range clamping (instead, encoder is not allowed to go out of
    range for either the predictor or step index).
    This can allow the header to be encoded in 16 bits (with 2-bits
    remaining for decoder control flags).

    The Log2 stepping and elimination of range clamping would allow for a
    cheaper hardware decoder, but the loss of range clamping negatively
    effects audio quality in some cases (when the audio uses the full
    dynamic range). Likewise, the encoder is not allowed to take paths that
    would take the step index out of range (unlike IMA, which requires
    clamping here as well).
    It also reduces LOC and allows for a faster decoder (can decode 4
    samples at a time without the code becoming unwieldy, and with no
    intermediate "if()" branches).


    A lot of my use-cases had been mono, but can note that there is a
    cheaper way to do stereo (vs fully duplicating the left/right channels,
    eg, "joint stereo"):
    Split audio into Center and Side channels;
    C=(L+R)/2, S=L-R
    Encode the center channel at full sample rate;
    Encode side channel at 1/4 sample rate.

    This way, stereo is 1.25x the bitrate of mono, rather than 2x. Though,
    one needs to decide how the side-channel is interpolated (usual options
    being either linear interpolation or a cubic spline).

    Another option being the 4x 3-bit center, 4-bit side, scheme (had used
    this a few times, same bitrate as 4-bit mono). For 2-bit, makes more
    sense though to have a center block and a 1/4 sub-sampled side block.

    Can note that 2 bits/sample is the bottom end of what is possible with
    the ADPCM strategy.



    Going much lower is harder.
    Getting under 2 bits per sample requires a more complex strategy.

    Best I am aware of ATM being to use a pattern table, eg, for each block:
    Split audio into a base curve and deviation (say, at 1/16 sample rate);
    ADPCM encode the base curve and deviation (with a sub-sampled
    deviation), reusing the joint-stereo encoding;
    Then follow this with a list of block pattern indices (4 or 8 bits each).

    So, while ADPCM itself can't go lower than 2 bits, it can encode lower
    if encoded at what is effectively 1 kHz and 250 Hz (and can still be
    leveraged here as a component in a more complex format).


    Can get to, say, around 0.5 bits/sample (and doesn't need an entropy
    encoder or much of anything else fancy), but quality is lacking.

    However, resource cost and decoding speeds can be similar, and only
    around 2x more code versus a normal ADPCM decoder (some fudging possible here). Generally, roughly half of the code in this case being for the
    ADPCM decoder. Or, with precomputed tables, in the area of around 500
    lines of C.


    The encoding process is a fair bit more involved though, so is generally
    a fair bit bigger, more complex, and slower than an ADPCM encoder.


    Haven't made much active use of this approach thus far, since as noted,
    audio quality is inferior to normal ADPCM.

    Though, have not explored its use at higher sample rates.
    An 0.5 bit/sample variant could give 28kbps for 44.1 stereo, so might
    make sense to compare against MP3 (if it can avoid sounding like
    complete garbage with music playback, it would probably still be a win).

    Still basically "scraping the bottom of the barrel" though.


    Though, the more likely option being a lower bitrate option for sticking
    sound effects into a PE/COFF resource section (where, currently the
    lowest quality option in BGBCC being 16kbps 8000 2-bit ADPCM).

    ...


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From scott@scott@slp53.sl.home (Scott Lurndal) to comp.arch on Mon Sep 8 21:10:49 2025
    From Newsgroup: comp.arch

    BGB <cr88192@gmail.com> writes:
    On 9/8/2025 3:59 AM, David Brown wrote:
    On 07/09/2025 23:12, MitchAlsup wrote:

    There is /some/ overlap, because both groups spend a lot of time
    listening to music, which exercises and improves both functions.


    I listen to music a lot, but usually House and EDM and similar.

    Not that I listen to that, but IIRC, most of that is fairly narrow
    in frequence range and mostly generated electronically instead
    of by actual instruments (drum machines, synthesizers, etc.)

    So you're starting with "artificial" digital signals.

    While classical, progressive rock, jazz and classic rock music all leverage real-world analog instruments, most of which have unique and complex
    harmonic elements and remain in the analog domain until converted
    to final digital form.

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From George Neuner@gneuner2@comcast.net to comp.arch on Mon Sep 8 17:57:42 2025
    From Newsgroup: comp.arch

    On Mon, 8 Sep 2025 10:59:50 +0200, David Brown
    <david.brown@hesbynett.no> wrote:

    ... For people who say they can distinguish CD audio from
    AAC or other high bps compressed audio, and other "golden ears" >distinctions, it's a different matter - in double-blind tests, most fail >badly. There are a great many factors involved in high-quality audio >reproduction - the basic sample rate is only one of them.

    That's true, but the basic sample rate does make a significant
    difference. I don't know if it is true, but I have read that to
    /accurately/ reproduce a given note requires 10 to 11 harmonics: the
    primary note, 7 higher, and 2 to 3 lower.

    This means most notes will include sounds that are outside the range
    of (normal) human hearing, but you can still /feel/ these sounds [even
    the high ones] and miss them when they are absent.


    C8 (high C) on the piano is ~4186 Hz. Assuming the need for the 7th
    higher harmonic - 29302 Hz - Nyquist would demand a minimum sampling
    rate of 58604/s to accurately reproduce C8.


    In practice, unless you like orchestral, or certain folk or country,
    you are not likely to hear much difference between a CD and a decent
    quality compressed version of it. But the CD itself is not a faithful reproduction of the live performance.

    And, of course, if you like orchestral you are more likely to be
    listening to vinyl rather than CD. 8-)
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Mon Sep 8 23:51:45 2025
    From Newsgroup: comp.arch


    George Neuner <gneuner2@comcast.net> posted:

    On Mon, 8 Sep 2025 10:59:50 +0200, David Brown
    <david.brown@hesbynett.no> wrote:

    ... For people who say they can distinguish CD audio from
    AAC or other high bps compressed audio, and other "golden ears" >distinctions, it's a different matter - in double-blind tests, most fail >badly. There are a great many factors involved in high-quality audio >reproduction - the basic sample rate is only one of them.

    That's true, but the basic sample rate does make a significant
    difference. I don't know if it is true, but I have read that to
    /accurately/ reproduce a given note requires 10 to 11 harmonics: the
    primary note, 7 higher, and 2 to 3 lower.

    This means most notes will include sounds that are outside the range
    of (normal) human hearing, but you can still /feel/ these sounds [even
    the high ones] and miss them when they are absent.

    Or when the reproduction of the harmonics is out-of-phase wrt the
    harmonics of any live version of the same note.

    C8 (high C) on the piano is ~4186 Hz. Assuming the need for the 7th
    higher harmonic - 29302 Hz - Nyquist would demand a minimum sampling
    rate of 58604/s to accurately reproduce C8.

    While I can guarantee that I could not hear a 24 KHz pure sine wave tone;
    I can guarantee that I can hear a phase shift of the 24 KHz harmonic of
    the non-sinusoidal musical note at 6 KHz.

    In practice, unless you like orchestral, or certain folk or country,
    you are not likely to hear much difference between a CD and a decent
    quality compressed version of it. But the CD itself is not a faithful reproduction of the live performance.

    And, of course, if you like orchestral you are more likely to be
    listening to vinyl rather than CD. 8-)
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From BGB@cr88192@gmail.com to comp.arch on Mon Sep 8 18:57:33 2025
    From Newsgroup: comp.arch

    On 9/8/2025 4:10 PM, Scott Lurndal wrote:
    BGB <cr88192@gmail.com> writes:
    On 9/8/2025 3:59 AM, David Brown wrote:
    On 07/09/2025 23:12, MitchAlsup wrote:

    There is /some/ overlap, because both groups spend a lot of time
    listening to music, which exercises and improves both functions.


    I listen to music a lot, but usually House and EDM and similar.

    Not that I listen to that, but IIRC, most of that is fairly narrow
    in frequence range and mostly generated electronically instead
    of by actual instruments (drum machines, synthesizers, etc.)

    So you're starting with "artificial" digital signals.


    Fair enough.

    I had noted when looking at some of this that typically the frequency
    spectrum drops off sharply to pretty much nothing. Exactly where this
    point is depends a lot on the song, but somewhere in the area of 11 to
    16 kHz seems typical.

    But, yeah, it appears like things often drop off steeply after a few
    points: 8kHz, 11kHz, and 16kHz. With a few songs looked at having a sort
    of "stair step" look in their spectrum.


    If I take a song and do an 8kHz high-pass filter, what is left mostly
    sounds like varying levels of white noise.

    One of my other test cases (mostly for speech; ripped from the
    audio-track of an episode of an animated TV show), has a drop-off wall
    at 8kHz (nothing over 8kHz).

    Checking for another animated show, it seems to have an 8kHz wall for
    the speech, but a 4kHz wall for the background music.

    The presence of a sharp 8kHz frequency wall in several cases does imply
    that 16kHz recording is likely popular for voice acting.



    Some songs are also weak for testing stereo encodings (or, "too easy")
    because they are essentially mono with little (if any) stereo
    divergence. Often, if there is stereo divergence, it is usually a pan adjustment.


    There are some songs I had found where there is a stronger stereo
    component. For example, the intro song for "Ghost in the Shell: Stand
    Alone Complex" being a stronger test case for stereo encodings (and has
    a more obvious loss of quality when converted to mono).


    Semi-interesting is one can invert one of the channels and see how much
    of the song drops out.


    While classical, progressive rock, jazz and classic rock music all leverage real-world analog instruments, most of which have unique and complex
    harmonic elements and remain in the analog domain until converted
    to final digital form.


    OK.

    I guess, in any case, natural recordings probably wouldn't have an
    8/11/16 kHz stair-step pattern.

    It seems like one could expect people to use 44 or 48 as a default for
    pretty much everything, but if everything were being done at 44 or 48, wouldn't likely see this stair-step pattern either (nor the apparent "frequency walls" effect when looking at audio pulled from TV shows;
    where whatever is going on with the audio is at a sampling rate lower
    than 44 or 48).


    I haven't really listened that much to non-electronic music.
    Pretty much my whole life has been in the time where people mostly make
    music on computers.

    Haven't listened that much to classical or similar apart from being in
    the form of MIDI files and similar.



    Well, apart from the band at the church I go to. They are mostly using a synthesizer and electronic drums and similar though (where the drum set
    is the black plastic disks that make drum sounds when hit variety). The instruments then connect up to a computer in the back of the room where
    most of the audio mixing happens. So, probably not a true analog
    experience here.


    IIRC, they are using 3-pin XLR for the microphones, with DIN-5 for some
    of the other instruments (the keyboard and drums use DIN-5). My dad
    plays guitar sometimes, and they use an interface box to plug 1/4" TRS
    cable into XLR.


    Not entirely sure of the specifics, but they can adjust the volume and
    similar of the various instruments on the computer (not entirely sure
    how it works, not looked into it too much). IIRC, I think all the XLR connectors and similar plug into a box which then plugs into the
    computer over USB or something.

    Looking on Amazon, a similar looking sort of box seems to go for around
    $500 (for a "Multi-track mixer/recorder").

    Not sure of the audio properties of all of this.

    Also they have a wireless microphone for the pastor and similar, which
    also feeds into it somehow.

    Sometimes they need to test things before starting (having him turn on
    the microphone and stay stuff), as sometimes the microphone fails to
    connect to the computer in the back.

    Not sure, but seems to behave sort of like it is using a Bluetooth
    interface or similar. Where, turn it on, say stuff, usually connects (if
    it does so) after around 10 or 15 seconds or so (and if/when it loses connection, it goes silent until it can reconnect; after another 10
    seconds or so, when audio comes back).

    Not seeing an exact match, but similar looking sorts of microphones seem
    to go for around $40 to $60 on Amazon.


    I don't really mess with any of this as it isn't really my area.


    ...


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From BGB@cr88192@gmail.com to comp.arch on Mon Sep 8 21:49:33 2025
    From Newsgroup: comp.arch

    On 9/8/2025 6:57 PM, BGB wrote:
    On 9/8/2025 4:10 PM, Scott Lurndal wrote:
    BGB <cr88192@gmail.com> writes:
    On 9/8/2025 3:59 AM, David Brown wrote:
    On 07/09/2025 23:12, MitchAlsup wrote:

    There is /some/ overlap, because both groups spend a lot of time
    listening to music, which exercises and improves both functions.


    I listen to music a lot, but usually House and EDM and similar.

    Not that I listen to that, but IIRC, most of that is fairly narrow
    in frequence range and mostly generated electronically instead
    of by actual instruments (drum machines, synthesizers, etc.)

    So you're starting with "artificial" digital signals.


    Fair enough.

    I had noted when looking at some of this that typically the frequency spectrum drops off sharply to pretty much nothing. Exactly where this
    point is depends a lot on the song, but somewhere in the area of 11 to
    16 kHz seems typical.

    But, yeah, it appears like things often drop off steeply after a few
    points: 8kHz, 11kHz, and 16kHz. With a few songs looked at having a sort
    of "stair step" look in their spectrum.


    If I take a song and do an 8kHz high-pass filter, what is left mostly
    sounds like varying levels of white noise.

    One of my other test cases (mostly for speech; ripped from the audio-
    track of an episode of an animated TV show), has a drop-off wall at 8kHz (nothing over 8kHz).

    Checking for another animated show, it seems to have an 8kHz wall for
    the speech, but a 4kHz wall for the background music.

    The presence of a sharp 8kHz frequency wall in several cases does imply
    that 16kHz recording is likely popular for voice acting.


    OK, going and ripping the audio tracks off episodes from a few more
    animated shows and manually looking at the spectrum, it seems:
    Sonic Boom:
    Pretty much everything seems to be ~ 8 kHz / 16000.
    Miraculous Ladybug:
    Pretty much everything seems to be ~ 11 kHz / 22050.
    Bravest Warriors (S4):
    Voice frequency wall: ~ 16kHz, so likely 32000 or maybe 44100
    Some distorted voices appear to be using 44100.
    Music / SFX: ~ 4kHz, so likely still 8000
    Though, show is fast-paced enough making it difficult to isolate.
    Amazing Digital Circus:
    Voice frequency wall: ~ 11 kHz, so likely 22050.
    Various sound effects: 4 and 5 kHz, so likely 8000 or 11025.
    Actually, whole episode seems to have a hard cutoff at 11 kHz.
    So, unclear 22050 was input or output side.
    (Would need to look at more episodes to figure it out).


    It would appear that some of these shows might be scavenging audio from
    places where 8000 and 11025 is popular for things like sound effects.
    While using higher sampling rates for the voice acting.


    Though, I suspect it might just have been "more obvious" in the case of Bravest Warriors, possibly because the music is used more prominently,
    and the mix of high-fidelity voice recording with poor-fidelity
    background music is more obvious than if everything were more "samey".


    Of the examples looked at this far, it would appear that the shows made
    by "indie" studios are more likely to make use of low-fidelity sound
    effects.

    May be a case of trying not to be too obvious.
    If all the voices sound cooked, it is more obvious;
    If a random sound effect sounds cooked, much less obvious;
    If background music for a scene sounds cooked, also obvious.


    Not sure where people are getting sound effects, but I would still find
    it kinda amusing whenever watching something and hearing a sound effect
    that I recognize from somewhere else (like "FreeDoom" or "OpenQuartz" or similar; seems there is now also LibreQuake which is using the same
    terms as "FreeDoom").

    But, I guess probably the goal is to not be too obvious unless maybe as
    part of an in-joke.

    Though, does seem like possibly if someone is using something, and it is obvious that it was originally 8000 or 11025, probably scavenged from somewhere (say, if one assumes people will not tend to make new sound
    effects for a TV show and then save them as 8000 or similar).



    Some stuff online says that voice-acting should use 48000, but this
    would not appear to be the case for the examples I have looked at (well, either this, or it was "lost" somewhere in the process).

    My analysis process here being to extract the audio track using "VLC
    Media Player" and then looking at stuff manually in Audacity.

    ...



    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From BGB@cr88192@gmail.com to comp.arch on Mon Sep 8 23:28:23 2025
    From Newsgroup: comp.arch

    On 9/8/2025 6:49 AM, David Schultz wrote:
    On 9/7/25 10:55 PM, BGB wrote:
    Dunno.

    Averaging pairs would be the traditional method for downsample, but,
    when downsampling to 8kHz, audio sounds muffled, and intelligibility
    of speech is poor.

    It has been a couple of decades since that discrete time signal
    processing course so the details have faded. But I do know that aliasing
    can be a big problem. Hence the good low pass filter as part of the decimation process.

    Assuming your 16KSPS data started with a good presample filter there was
    no signal or noise (or at least negligible) above 8KHz, it is going to
    have stuff between 4KHz and 8KHz. Fail to filter that adequately and it
    gets folded/aliased to a lower frequency.

    So you need a discrete time filter to remove most of the information in
    your data above 4KHz while leaving what you want alone.

    The moving average filter is not a good choice. Sure it has a zero at
    your new sample rate but it has poor performance in general.


    OK.

    I was just seeing here if I could make stuff "less muffled".
    A cleaner averaging filter, like:
    (S0+3*S1+3*S2+S3)/8
    Tends to have more muffle.

    Whereas:
    (5*S1+5*S2-S0-S3)/4
    Slightly boosts high frequencies (so less muffle) but may have other drawbacks.

    ...


    A lot of the 16000 audio in this case is actually from opening a 44100
    or 48000 WAV and then dynamically resampling to 16000 because this is
    what I tend to use in a lot of my experiments of this sort.


    The typical downsampler that I use for general stuff works like:
    Downsample by powers of 2 (via pairwise averaging);
    Interpolate samples from above and below the target rate.
    Or, basically, sorta like mip-mapping.

    I had tried various strategies long ago, and I think this is the one
    that won out (though, gives weak results when the target is a low sample rate).



    This was contrast to a different strategy (IIRC also used in the Quake engine), something like:
    void Resample8(byte *dst, byte *src, int dlen, int slen)
    {
    int i, spos, sstep;

    spos=0; sstep=(256*slen)/dlen;
    for(i=0; i<dlen; i++)
    { dst[i]=src[spos>>8]; spos+=sstep; }
    }
    Or, 16-bit:
    void Resample16(s16 *dst, s16 *src, int dlen, int slen)
    {
    int i, spos, sstep;

    spos=0; sstep=(256*slen)/dlen;
    for(i=0; i<dlen; i++)
    { dst[i]=src[spos>>8]; spos+=sstep; }
    }

    Which gives results that sound poor.


    Versus, say (algo from memory, slower but less bad):

    void ResampleHalf16(s16 *dst, s16 *src, int dlen)
    {
    int i;
    for(i=0; i<dlen; i++)
    dst[i]=(src[i*2+0]+src[i*2+1])/2;
    }

    void ResampleInterpSpline(s16 *dst, s16 *src, int dlen, int slen)
    {
    int s0, s1, s2, s3, t0, t1, t2;
    int i, j, spos, sstep, sfr, tfr;

    spos=0; sstep=(256*slen)/dlen;
    for(i=0; i<dlen; i++)
    {
    j=spos>>8;
    s2=src[j+0];
    s0=s2; s1=s2; s3=s2;
    if(j>0) s1=src[j-1];
    if(j>1) s0=src[j-2];
    if((j+1)<slen1) s3=src[j+1];

    sfr=spos&255;
    tfr=sfr^255;
    t0=(((256+sfr)*s1)-(sfr*s0))>>8;
    t1=(((256+tfr)*s2)-(tfr*s3))>>8;
    t2=(((256-sfr)*t0)+(sfr*t1))>>8;
    if(t2<-32767)t2=-32767;
    if(t2> 32767)t2= 32767;

    dst[i]=t2;
    spos+=sstep;
    }
    }

    void ResampleInterpSpline2x(s16 *dst,
    s16 *src1, s16 *src2, int dlen, int slen)
    {
    int s0, s1, s2, s3, s4, t0, t1, t2;
    int i, j, k, spos, sstep, ufr, sfr, tfr;

    spos=0; sstep=(256*slen)/dlen;

    //try to estimate the interpolation values between the levels
    //assume sstep is between 0x80 and 0xFF in this case
    //0x80 means dlen is ~ 2x slen
    //0xFF means dlen ~= slen
    ufr=511-(sstep*2);

    for(i=0; i<dlen; i++)
    {
    j=spos>>8;
    k=spos>>7;
    s2=src2[j+0];
    s0=s2; s1=s2; s3=s2;
    if(j>0) s1=src2[j-1];
    if(j>1) s0=src2[j-2];
    if((j+1)<slen1) s3=src2[j+1];
    s4=src1[k];

    sfr=spos&255;
    tfr=sfr^255;
    t0=(((256+sfr)*s1)-(sfr*s0))>>8;
    t1=(((256+tfr)*s2)-(tfr*s3))>>8;
    t2=(((256-sfr)*t0)+(sfr*t1))>>8;
    t3=(((256-ufr)*t2)+(ufr*s4))>>8;

    if(t3<-32767)t3=-32767;
    if(t3> 32767)t3= 32767;

    dst[i]=t3;
    spos+=sstep;
    }
    }

    void Resample16(s16 *dst, s16 *src, int dlen, int slen)
    {
    s16 *sbuf1, *sbuf2, *sbt;
    int sl1, sl2;
    if(dlen>=slen)
    {
    if(dlen==slen)
    {
    memcpy(dst, src, dlen*2);
    return;
    }
    ResampleInterpSpline(dst, src, dlen, slen);
    return;
    }

    sl1=slen/2; sl2=sl1/2;
    sbuf1=malloc(slen);
    sbuf2=malloc(slen/2);
    ResampleHalf16(sbuf1, src, sl1);
    ResampleHalf16(sbuf2, sbuf1, sl2);
    while(sl2>=dlen)
    { sbt=sbuf1; sbuf1=sbuf2; sbuf2=sbt;
    sl1=sl2; sl2=sl1/2;
    ResampleHalf16(sbuf2, sbuf1, sl2);
    }

    if(sl1==dlen)
    {
    memcpy(dst, sbuf1, dlen*2);
    free(sbuf1); free(sbuf2);
    return;
    }

    ResampleInterpSpline2x(dst, sbuf1, sbuf2, dlen, sl2);
    free(sbuf1); free(sbuf2);
    }




    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From David Schultz@david.schultz@earthlink.net to comp.arch on Tue Sep 9 07:06:17 2025
    From Newsgroup: comp.arch

    On 9/8/25 11:28 PM, BGB wrote:
    OK.

    I was just seeing here if I could make stuff "less muffled".
      A cleaner averaging filter, like:
        (S0+3*S1+3*S2+S3)/8
    Tends to have more muffle.

    Whereas:
        (5*S1+5*S2-S0-S3)/4
    Slightly boosts high frequencies (so less muffle) but may have other drawbacks.

    Use a real FIR filter:
    http://t-filter.engineerjs.com/
    As an example, I tried:
    16KSPS
    Pass band: 0 to 3500Hz, 1dB dips
    Stop band: 4KHz to 8KHz: 40dB attenuation

    This resulted in a filter with 51 taps. The more brick wall like the
    filter is, the more taps required.
    --
    http://davesrocketworks.com
    David Schultz
    "The cheaper the crook, the gaudier the patter." - Sam Spade
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From David Brown@david.brown@hesbynett.no to comp.arch on Tue Sep 9 15:06:44 2025
    From Newsgroup: comp.arch

    On 08/09/2025 22:10, BGB wrote:
    On 9/8/2025 3:59 AM, David Brown wrote:
    On 07/09/2025 23:12, MitchAlsup wrote:

    Terje Mathisen <terje.mathisen@tmsw.no> posted:

    MitchAlsup wrote:

    BGB <cr88192@gmail.com> posted:



    For those that work directly with music, age brings experience and
    improves abilities like recognising or duplicating tunes.  Age also
    brings deterioration in the physical aspects of hearing - especially
    at higher frequencies.


    This is a concern for me, partly if I lose high frequencies, seemingly I wont have anything, as my hearing of low frequencies (sub 1kHz) is
    seemingly already impaired.

    Seemingly, most of my world of audio perception is located between 1kHz
    and 8kHz.

    Most lower frequencies are more felt than heard.



    There is /some/ overlap, because both groups spend a lot of time
    listening to music, which exercises and improves both functions.


    I listen to music a lot, but usually House and EDM and similar.

    Isn't that an oxymoron? "Music" and "House / EDM" don't belong together
    in the same sentence :-)


    When I was younger, a lot of Goth (mostly of the Synthpop/Synthwave variaty), and Industrial.

    There was Dubstep for a while, but seemingly the whole genre kind of imploded (though, never got much mainstream popularity aside from "Skrillex").



    It sounds like you have likely damaged your hearing - but those kinds of "music" (not that I am opinionated...) are often played at very high
    volumes.

    One key difference, however, is that it is easy to appreciate when
    people can listen to a tune once and play it again afterwards - you
    can watch them do it.  For people who say they can distinguish CD
    audio from AAC or other high bps compressed audio, and other "golden
    ears" distinctions, it's a different matter - in double-blind tests,
    most fail badly.  There are a great many factors involved in
    high-quality audio reproduction - the basic sample rate is only one of
    them.


    I am not really a "golden ear" AFAICT.

    At high bitrates, or high sample rates, I can't hear much difference.



    But, mostly just noting that it is at low bitrates where things like MP3
    and similar start to sound like crap.


    There are, as I said, /many/ factors involved - you are mixing up
    bitrates and sample rates.

    If everything else is "perfect", 44.1 kHz sample rate can reproduce frequencies (including phase information) up to 22.05 kHz - more than
    enough for anyone but some young children.

    But everything else is usually very far from perfect. A particular
    issue is the dynamic range - 16-bit linear coding does not have enough
    range for a lot of music. Either quite sounds are "pixelated", losing a
    lot of important information, or the dynamic range is compressed before
    the CD quality image is generated - giving the music a "flat" sound.
    When compressed audio formats are used, they may start off at higher bit depths and sample rates, but in effect the bit depth also gets
    compressed and you lose resolution as well as sample rate and high
    frequency information for high compression ratios. And just as high
    jpeg compression produces artefacts for some images, such as ghosting,
    so does high audio compression.


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From David Brown@david.brown@hesbynett.no to comp.arch on Tue Sep 9 15:51:10 2025
    From Newsgroup: comp.arch

    On 08/09/2025 23:57, George Neuner wrote:
    On Mon, 8 Sep 2025 10:59:50 +0200, David Brown
    <david.brown@hesbynett.no> wrote:

    ... For people who say they can distinguish CD audio from
    AAC or other high bps compressed audio, and other "golden ears"
    distinctions, it's a different matter - in double-blind tests, most fail
    badly. There are a great many factors involved in high-quality audio
    reproduction - the basic sample rate is only one of them.

    That's true, but the basic sample rate does make a significant
    difference. I don't know if it is true, but I have read that to
    /accurately/ reproduce a given note requires 10 to 11 harmonics: the
    primary note, 7 higher, and 2 to 3 lower.

    There you are talking about /timbre/, not just frequency. It's the
    harmonics that make the same note sound differently on a piano and a
    violin. But at the high frequency range, you can't distinguish these.
    If you can hear sounds up to 16 kHz (which would be better than most
    regulars in this group, with the demographics of males of a certain
    age), you would not be able to able to determine if it is a piano note
    or a violin note. Musical instruments don't go anything like that high
    - even a piccolo won't go over about 4 kHz. Sounds about that range
    have all the musical nuances of chalk on a blackboard.

    High-pitch music or very high pitch singing usually tops out at about 2
    Khz for the base frequency. The tenth harmonic would then be at 20 kHz.
    So the maths works out fine for 44.1 kHz sampling rate.

    Of course there is the very significant matter of how the reproduced
    44.1 kHz is filtered - it must, in an analogue filter, try to cut out virtually everything of 22.5 kHz and above while allowing 20 kHz and
    below to pass through with a flat power response and linear phase
    response. That does not happen in practice - and that is a key reason
    why harmonics of higher pitched notes are usually poor from CD quality
    music, especially on low-end audio systems. (High-end audio systems
    up-sample the 44.1 kHz / 16-bit to perhaps 192 kHz / 24-bit, so that the filtering is vastly better).


    This means most notes will include sounds that are outside the range
    of (normal) human hearing, but you can still /feel/ these sounds [even
    the high ones] and miss them when they are absent.


    Nope. Most notes are much lower, and harmonics of relevance are within
    the range of human hearing. For high enough notes, you simply don't
    hear as much harmonic information.


    C8 (high C) on the piano is ~4186 Hz. Assuming the need for the 7th
    higher harmonic - 29302 Hz - Nyquist would demand a minimum sampling
    rate of 58604/s to accurately reproduce C8.


    You can't accurately hear C8 even when live - you don't get the same
    harmonic information as you do with C6, because your ears can't
    distinguish the higher harmonics. Your ears have the same limitations
    as any other senses in this manner - you can look at your cat's feet and
    count its toes, but if you look at a fly's feet you can't count the toes.


    In practice, unless you like orchestral, or certain folk or country,
    you are not likely to hear much difference between a CD and a decent
    quality compressed version of it. But the CD itself is not a faithful reproduction of the live performance.


    Good quality compressed formats are often better than CD quality. The
    killer for CD quality is not the sample rate, it is the limited dynamic
    range from the linear 16-bit range. Compressed formats will, in effect,
    use a more logarithmic scale (like A-law and mu-law, used to get comprehensible speech despite a much smaller sample size) that is more
    in line with the way the human brain interprets sound.

    And, of course, if you like orchestral you are more likely to be
    listening to vinyl rather than CD. 8-)

    In theory (but very rarely in practice), when combined with good enough amplifiers and speakers, vinyl has a a higher dynamic range than CD
    audio. But that is only the case when the record is new. Play it a few times, and the wear from the needle will smooth out the tracks enough to eliminate the difference.

    But enjoying music is a psychologically, physically, mentally and
    biologically complex hobby. The comfort of the chair you are sitting
    in, or the type of reflections and absorptions from the rest of the
    room, can make a big difference. Knowing that you have spent a great
    deal of money on your impressive-looking hifi system will improve your listening experience regardless of what any audio measurement might say.
    Some audiophiles prefer the "valve sound" over "transistor sound" -
    not because the sound reproduction is more accurate (it is not - valves
    add second harmonic distortion that is non-existent in transistor
    amplifiers), but simply because they like it better.


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Michael S@already5chosen@yahoo.com to comp.arch on Tue Sep 9 18:47:00 2025
    From Newsgroup: comp.arch

    On Tue, 9 Sep 2025 07:06:17 -0500
    David Schultz <david.schultz@earthlink.net> wrote:
    On 9/8/25 11:28 PM, BGB wrote:
    OK.

    I was just seeing here if I could make stuff "less muffled".
      A cleaner averaging filter, like:
        (S0+3*S1+3*S2+S3)/8
    Tends to have more muffle.

    Whereas:
        (5*S1+5*S2-S0-S3)/4
    Slightly boosts high frequencies (so less muffle) but may have
    other drawbacks.

    Use a real FIR filter:
    http://t-filter.engineerjs.com/
    As an example, I tried:
    16KSPS
    Pass band: 0 to 3500Hz, 1dB dips
    Stop band: 4KHz to 8KHz: 40dB attenuation

    This resulted in a filter with 51 taps. The more brick wall like the
    filter is, the more taps required.

    IIRC, AT&T had Fpass = 3.2 KHz. That's significantly easier than 3.5.
    Of course, AT&T used analog IIR filter rather than digital FIR. I have
    no idea what sort of IIR it was, but would guess that 5th order Bessel
    filter with -3dB point at 3.6 KHz could serve as a fair digital
    imitation of their circuit.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From BGB@cr88192@gmail.com to comp.arch on Tue Sep 9 14:13:03 2025
    From Newsgroup: comp.arch

    On 9/9/2025 8:06 AM, David Brown wrote:
    On 08/09/2025 22:10, BGB wrote:
    On 9/8/2025 3:59 AM, David Brown wrote:
    On 07/09/2025 23:12, MitchAlsup wrote:

    Terje Mathisen <terje.mathisen@tmsw.no> posted:

    MitchAlsup wrote:

    BGB <cr88192@gmail.com> posted:



    For those that work directly with music, age brings experience and
    improves abilities like recognising or duplicating tunes.  Age also
    brings deterioration in the physical aspects of hearing - especially
    at higher frequencies.


    This is a concern for me, partly if I lose high frequencies, seemingly
    I wont have anything, as my hearing of low frequencies (sub 1kHz) is
    seemingly already impaired.

    Seemingly, most of my world of audio perception is located between
    1kHz and 8kHz.

    Most lower frequencies are more felt than heard.



    There is /some/ overlap, because both groups spend a lot of time
    listening to music, which exercises and improves both functions.


    I listen to music a lot, but usually House and EDM and similar.

    Isn't that an oxymoron?  "Music" and "House / EDM" don't belong together
    in the same sentence :-)


    Probably still more "music" than "Gansta Rap" though...



    Where I am living, most of the local people seem to be into "Country",
    but I am not so much into this, like no, will take a pass.

    Though, there were a few decent Country musicians, like "Johnny Cash"
    and similar.


    But mostly people are all like, "this stuff is great" and it is mostly
    guitar slide effects and people going on about how their wife ran off
    and took their dog and pickup truck and similar...

    Vs, say, "Gangsta Rap" being mostly about doing crimes and picking up
    girls and similar.


    Had a few times imagined hybrid styles, usually taking one sort of
    instrument usage style and having the lyrics in a different clashing style.


    Say, for example, taking the instrument styles of Country or Disco or
    similar, and then having the singer do it "Gansta Rap" style, etc.
    Like, say, Country style guitar slides, followed by lyrics like:
    "I be roamin da hood, and be gettin da wood."
    "I see dem honeys, I show em my moneys."
    "'Where it be at?' She be a gyat."
    "Daammnn, G."
    ...



    Well, and there is a lot of Mexican music where, even if one can't
    really hear the song in much detail (due to distance or whatever), it is recognizable due to a "boop boop" sound, with two different tones of
    "boop" with a roughly 1 second "boop" spacing.


    Whereas, with House and EDM and similar, usually it is more about the
    beats. Also a decent BPM range, etc...



    When I was younger, a lot of Goth (mostly of the Synthpop/Synthwave
    variaty), and Industrial.

    There was Dubstep for a while, but seemingly the whole genre kind of
    imploded (though, never got much mainstream popularity aside from
    "Skrillex").



    It sounds like you have likely damaged your hearing - but those kinds of "music" (not that I am opinionated...) are often played at very high volumes.


    I don't usually go to clubs or anything...


    One of the loudest things I had tended to deal with in my experiences
    wasn't so much music, rather running a waterjet.

    This was one use-case for noise-canceling wireless headphones, but
    headphones couldn't entirely defeat this.

    Not sure of the exact volume, but basically however much noise is
    generated by pushing 90k PSI water through a 1/16" hole.



    But, yeah, in any case, I seem to have experienced some form of "reverse slope" hearing loss, where I hear relatively little under around 1kHz,
    so things like tuning forks and similar are basically silent IRC.


    Seemingly, I can hear them better on computers than IRL, where:
    440 Hz sine wave on PC is quiet, but still audible;
    440 Hz tuning fork is inaudible.

    My mom got a steel drum tuned to 432 Hz, it is barely audible to me. Can
    put my hand near it and feel vibrations, but don't hear much.

    Would be easier if these things generated square waves, I can hear
    square waves.

    My dad plays guitar some, but to me guitar mostly generates a kind of
    buzzing sound, or a sound vaguely similar to a chainsaw (but, like, each string being a different speed of chainsaw).


    Apparently, there was a difference between me and my dad as to what an
    air compressor sounds like. To him, he says it is very loud. To me, it
    sounds more like a pond pump (and mostly dominated by a buzzing noise),
    but a little louder and more annoying.

    One time not too long ago, the air line for the mill had developed a
    leak (as a small hole in the hose), but this leak was to me very obvious
    due to the hiss. Apparently my dad didn't hear it.

    Also I can note that I don't hear car engines (which apparently make a
    lot of noise or something), usually I more hear cars in the form of the
    noise made by the tire rolling on the ground.

    ...


    One key difference, however, is that it is easy to appreciate when
    people can listen to a tune once and play it again afterwards - you
    can watch them do it.  For people who say they can distinguish CD
    audio from AAC or other high bps compressed audio, and other "golden
    ears" distinctions, it's a different matter - in double-blind tests,
    most fail badly.  There are a great many factors involved in high-
    quality audio reproduction - the basic sample rate is only one of them.


    I am not really a "golden ear" AFAICT.

    At high bitrates, or high sample rates, I can't hear much difference.



    But, mostly just noting that it is at low bitrates where things like
    MP3 and similar start to sound like crap.


    There are, as I said, /many/ factors involved - you are mixing up
    bitrates and sample rates.


    At least with ADPCM, bitrate and sample rate are tied together.

    MP3 is more independent here.

    So, you can keep the sample rate high but drive the bitrate low, and it
    sounds kinda terrible.



    If everything else is "perfect", 44.1 kHz sample rate can reproduce frequencies (including phase information) up to 22.05 kHz - more than
    enough for anyone but some young children.


    Yes, granted.

    MP3 seemingly always tries to operate at 44.1, but often still goes to
    crap when set to encode lower bitrates.



    As noted, I had had good experiences with ADPCM variants, but usually
    the only way to get to lower bitrates is to drop the sample rate.

    But, then, if going below 16000, perceptual quality drops off sharply.
    Usual unavoidable enemy being that stuff starts sounding muffled, which
    is most obvious at 8000.



    But, even as such, I can note (when downloading and looking for the
    audio files for LibreQuake) that, even despite being fairly recent, they
    are still doing most of their sound effects at 11025 (apart from the
    random straggler files at 22050 or 44100).


    Well, anyways, I guess I can fiddle more with trying to use a
    combination of ADPCM and a pattern table to get to a lower bits-per-sample.

    At least, the decoding process isn't too expensive.


    But everything else is usually very far from perfect.  A particular
    issue is the dynamic range - 16-bit linear coding does not have enough
    range for a lot of music.  Either quite sounds are "pixelated", losing a lot of important information, or the dynamic range is compressed before
    the CD quality image is generated - giving the music a "flat" sound.
    When compressed audio formats are used, they may start off at higher bit depths and sample rates, but in effect the bit depth also gets
    compressed and you lose resolution as well as sample rate and high
    frequency information for high compression ratios.  And just as high
    jpeg compression produces artefacts for some images, such as ghosting,
    so does high audio compression.


    Yeah, I am aware that seemingly a lot of music uses compression to try
    to achieve a sort of semi-uniform "loudness wall".

    I can note that when looking at music in an audio program, it tends to
    use the full amplitude range for pretty much the whole song.

    Contrast, the audio from TV shows tends to use closer to around 25% to
    33% of the amplitude range, and with a lot more variability in loudness between sections.


    Though, if there is one merit to compression, it does make it easier to
    hear low frequencies.

    So, while a 440Hz pure sine wave is very quiet; a 440Hz sine wave fed
    through the compressor filter is a lot louder, even if the overall
    amplitude hasn't really changed much. Possibly because compression often
    makes the shape of a sine wave closer to that of a square wave.



    Ironically, one could almost make a case here for using an A-Law variant
    with the low-order bits XOR'ed with the sign, in which case it could
    function as both a higher dynamic-range format and as 8-bit PCM.

    Though, using it as 8-bit PCM would still have the "noise floor"
    annoyance; where it almost invariably adds a low intensity hiss to the
    audio.


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From scott@scott@slp53.sl.home (Scott Lurndal) to comp.arch on Tue Sep 9 20:55:56 2025
    From Newsgroup: comp.arch

    BGB <cr88192@gmail.com> writes:
    On 9/9/2025 8:06 AM, David Brown wrote:
    On 08/09/2025 22:10, BGB wrote:
    On 9/8/2025 3:59 AM, David Brown wrote:
    On 07/09/2025 23:12, MitchAlsup wrote:

    Terje Mathisen <terje.mathisen@tmsw.no> posted:

    MitchAlsup wrote:

    BGB <cr88192@gmail.com> posted:



    For those that work directly with music, age brings experience and
    improves abilities like recognising or duplicating tunes.  Age also
    brings deterioration in the physical aspects of hearing - especially
    at higher frequencies.


    This is a concern for me, partly if I lose high frequencies, seemingly
    I wont have anything, as my hearing of low frequencies (sub 1kHz) is
    seemingly already impaired.

    Seemingly, most of my world of audio perception is located between
    1kHz and 8kHz.

    Most lower frequencies are more felt than heard.



    There is /some/ overlap, because both groups spend a lot of time
    listening to music, which exercises and improves both functions.


    I listen to music a lot, but usually House and EDM and similar.

    Isn't that an oxymoron?  "Music" and "House / EDM" don't belong together >> in the same sentence :-)


    Probably still more "music" than "Gansta Rap" though...

    It's not music, it is C-rap.

    Try some prog rock: https://www.youtube.com/watch?v=0HWv7YJtYCA

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.arch on Tue Sep 9 14:15:17 2025
    From Newsgroup: comp.arch

    On 9/9/2025 12:13 PM, BGB wrote:
    [...]

    This is fairly nice:

    https://youtu.be/RijB8wnJCN0?list=RDRijB8wnJCN0

    :^)
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.arch on Tue Sep 9 14:17:20 2025
    From Newsgroup: comp.arch

    On 9/9/2025 2:15 PM, Chris M. Thomasson wrote:
    On 9/9/2025 12:13 PM, BGB wrote:
    [...]

    This is fairly nice:

    https://youtu.be/RijB8wnJCN0?list=RDRijB8wnJCN0

    :^)

    More ambient: https://youtu.be/x1afn71-0sI?list=RDMM ;^D
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From David Brown@david.brown@hesbynett.no to comp.arch on Tue Sep 9 23:23:40 2025
    From Newsgroup: comp.arch

    On 09/09/2025 21:13, BGB wrote:

    My mom got a steel drum tuned to 432 Hz, it is barely audible to me. Can
    put my hand near it and feel vibrations, but don't hear much.

    Would be easier if these things generated square waves, I can hear
    square waves.

    For a square wave, you have a third harmonic at third volume, a fifth
    harmonic at fifth volume, and so on. So you have very large harmonics -
    if you have trouble hearing at 400 Hz but can hear somewhat at 800 Hz
    and fine at 1200 Hz, then a 400 Hz sine wave will be inaudible but a 400
    Hz square wave will be merely a little dampened.


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From BGB@cr88192@gmail.com to comp.arch on Tue Sep 9 20:27:14 2025
    From Newsgroup: comp.arch

    On 9/9/2025 10:47 AM, Michael S wrote:
    On Tue, 9 Sep 2025 07:06:17 -0500
    David Schultz <david.schultz@earthlink.net> wrote:

    On 9/8/25 11:28 PM, BGB wrote:
    OK.

    I was just seeing here if I could make stuff "less muffled".
      A cleaner averaging filter, like:
        (S0+3*S1+3*S2+S3)/8
    Tends to have more muffle.

    Whereas:
        (5*S1+5*S2-S0-S3)/4
    Slightly boosts high frequencies (so less muffle) but may have
    other drawbacks.

    Use a real FIR filter:
    http://t-filter.engineerjs.com/
    As an example, I tried:
    16KSPS
    Pass band: 0 to 3500Hz, 1dB dips
    Stop band: 4KHz to 8KHz: 40dB attenuation

    This resulted in a filter with 51 taps. The more brick wall like the
    filter is, the more taps required.




    Well, but downsides:
    FIR filtering would likely reduce noise;
    But, would not reduce the muffle, which is the bigger issue at 8000;
    FIR filtering, particularly larger filters, tends to be slow.


    But, yeah, could in theory use a slightly bigger sliding filter, say:
    ( 1 1 -2 -2 6 6 -2 -2 1 1 ) /8

    Which could in theory maybe have less noise.


    IIRC, AT&T had Fpass = 3.2 KHz. That's significantly easier than 3.5.

    Of course, AT&T used analog IIR filter rather than digital FIR. I have
    no idea what sort of IIR it was, but would guess that 5th order Bessel
    filter with -3dB point at 3.6 KHz could serve as a fair digital
    imitation of their circuit.


    Dunno...

    Seems like one has to be careful to not loose too much more.


    But, seems like this still mostly just leaves trying to use a more
    complex encoding and a higher sample rate (such as 16000) as the more
    likely path forward.



    So, say, more complex format:

    Base Block, 2-bit ADPCM:
    1 byte: Initial Predictor (A-Law)
    1 byte: Initial Step (low 6 bits, log2 4.2)
    N/4 bytes: Audio Samples

    Where each sample is:
    00: Small Positive (V+=1*StepScale, StepIndex-=1)
    01: Large Positive (V+=3*StepScale, StepIndex+=2)
    10: Small Negative (V-=1*StepScale, StepIndex-=1)
    11: Large Negative (V-=3*StepScale, StepIndex+=2)
    Encoder responsible for avoiding V or StepIndex going out of range.

    Mono Block:
    Encode a Base Spline at 1/16 rate
    Encode a Deviation Spline at 1/64 rate
    Encode a pattern table index at 4 or 8 bits per 16-samples.
    4b: ~ 0.5 bits/sample
    8b: ~ 0.75 bits/sample


    Issue:
    More fiddling, and it is proving difficult to get it above "some minimum standard of acceptable audio quality".

    In any case, can't be a good alternative to 8kHz 2b ADPCM if it sounds significantly worse that 8kHz 2b ADPCM (and 12kbps isn't a huge win over 16kbps...).


    ...

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Lawrence =?iso-8859-13?q?D=FFOliveiro?=@ldo@nz.invalid to comp.arch on Thu Sep 11 01:33:44 2025
    From Newsgroup: comp.arch

    On Sat, 6 Sep 2025 14:19:40 -0500, BGB wrote:

    But, there is some "weird hacks" that can be done in audio processing
    when downsampling that seems to notably increase intelligibility at an
    8kHz sample rate ...

    There are digital encoding formats used with mobile phones that are
    optimized for speech. Ever heard a call where the other end sounded every
    now and then like they were underwater? That’s the kind of compression artifact you get.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Lawrence =?iso-8859-13?q?D=FFOliveiro?=@ldo@nz.invalid to comp.arch on Thu Sep 11 01:35:00 2025
    From Newsgroup: comp.arch

    On Sat, 06 Sep 2025 16:21:12 GMT, MitchAlsup wrote:

    No it does not sound "good" on a system that accurately reproduces
    22KHz; like systems with electrostatic speakers covering the high end of
    the audio spectrum.

    I wonder how that works, given that the audio engineer that mastered the recording was using speakers that cost a fraction of the price.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From BGB@cr88192@gmail.com to comp.arch on Thu Sep 11 02:05:59 2025
    From Newsgroup: comp.arch

    On 9/10/2025 8:33 PM, Lawrence D’Oliveiro wrote:
    On Sat, 6 Sep 2025 14:19:40 -0500, BGB wrote:

    But, there is some "weird hacks" that can be done in audio processing
    when downsampling that seems to notably increase intelligibility at an
    8kHz sample rate ...

    There are digital encoding formats used with mobile phones that are
    optimized for speech. Ever heard a call where the other end sounded every
    now and then like they were underwater? That’s the kind of compression artifact you get.

    Looking some at it, apparently a lot of the current modern phone class
    audio codecs are based on trying to run a model of the human vocal tract
    and then adding white noise to make it sound more natural (with some apparently partly based on vocoder technology).

    But, in my case, I don't really hear speech effectively over phones, I
    mostly hear a lot of warbling that I am left trying to decipher over all
    the hiss.


    As noted, the filtering hack mostly kept to normal PCM handling, but I
    soon realized can't work as a general solution to "stuff sounding bad"
    at an 8kHz sample rate.



    When I was looking into it, 4-channel sinewave synthesis is possible, but: Quality is still poor;
    At a 125Hz update frequency, at 16 bits per sinewave, still takes around 8kbps.

    Needs 16 bits roughly to encode both the frequency and amplitude of each sinewave to an acceptable degree.

    When fiddling with it, I ended up finding an OK strategy of:
    Sample for 12 signwaves, dividing the 2-8 kHz range into roughly 1/6
    octave chunks (picking the loudest wave within each chunk);
    Pick the top 4 loudest waves from the 12 sampled.


    I was experimenting with pushing the scheme I mentioned else-thread to
    around 6 kbps, which (last I messed with it) still generates some truly
    awful audio quality.

    Posted an example to my twitter feed: https://x.com/cr88192/status/1965694742186049683

    It does sound a fair bit better with a 16kHz sampling rate (12kbs), but
    is still notably inferior to 8kHz 2-bit ADPCM (16kbps).


    The 6kbps case is interesting as it gets a 2-minute song into around
    96K, which is kinda pushing into MIDI territory. But, MIDI would have
    sounded better (though, no real obvious way to auto-convert PCM audio
    into MIDI commands).

    Well, unless maybe doing something like sinewave synthesis but then
    trying to convert the sine waves into Note On/Off commands. Though,
    naively mapping sinewave synthesis to MIDI commands would likely add a
    fair bit of bulk and overhead.



    It is possible that I may need to take a different approach to
    generating the pattern table.

    Initial approach:
    Fill it with sine-waves;
    Didn't work very well.
    Current strategy:
    Start with a table of 16-bit patterns (curated manually);
    Map each to samples, 0=full negative, 1=full positive;
    Run N passes of averaging;
    Generate a pattern table with 4-bits per pattern sample.

    Possible pattern-table generation strategy (not yet tried):
    Use the sign of each sample relative to the base curve to generate a
    16-bit key;
    average the relative values for each key, keeping track of relative
    usage frequency;
    Pick the top-N else merge similar patterns until one has fewer than 256
    or so.

    Note that any sounds much over ~ 250Hz at an 8000 sample rate are being generated from the pattern table.


    But, it is possible this approach may be a lost cause (could not be made
    to give anywhere acceptable quality at these bitrates).


    Note that I don't want something significantly more complicated or
    expensive than ADPCM (so, ideally no entropy coding or fancy transforms
    on the decoder side...).

    To be useful, would need to either:
    Do better than ADPCM at a similar bitrate;
    Achieve bitrates lower than what is possible with ADPCM.

    Was partly looking at the latter, but to be useful it needs to have some
    level of "passable" quality, which I have yet to achieve at this target
    (eg, particularly at 6 kbps).

    ...


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From scott@scott@slp53.sl.home (Scott Lurndal) to comp.arch on Thu Sep 11 15:06:57 2025
    From Newsgroup: comp.arch

    Lawrence =?iso-8859-13?q?D=FFOliveiro?= <ldo@nz.invalid> writes:
    On Sat, 06 Sep 2025 16:21:12 GMT, MitchAlsup wrote:

    No it does not sound "good" on a system that accurately reproduces
    22KHz; like systems with electrostatic speakers covering the high end of
    the audio spectrum.

    I wonder how that works, given that the audio engineer that mastered the >recording was using speakers that cost a fraction of the price.

    Have you priced quality studio monitors? Obviously not.

    A nice pair of intro electrostatics run about a USD1200 (magnapan lrs+).

    A single studio monitor can easily cost more than USD12000.

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Thu Sep 11 15:59:32 2025
    From Newsgroup: comp.arch


    scott@slp53.sl.home (Scott Lurndal) posted:

    Lawrence =?iso-8859-13?q?D=FFOliveiro?= <ldo@nz.invalid> writes:
    On Sat, 06 Sep 2025 16:21:12 GMT, MitchAlsup wrote:

    No it does not sound "good" on a system that accurately reproduces
    22KHz; like systems with electrostatic speakers covering the high end of >> the audio spectrum.

    I wonder how that works, given that the audio engineer that mastered the >recording was using speakers that cost a fraction of the price.

    Have you priced quality studio monitors? Obviously not.

    A nice pair of intro electrostatics run about a USD1200 (magnapan lrs+).

    Magnepan's are not electrostatic, but use the moving Mylar plane sort-of
    like they were electrostatic--but they use magnetic strips on the backplane
    to impart forces onto the Mylar plane.

    Martin Logan speakers are electrostatic (I have a pair from 1986-ish,
    reved up from B-to-G in 1996.) They sound much like electrostatic
    headphones except rooms sized sound pressure levels. These cost
    around $2,000 in 1986...

    Dalhquist are electrostatic; around since 1973-ish.

    A single studio monitor can easily cost more than USD12000.

    And often accompanied by a tuning system to allow the speakers to be tuned
    to the room in which they are used. Velodyne sub-woofers allow the woofer
    to be tuned to the room and phase aligned with the main speakers.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From BGB@cr88192@gmail.com to comp.arch on Thu Sep 11 12:56:09 2025
    From Newsgroup: comp.arch

    On 9/11/2025 10:06 AM, Scott Lurndal wrote:
    Lawrence =?iso-8859-13?q?D=FFOliveiro?= <ldo@nz.invalid> writes:
    On Sat, 06 Sep 2025 16:21:12 GMT, MitchAlsup wrote:

    No it does not sound "good" on a system that accurately reproduces
    22KHz; like systems with electrostatic speakers covering the high end of >>> the audio spectrum.

    I wonder how that works, given that the audio engineer that mastered the
    recording was using speakers that cost a fraction of the price.

    Have you priced quality studio monitors? Obviously not.

    A nice pair of intro electrostatics run about a USD1200 (magnapan lrs+).

    A single studio monitor can easily cost more than USD12000.


    I guess, this is a fair bit different, say, from using some $35
    headphones, or $60 for some external speakers, probably throwing some
    money Logitech's way...

    Or, slightly cheaper, some "Amazon Basics" equivalents.
    Or, more expensive, throwing their money at Bosch or Senheiser or similar.


    Granted, there are cheaper headphones, but they are often lacking in
    terms of comfort and/or audio quality.


    Otherwise, I would think an option would be to try to guess which sort
    of hardware consumers are most likely to be using, and then tune for
    best results on this (so, say, aim for cheap, but not too cheap).

    ...


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From George Neuner@gneuner2@comcast.net to comp.arch on Fri Sep 12 13:01:36 2025
    From Newsgroup: comp.arch

    On Tue, 9 Sep 2025 15:51:10 +0200, David Brown
    <david.brown@hesbynett.no> wrote:

    On 08/09/2025 23:57, George Neuner wrote:

    :

    This means most notes will include sounds that are outside the range
    of (normal) human hearing, but you can still /feel/ these sounds [even
    the high ones] and miss them when they are absent.


    Nope. Most notes are much lower, and harmonics of relevance are within
    the range of human hearing. For high enough notes, you simply don't
    hear as much harmonic information.

    You are forgetting the lower harmonics. If it is true about 3 lower,
    then ~1/3 of notes on the piano will include an overtone that is below
    the (average) hearing threshold.


    C8 (high C) on the piano is ~4186 Hz. Assuming the need for the 7th
    higher harmonic - 29302 Hz - Nyquist would demand a minimum sampling
    rate of 58604/s to accurately reproduce C8.


    You can't accurately hear C8 even when live - you don't get the same >harmonic information as you do with C6, because your ears can't
    distinguish the higher harmonics. Your ears have the same limitations
    as any other senses in this manner - you can look at your cat's feet and >count its toes, but if you look at a fly's feet you can't count the toes.

    My point was about sampling and reproduction, not whether the note
    could be heard. There is not a lot of piano music that involves the
    1st, 7th or 8th octaves - because the 1st octave is jarring and the
    7th and 8th (in general) are too high to carry to the audience without amplification.


    In practice, unless you like orchestral, or certain folk or country,
    you are not likely to hear much difference between a CD and a decent
    quality compressed version of it. But the CD itself is not a faithful
    reproduction of the live performance.


    Good quality compressed formats are often better than CD quality. The >killer for CD quality is not the sample rate, it is the limited dynamic >range from the linear 16-bit range. Compressed formats will, in effect,
    use a more logarithmic scale (like A-law and mu-law, used to get >comprehensible speech despite a much smaller sample size) that is more
    in line with the way the human brain interprets sound.

    And, of course, if you like orchestral you are more likely to be
    listening to vinyl rather than CD. 8-)

    In theory (but very rarely in practice), when combined with good enough >amplifiers and speakers, vinyl has a a higher dynamic range than CD
    audio. But that is only the case when the record is new. Play it a few >times, and the wear from the needle will smooth out the tracks enough to >eliminate the difference.

    True, but in fact there are laser based record players that do not
    touch or damage the media. You still need to worry about warping, so
    it is necessary to store your records properly.

    I don't deal much with vinyl records myself anymore, but my sister has
    an extensive collection.


    But enjoying music is a psychologically, physically, mentally and >biologically complex hobby. The comfort of the chair you are sitting
    in, or the type of reflections and absorptions from the rest of the
    room, can make a big difference. Knowing that you have spent a great
    deal of money on your impressive-looking hifi system will improve your >listening experience regardless of what any audio measurement might say.
    Some audiophiles prefer the "valve sound" over "transistor sound" -
    not because the sound reproduction is more accurate (it is not - valves
    add second harmonic distortion that is non-existent in transistor >amplifiers), but simply because they like it better.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From David Schultz@david.schultz@earthlink.net to comp.arch on Fri Sep 12 12:23:39 2025
    From Newsgroup: comp.arch

    On 9/12/25 12:01 PM, George Neuner wrote:
    You are forgetting the lower harmonics. If it is true about 3 lower,
    then ~1/3 of notes on the piano will include an overtone that is below
    the (average) hearing threshold.

    One of the coolest things I ever heard, felt really, were the beat tones between a couple of peddle notes on the pipe organ at the Meyerson in
    Dallas.
    --
    http://davesrocketworks.com
    David Schultz
    "The cheaper the crook, the gaudier the patter." - Sam Spade
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From David Brown@david.brown@hesbynett.no to comp.arch on Fri Sep 12 20:32:48 2025
    From Newsgroup: comp.arch

    On 12/09/2025 19:23, David Schultz wrote:
    On 9/12/25 12:01 PM, George Neuner wrote:
    You are forgetting the lower harmonics. If it is true about 3 lower,
    then ~1/3 of notes on the piano will include an overtone that is below
    the (average) hearing threshold.

    Harmonics are always integer multiples of the base frequency, not
    fractions - that's the definition of a harmonic.

    You can get lower frequencies produced as beats when different
    instruments play nominally the same note, but are a little out of tune.
    That's not something you would normally want to have in music (though it
    is a very useful effect for getting things in tune).


    One of the coolest things I ever heard, felt really, were the beat tones between a couple of peddle notes on the pipe organ at the Meyerson in Dallas.


    Big pipe organs have notes that are too low for human hearing, but the
    volume is enough to feel them. Infrasound (sound below the lowest
    audible frequency) has long been associated with feelings of
    "supernatural" or "paranormal", increasing stress and tension. Horror
    movies sometimes like to have them in their soundtracks, and more than
    one "haunted house" turned out to have issues with the plumbing,
    ventilation or nearby diesel engines that produced infrasound that made
    people feel uneasy without knowing why. I'd imagine that for a church
    organ, playing some infrasound notes will help the listeners feel
    "religious" or feel some kind of "spiritual presence", though I have not
    heard of that being done intentionally except by some of the more
    dedicated fake healer conmen.

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Fri Sep 12 18:58:59 2025
    From Newsgroup: comp.arch


    David Schultz <david.schultz@earthlink.net> posted:

    On 9/12/25 12:01 PM, George Neuner wrote:
    You are forgetting the lower harmonics. If it is true about 3 lower,
    then ~1/3 of notes on the piano will include an overtone that is below
    the (average) hearing threshold.

    One of the coolest things I ever heard, felt really, were the beat tones between a couple of peddle notes on the pipe organ at the Meyerson in Dallas.

    Have you listened to a helicopter-style sub-woofer ??

    Generally housed between stories in a building--a helicopter arranged set
    of blades, that can go all the way down to 0 Hz--and up to about 30 Hz.
    The low frequency components adjust the pitch of the blades through the
    cyclic.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From BGB@cr88192@gmail.com to comp.arch on Fri Sep 12 14:30:08 2025
    From Newsgroup: comp.arch

    On 9/12/2025 1:58 PM, MitchAlsup wrote:

    David Schultz <david.schultz@earthlink.net> posted:

    On 9/12/25 12:01 PM, George Neuner wrote:
    You are forgetting the lower harmonics. If it is true about 3 lower,
    then ~1/3 of notes on the piano will include an overtone that is below
    the (average) hearing threshold.

    One of the coolest things I ever heard, felt really, were the beat tones
    between a couple of peddle notes on the pipe organ at the Meyerson in
    Dallas.

    Have you listened to a helicopter-style sub-woofer ??

    Generally housed between stories in a building--a helicopter arranged set
    of blades, that can go all the way down to 0 Hz--and up to about 30 Hz.
    The low frequency components adjust the pitch of the blades through the cyclic.

    Main subwoofers I am aware of/had seen:
    Large plastic-cone speakers;
    Seemingly, the relative rigidity of a plastic cone works well here.
    Large solenoid driving a big/heavy weight (such as a big chunk of
    steel), which is then bolted down to something (presumably, the surface
    of whatever it is bolted to serving a similar role to the speaker cone).




    well, in other news, have slightly improved the quality of my new
    experimental audio compressor at 6 kbps, but it is still pretty bad.
    Seems my previous attempt (that I had posted online) was suffering from
    32-bit truncation in the pattern table (some stuff was happening with
    'int' that should have been with 'unsigned long long'; which was
    negatively effecting audio quality).

    ...


    TODO: Might still be worth testing it out at 32kHz / 24kbps, and see how
    it compares against low bitrate MP3. If it doesn't sound completely
    awful, might be OK (only reason I think it may stand a chance is because
    of how terrible MP3 sounds at these sorts of bitrates).

    At 16kHz (and 12kbps), does sound a bit better, though audio quality is
    still inferior at present to 8000 2-bit ADPCM (16kbps).

    ...


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From David Schultz@david.schultz@earthlink.net to comp.arch on Fri Sep 12 14:46:20 2025
    From Newsgroup: comp.arch

    On 9/12/25 1:58 PM, MitchAlsup wrote:

    David Schultz <david.schultz@earthlink.net> posted:
    One of the coolest things I ever heard, felt really, were the beat tones
    between a couple of peddle notes on the pipe organ at the Meyerson in
    Dallas.

    Have you listened to a helicopter-style sub-woofer ??

    No. I built a sub a decade or three ago (JBL 2245H driver) but have
    never wanted to go quite so far as to build a rotary subwoofer.


    The 18" sub gets as freaky as I want it to.
    --
    http://davesrocketworks.com
    David Schultz
    "The cheaper the crook, the gaudier the patter." - Sam Spade
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From George Neuner@gneuner2@comcast.net to comp.arch on Sun Sep 14 07:43:07 2025
    From Newsgroup: comp.arch

    On Fri, 12 Sep 2025 20:32:48 +0200, David Brown
    <david.brown@hesbynett.no> wrote:

    On 12/09/2025 19:23, David Schultz wrote:
    On 9/12/25 12:01 PM, George Neuner wrote:
    You are forgetting the lower harmonics. If it is true about 3 lower,
    then ~1/3 of notes on the piano will include an overtone that is below
    the (average) hearing threshold.

    Harmonics are always integer multiples of the base frequency, not
    fractions - that's the definition of a harmonic.

    I learned them as "overtones" ... but it seems that musicians call
    them all "harmonics" regardless of whether they are higher or lower.

    MMV.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From David Brown@david.brown@hesbynett.no to comp.arch on Sun Sep 14 15:08:28 2025
    From Newsgroup: comp.arch

    On 14/09/2025 13:43, George Neuner wrote:
    On Fri, 12 Sep 2025 20:32:48 +0200, David Brown
    <david.brown@hesbynett.no> wrote:

    On 12/09/2025 19:23, David Schultz wrote:
    On 9/12/25 12:01 PM, George Neuner wrote:
    You are forgetting the lower harmonics. If it is true about 3 lower,
    then ~1/3 of notes on the piano will include an overtone that is below >>>> the (average) hearing threshold.

    Harmonics are always integer multiples of the base frequency, not
    fractions - that's the definition of a harmonic.

    I learned them as "overtones" ... but it seems that musicians call
    them all "harmonics" regardless of whether they are higher or lower.

    MMV.

    I'm not a musician - my knowledge of harmonics is from maths, physics,
    signal processing, motor control, and that kind of thing. Maybe
    musicians use the term slightly differently.

    --- Synchronet 3.21a-Linux NewsLink 1.2