• Re: Tutorial: Windows/Android privacy de-googled STT optimized for speed

    From Maria Sophia@mariasophia@comprehension.com to comp.mobile.android,alt.comp.os.windows-10,alt.comp.microsoft.windows on Wed May 6 20:37:28 2026
    From Newsgroup: comp.mobile.android

    Maria Sophia wrote:
    a. Low end Android => use HeliBoard + WhisperIME STT
    b. High end Android => try the all-in-one Futo Keyboard

    Testing takes time...

    I'm having trouble with the tiny models in a noisy environment,
    with the transcription taking too long or not working at all.

    It seems the AGC on the mic is allowing too much noise to filter through.

    First, I confirmed the small models are running by running adb logcat.
    "testing testing 123"

    Since, at home, you never need to touch the phone itself, from Windows:
    adb shell logcat -c
    adb shell "logcat -d -v tag WhisperEngineJava:D *:S"
    --------- beginning of main
    D/WhisperEngineJava: Model is
    loaded.../storage/emulated/0/Android/data/org.woheller69.whisper/files/whisper-tiny.en.tflite
    D/WhisperEngineJava: Filters and Vocab are loaded.../storage/emulated/0/Android/data/org.woheller69.whisper/files/filters_vocab_en.bin
    D/WhisperEngineJava: Model is loaded.../storage/emulated/0/Android/data/org.woheller69.whisper/files/whisper-tiny.en.tflite
    D/WhisperEngineJava: Filters and Vocab are loaded.../storage/emulated/0/Android/data/org.woheller69.whisper/files/filters_vocab_en.bin

    Where the specgtrogram was too big for such as small sentence:
    D/WhisperEngineJava: Calculating Mel spectrogram...
    D/WhisperEngineJava: Mel spectrogram is calculated...!
    D/WhisperEngineJava: output_len: 449

    So to lower the mic sensitivity on the Samsung A32-5G, I ran:
    adb shell settings put global call_noise_reduction 1
    adb reboot

    Re-run "testing, testing, 123"
    adb shell logcat -c
    adb shell "logcat -d -v tag WhisperEngineJava:D *:S"
    --------- beginning of main
    D/WhisperEngineJava: Model is loaded.../storage/emulated/0/Android/data/org.woheller69.whisper/files/whisper-tiny.en.tflite
    D/WhisperEngineJava: Filters and Vocab are loaded.../storage/emulated/0/Android/data/org.woheller69.whisper/files/filters_vocab_en.bin
    D/WhisperEngineJava: Calculating Mel spectrogram...
    D/WhisperEngineJava: Mel spectrogram is calculated...!
    D/WhisperEngineJava: output_len: 449
    D/WhisperEngineJava: Skipping token: 50257, word: [_SOT_]
    D/WhisperEngineJava: Detected language code: en
    D/WhisperEngineJava: Skipping token: 50259, word: [_extra_token_50259] D/WhisperEngineJava: It is Transcription...
    D/WhisperEngineJava: Skipping token: 50359, word: [_extra_token_50359] D/WhisperEngineJava: Skipping token: 50363, word: [_BEG_]
    D/WhisperEngineJava: Skipping token: 50413, word: [_TT_50]
    D/WhisperEngineJava: Skipping token: 50513, word: [_TT_150] D/WhisperEngineJava: Inference is executed...!

    Drat. It's still 449.

    If that doesn't work in noisy environments, then I'll have to bump up
    to the next-sized model, which I think is the base model.

    adb push whisper-base.en.tflite /storage/emulated/0/Android/data/org.woheller69.whisper/files/
    adb shell "cp /storage/emulated/0/Android/data/org.woheller69.whisper/files/whisper-base.en.tflite /storage/emulated/0/Android/data/org.woheller69.whisper/files/whisper.tflite"
    adb shell "cp /storage/emulated/0/Android/data/org.woheller69.whisper/files/whisper-base.en.tflite /storage/emulated/0/Android/data/org.woheller69.whisper/files/whisper-tiny.en.tflite"
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Paul@nospam@needed.invalid to comp.mobile.android,alt.comp.os.windows-10,alt.comp.microsoft.windows on Fri May 8 02:31:26 2026
    From Newsgroup: comp.mobile.android

    On Wed, 5/6/2026 10:37 PM, Maria Sophia wrote:
    Maria Sophia wrote:
    a. Low end Android => use HeliBoard + WhisperIME STT
    b. High end Android => try the all-in-one Futo Keyboard

    Testing takes time...

    I'm having trouble with the tiny models in a noisy environment,
    with the transcription taking too long or not working at all.

    It seems the AGC on the mic is allowing too much noise to filter through.

    First, I confirmed the small models are running by running adb logcat.
    "testing testing 123"

    Since, at home, you never need to touch the phone itself, from Windows:
    adb shell logcat -c
    adb shell "logcat -d -v tag WhisperEngineJava:D *:S"
    --------- beginning of main
    D/WhisperEngineJava: Model is
    loaded.../storage/emulated/0/Android/data/org.woheller69.whisper/files/whisper-tiny.en.tflite
    D/WhisperEngineJava: Filters and Vocab are loaded.../storage/emulated/0/Android/data/org.woheller69.whisper/files/filters_vocab_en.bin
    D/WhisperEngineJava: Model is loaded.../storage/emulated/0/Android/data/org.woheller69.whisper/files/whisper-tiny.en.tflite
    D/WhisperEngineJava: Filters and Vocab are loaded.../storage/emulated/0/Android/data/org.woheller69.whisper/files/filters_vocab_en.bin

    Where the specgtrogram was too big for such as small sentence:
    D/WhisperEngineJava: Calculating Mel spectrogram...
    D/WhisperEngineJava: Mel spectrogram is calculated...!
    D/WhisperEngineJava: output_len: 449

    So to lower the mic sensitivity on the Samsung A32-5G, I ran:
    adb shell settings put global call_noise_reduction 1
    adb reboot

    Re-run "testing, testing, 123"
    adb shell logcat -c
    adb shell "logcat -d -v tag WhisperEngineJava:D *:S"
    --------- beginning of main
    D/WhisperEngineJava: Model is loaded.../storage/emulated/0/Android/data/org.woheller69.whisper/files/whisper-tiny.en.tflite
    D/WhisperEngineJava: Filters and Vocab are loaded.../storage/emulated/0/Android/data/org.woheller69.whisper/files/filters_vocab_en.bin
    D/WhisperEngineJava: Calculating Mel spectrogram...
    D/WhisperEngineJava: Mel spectrogram is calculated...!
    D/WhisperEngineJava: output_len: 449
    D/WhisperEngineJava: Skipping token: 50257, word: [_SOT_] D/WhisperEngineJava: Detected language code: en
    D/WhisperEngineJava: Skipping token: 50259, word: [_extra_token_50259] D/WhisperEngineJava: It is Transcription...
    D/WhisperEngineJava: Skipping token: 50359, word: [_extra_token_50359] D/WhisperEngineJava: Skipping token: 50363, word: [_BEG_] D/WhisperEngineJava: Skipping token: 50413, word: [_TT_50] D/WhisperEngineJava: Skipping token: 50513, word: [_TT_150] D/WhisperEngineJava: Inference is executed...!

    Drat. It's still 449.

    If that doesn't work in noisy environments, then I'll have to bump up
    to the next-sized model, which I think is the base model.

    adb push whisper-base.en.tflite /storage/emulated/0/Android/data/org.woheller69.whisper/files/
    adb shell "cp /storage/emulated/0/Android/data/org.woheller69.whisper/files/whisper-base.en.tflite /storage/emulated/0/Android/data/org.woheller69.whisper/files/whisper.tflite"
    adb shell "cp /storage/emulated/0/Android/data/org.woheller69.whisper/files/whisper-base.en.tflite /storage/emulated/0/Android/data/org.woheller69.whisper/files/whisper-tiny.en.tflite"


    It missed the word "It's" in the picture.

    [Picture] dsnote-ubu2504.gif

    https://imgur.com/a/9VxuCCa

    https://postimg.cc/CRrHVQXP

    That's "dsnote" in Ubuntu using a Whisper model.
    I read the text of the lines above, and the model
    missed the "It's" on the recorded attempt. A
    previous attempt was OK.

    Microphone was a Blue Yeti. Which doesn't have AGC.
    And the level wasn't all that high either, maybe
    -24dBm or so. I recorded the microphone first in
    Audacity, to see I had to hold the mike two inches
    from my face to get a signal.

    While the spec for the microphone claims a 20-20000Hz
    response (which would be 3dB down at the ends),
    it is clearly a "voice" microphone and it
    cuts off the high frequencies. That's one of the reasons
    the fans in the room didn't get picked up. So as far as
    being a "live" mic, it's a bit of a "dull potato" as
    mics go. But it does seem to give a decent result.

    And when you "blast" the four lines above at the model,
    then stop and wait for the conversion, it must have taken
    at least 10-15 seconds to do the amount of text in the picture.
    It "feels" slightly better, if you feed it a sentence at a time.
    Feed it just a few words. It seems happier that way. Dragon
    Naturally Speaking has nothing to worry about :-)

    Paul
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Maria Sophia@mariasophia@comprehension.com to comp.mobile.android,alt.comp.os.windows-10,alt.comp.microsoft.windows on Fri May 8 01:09:32 2026
    From Newsgroup: comp.mobile.android

    Paul wrote:
    That's "dsnote" in Ubuntu using a Whisper model.
    I read the text of the lines above, and the model
    missed the "It's" on the recorded attempt. A
    previous attempt was OK.

    Microphone was a Blue Yeti. Which doesn't have AGC.
    And the level wasn't all that high either, maybe
    -24dBm or so. I recorded the microphone first in
    Audacity, to see I had to hold the mike two inches
    from my face to get a signal.

    While the spec for the microphone claims a 20-20000Hz
    response (which would be 3dB down at the ends),
    it is clearly a "voice" microphone and it
    cuts off the high frequencies. That's one of the reasons
    the fans in the room didn't get picked up. So as far as
    being a "live" mic, it's a bit of a "dull potato" as
    mics go. But it does seem to give a decent result.

    And when you "blast" the four lines above at the model,
    then stop and wait for the conversion, it must have taken
    at least 10-15 seconds to do the amount of text in the picture.
    It "feels" slightly better, if you feed it a sentence at a time.
    Feed it just a few words. It seems happier that way. Dragon
    Naturally Speaking has nothing to worry about :-)

    Hi Paul,

    Thanks for testing it out. I think there's a reason that the WhisperIME defaults to the 435MB model instead of the "tiny" model of 40MB.

    I agree with EVERYTHING you said (I'd never disagree with anything that is logically sensibly stated). What I want to say is that when it's quiet, it works "just OK" for the tiny model.

    I'm gonna switch to the default 435MB model and see if that does better.
    But I agree with you. YMMV.

    When it's noisy (like in a vehicle), the tiny model really sucks.
    So I guess we're doomed to have to use the largest model most of the time.

    I'm told (by the Internet) that the Futo Keyboard works better as it's more modern and it uses the C++ whisper models (if that matters).

    If I were to do it over again, I'd try that first.

    But I do THANK YOU VERY MUCH for testing this out for the team.
    People like you are wonderful because we all benefit from your efforts!

    What's really neat is that Windows/Linux controls the phone wonderfully.
    I never have to touch the phone when I'm sitting at my desk.
    a. adb controls the phone
    b. scrcpy/sndcpy displays the phone
    c. the keyboard types into the phone
    d. the mouse taps on the phone

    It's really neat having the phone show up as nearly two feet tall!
    --- Synchronet 3.22a-Linux NewsLink 1.2