• Re: Hint For GNU/Linux Progrmmers

    From Lester Thorpe@lt@gnu.rocks to comp.os.linux.misc on Thu Sep 11 18:42:02 2025
    From Newsgroup: comp.os.linux.misc

    On Thu, 11 Sep 2025 15:11:24 -0000 (UTC), John McCue wrote:


    To, me, testing and retesting different optimizations is a
    huge waste of time and at most you might save 1 second :)


    That was my original point and the reason I suggest that
    programmers should do the dirty work for the user.

    But seconds can quickly add. For audio/video encoding and
    math/physics simulations optimization can mean the difference
    between 20 minutes and 1 hour, which is highly significant.

    There is an "ancient" program called "paranoia" which evaluates
    a machines floating point accuracy:

    <https://netlib.sandia.gov/paranoia/paranoia.c>

    Using your "-O1" to compile would lead to erroneous results.
    In this case, "-O0" is required.

    Granted, this program precedes Linux and GCC but other, more
    recent programs, may behave in similar ways regarding optimization.

    Therefore, it should be the programmers responsibility to
    indicate the correct optimization.


    For programs created by others, I keep whatever setting
    the use since they know much better than me.
    --
    Gentoo: the only road to GNU/Linux perfection.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Lawrence =?iso-8859-13?q?D=FFOliveiro?=@ldo@nz.invalid to comp.os.linux.misc on Thu Sep 11 22:38:45 2025
    From Newsgroup: comp.os.linux.misc

    On Thu, 11 Sep 2025 15:11:24 -0000 (UTC), John McCue wrote:

    To, me, testing and retesting different optimizations is a huge waste of
    time and at most you might save 1 second :)

    I was once hired to build an app in MATLAB for decoding and displaying multiple channels of EEG data, using its built-in GUI tools (momentary
    shudder as the PTSD kicks in), in real time. One of the original
    researchers had already written some stream-decoding code to start with; I
    had a go at doing it in different ways, and was able to achieve close to a
    2:1 speedup on the DEC Alpha I was using for testing.

    Then I ran the same code on the Windows NT box which was going to be used
    as the actual deployment platform ... and most of the speedup went away.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From c186282@c186282@nnada.net to comp.os.linux.misc on Fri Sep 12 00:15:41 2025
    From Newsgroup: comp.os.linux.misc

    On 9/11/25 11:11 AM, John McCue wrote:
    Follow-ups trimmed to comp.os.linux.misc

    In comp.os.linux.misc Lester Thorpe <lt@gnu.rocks> wrote:
    Program optimization is essential, but yet it is difficult
    to arrive at a best method.
    <snip>
    I propose that GNU/Linux programmers should determine
    the best options and then publish these recommendations
    in the source tree to guide the interested user.

    I find O1 is good enough for all programs I create.

    Yep.

    As for the actual writ code, write it once, wait
    a week or two, then write it over again better.
    I tended to proto in Python, then re-do in Pascal.
    The re-do was always a lot tighter/smarter.

    To, me, testing and retesting different optimizations is a
    huge waste of time and at most you might save 1 second :)

    Yep, esp at the compiler level.

    Refined source - might improve 25% or so. Drop
    unneeded/weird steps.

    For programs created by others, I keep whatever setting
    the use since they know much better than me.

    Well, not necessarily ....
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From c186282@c186282@nnada.net to comp.os.linux.misc on Fri Sep 12 00:25:46 2025
    From Newsgroup: comp.os.linux.misc

    On 9/11/25 12:30 PM, John Ames wrote:
    On Thu, 11 Sep 2025 15:11:24 -0000 (UTC)
    John McCue <jmclnx@gmail.com.invalid> wrote:

    I find O1 is good enough for all programs I create.

    To, me, testing and retesting different optimizations is a
    huge waste of time and at most you might save 1 second :)

    Even if you're an optimization freak (and there's nothing wrong with
    that,) the efficacy of tweaks like loop unrolling is highly dependent
    on machine particulars (cache size, etc.) - it's difficult if not
    impossible to establish a one-size-fits-all recipe for True Optimum Performance that could be handed out to non-freaks, as is being
    suggested here. Some level of tweaking may be warranted (e.g. unrolling
    loops in a way that suits the particular algorithm,) but there's little
    point trying to generalize deep grease-monkey fine-tuning across *all
    target systems* even for a single distro, let alone The World At Large.

    Best tact - proto, then re-write a few weeks later.
    The 2nd take will be smarter, tighter.

    Compiler options ... only deliver slight improvements.
    Best used if you need SMALLER, not faster.

    Had one microcontroller app I kept tweaking for
    five or six generations. Each time I could zap
    unnecesssary steps. Got it down nearly 50% from
    the original - saved power (it was a solar-powered
    field app so that was kinda important).

    New "AI" code-writing ... don't count on much
    "optimization". The AI won't really "get it".
    It may work - but be kinda messy. If it's a
    popular app, figure the power/time consumption
    of 'messy' for millions/billions of users.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From c186282@c186282@nnada.net to comp.os.linux.misc on Fri Sep 12 03:02:09 2025
    From Newsgroup: comp.os.linux.misc

    On 9/11/25 6:38 PM, Lawrence D’Oliveiro wrote:
    On Thu, 11 Sep 2025 15:11:24 -0000 (UTC), John McCue wrote:

    To, me, testing and retesting different optimizations is a huge waste of
    time and at most you might save 1 second :)

    I was once hired to build an app in MATLAB for decoding and displaying multiple channels of EEG data, using its built-in GUI tools (momentary shudder as the PTSD kicks in), in real time. One of the original
    researchers had already written some stream-decoding code to start with; I had a go at doing it in different ways, and was able to achieve close to a 2:1 speedup on the DEC Alpha I was using for testing.

    Then I ran the same code on the Windows NT box which was going to be used
    as the actual deployment platform ... and most of the speedup went away.

    THE best op is to proto, look/think for a few weeks,
    then re-write.

    That will do FAR more than any compiler tweaks.

    I like to proto in Python, then re-write in Pascal
    or maybe K&R 'C' depending.

    New - "AI" generated code. The "AI" does NOT
    "get it". It's code will be MESSY - 'Lego'.
    Maybe not so bad for random utilities, but if
    the app is meant for millions/billions then
    shitty code sucks a LOT more CPU cycles and
    energy.

    "AI" ... at present it's gonna suck maybe
    25% of the entire global energy output just
    so it can pretend to be idiot people. BIZ
    loves it because they think it can replace
    all those annoying HUMANS. Alas, disemployed
    humans can't BUY their stuff so .......

    Can't get there from here.

    Sorry.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Lester Thorpe@lt@gnu.rocks to comp.os.linux.misc on Fri Sep 12 08:19:07 2025
    From Newsgroup: comp.os.linux.misc

    On Fri, 12 Sep 2025 03:02:09 -0400, c186282 wrote:


    THE best op is to proto, look/think for a few weeks,
    then re-write.

    That will do FAR more than any compiler tweaks.


    Everyone is missing the main point.

    I am referring to optimizing code that is already published
    and available, e.g. the average GNU/Linux package.

    This code cannot be (easily) rewritten by the user and the
    only way to optimize is during build time, which can be quite
    effective. I have experienced up to 40% performance increase
    using just compiler options.

    But finding the best options can at times be difficult and
    that's why the code author should provide guidance.
    --
    Gentoo: the only road to GNU/Linux perfection.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Farley Flud@fsquared@fsquared.linux to comp.os.linux.misc on Fri Sep 12 10:17:25 2025
    From Newsgroup: comp.os.linux.misc

    On Fri, 12 Sep 2025 03:02:09 -0400, c186282 wrote:


    THE best op is to proto, look/think for a few weeks,
    then re-write.

    That will do FAR more than any compiler tweaks.


    Not necessarily.

    Consider the Automatically Tuned Linear Algebra Software (ATLAS):

    <https://math-atlas.sourceforge.net/>

    Linear algebra (i.e. matrix operations) software is used as a standard benchmark for all supercomputers.

    The ATLAS program will automatically tune itself, using compiler options,
    for the best performance on a particular machine.

    ATLAS has some pre-determined options for certain CPUs but if a CPU
    is not on the list ATLAS will then undergo an automatic tuning wherein different options are tried and compared.

    Compiler tweaks can make a big difference.

    The original point of this thread is that all software should
    emulate ATLAS to some extent.
    --
    Hail Linux! Hail FOSS! Hail Stallman!
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From c186282@c186282@nnada.net to comp.os.linux.misc on Fri Sep 12 06:51:53 2025
    From Newsgroup: comp.os.linux.misc

    On 9/12/25 6:17 AM, Farley Flud wrote:
    On Fri, 12 Sep 2025 03:02:09 -0400, c186282 wrote:


    THE best op is to proto, look/think for a few weeks,
    then re-write.

    That will do FAR more than any compiler tweaks.


    Not necessarily.

    Should ......... always did for me .......

    Consider the Automatically Tuned Linear Algebra Software (ATLAS):

    <https://math-atlas.sourceforge.net/>

    Ugh ....

    Gimme 'pure' 'C' or Pascal or FORTRAN.

    Linear algebra is not the best solution to everything.

    Linear algebra (i.e. matrix operations) software is used as a standard benchmark for all supercomputers.

    Not interested in that sort of benchmark.

    The ATLAS program will automatically tune itself, using compiler options,
    for the best performance on a particular machine.

    But we're not really talking 'compiler options' here
    but good/better/best source code. If your source is
    messy then no compiler can help you much.

    ATLAS has some pre-determined options for certain CPUs but if a CPU
    is not on the list ATLAS will then undergo an automatic tuning wherein different options are tried and compared.

    Compiler tweaks can make a big difference.

    Depends. Garbage IN = Garbage OUT.

    The original point of this thread is that all software should
    emulate ATLAS to some extent.

    Ummm ... maybe in deep theory ...... but that's not
    how the 99% will do it. "Hello World" does NOT need
    this approach.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From The Natural Philosopher@tnp@invalid.invalid to comp.os.linux.misc on Fri Sep 12 15:45:45 2025
    From Newsgroup: comp.os.linux.misc

    On 12/09/2025 11:17, Farley Flud wrote:
    The ATLAS program will automatically tune itself, using compiler options,
    for the best performance on a particular machine.

    How does it know what machine is the target?
    --
    There is something fascinating about science. One gets such wholesale
    returns of conjecture out of such a trifling investment of fact.

    Mark Twain

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Farley Flud@fsquared@fsquared.linux to comp.os.linux.misc on Fri Sep 12 15:20:39 2025
    From Newsgroup: comp.os.linux.misc

    On Fri, 12 Sep 2025 15:45:45 +0100, The Natural Philosopher wrote:

    On 12/09/2025 11:17, Farley Flud wrote:
    The ATLAS program will automatically tune itself, using compiler options,
    for the best performance on a particular machine.

    How does it know what machine is the target?


    The tuning occurs during build time. The "target" is the machine upon which
    it is being built.

    No binaries are distributed. Only the source code is available.

    However, some GNU/Linux distros will include binary Atlas packages
    but these are necessarily sub-optimal builds. Check out the blurb
    from Fedora:

    https://www.rpmfind.net/linux/RPM/fedora/devel/rawhide/x86_64/a/atlas-3.10.3-30.fc43.x86_64.html
    --
    Hail Linux! Hail FOSS! Hail Stallman!
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From John Ames@commodorejohn@gmail.com to comp.os.linux.misc on Fri Sep 12 10:11:03 2025
    From Newsgroup: comp.os.linux.misc

    On Fri, 12 Sep 2025 15:45:45 +0100
    The Natural Philosopher <tnp@invalid.invalid> wrote:

    The ATLAS program will automatically tune itself, using compiler
    options, for the best performance on a particular machine.

    How does it know what machine is the target?

    Presumably it targets the machine on which it's running. Reminds me a
    bit of one of the few genuinely smart things MS's .NET framework does -
    part of the install/update process involves it auto-profiling/tuning
    its core VM interpreter/library in situ so it can accurately benchmark
    itself. That only affects raw VM performance (a bad algorithm running
    on top of a fast VM is still gonna suck,) and it'd be more involved to
    do something comparable with a native-code application (dynamic linking
    might save you the trouble of a full recompile, but it'd still be non- trivial,) but it *is* a nice touch.

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From The Natural Philosopher@tnp@invalid.invalid to comp.os.linux.misc on Sat Sep 13 07:57:24 2025
    From Newsgroup: comp.os.linux.misc

    On 12/09/2025 16:20, Farley Flud wrote:
    On Fri, 12 Sep 2025 15:45:45 +0100, The Natural Philosopher wrote:

    On 12/09/2025 11:17, Farley Flud wrote:
    The ATLAS program will automatically tune itself, using compiler options, >>> for the best performance on a particular machine.

    How does it know what machine is the target?


    The tuning occurs during build time. The "target" is the machine upon which it is being built.

    That will do really nicely when I am compiling for an ARM 2040 on my *86 machine, then...

    No binaries are distributed. Only the source code is available.

    However, some GNU/Linux distros will include binary Atlas packages
    but these are necessarily sub-optimal builds. Check out the blurb
    from Fedora:

    https://www.rpmfind.net/linux/RPM/fedora/devel/rawhide/x86_64/a/atlas-3.10.3-30.fc43.x86_64.html



    --
    “Politics is the art of looking for trouble, finding it everywhere, diagnosing it incorrectly and applying the wrong remedies.”
    ― Groucho Marx

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From =?UTF-8?Q?St=C3=A9phane?= CARPENTIER@sc@fiat-linux.fr to comp.os.linux.misc on Sat Sep 13 13:44:41 2025
    From Newsgroup: comp.os.linux.misc

    Le 12-09-2025, Lester Thorpe <lt@gnu.rocks> a écrit :
    On Fri, 12 Sep 2025 03:02:09 -0400, c186282 wrote:


    THE best op is to proto, look/think for a few weeks,
    then re-write.

    That will do FAR more than any compiler tweaks.


    Everyone is missing the main point.

    Which one?
    - That you are a fraud? Nope: I know it.
    - That you don't know how to optimize compilation? Nope: I know it.
    - That you can only copy/past code? Nope: I know it.
    - That you are a distro lackey? Nope: I know it.
    - The fact that the more you speak about something, the less you know
    about it? Nope: I know it.
    - That you are a Windows fanboy trying to make Linux users pass like
    morons? Nope: I know it.

    I am referring to optimizing code that is already published
    and available, e.g. the average GNU/Linux package.

    You mean that the guys who wrote and published the code know how to
    compile it? Or do you mean what the people competent enough to write
    code for a great tool are too stupid to be able to know how to compile
    it?

    Do you really understand how your sentence is, at the same time, stupid
    and inconsistent? You explain at the same time they know what they are
    doing and they don't know what they are doing. You just explained you
    need random people to help you find a general way to sort out what
    experts do good and what they don't.

    I have experienced up to 40% performance increase using just compiler options.

    I don't believe that. And your last video proves that it's a lie.

    But finding the best options can at times be difficult

    Agreed. But I don't believe you can find them. And, I believe the distro managers, helped with the people who provided the code, can do it. In
    any case, it would take me hours to find better options than what's
    provided by the distro managers helped by package producers to get
    noticeable results on my own computer. Another way to state it: spending
    hours to win few seconds each moths is a waste of my precious time.
    --
    Si vous avez du temps à perdre :
    https://scarpet42.gitlab.io
    --- Synchronet 3.21a-Linux NewsLink 1.2