Forum: War Ensemble BBS

Re: Hint For GNU/Linux Progrmmers

From Lester Thorpe@lt@gnu.rocks to comp.os.linux.misc on Thu Sep 11 18:42:02 2025

From Newsgroup: comp.os.linux.misc

On Thu, 11 Sep 2025 15:11:24 -0000 (UTC), John McCue wrote:

To, me, testing and retesting different optimizations is a
huge waste of time and at most you might save 1 second :)

That was my original point and the reason I suggest that
programmers should do the dirty work for the user.

But seconds can quickly add. For audio/video encoding and
math/physics simulations optimization can mean the difference
between 20 minutes and 1 hour, which is highly significant.

There is an "ancient" program called "paranoia" which evaluates
a machines floating point accuracy:

<https://netlib.sandia.gov/paranoia/paranoia.c>

Using your "-O1" to compile would lead to erroneous results.
In this case, "-O0" is required.

Granted, this program precedes Linux and GCC but other, more
recent programs, may behave in similar ways regarding optimization.

Therefore, it should be the programmers responsibility to
indicate the correct optimization.

For programs created by others, I keep whatever setting
the use since they know much better than me.

--
Gentoo: the only road to GNU/Linux perfection.
--- Synchronet 3.21a-Linux NewsLink 1.2

From Lawrence =?iso-8859-13?q?D=FFOliveiro?=@ldo@nz.invalid to comp.os.linux.misc on Thu Sep 11 22:38:45 2025

From Newsgroup: comp.os.linux.misc

On Thu, 11 Sep 2025 15:11:24 -0000 (UTC), John McCue wrote:

To, me, testing and retesting different optimizations is a huge waste of
time and at most you might save 1 second :)

I was once hired to build an app in MATLAB for decoding and displaying multiple channels of EEG data, using its built-in GUI tools (momentary
shudder as the PTSD kicks in), in real time. One of the original
researchers had already written some stream-decoding code to start with; I
had a go at doing it in different ways, and was able to achieve close to a
2:1 speedup on the DEC Alpha I was using for testing.

Then I ran the same code on the Windows NT box which was going to be used
as the actual deployment platform ... and most of the speedup went away.
--- Synchronet 3.21a-Linux NewsLink 1.2

From c186282@c186282@nnada.net to comp.os.linux.misc on Fri Sep 12 00:15:41 2025

From Newsgroup: comp.os.linux.misc

On 9/11/25 11:11 AM, John McCue wrote:

Follow-ups trimmed to comp.os.linux.misc

In comp.os.linux.misc Lester Thorpe <lt@gnu.rocks> wrote:

Program optimization is essential, but yet it is difficult
to arrive at a best method.

<snip>

I propose that GNU/Linux programmers should determine
the best options and then publish these recommendations
in the source tree to guide the interested user.

I find O1 is good enough for all programs I create.

Yep.

As for the actual writ code, write it once, wait
a week or two, then write it over again better.
I tended to proto in Python, then re-do in Pascal.
The re-do was always a lot tighter/smarter.

To, me, testing and retesting different optimizations is a
huge waste of time and at most you might save 1 second :)

Yep, esp at the compiler level.

Refined source - might improve 25% or so. Drop
unneeded/weird steps.

For programs created by others, I keep whatever setting
the use since they know much better than me.

Well, not necessarily ....
--- Synchronet 3.21a-Linux NewsLink 1.2

From c186282@c186282@nnada.net to comp.os.linux.misc on Fri Sep 12 00:25:46 2025

From Newsgroup: comp.os.linux.misc

On 9/11/25 12:30 PM, John Ames wrote:

On Thu, 11 Sep 2025 15:11:24 -0000 (UTC)
John McCue <jmclnx@gmail.com.invalid> wrote:

I find O1 is good enough for all programs I create.

To, me, testing and retesting different optimizations is a
huge waste of time and at most you might save 1 second :)

Even if you're an optimization freak (and there's nothing wrong with
that,) the efficacy of tweaks like loop unrolling is highly dependent
on machine particulars (cache size, etc.) - it's difficult if not
impossible to establish a one-size-fits-all recipe for True Optimum Performance that could be handed out to non-freaks, as is being
suggested here. Some level of tweaking may be warranted (e.g. unrolling
loops in a way that suits the particular algorithm,) but there's little
point trying to generalize deep grease-monkey fine-tuning across *all
target systems* even for a single distro, let alone The World At Large.

Best tact - proto, then re-write a few weeks later.
The 2nd take will be smarter, tighter.

Compiler options ... only deliver slight improvements.
Best used if you need SMALLER, not faster.

Had one microcontroller app I kept tweaking for
five or six generations. Each time I could zap
unnecesssary steps. Got it down nearly 50% from
the original - saved power (it was a solar-powered
field app so that was kinda important).

New "AI" code-writing ... don't count on much
"optimization". The AI won't really "get it".
It may work - but be kinda messy. If it's a
popular app, figure the power/time consumption
of 'messy' for millions/billions of users.
--- Synchronet 3.21a-Linux NewsLink 1.2

From c186282@c186282@nnada.net to comp.os.linux.misc on Fri Sep 12 03:02:09 2025

From Newsgroup: comp.os.linux.misc

On 9/11/25 6:38 PM, Lawrence D’Oliveiro wrote:

On Thu, 11 Sep 2025 15:11:24 -0000 (UTC), John McCue wrote:

To, me, testing and retesting different optimizations is a huge waste of
time and at most you might save 1 second :)

I was once hired to build an app in MATLAB for decoding and displaying multiple channels of EEG data, using its built-in GUI tools (momentary shudder as the PTSD kicks in), in real time. One of the original
researchers had already written some stream-decoding code to start with; I had a go at doing it in different ways, and was able to achieve close to a 2:1 speedup on the DEC Alpha I was using for testing.

Then I ran the same code on the Windows NT box which was going to be used
as the actual deployment platform ... and most of the speedup went away.

THE best op is to proto, look/think for a few weeks,
then re-write.

That will do FAR more than any compiler tweaks.

I like to proto in Python, then re-write in Pascal
or maybe K&R 'C' depending.

New - "AI" generated code. The "AI" does NOT
"get it". It's code will be MESSY - 'Lego'.
Maybe not so bad for random utilities, but if
the app is meant for millions/billions then
shitty code sucks a LOT more CPU cycles and
energy.

"AI" ... at present it's gonna suck maybe
25% of the entire global energy output just
so it can pretend to be idiot people. BIZ
loves it because they think it can replace
all those annoying HUMANS. Alas, disemployed
humans can't BUY their stuff so .......

Can't get there from here.

Sorry.
--- Synchronet 3.21a-Linux NewsLink 1.2

From Lester Thorpe@lt@gnu.rocks to comp.os.linux.misc on Fri Sep 12 08:19:07 2025

From Newsgroup: comp.os.linux.misc

On Fri, 12 Sep 2025 03:02:09 -0400, c186282 wrote:

THE best op is to proto, look/think for a few weeks,
then re-write.

That will do FAR more than any compiler tweaks.

Everyone is missing the main point.

I am referring to optimizing code that is already published
and available, e.g. the average GNU/Linux package.

This code cannot be (easily) rewritten by the user and the
only way to optimize is during build time, which can be quite
effective. I have experienced up to 40% performance increase
using just compiler options.

But finding the best options can at times be difficult and
that's why the code author should provide guidance.
--
Gentoo: the only road to GNU/Linux perfection.
--- Synchronet 3.21a-Linux NewsLink 1.2

From Farley Flud@fsquared@fsquared.linux to comp.os.linux.misc on Fri Sep 12 10:17:25 2025

From Newsgroup: comp.os.linux.misc

On Fri, 12 Sep 2025 03:02:09 -0400, c186282 wrote:

THE best op is to proto, look/think for a few weeks,
then re-write.

That will do FAR more than any compiler tweaks.

Not necessarily.

Consider the Automatically Tuned Linear Algebra Software (ATLAS):

<https://math-atlas.sourceforge.net/>

Linear algebra (i.e. matrix operations) software is used as a standard benchmark for all supercomputers.

The ATLAS program will automatically tune itself, using compiler options,
for the best performance on a particular machine.

ATLAS has some pre-determined options for certain CPUs but if a CPU
is not on the list ATLAS will then undergo an automatic tuning wherein different options are tried and compared.

Compiler tweaks can make a big difference.

The original point of this thread is that all software should
emulate ATLAS to some extent.
--
Hail Linux! Hail FOSS! Hail Stallman!
--- Synchronet 3.21a-Linux NewsLink 1.2

From c186282@c186282@nnada.net to comp.os.linux.misc on Fri Sep 12 06:51:53 2025

From Newsgroup: comp.os.linux.misc

On 9/12/25 6:17 AM, Farley Flud wrote:

On Fri, 12 Sep 2025 03:02:09 -0400, c186282 wrote:

THE best op is to proto, look/think for a few weeks,
then re-write.

That will do FAR more than any compiler tweaks.

Not necessarily.

Should ......... always did for me .......

Consider the Automatically Tuned Linear Algebra Software (ATLAS):

<https://math-atlas.sourceforge.net/>

Ugh ....

Gimme 'pure' 'C' or Pascal or FORTRAN.

Linear algebra is not the best solution to everything.

Linear algebra (i.e. matrix operations) software is used as a standard benchmark for all supercomputers.

Not interested in that sort of benchmark.

The ATLAS program will automatically tune itself, using compiler options,
for the best performance on a particular machine.

But we're not really talking 'compiler options' here
but good/better/best source code. If your source is
messy then no compiler can help you much.

ATLAS has some pre-determined options for certain CPUs but if a CPU
is not on the list ATLAS will then undergo an automatic tuning wherein different options are tried and compared.

Compiler tweaks can make a big difference.

Depends. Garbage IN = Garbage OUT.

The original point of this thread is that all software should
emulate ATLAS to some extent.

Ummm ... maybe in deep theory ...... but that's not
how the 99% will do it. "Hello World" does NOT need
this approach.
--- Synchronet 3.21a-Linux NewsLink 1.2

From The Natural Philosopher@tnp@invalid.invalid to comp.os.linux.misc on Fri Sep 12 15:45:45 2025

From Newsgroup: comp.os.linux.misc

On 12/09/2025 11:17, Farley Flud wrote:

The ATLAS program will automatically tune itself, using compiler options,
for the best performance on a particular machine.

How does it know what machine is the target?
--
There is something fascinating about science. One gets such wholesale
returns of conjecture out of such a trifling investment of fact.

Mark Twain

--- Synchronet 3.21a-Linux NewsLink 1.2

From Farley Flud@fsquared@fsquared.linux to comp.os.linux.misc on Fri Sep 12 15:20:39 2025

From Newsgroup: comp.os.linux.misc

On Fri, 12 Sep 2025 15:45:45 +0100, The Natural Philosopher wrote:

On 12/09/2025 11:17, Farley Flud wrote:

The ATLAS program will automatically tune itself, using compiler options,
for the best performance on a particular machine.

How does it know what machine is the target?

The tuning occurs during build time. The "target" is the machine upon which
it is being built.

No binaries are distributed. Only the source code is available.

However, some GNU/Linux distros will include binary Atlas packages
but these are necessarily sub-optimal builds. Check out the blurb
from Fedora:

https://www.rpmfind.net/linux/RPM/fedora/devel/rawhide/x86_64/a/atlas-3.10.3-30.fc43.x86_64.html
--
Hail Linux! Hail FOSS! Hail Stallman!
--- Synchronet 3.21a-Linux NewsLink 1.2

From John Ames@commodorejohn@gmail.com to comp.os.linux.misc on Fri Sep 12 10:11:03 2025

From Newsgroup: comp.os.linux.misc

On Fri, 12 Sep 2025 15:45:45 +0100
The Natural Philosopher <tnp@invalid.invalid> wrote:

The ATLAS program will automatically tune itself, using compiler
options, for the best performance on a particular machine.

How does it know what machine is the target?

Presumably it targets the machine on which it's running. Reminds me a
bit of one of the few genuinely smart things MS's .NET framework does -
part of the install/update process involves it auto-profiling/tuning
its core VM interpreter/library in situ so it can accurately benchmark
itself. That only affects raw VM performance (a bad algorithm running
on top of a fast VM is still gonna suck,) and it'd be more involved to
do something comparable with a native-code application (dynamic linking
might save you the trouble of a full recompile, but it'd still be non- trivial,) but it *is* a nice touch.

--- Synchronet 3.21a-Linux NewsLink 1.2

From The Natural Philosopher@tnp@invalid.invalid to comp.os.linux.misc on Sat Sep 13 07:57:24 2025

From Newsgroup: comp.os.linux.misc

On 12/09/2025 16:20, Farley Flud wrote:

On Fri, 12 Sep 2025 15:45:45 +0100, The Natural Philosopher wrote:

On 12/09/2025 11:17, Farley Flud wrote:

The ATLAS program will automatically tune itself, using compiler options, >>> for the best performance on a particular machine.

How does it know what machine is the target?

The tuning occurs during build time. The "target" is the machine upon which it is being built.

That will do really nicely when I am compiling for an ARM 2040 on my *86 machine, then...

No binaries are distributed. Only the source code is available.

However, some GNU/Linux distros will include binary Atlas packages
but these are necessarily sub-optimal builds. Check out the blurb
from Fedora:

https://www.rpmfind.net/linux/RPM/fedora/devel/rawhide/x86_64/a/atlas-3.10.3-30.fc43.x86_64.html

--
“Politics is the art of looking for trouble, finding it everywhere, diagnosing it incorrectly and applying the wrong remedies.”
― Groucho Marx

--- Synchronet 3.21a-Linux NewsLink 1.2

From =?UTF-8?Q?St=C3=A9phane?= CARPENTIER@sc@fiat-linux.fr to comp.os.linux.misc on Sat Sep 13 13:44:41 2025

From Newsgroup: comp.os.linux.misc

Le 12-09-2025, Lester Thorpe <lt@gnu.rocks> a écrit :

On Fri, 12 Sep 2025 03:02:09 -0400, c186282 wrote:

THE best op is to proto, look/think for a few weeks,
then re-write.

That will do FAR more than any compiler tweaks.

Everyone is missing the main point.

Which one?
- That you are a fraud? Nope: I know it.
- That you don't know how to optimize compilation? Nope: I know it.
- That you can only copy/past code? Nope: I know it.
- That you are a distro lackey? Nope: I know it.
- The fact that the more you speak about something, the less you know
about it? Nope: I know it.
- That you are a Windows fanboy trying to make Linux users pass like
morons? Nope: I know it.

I am referring to optimizing code that is already published
and available, e.g. the average GNU/Linux package.

You mean that the guys who wrote and published the code know how to
compile it? Or do you mean what the people competent enough to write
code for a great tool are too stupid to be able to know how to compile
it?

Do you really understand how your sentence is, at the same time, stupid
and inconsistent? You explain at the same time they know what they are
doing and they don't know what they are doing. You just explained you
need random people to help you find a general way to sort out what
experts do good and what they don't.

I have experienced up to 40% performance increase using just compiler options.

I don't believe that. And your last video proves that it's a lie.

But finding the best options can at times be difficult

Agreed. But I don't believe you can find them. And, I believe the distro managers, helped with the people who provided the code, can do it. In
any case, it would take me hours to find better options than what's
provided by the distro managers helped by package producers to get
noticeable results on my own computer. Another way to state it: spending
hours to win few seconds each moths is a waste of my precious time.
--
Si vous avez du temps à perdre :
https://scarpet42.gitlab.io
--- Synchronet 3.21a-Linux NewsLink 1.2

Who's Online
Recent Visitors
- Microbot
  Wed Sep 17 08:25:14 2025
  from Moore, Ok via Telnet
- Zenobyte
  Tue Sep 16 15:06:15 2025
  from San Juan, Pr via Telnet
- Microbot
  Tue Sep 16 10:00:46 2025
  from Moore, Ok via Telnet
- Snow
  Mon Sep 15 12:19:45 2025
  from Nyc via Telnet

System Info

Sysop:	DaiTengu
Location:	Appleton, WI
Users:	1,070
Nodes:	10 (0 / 10)
Uptime:	151:39:26
Calls:	13,733
Calls today:	1
Files:	186,966
D/L today:	724 files (253M bytes)
Messages:	2,418,444

Re: Hint For GNU/Linux Progrmmers

Who's Online

Recent Visitors

System Info