• Re: Article of Melissa O'Nail

    From antispam@antispam@fricas.org (Waldek Hebisch) to comp.lang.c on Wed Jan 7 10:51:16 2026
    From Newsgroup: comp.lang.c

    Michael S <already5chosen@yahoo.com> wrote:
    I experimented a bit more (in fact, more like a lot more) with
    test batteries of L’Ecuyer. It led me to conclusion that occasional
    failure in the either middle or big battery means nothing.
    Sometimes even cripto-quality PRNG does not pass one or another test.
    Then you try to reproduce it and see that with any other seed that you
    try a failure does not happen.
    All in all, it makes me more suspect of PRNGs that consistently pass
    both batteries with various seed. I start to see it as a sign of
    PRNG being rigged to pass tests.

    Well, that depends on the tests and threshhold in the tests.
    Some tests when fed with trurly random source will produce produce
    very small variation of the results. With generous threshhold
    such test will essentially never fail for trurly random source.
    OTOH when expected variation of the results is larger and
    threshhold is tight, then trurly random source will fail the
    test from time to time. And if you test long enough you should
    be able to estimate probability of failure and possibly compare
    is with theoretical result if available.

    To say the truth, I do not know what failures of crypto-quality PRNG
    means. It may mean that the test is of tight kind that is supposed
    to fail from time to time for trurly random source. Or it may mean
    to PRNG improves cypto part at cost of statistics. That is
    non-uniform distribution of the output is not a problem in
    crypto applications. Simply for crypto purposes future output
    should be not predictable from current and past output only.
    And slightly non-uniform distribution can increase probablity
    of test failure enough that you can observe such failures.

    BTW: You mentioned using counter and hardware AES128 round.
    Given cheap AES128 round I would try 128-bit state and AES128
    round as state update. I do not know if hardware AES128 is
    fast enough to make it wortwhile, but using AES128 round as a
    state update should be much better than scrambling a counter.

    Said above does not apply to scomp_LinearComp() failures of mt19937.
    Those failures are very consistent. I just don't consider them
    significant for my own use of PRNGs or for any other uses of PRNG that
    I personally ever encountered.

    Overall an experience strengthened my position that general wisdom, previously shared by O'Neil herself, got it right: in absence of the
    special considerations people should select mt19937 and especially
    mt19937-64 as their default PRNGs of choice.
    Looking closer, apart from its properties of randomness and apart from
    huge period

    Huge period alone is easy. AFAICS matrix variants of LCG can
    easily get quite large periods. I did not test matrix LCG,
    but on statistical tests they should be essentially as good
    as multiprecision LCG, but should be cheaper to implement.

    Just to be clear, I mean equation x_{n+1} = Ax_n + b, wher x_n
    is a vector reprezenting n-th state, A is a matrix and b is a
    vector. Matrix A may be somewhat sparse, that is have limited
    number of non-zero entries, and some entries my be simple, like
    1 or -1. With proper choice of a and b period should be
    comparable with number of availalable states.

    I see no reason to go for very long periods, already 512 bits
    of state allow perids which in practice should be indistingushable
    from infinite period.

    (which does not matter for me) I started to appreciate for
    mt19937 for the following properties that I was not aware before:
    - it does not use multiplier. So good fit for embedded systems that
    have no (or very slow) multiplier HW.

    Biggest MCU with no hardware multiplier that I have has 2kB RAM.
    I do not want mt19937 on such a machine. 64-bit multiplication
    on 8-biter with hardware multiplier may be slow, so 64-bit LCG
    (and improvements based on it) may be slow. But multiplication
    nicely spreads out bits, so it is not clear to me if there is
    equally good cheaper alternative. If needed I would investigate
    matrix LCG, they may be slightly cheaper.

    - algorithm is very SIMD-friendly, so optimized implementations can be
    very very fast on modern x86 and ARM64 hardware.

    Just size of the state puts limit how fast it can be. And size of
    the state means that it will compete for cache with user data.
    BTW: AFAICS matrix LCG can be SIMD friendly too.

    The latter property also means that very fast FPGA implementation is
    easily possible as long as designer is willing to through at it
    moderate amount of logic resources.

    Said above does not mean that PCG generators of O'Neil have no place. Intuitively, they appear not bad. But the theory is unproven, optimized implementation is likely slower that optimized mt19937, claimed
    "security" advantages are nonsense as admitted later by O'Neil herself.
    And, as said above, I no longer trust her empirical methodology, based
    on work of L’Ecuyer.
    So, PCG generators are valuable addition to the toolbox, but not good
    enough to change my default.

    I agree that ATM it is not entirely clear if PCG-s are as good as
    suggested by tests. But I am surprised by your opinion about
    speed, I did not analyze deeply either of them, but PCG-s are way
    simpler, so I would expect them to be faster.
    --
    Waldek Hebisch
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Wed Jan 7 07:39:08 2026
    From Newsgroup: comp.lang.c

    John McCue <jmclnx@gmail.com.invalid> writes:

    Michael Sanders <porkchop@invalid.foo> wrote:

    Is it incorrect to use 0 (zero) to seed srand()?

    int seed = (argc >= 2 && strlen(argv[1]) == 9)
    ? atoi(argv[1])
    : (int)(time(NULL) % 900000000 + 100000000);

    srand(seed);

    I like to just read /dev/urandom when I need a random
    number. Seem easier and more portable across Linux &
    the *BSDs.

    int s;
    read(fd, &s, sizeof(int));

    Apples and oranges. Many applications that use random numbers
    need a stream of numbers that is deterministic and reproducible,
    which /dev/urandom is not.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Wed Jan 7 07:46:40 2026
    From Newsgroup: comp.lang.c

    Michael S <already5chosen@yahoo.com> writes:

    [regarding rand() and srand()]

    Pay attention that C Standard only requires for the same seed to always produces the same sequence. There is no requirement that different
    seeds have to produce different sequences.
    So, for generator in your example, implementation like below would be
    fully legal. Personally, I wouldn't even consider it as particularly
    poor quality:

    void srand(unsigned seed ) { init = seed | 1;}

    It seems better to do, for example,

    void srand(unsigned seed ) { init = seed - !seed;}

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Wed Jan 7 07:50:09 2026
    From Newsgroup: comp.lang.c

    antispam@fricas.org (Waldek Hebisch) writes:

    [...]
    There is a paper "PCG: A Family of Simple Fast Space-Efficient
    Statistically Good Algorithms for Random Number Generation"
    by M. O?Neill where she gives a family of algorithms and runs
    several statistical tests against known algorithms. Mersenne
    Twister does not look good in tests. If you have enough (128) bits
    LCGs do pass tests. A bunch of generators with 64-bit state also
    passes tests. So the only reason to prefer Mersenne Twister is
    that it is implemented in available libraries. Otherwise it is
    not so good, have large state and needs more execution time
    than alternatives.

    Interesting paper. Thank you.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Michael S@already5chosen@yahoo.com to comp.lang.c on Wed Jan 7 18:14:14 2026
    From Newsgroup: comp.lang.c

    On Wed, 07 Jan 2026 07:46:40 -0800
    Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

    Michael S <already5chosen@yahoo.com> writes:

    [regarding rand() and srand()]

    Pay attention that C Standard only requires for the same seed to
    always produces the same sequence. There is no requirement that
    different seeds have to produce different sequences.
    So, for generator in your example, implementation like below would
    be fully legal. Personally, I wouldn't even consider it as
    particularly poor quality:

    void srand(unsigned seed ) { init = seed | 1;}

    It seems better to do, for example,

    void srand(unsigned seed ) { init = seed - !seed;}


    Yes, it is better.

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Wed Jan 7 08:41:25 2026
    From Newsgroup: comp.lang.c

    Michael S <already5chosen@yahoo.com> writes:

    On Tue, 23 Dec 2025 17:54:05 -0000 (UTC)
    antispam@fricas.org (Waldek Hebisch) wrote:
    [...]
    There is a paper "PCG: A Family of Simple Fast Space-Efficient
    Statistically Good Algorithms for Random Number Generation"
    by M. O?Neill where she gives a family of algorithms and runs
    several statistical tests against known algorithms. Mersenne
    Twister does not look good in tests. If you have enough (128) bits
    LCGs do pass tests. A bunch of generators with 64-bit state also
    passes tests. So the only reason to prefer Mersenne Twister is
    that it is implemented in available libraries. Otherwise it is
    not so good, have large state and needs more execution time
    than alternatives.

    I don't know. Testing randomness is complicated matter.
    How can I be sure that L'Ecuyer and Simard's TestU01 suite tests
    things that I personally care about and that it does not test
    things that are of no interest for me? Especially, the latter.

    Do you think any of the tests in the TestU01 suite are actually counter-indicated? As long as you don't think any TestU01 test
    makes things worse, there is no reason not to use all of them.
    You are always free to disregard tests you don't care about.

    Also, the TestU01 suit is made for generators with 32-bit output.
    M. O'Neill used ad hoc technique to make it applicable to
    generators with 64-bit output. Is this technique right? Or may
    be it put 64-bit PRNG at unfair disadvantage?

    As long as the same mapping is applied to all 64-bit PRNGs under
    consideration I don't see a problem. The point of the test is to
    compare PRNGs, not to compare test methods. If someone thinks a
    different set of tests is called for they are free to run them.

    Besides, I strongly disagree with at least one assertion made by
    O'Neill: "While security-related applications should use a secure
    generator, because we cannot always know the future contexts in
    which our code will be used, it seems wise for all applications to
    avoid generators that make discovering their entire internal state
    completely trivial."
    No, I know exactly what I am doing/ I know exactly that for my
    application easy discovery of complete state of PRNG is not a
    defect.

    You and she are talking about different things. You are talking
    about choosing a PRNG to be used only by yourself. She is talking
    about choosing a PRNG to be made available to other people without
    knowing who they are or what their needs are. In the second case
    it's reasonable to raise the bar for the set of criteria that need
    to be met.

    Anyway, even if I am skeptical about her criticism of popular PRNGs, intuitively I agree with the constructive part of the article - medium-quality PRNG that feeds medium quality hash function can
    potentially produce very good fast PRNG with rather small internal
    state.

    After looking at one of the example PCG generators, I would
    describe it as a medium-quality PRNG that feeds a low-quality
    hash. The particular combination I looked at produced good
    results, but it isn't clear which combinations of PRNG and
    hash would do likewise.

    On related note, I think that even simple counter fed into high
    quality hash function (not cryptographically high quality, far
    less than that) can produce excellent PRNG with even smaller
    internal state. But not very fast one. Although the speed
    depends on specifics of used computer. I can imagine computer
    that has low-latency Rijndael128 instruction. On such computer,
    running counter through 3-4 rounds of Rijndael ill produce very
    good PRNG that is only 2-3 times slower than, for example, LCG
    128/64.

    I think the point of her paper where she talks about determining
    how much internal state is needed is to measure the efficacy of
    the PRNG, not to try to reduce the amount of state needed. Based
    on my own experience with various PRNGs I think it's a mistake to
    try to minimize the amount of internal state needed. My own rule
    of thumb is to allow at least a factor of four: for example, a
    PRNG with a 32-bit output should have at least 128 bits of state.
    My latest favorite has 256 bits of state to produce 32-bit
    outputs (and so might also do well to produce 64-bit outputs, but
    I haven't tested that).
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Wed Jan 7 13:54:21 2026
    From Newsgroup: comp.lang.c

    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
    John McCue <jmclnx@gmail.com.invalid> writes:
    Michael Sanders <porkchop@invalid.foo> wrote:
    Is it incorrect to use 0 (zero) to seed srand()?

    int seed = (argc >= 2 && strlen(argv[1]) == 9)
    ? atoi(argv[1])
    : (int)(time(NULL) % 900000000 + 100000000);

    srand(seed);

    I like to just read /dev/urandom when I need a random
    number. Seem easier and more portable across Linux &
    the *BSDs.

    int s;
    read(fd, &s, sizeof(int));

    Apples and oranges. Many applications that use random numbers
    need a stream of numbers that is deterministic and reproducible,
    which /dev/urandom is not.

    And neither is the non-conforming rand() on OpenBSD.

    The rand(1) man page on OpenBSD 7.8 says:

    Standards insist that this interface return deterministic
    results. Unsafe usage is very common, so OpenBSD changed the
    subsystem to return non-deterministic results by default.

    To satisfy portable code, srand() may be called to initialize
    the subsystem. In OpenBSD the seed variable is ignored,
    and strong random number results will be provided from
    arc4random(3). In other systems, the seed variable primes a
    simplistic deterministic algorithm.

    It does provide an srand_deterministic() function that behaves the way
    srand() is supposed to.

    And a program that calls rand() produces a link-time warning, even
    though OpenBSD's rand() *doesn't* return deterministic values.

    ld: warning: rand_test.c(rand_test.o:(main)): warning: rand() may return deterministic values, is that what you want?

    (In a similarly questionable decision, OpenBSD's printf triggers a
    SIGABRT signal if the format string uses "%n".)
    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Michael S@already5chosen@yahoo.com to comp.lang.c on Thu Jan 8 01:06:01 2026
    From Newsgroup: comp.lang.c

    On Wed, 07 Jan 2026 08:41:25 -0800
    Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

    Michael S <already5chosen@yahoo.com> writes:

    On Tue, 23 Dec 2025 17:54:05 -0000 (UTC)
    antispam@fricas.org (Waldek Hebisch) wrote:
    [...]
    There is a paper "PCG: A Family of Simple Fast Space-Efficient
    Statistically Good Algorithms for Random Number Generation"
    by M. O?Neill where she gives a family of algorithms and runs
    several statistical tests against known algorithms. Mersenne
    Twister does not look good in tests. If you have enough (128) bits
    LCGs do pass tests. A bunch of generators with 64-bit state also
    passes tests. So the only reason to prefer Mersenne Twister is
    that it is implemented in available libraries. Otherwise it is
    not so good, have large state and needs more execution time
    than alternatives.

    I don't know. Testing randomness is complicated matter.
    How can I be sure that L'Ecuyer and Simard's TestU01 suite tests
    things that I personally care about and that it does not test
    things that are of no interest for me? Especially, the latter.

    Do you think any of the tests in the TestU01 suite are actually counter-indicated? As long as you don't think any TestU01 test
    makes things worse, there is no reason not to use all of them.
    You are always free to disregard tests you don't care about.


    Except that it's difficult psychologically.
    The batteries of test gains position of of authority in your mind.
    Well, may be, you specifically are resistant, but I am not. Nor is
    Melissa O'Nail, it seems.

    To illustrate my point, I will tell you the story about myself.
    Sort of confession.
    If you had read the rest of this thread (or paid attention to
    finer details in the article of O'Nail) then you already know that
    mt19937 consistently fails in scomp_LinearComp() subtest of Crush and
    BigCrush batteries of Test01. As reported, I "fixed" it by skipping
    every 19936th word of mt19937 output. This particular "fix" is benign.
    I am sure that it does not make output of generator less random. The
    only impact is a bit of slowness, because of the need to manage yet
    another modulo counter.
    What I did not tell so far is that I tried another "fix". I added
    leakage during mt state update. It means that periodically I
    forced two LS bits of newly generated state word to be '01', without
    affecting the word that goes to tampering and then to output.
    And according to Test01 it worked! Flew through all batteries!
    Luckily, I was sufficiently self-conscious to understand that I don't understand nearly enough about algebra of Galois fields to predict all consequences of my modification. But that's me. I know few people that
    are less aware of their limitations.

    Also, the TestU01 suit is made for generators with 32-bit output.
    M. O'Neill used ad hoc technique to make it applicable to
    generators with 64-bit output. Is this technique right? Or may
    be it put 64-bit PRNG at unfair disadvantage?

    As long as the same mapping is applied to all 64-bit PRNGs under consideration I don't see a problem. The point of the test is to
    compare PRNGs, not to compare test methods. If someone thinks a
    different set of tests is called for they are free to run them.


    But Melissa, following advice of L'Ecuyer, and me, following advice of
    Melissa, tested 64-bit generators three times - LSW then MSW, only LSW
    and only MSW, thus putting them under trice more serious scrutiny than
    32-bit counterparts.

    Besides, I strongly disagree with at least one assertion made by
    O'Neill: "While security-related applications should use a secure generator, because we cannot always know the future contexts in
    which our code will be used, it seems wise for all applications to
    avoid generators that make discovering their entire internal state completely trivial."
    No, I know exactly what I am doing/ I know exactly that for my
    application easy discovery of complete state of PRNG is not a
    defect.

    You and she are talking about different things. You are talking
    about choosing a PRNG to be used only by yourself. She is talking
    about choosing a PRNG to be made available to other people without
    knowing who they are or what their needs are. In the second case
    it's reasonable to raise the bar for the set of criteria that need
    to be met.


    No, this part of her article is a mistake, plain and simple.
    He even sort of admitted it couple of years later in her blogg.
    In reality, we either need secure PRNG or do not need secure PRNG.
    There is no middle ground of "more or less secure insecure PRNGs".
    PRNGs she advocates are insecure until proven otherwise by crypto
    analysts.

    Anyway, even if I am skeptical about her criticism of popular PRNGs, intuitively I agree with the constructive part of the article - medium-quality PRNG that feeds medium quality hash function can
    potentially produce very good fast PRNG with rather small internal
    state.

    After looking at one of the example PCG generators, I would
    describe it as a medium-quality PRNG that feeds a low-quality
    hash. The particular combination I looked at produced good
    results, but it isn't clear which combinations of PRNG and
    hash would do likewise.


    I tend to like CRC32C for hashing 64 bits into 32 bits; for no reason
    apart from that this primitive is available and fast on modern Intel
    and AMD CPUs. ARM64 CPUs as well, I think, although less than 100% sure.

    On related note, I think that even simple counter fed into high
    quality hash function (not cryptographically high quality, far
    less than that) can produce excellent PRNG with even smaller
    internal state. But not very fast one. Although the speed
    depends on specifics of used computer. I can imagine computer
    that has low-latency Rijndael128 instruction. On such computer,
    running counter through 3-4 rounds of Rijndael ill produce very
    good PRNG that is only 2-3 times slower than, for example, LCG
    128/64.

    I think the point of her paper where she talks about determining
    how much internal state is needed is to measure the efficacy of
    the PRNG, not to try to reduce the amount of state needed. Based
    on my own experience with various PRNGs I think it's a mistake to
    try to minimize the amount of internal state needed. My own rule
    of thumb is to allow at least a factor of four: for example, a
    PRNG with a 32-bit output should have at least 128 bits of state.
    My latest favorite has 256 bits of state to produce 32-bit
    outputs (and so might also do well to produce 64-bit outputs, but
    I haven't tested that).

    One important point that I seem to figure out recently is that the only practical way to produce both solid and very fast PRNG that adheres to
    standard language APIs with 32-bit and to somewhat smaller extent 64-bit output, is to use buffering. I.e. most of the time generator simply
    reads pre-calculated word from the buffer and only ones per N
    iterations runs an actual PRNG algorithm, probably in a loop, often
    in SIMD. In order for this approach to be effective, buffer can't be particularly small. 32 bytes (256 bits) appear to be an absolute
    minimum. The buffer and counter that manages buffering, are parts of the generator state. That alone sets a practical minimal limit on the size
    of generator and diminishes significance of the difference between
    PRNGs with "algorithmic" state of 64 bits, 128 bits or even 256 bits.

    The observation certainly applies to PCGs or to anything else that
    utilizes LCG for its state update primitive.

    Now, if one does no look for ultimate speed then said above does not
    apply.


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Michael Sanders@porkchop@invalid.foo to comp.lang.c on Thu Jan 8 02:57:08 2026
    From Newsgroup: comp.lang.c

    On Mon, 22 Dec 2025 08:48:10 -0000 (UTC), Michael Sanders wrote:

    [...]

    This puppy is done (at least I'm happy with it).

    Main thing I wanted to accomplish was solid keyboard
    handling in the REPL (so I can apply the knowledge
    gained to another project & I believe I'm pretty close.)

    As it stands the game now has:

    - 3 games in a single binary (mastermind, moo, bagels)

    - man page

    - make file

    - sparse comments...

    Zipped sources [8.3KB] for a few days below:

    <https://drive.google.com/file/d/1PAzCj-Mfx06y2oHz9LQTxa74Aerd2pYe/view?usp=sharing>

    Thanks for the help everyone, I'm gone fishin' for awhile =)
    --
    :wq
    Mike Sanders
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Michael S@already5chosen@yahoo.com to comp.lang.c on Thu Jan 8 14:03:35 2026
    From Newsgroup: comp.lang.c

    On Wed, 7 Jan 2026 10:51:16 -0000 (UTC)
    antispam@fricas.org (Waldek Hebisch) wrote:
    Michael S <already5chosen@yahoo.com> wrote:
    I experimented a bit more (in fact, more like a lot more) with
    test batteries of L’Ecuyer. It led me to conclusion that occasional failure in the either middle or big battery means nothing.
    Sometimes even cripto-quality PRNG does not pass one or another
    test. Then you try to reproduce it and see that with any other seed
    that you try a failure does not happen.
    All in all, it makes me more suspect of PRNGs that consistently pass
    both batteries with various seed. I start to see it as a sign of
    PRNG being rigged to pass tests.

    Well, that depends on the tests and threshhold in the tests.
    Some tests when fed with trurly random source will produce produce
    very small variation of the results. With generous threshhold
    such test will essentially never fail for trurly random source.
    OTOH when expected variation of the results is larger and
    threshhold is tight, then trurly random source will fail the
    test from time to time.
    Yes, there are many stable tests. But there are also many unstable
    tests in Crash/BigCrash batteries. I can't say if the ratio is 50:50 or
    75:25, but it is not 90:10.
    svaria_WeightDistrib() and swalk_RandomWalk1() appear most unstable.
    You feed them with particular pseudo-random vector (100,000,000 points)
    and the test fails. Then you shift the vector by just one point, to the
    left or to the right. And it passes, often not just passes, but produces
    a result that is not even close to the margin.
    And if you test long enough you should
    be able to estimate probability of failure and possibly compare
    is with theoretical result if available.

    Long enough in this case is very damn long.
    To say the truth, I do not know what failures of crypto-quality PRNG
    means. It may mean that the test is of tight kind that is supposed
    to fail from time to time for trurly random source. Or it may mean
    to PRNG improves cypto part at cost of statistics. That is
    non-uniform distribution of the output is not a problem in
    crypto applications. Simply for crypto purposes future output
    should be not predictable from current and past output only.
    And slightly non-uniform distribution can increase probablity
    of test failure enough that you can observe such failures.

    BTW: You mentioned using counter and hardware AES128 round.
    Given cheap AES128 round I would try 128-bit state and AES128
    round as state update. I do not know if hardware AES128 is
    fast enough to make it wortwhile, but using AES128 round as a
    state update should be much better than scrambling a counter.

    It depends on what one considers fast.
    My expectations, trained by LCGs and esp. by optimized MT, are rather
    high. [On relatively modern Intel and AMD hardware] generators that do
    feedback through even a single AES round (with another round for post-processing) do not meet my expectations.
    At best, fully inlined and after significant effort of removing all non-necessary bottlenecks, such generators produce one output word per
    4-5 clock cycles. More naive implementations with state read from
    memory and stored back to memory on each iteration, are much slower
    than that, easily takes over 10 clocks per iteration.
    For comparison, a counter that goes through 5 rounds of AES without
    feedback runs at 3 clocks per word when inlined and at 3.5 clocks per
    word via function call.
    It seems that given today's hardware limitations the only way to get
    really fast PRNG with the state updated through AES round is by running
    several chains interleaved. Preferably no less than 4 chains. Which
    increases size of basic PRNG state to 512 bits + some more for state management.
    Another problem is that when all state bits go through AES then I can
    not prove that the period of PRNG is really high. I can't even be sure
    that it does not enter very short loop at some stage of its life. I
    understand that it is highly unlikely, but don't like uncertainty.
    Running half of the state through AES and taking another half from
    counter solves it, but at cost of need for better diffusion in
    post-processing (temper) state of generator. If more than 32 bits of
    output produced per iteration, I would not feel good if tempering
    consists of just one round of AES. I would want two rounds at least.
    May be, better option is to do something like that:
    interleave N times {
    prnd_out = AES1R(state.w128);
    state.w128 = AES1R(state.w128) ^ state.counter64;
    state.counter64 = state.counter64 + 1;
    }
    I am still not 100% sure that the long period is guaranteed with such arrangement, but I am far more sure of it than in case of original
    arrangement.
    However, it is far from simple and the state is not very small. So, the question is "Why bother"? What's wrong with much simpler scheme that
    takes 64-bit counter and runs it through 5 AES rounds? Or even 6 round
    if being paranoid?
    The only downside that I can see is that this simple robust scheme is
    more dependent on high throughput of AES units which is not necessarily available on somewhat older hardware.
    Said above does not apply to scomp_LinearComp() failures of mt19937.
    Those failures are very consistent. I just don't consider them
    significant for my own use of PRNGs or for any other uses of PRNG
    that I personally ever encountered.

    Overall an experience strengthened my position that general wisdom, previously shared by O'Neil herself, got it right: in absence of the special considerations people should select mt19937 and especially mt19937-64 as their default PRNGs of choice.
    Looking closer, apart from its properties of randomness and apart
    from huge period

    Huge period alone is easy. AFAICS matrix variants of LCG can
    easily get quite large periods. I did not test matrix LCG,
    but on statistical tests they should be essentially as good
    as multiprecision LCG, but should be cheaper to implement.

    Just to be clear, I mean equation x_{n+1} = Ax_n + b, wher x_n
    is a vector reprezenting n-th state, A is a matrix and b is a
    vector. Matrix A may be somewhat sparse, that is have limited
    number of non-zero entries, and some entries my be simple, like
    1 or -1. With proper choice of a and b period should be
    comparable with number of availalable states.

    I see no reason to go for very long periods, already 512 bits
    of state allow perids which in practice should be indistingushable
    from infinite period.

    Didn't I agree with that in the very next statement?
    (which does not matter for me) I started to appreciate for
    mt19937 for the following properties that I was not aware before:
    - it does not use multiplier. So good fit for embedded systems that
    have no (or very slow) multiplier HW.

    Biggest MCU with no hardware multiplier that I have has 2kB RAM.
    I do not want mt19937 on such a machine. 64-bit multiplication
    on 8-biter with hardware multiplier may be slow, so 64-bit LCG
    (and improvements based on it) may be slow. But multiplication
    nicely spreads out bits, so it is not clear to me if there is
    equally good cheaper alternative.
    64-bit multipliers spread bits nicely, indeed. But implementing 64-bit multiplier on the core that natively has only 8x8=16 is not easy or
    fast.
    Even implementing 32-bit multiplier here is not easy or fast. And I
    would not say that 32-bit multipliers spread bits nicely.
    But I was not thinking about that sort of cores.
    I had in mind minimalistic implementation of soft core architectures by
    Xilinx and Altera.
    These core have several useful properties:
    - tested and supported much better than open-source alternatives
    - occupy relative few logic and block RAM resources
    - often capable to run at higher frequency than their full-featured
    brethren
    - the fact that they can be used without buying a license (free as
    beer) does not hurt either
    Of course, they are *much* slower (lower IPC) than full-featured cores,
    but in plenty of cases it's o.k. One thing that these cores do not have
    is multiplier. When compiler sees multiplication in source code it calls support routine. And when it happens then instead of *much* slower
    (say, 5x) than full-featured core, which is commonly acceptable, these
    cores turn ALOT slower (say, 200x0 which is commonly unacceptable.
    Cores like that sometimes used in applications that want good PRNG,
    like various sorts of built-in tests. In such app spending 2K bytes for
    mt1937 state is o.k. Using emulated 64-bit multiplication for PRMG is
    sometimes o.k. but more commonly not so.
    If needed I would investigate
    matrix LCG, they may be slightly cheaper.

    - algorithm is very SIMD-friendly, so optimized implementations can
    be very very fast on modern x86 and ARM64 hardware.

    Just size of the state puts limit how fast it can be. And size of
    the state means that it will compete for cache with user data.
    BTW: AFAICS matrix LCG can be SIMD friendly too.

    The latter property also means that very fast FPGA implementation is
    easily possible as long as designer is willing to through at it
    moderate amount of logic resources.

    Said above does not mean that PCG generators of O'Neil have no
    place. Intuitively, they appear not bad. But the theory is
    unproven, optimized implementation is likely slower that optimized
    mt19937, claimed "security" advantages are nonsense as admitted
    later by O'Neil herself. And, as said above, I no longer trust her empirical methodology, based on work of L’Ecuyer.
    So, PCG generators are valuable addition to the toolbox, but not
    good enough to change my default.

    I agree that ATM it is not entirely clear if PCG-s are as good as
    suggested by tests. But I am surprised by your opinion about
    speed, I did not analyze deeply either of them, but PCG-s are way
    simpler, so I would expect them to be faster.

    When used in simple way, PCG has LCG latency (IMUL+ADD) embedded in its
    core.
    On modern x86-64 it tends to be 4 clocks at very least. On ARM64 it's
    often better, by I am more interested in x86-64.
    OTOH, optimized implementations of mt19937 have [almost] no inherent
    latency limits. If one has wider hardware it directly translate into
    both faster state update and faster tampering phase.
    More strictly speaking for state update there exists a strict
    algorithmic limit of 397 32-bit words. Plus, non-heroic implementations
    have limit of 624-397=237 words. Both limits are far away from
    capabilities of the widest commonly available hardware (2x or 3x 512
    bits, i.e. no more than 48 32-bit operations per cycle).
    In practice, the biggest bottleneck , even without 512-bit processing,
    is not a state update and not tampering, but fighting impedance
    mismatch at the level of API. That is, implementation prefers to
    generate 256 bits at once, but API insists on 32 bits. Which leads to
    need for buffering and for additional management overhead. More time
    spent here than within mt algorithm itself.
    Still, even with overhead, when buffer management part is inlined
    (which is easy to achieve in practice and does not cause significant
    code bloat) modern but not the most modern Intel core (Raptor Cove)
    that I have in my desktop at work is capable to deliver one 32-bit word
    per exactly 3 clock cycles.
    That's a little better than even most basic fully inlined LCG, so
    necessarily better than any PCG.
    There is little doubt that LCGs (and as result of it PCGs) can be
    improved via advancing state by several steps at time instead of one
    step at time. But then they will suffer from the same impedance
    mismatch with APIs. Solution to mismatch would be the same as with mt - buffering and additional management overhead. So gone simplicity, gone
    tiny state size :(
    I did not try it, but it is very likely that for given width of output
    word the resulting speed of PCG would be almost exactly the same as
    optimized mt19937, because generator will spend majority of time in the
    code that not merely almost the same between the two, but exactly the
    same.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Michael Sanders@porkchop@invalid.foo to comp.lang.c on Thu Jan 8 15:34:04 2026
    From Newsgroup: comp.lang.c

    On Wed, 07 Jan 2026 13:54:21 -0800, Keith Thompson wrote:

    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
    John McCue <jmclnx@gmail.com.invalid> writes:
    Michael Sanders <porkchop@invalid.foo> wrote:
    Is it incorrect to use 0 (zero) to seed srand()?

    int seed = (argc >= 2 && strlen(argv[1]) == 9)
    ? atoi(argv[1])
    : (int)(time(NULL) % 900000000 + 100000000);

    srand(seed);

    I like to just read /dev/urandom when I need a random
    number. Seem easier and more portable across Linux &
    the *BSDs.

    int s;
    read(fd, &s, sizeof(int));

    Apples and oranges. Many applications that use random numbers
    need a stream of numbers that is deterministic and reproducible,
    which /dev/urandom is not.

    And neither is the non-conforming rand() on OpenBSD.

    The rand(1) man page on OpenBSD 7.8 says:

    Standards insist that this interface return deterministic
    results. Unsafe usage is very common, so OpenBSD changed the
    subsystem to return non-deterministic results by default.

    To satisfy portable code, srand() may be called to initialize
    the subsystem. In OpenBSD the seed variable is ignored,
    and strong random number results will be provided from
    arc4random(3). In other systems, the seed variable primes a
    simplistic deterministic algorithm.

    It does provide an srand_deterministic() function that behaves the way srand() is supposed to.

    So then clang would use:

    #ifdef __OpenBSD__
    srand_deterministic(seed);
    #else
    srand(seed);
    #endif

    But I don't know (yet) that gcc does as well under OpenBSD.
    --
    :wq
    Mike Sanders
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Thu Jan 8 09:26:18 2026
    From Newsgroup: comp.lang.c

    antispam@fricas.org (Waldek Hebisch) writes:

    Michael S <already5chosen@yahoo.com> wrote:
    [...]
    Anyway, even if I am skeptical about her criticism of popular PRNGs,
    intuitively I agree with the constructive part of the article -
    medium-quality PRNG that feeds medium quality hash function can
    potentially produce very good fast PRNG with rather small internal
    state.

    She seem to care very much about having minimal possible state.
    That is may be nice on embeded systems, but in general I would
    happily accept slighty bigger state (say 256 bits). But if
    we can get good properties with very small state, then why not?
    [...]

    That depends on whether one thinks the tests done to measure
    quality are sufficient to determine all the axes of "good
    properties". I'm not inclined to think that they do. The
    set of tests done in the TestU01 suite are quite good IMO,
    but I don't think they measure all the properties of interest.
    I prefer to err on the side of caution.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Thu Jan 8 09:40:21 2026
    From Newsgroup: comp.lang.c

    antispam@fricas.org (Waldek Hebisch) writes:

    Michael S <already5chosen@yahoo.com> wrote:

    I experimented a bit more (in fact, more like a lot more) with
    test batteries of L?Ecuyer. It led me to conclusion that occasional
    failure in the either middle or big battery means nothing.
    Sometimes even cripto-quality PRNG does not pass one or another test.
    Then you try to reproduce it and see that with any other seed that you
    try a failure does not happen.
    All in all, it makes me more suspect of PRNGs that consistently pass
    both batteries with various seed. I start to see it as a sign of
    PRNG being rigged to pass tests.

    Well, that depends on the tests and threshhold in the tests.
    Some tests when fed with trurly random source will produce produce
    very small variation of the results. With generous threshhold
    such test will essentially never fail for trurly random source.
    OTOH when expected variation of the results is larger and
    threshhold is tight, then trurly random source will fail the
    test from time to time. And if you test long enough you should
    be able to estimate probability of failure and possibly compare
    is with theoretical result if available.

    It's inherent in the nature of statistical testing that every
    so often a statistical test will "fail" even for a truly random
    input. An input source that never fails can also be indicative
    of a low quality source (and perhaps one that was tuned to the
    particular set of tests being done). It took me a while to
    learn that the results of a PRNG test suite should not be seen
    as purely binary.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Thu Jan 8 14:44:27 2026
    From Newsgroup: comp.lang.c

    Michael Sanders <porkchop@invalid.foo> writes:
    On Wed, 07 Jan 2026 13:54:21 -0800, Keith Thompson wrote:
    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
    [...]
    Apples and oranges. Many applications that use random numbers
    need a stream of numbers that is deterministic and reproducible,
    which /dev/urandom is not.

    And neither is the non-conforming rand() on OpenBSD.

    The rand(1) man page on OpenBSD 7.8 says:

    Standards insist that this interface return deterministic
    results. Unsafe usage is very common, so OpenBSD changed the
    subsystem to return non-deterministic results by default.

    To satisfy portable code, srand() may be called to initialize
    the subsystem. In OpenBSD the seed variable is ignored,
    and strong random number results will be provided from
    arc4random(3). In other systems, the seed variable primes a
    simplistic deterministic algorithm.

    It does provide an srand_deterministic() function that behaves the way
    srand() is supposed to.

    So then clang would use:

    #ifdef __OpenBSD__
    srand_deterministic(seed);
    #else
    srand(seed);
    #endif

    But I don't know (yet) that gcc does as well under OpenBSD.

    I don't know what you mean when you say that clang "would use"
    that code.

    I'm not aware that either clang or gcc uses random numbers
    internally. I don't know why they would.

    You could certainly write the above code and compile it with either
    gcc or clang (or any other C compiler on OpenBSD). I've confirmed
    that gcc on OpenBSD does predefine the symbol __OpenBSD__. There
    should be no relevant difference between gcc and clang; random
    number generation is implemented in the library, not in the compiler.

    If your point is that a programmer using either gcc or clang could
    use the above code to get the required deterministic behavior
    for rand(), I agree. (Though it shouldn't be necessary; IMHO the
    OpenBSD folks made a very bad decision.)

    Relatedly, the NetBSD implementation of rand() is conforming, but
    of very low quality. The low-order bit alternates between 0 and
    1 on successive rand() calls, the two low-order bits repeat with
    a cycle of 4, and so on. Larry Jones wrote about it here in 2010:

    The even/odd problem was caused at Berkeley by a well meaning
    but clueless individual who increased the range of the generator
    (which originally matched the sample implementation) by returning
    the *entire* internal state rather than just the high-order
    bits of it. BSD was very popular, so that defective generator
    got around a lot, unfortunately.

    And I've just discovered that the OpenBSD rand() returns alternating
    odd and even results after a call to srand_determinstic().

    It's disturbing that this has never been fixed.
    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Michael Sanders@porkchop@invalid.foo to comp.lang.c on Fri Jan 9 06:06:57 2026
    From Newsgroup: comp.lang.c

    On Thu, 08 Jan 2026 14:44:27 -0800, Keith Thompson wrote:

    Michael Sanders <porkchop@invalid.foo> writes:
    On Wed, 07 Jan 2026 13:54:21 -0800, Keith Thompson wrote:
    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
    [...]
    Apples and oranges. Many applications that use random numbers
    need a stream of numbers that is deterministic and reproducible,
    which /dev/urandom is not.

    And neither is the non-conforming rand() on OpenBSD.

    The rand(1) man page on OpenBSD 7.8 says:

    Standards insist that this interface return deterministic
    results. Unsafe usage is very common, so OpenBSD changed the
    subsystem to return non-deterministic results by default.

    To satisfy portable code, srand() may be called to initialize
    the subsystem. In OpenBSD the seed variable is ignored,
    and strong random number results will be provided from
    arc4random(3). In other systems, the seed variable primes a
    simplistic deterministic algorithm.

    It does provide an srand_deterministic() function that behaves the way
    srand() is supposed to.

    So then clang would use:

    #ifdef __OpenBSD__
    srand_deterministic(seed);
    #else
    srand(seed);
    #endif

    But I don't know (yet) that gcc does as well under OpenBSD.

    I don't know what you mean when you say that clang "would use"
    that code.

    I'm not aware that either clang or gcc uses random numbers
    internally. I don't know why they would.

    Well, I meant the macro itself is (I'm guessing) probably defined
    by clang since its the default compiler.

    You could certainly write the above code and compile it with either
    gcc or clang (or any other C compiler on OpenBSD). I've confirmed
    that gcc on OpenBSD does predefine the symbol __OpenBSD__. There
    should be no relevant difference between gcc and clang; random
    number generation is implemented in the library, not in the compiler.

    This is the info I'm, wondering about: both clang & gcc predefine
    the symbol.

    If your point is that a programmer using either gcc or clang could
    use the above code to get the required deterministic behavior
    for rand(), I agree. (Though it shouldn't be necessary; IMHO the
    OpenBSD folks made a very bad decision.)

    Relatedly, the NetBSD implementation of rand() is conforming, but
    of very low quality. The low-order bit alternates between 0 and
    1 on successive rand() calls, the two low-order bits repeat with
    a cycle of 4, and so on. Larry Jones wrote about it here in 2010:

    The even/odd problem was caused at Berkeley by a well meaning
    but clueless individual who increased the range of the generator
    (which originally matched the sample implementation) by returning
    the *entire* internal state rather than just the high-order
    bits of it. BSD was very popular, so that defective generator
    got around a lot, unfortunately.

    And I've just discovered that the OpenBSD rand() returns alternating
    odd and even results after a call to srand_determinstic().

    It's disturbing that this has never been fixed.

    Yikes! Thanks Keith. This is sort of odd for OpenBSD.
    --
    :wq
    Mike Sanders
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Thu Jan 8 22:46:42 2026
    From Newsgroup: comp.lang.c

    Michael Sanders <porkchop@invalid.foo> writes:
    On Thu, 08 Jan 2026 14:44:27 -0800, Keith Thompson wrote:
    Michael Sanders <porkchop@invalid.foo> writes:
    [...]
    So then clang would use:

    #ifdef __OpenBSD__
    srand_deterministic(seed);
    #else
    srand(seed);
    #endif

    But I don't know (yet) that gcc does as well under OpenBSD.

    I don't know what you mean when you say that clang "would use"
    that code.

    I'm not aware that either clang or gcc uses random numbers
    internally. I don't know why they would.

    Well, I meant the macro itself is (I'm guessing) probably defined
    by clang since its the default compiler.

    You mean the macro __OpenBSD__? Yes, that and other similar macros
    are predefined by the compiler, which is configured for each OS.
    gcc on OpenBSD also predefines it. (I don't know whether it's
    predefined by the preprocessor directly or by some header that's
    included implicitly. That doesn't really matter.) Compilers on
    other platforms will not predefine __OpenBSD__.

    But your original statement implied that clang would *use* that
    particular piece of code, which didn't make much sense. Were you
    just asking about how the __OpenBSD__ macro is defined, without
    reference to srand?

    [...]
    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Fri Jan 9 00:36:21 2026
    From Newsgroup: comp.lang.c

    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

    John McCue <jmclnx@gmail.com.invalid> writes:

    Michael Sanders <porkchop@invalid.foo> wrote:

    Is it incorrect to use 0 (zero) to seed srand()?

    int seed = (argc >= 2 && strlen(argv[1]) == 9)
    ? atoi(argv[1])
    : (int)(time(NULL) % 900000000 + 100000000);

    srand(seed);

    I like to just read /dev/urandom when I need a random
    number. Seem easier and more portable across Linux &
    the *BSDs.

    int s;
    read(fd, &s, sizeof(int));

    Apples and oranges. Many applications that use random numbers
    need a stream of numbers that is deterministic and reproducible,
    which /dev/urandom is not.

    And neither is the non-conforming rand() on OpenBSD.

    The rand(1) man page on OpenBSD 7.8 says:

    Standards insist that this interface return deterministic
    results. Unsafe usage is very common, so OpenBSD changed the
    subsystem to return non-deterministic results by default.

    To satisfy portable code, srand() may be called to initialize
    the subsystem. In OpenBSD the seed variable is ignored,
    and strong random number results will be provided from
    arc4random(3). In other systems, the seed variable primes a
    simplistic deterministic algorithm.

    Apparently the OpenBSD folks have seen fit to remove the only
    desirable property that ISO C actually specifies for the standard
    library random number generator. Bravo!
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Michael Sanders@porkchop@invalid.foo to comp.lang.c on Fri Jan 9 22:38:59 2026
    From Newsgroup: comp.lang.c

    On Thu, 08 Jan 2026 22:46:42 -0800, Keith Thompson wrote:

    But your original statement implied that clang would *use* that
    particular piece of code, which didn't make much sense. Were you
    just asking about how the __OpenBSD__ macro is defined, without
    reference to srand?

    Well, under OpenBSD I plan on using:

    #ifdef __OpenBSD__
    srand_deterministic(seed);
    #else
    srand(seed);
    #endif

    But what I was asking is whether or not gcc would recognize
    the __OpenBSD__ macro (why wouldn't I'm assuming) since clang
    is the default compiler.

    But also about srand()... you've got me really wondering why
    OpenBSD would deviate from the standard as they have. I get
    that the those folks disagree because its deterministic, but
    its the accepted standard to be deterministic with srand().

    Only speaking for myself here, rather than srand_deterministic()
    and srand() (that's not deterministic under OpenBSD) it
    would've made more sense to've implemented srand_non_deterministic()
    and left srand() alone. That design decision on their part only
    muddies the waters in my thinking. Live & learn =)
    --
    :wq
    Mike Sanders
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From scott@scott@slp53.sl.home (Scott Lurndal) to comp.lang.c on Fri Jan 9 23:27:15 2026
    From Newsgroup: comp.lang.c

    Michael Sanders <porkchop@invalid.foo> writes:
    On Thu, 08 Jan 2026 22:46:42 -0800, Keith Thompson wrote:

    But your original statement implied that clang would *use* that
    particular piece of code, which didn't make much sense. Were you
    just asking about how the __OpenBSD__ macro is defined, without
    reference to srand?

    Well, under OpenBSD I plan on using:

    #ifdef __OpenBSD__
    srand_deterministic(seed);
    #else
    srand(seed);
    #endif

    But what I was asking is whether or not gcc would recognize
    the __OpenBSD__ macro (why wouldn't I'm assuming) since clang
    is the default compiler.

    $ gcc -dM -E - < /dev/null

    will show all the preprocessor macros predefined by the compiler.

    There 397 predefined macros on my Fedora gcc 14 installation.

    Note that other macros may be defined in header files.



    But also about srand()... you've got me really wondering why
    OpenBSD would deviate from the standard as they have. I get
    that the those folks disagree because its deterministic, but
    its the accepted standard to be deterministic with srand().

    I expect they were primary concerned with the security
    implications of a deterministic algorithm.


    Only speaking for myself here, rather than srand_deterministic()
    and srand() (that's not deterministic under OpenBSD) it
    would've made more sense to've implemented srand_non_deterministic()
    and left srand() alone. That design decision on their part only
    muddies the waters in my thinking. Live & learn =)

    I'm sure they wanted the change to apply by default to existing
    applications (many of them likely distributed with various BSD
    releases).

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Fri Jan 9 17:09:47 2026
    From Newsgroup: comp.lang.c

    Michael Sanders <porkchop@invalid.foo> writes:
    On Thu, 08 Jan 2026 22:46:42 -0800, Keith Thompson wrote:
    But your original statement implied that clang would *use* that
    particular piece of code, which didn't make much sense. Were you
    just asking about how the __OpenBSD__ macro is defined, without
    reference to srand?

    Well, under OpenBSD I plan on using:

    #ifdef __OpenBSD__
    srand_deterministic(seed);
    #else
    srand(seed);
    #endif

    But what I was asking is whether or not gcc would recognize
    the __OpenBSD__ macro (why wouldn't I'm assuming) since clang
    is the default compiler.

    OK.

    Do you understand that your original question was unclear?

    You said that "clang would use" the quoted 5-line code snippet,
    and asked whether "gcc does as well". It's not clang or gcc that
    would use that code. It would be used by a programmer writing code
    to be compiled with clang or gcc.

    I understand now what you meant. I'd like to be sure that you
    understand the problem with the question as you originally wrote it.

    I have clang 19.1.7 and gcc 13.2.0 installed on OpenBSD 7.8, and
    both predefine the macro __OpenBSD__.

    But also about srand()... you've got me really wondering why
    OpenBSD would deviate from the standard as they have. I get
    that the those folks disagree because its deterministic, but
    its the accepted standard to be deterministic with srand().

    Only speaking for myself here, rather than srand_deterministic()
    and srand() (that's not deterministic under OpenBSD) it
    would've made more sense to've implemented srand_non_deterministic()
    and left srand() alone. That design decision on their part only
    muddies the waters in my thinking. Live & learn =)

    I don't know why they made that decision. It was clearly deliberate.
    I agree that adding an srand_non_deterministic() function would
    have been better.
    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Kaz Kylheku@046-301-5902@kylheku.com to comp.lang.c on Sat Jan 10 19:44:01 2026
    From Newsgroup: comp.lang.c

    On 2026-01-09, Michael Sanders <porkchop@invalid.foo> wrote:
    On Thu, 08 Jan 2026 22:46:42 -0800, Keith Thompson wrote:

    But your original statement implied that clang would *use* that
    particular piece of code, which didn't make much sense. Were you
    just asking about how the __OpenBSD__ macro is defined, without
    reference to srand?

    Well, under OpenBSD I plan on using:

    #ifdef __OpenBSD__
    srand_deterministic(seed);
    #else
    srand(seed);
    #endif

    This is is better

    // In some common configuration header:

    #ifdef __OpenBSD__
    #define HAVE_SRAND_DETERMINISTIC 1
    #define HAVE_... /* other such macros */
    #endif

    (Or the configuration can be generated by scripts which detect features
    in environment.)

    Then in the code:

    #if HAVE_SRAND_DETERMINISTIC
    srand_deterministic(seed);
    #else
    srand(seed);
    #endif

    If a platform other than __OpenBSD__ comes along which has
    srand_deterministic you just make sure HAVE_SRAND_DETERMINISTIC 1 is
    turned on; you don't have to edit the code where that is used.

    This idea is seen in the configuration of GNU programs and such.

    There is a "GDB Internals" document which discusses it in a section
    called "Clean Design"

    Partial quote:

    New #ifdef’s which test for specific compilers or manufacturers or
    operating systems are unacceptable. All #ifdef’s should test for
    features. The information about which configurations contain which
    features should be segregated into the configuration files. Experience
    has proven far too often that a feature unique to one particular system
    often creeps into other systems; and that a conditional based on some
    predefined macro for your current system will become worthless over
    time, as new versions of your system come out that behave differently
    with regard to this feature.

    [ ... more discussion ... ]

    https://www.sourceware.org/gdb/5/onlinedocs/gdbint.pdf

    I think the GNU Coding Standards document may have had a similar
    discussion; I don't see it in the current version though.
    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Tue Feb 3 05:26:47 2026
    From Newsgroup: comp.lang.c

    Michael S <already5chosen@yahoo.com> writes:

    On Wed, 07 Jan 2026 08:41:25 -0800
    Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

    Michael S <already5chosen@yahoo.com> writes:

    On Tue, 23 Dec 2025 17:54:05 -0000 (UTC)
    antispam@fricas.org (Waldek Hebisch) wrote:

    [...]

    There is a paper "PCG: A Family of Simple Fast Space-Efficient
    Statistically Good Algorithms for Random Number Generation"
    by M. O?Neill where she gives a family of algorithms and runs
    several statistical tests against known algorithms. Mersenne
    Twister does not look good in tests. If you have enough (128) bits
    LCGs do pass tests. A bunch of generators with 64-bit state also
    passes tests. So the only reason to prefer Mersenne Twister is
    that it is implemented in available libraries. Otherwise it is
    not so good, have large state and needs more execution time
    than alternatives.

    I don't know. Testing randomness is complicated matter.
    How can I be sure that L'Ecuyer and Simard's TestU01 suite tests
    things that I personally care about and that it does not test
    things that are of no interest for me? Especially, the latter.

    Do you think any of the tests in the TestU01 suite are actually
    counter-indicated? As long as you don't think any TestU01 test
    makes things worse, there is no reason not to use all of them.
    You are always free to disregard tests you don't care about.

    Except that it's difficult psychologically.
    The batteries of test gains position of of authority in your mind.
    Well, may be, you specifically are resistant, but I am not. Nor is
    Melissa O'Nail, it seems.

    To illustrate my point, I will tell you the story about myself.
    Sort of confession.
    [very large portion]

    I have read through your whole posting several times, and also
    looked through your other postings in this thread. Despite my
    efforts I am still not sure what you think or what you're trying to
    say.

    Let me put it as a question. Do you think there is a good and
    objective test for measuring the quality of a PRNG? If so what test
    (or tests) do you think would suffice? Here "quality" is meant as
    some sort of numeric measure, which could be a monotonic metric (as
    in "the larger the number the higher the quality") or just a simple
    pass/fail.

    If you don't think there is any such test, how do you propose that
    PRNGs be evaluated?

    One important point that I seem to figure out recently is that the
    only practical way to produce both solid and very fast PRNG that
    adheres to standard language APIs with 32-bit and to somewhat smaller
    extent 64-bit output, is to use buffering. I.e. most of the time
    generator simply reads pre-calculated word from the buffer and only
    ones per N iterations runs an actual PRNG algorithm, probably in a
    loop, often in SIMD. In order for this approach to be effective,
    buffer can't be particularly small. 32 bytes (256 bits) appear to be
    an absolute minimum. The buffer and counter that manages buffering,
    are parts of the generator state. That alone sets a practical minimal
    limit on the size of generator and diminishes significance of the
    difference between PRNGs with "algorithmic" state of 64 bits, 128 bits
    or even 256 bits.

    I understand what you're saying. These concerns are orthogonal to
    the question of the quality of the numbers generated, which for the
    purposes of this conversation is all I'm interested in.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Michael S@already5chosen@yahoo.com to comp.lang.c on Tue Feb 3 16:37:08 2026
    From Newsgroup: comp.lang.c

    On Tue, 03 Feb 2026 05:26:47 -0800
    Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

    Michael S <already5chosen@yahoo.com> writes:

    On Wed, 07 Jan 2026 08:41:25 -0800
    Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

    Michael S <already5chosen@yahoo.com> writes:

    On Tue, 23 Dec 2025 17:54:05 -0000 (UTC)
    antispam@fricas.org (Waldek Hebisch) wrote:

    [...]

    There is a paper "PCG: A Family of Simple Fast Space-Efficient
    Statistically Good Algorithms for Random Number Generation"
    by M. O?Neill where she gives a family of algorithms and runs
    several statistical tests against known algorithms. Mersenne
    Twister does not look good in tests. If you have enough (128)
    bits LCGs do pass tests. A bunch of generators with 64-bit
    state also passes tests. So the only reason to prefer Mersenne
    Twister is that it is implemented in available libraries.
    Otherwise it is not so good, have large state and needs more
    execution time than alternatives.

    I don't know. Testing randomness is complicated matter.
    How can I be sure that L'Ecuyer and Simard's TestU01 suite tests
    things that I personally care about and that it does not test
    things that are of no interest for me? Especially, the latter.

    Do you think any of the tests in the TestU01 suite are actually
    counter-indicated? As long as you don't think any TestU01 test
    makes things worse, there is no reason not to use all of them.
    You are always free to disregard tests you don't care about.

    Except that it's difficult psychologically.
    The batteries of test gains position of of authority in your mind.
    Well, may be, you specifically are resistant, but I am not. Nor is
    Melissa O'Nail, it seems.

    To illustrate my point, I will tell you the story about myself.
    Sort of confession.
    [very large portion]

    I have read through your whole posting several times, and also
    looked through your other postings in this thread. Despite my
    efforts I am still not sure what you think or what you're trying to
    say.

    Let me put it as a question. Do you think there is a good and
    objective test for measuring the quality of a PRNG? If so what test
    (or tests) do you think would suffice? Here "quality" is meant as
    some sort of numeric measure, which could be a monotonic metric (as
    in "the larger the number the higher the quality") or just a simple pass/fail.


    I don't think that it is possible to create generic empirical tests of
    this sort.
    What is possible is a test that measures specific property that is
    known to be important for specific use.

    If you don't think there is any such test, how do you propose that
    PRNGs be evaluated?


    I don't know.
    But I believe in one philosophical principle that I learned first ~40
    years ago in the field unrelated to PRNGs or to Computer Science: when
    you don't know how to measure quality of your product then it is
    advisable to ask your consumer. Do not ask a random consumer, ask
    the one who requested improvement of the property that you have
    difficulties to measure. He is the most likely one to know.

    The field where I first heard about that principle was manufacturing of ultra-pure chemicals.














    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Tue Feb 17 23:47:06 2026
    From Newsgroup: comp.lang.c

    Michael S <already5chosen@yahoo.com> writes:

    On Tue, 03 Feb 2026 05:26:47 -0800
    Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

    Michael S <already5chosen@yahoo.com> writes:

    On Wed, 07 Jan 2026 08:41:25 -0800
    Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

    Michael S <already5chosen@yahoo.com> writes:

    On Tue, 23 Dec 2025 17:54:05 -0000 (UTC)
    antispam@fricas.org (Waldek Hebisch) wrote:

    [...]

    There is a paper "PCG: A Family of Simple Fast Space-Efficient
    Statistically Good Algorithms for Random Number Generation"
    by M. O?Neill where she gives a family of algorithms and runs
    several statistical tests against known algorithms. Mersenne
    Twister does not look good in tests. If you have enough (128)
    bits LCGs do pass tests. A bunch of generators with 64-bit
    state also passes tests. So the only reason to prefer Mersenne
    Twister is that it is implemented in available libraries.
    Otherwise it is not so good, have large state and needs more
    execution time than alternatives.

    I don't know. Testing randomness is complicated matter.
    How can I be sure that L'Ecuyer and Simard's TestU01 suite tests
    things that I personally care about and that it does not test
    things that are of no interest for me? Especially, the latter.

    Do you think any of the tests in the TestU01 suite are actually
    counter-indicated? As long as you don't think any TestU01 test
    makes things worse, there is no reason not to use all of them.
    You are always free to disregard tests you don't care about.

    Except that it's difficult psychologically.
    The batteries of test gains position of of authority in your mind.
    Well, may be, you specifically are resistant, but I am not. Nor is
    Melissa O'Nail, it seems.

    To illustrate my point, I will tell you the story about myself.
    Sort of confession.
    [very large portion]

    I have read through your whole posting several times, and also
    looked through your other postings in this thread. Despite my
    efforts I am still not sure what you think or what you're trying to
    say.

    Let me put it as a question. Do you think there is a good and
    objective test for measuring the quality of a PRNG? If so what test
    (or tests) do you think would suffice? Here "quality" is meant as
    some sort of numeric measure, which could be a monotonic metric (as
    in "the larger the number the higher the quality") or just a simple
    pass/fail.

    I don't think that it is possible to create generic empirical tests
    of this sort.
    What is possible is a test that measures specific property that is
    known to be important for specific use.

    If you don't think there is any such test, how do you propose that
    PRNGs be evaluated?

    I don't know.
    But I believe in one philosophical principle that I learned first ~40
    years ago in the field unrelated to PRNGs or to Computer Science:
    when you don't know how to measure quality of your product then it is advisable to ask your consumer. Do not ask a random consumer, ask the
    one who requested improvement of the property that you have
    difficulties to measure. He is the most likely one to know.

    The field where I first heard about that principle was manufacturing
    of ultra-pure chemicals.

    I'm surprised to see this answer from you, primarily because it
    seems to confuse different aspects of the subject.

    The key property of a (pseudo) random number generator is that the
    values produced exhibit no discernible pattern. Of course this
    question cannot be answered absolutely, or more precisely it can be
    answered definitely only in the negative. Any affirmative answer
    must be partial and also statistical rather than absolute.

    If someone wants to write a PRNG for general use, there is no point
    in asking users what they want, because they don't know. Very few
    people in the entire world have the mathematical sophistication
    necessary to answer the question competently (I know I don't, and I
    would put myself in the 75 percentile or above). The sort of people
    who could answer the question are also likely to have written test
    suites like TestU01, so it seems reasonable to use one or more of
    those test suites to establish a lower bar for any proposed general
    pupose PRNG.

    Of course there are properties besides statistical quality measures
    that might be desirable, such as whether the state of the PRNG can
    be readily ascertained, the size of the state space, various sorts
    of cryptographic concerns, and so forth. For some applications
    there could be performance thresholds that are essential to meet,
    such as how long to produce a new random value, or what the memory
    footprint is. But these concerns are outside the domain of the
    question, which is only about the quality of the values produced.
    Do you see now what I'm getting at?
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Tristan Wibberley@tristan.wibberley+netnews2@alumni.manchester.ac.uk to comp.lang.c,sci.math.num-analysis on Wed Feb 18 11:21:22 2026
    From Newsgroup: comp.lang.c

    On 18/02/2026 07:47, Tim Rentsch wrote:

    The key property of a (pseudo) random number generator is that the
    values produced exhibit no discernible pattern.

    For a PRNG, they exhibit the pattern of following the sequence of the PRNG!

    Is it that, for any finite sequence of numbers from a PRNG, without
    information about where it came from and how many numbers came before
    you can't predict the next number better than chance?
    --
    Tristan Wibberley

    The message body is Copyright (C) 2026 Tristan Wibberley except
    citations and quotations noted. All Rights Reserved except that you may,
    of course, cite it academically giving credit to me, distribute it
    verbatim as part of a usenet system or its archives, and use it to
    promote my greatness and general superiority without misrepresentation
    of my opinions other than my opinion of my greatness and general
    superiority which you _may_ misrepresent. You definitely MAY NOT train
    any production AI system with it but you may train experimental AI that
    will only be used for evaluation of the AI methods it implements.

    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From David Brown@david.brown@hesbynett.no to comp.lang.c,sci.math.num-analysis on Thu Feb 19 10:01:04 2026
    From Newsgroup: comp.lang.c

    On 18/02/2026 12:21, Tristan Wibberley wrote:
    On 18/02/2026 07:47, Tim Rentsch wrote:

    The key property of a (pseudo) random number generator is that the
    values produced exhibit no discernible pattern.

    For a PRNG, they exhibit the pattern of following the sequence of the PRNG!


    As a deterministic function, a PRNG will obviously follow the pattern of
    its generating function. But the aim is to have no /discernible/
    pattern. The sequence 3, 4, 2, 1, 1, 7, 0, 6, 7 has no pattern that
    could be identified without knowledge of where they came from - and thus
    no way to predict the next number, 9, in the sequence. But there is a
    pattern there - it's the 90th - 100th digits of the decimal expansion of pi.

    Is it that, for any finite sequence of numbers from a PRNG, without information about where it came from and how many numbers came before
    you can't predict the next number better than chance?


    That's the general aim, yes.

    But Michael is absolutely correct that only the consumer can say what
    they want to measure in order to judge the quality of any piece of code.
    It is the customer that gives the requirement specifications, and the programmer's job is to write code that fulfils those specifications.
    PRNGs are no different. (In practice, many customers need help figuring
    out what their requirements are, and how to express those, but that's
    another matter.)

    In the case of PRNGs, there are many possible requirements beyond the
    "it's hard to predict the next number in the sequence". These include :

    * Simplicity of implementation
    * Cryptographic security of implementation
    * Running speed
    * Statistical distribution of values (with many possible patterns, and consideration of length of samples)
    * Repeat cycle length
    * Psychological factors (sometimes you roll five sixes in a row, but
    that might look like a loaded dice. Randomised playlists often use modifications to their PRNGs to avoid repetition of songs, and plotting
    random points in a 2-D space does not look "random" to most people)
    * Interaction with added entropy sources


    As with most requirements for most software, turning most of these into
    some kind of directly and objectively measurable "quality" function is difficult or impossible in practice. As Michael says, the only thing
    you can do is when a consumer complains that it is not good enough for
    their purposes, ask them how to identify when it would be good enough.

    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From James Kuyper@jameskuyper@alumni.caltech.edu to sci.math.num-analysis,comp.lang.c on Thu Feb 19 14:33:34 2026
    From Newsgroup: comp.lang.c

    On 2026-02-19 04:01, David Brown wrote:
    On 18/02/2026 12:21, Tristan Wibberley wrote:
    On 18/02/2026 07:47, Tim Rentsch wrote:

    The key property of a (pseudo) random number generator is that the
    values produced exhibit no discernible pattern.

    For a PRNG, they exhibit the pattern of following the sequence of the PRNG! >>

    As a deterministic function, a PRNG will obviously follow the pattern of
    its generating function. But the aim is to have no /discernible/
    pattern. The sequence 3, 4, 2, 1, 1, 7, 0, 6, 7 has no pattern that
    could be identified without knowledge of where they came from - and thus
    no way to predict the next number, 9, in the sequence. But there is a pattern there - it's the 90th - 100th digits of the decimal expansion of pi.

    I think you're being overoptimistic. I suspect that the pattern could be identified, exactly, without knowing how it was generated. That's
    because every possible pattern has infinitely many different ways in
    which in can be produced. One of those other ways might be easier to
    describe than the way in which the numbers were actually produced, in
    which case that simpler way might be guessed more easily that the actual
    one - possibly a lot easier.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From David Brown@david.brown@hesbynett.no to sci.math.num-analysis,comp.lang.c on Thu Feb 19 20:47:41 2026
    From Newsgroup: comp.lang.c

    On 19/02/2026 20:33, James Kuyper wrote:
    On 2026-02-19 04:01, David Brown wrote:
    On 18/02/2026 12:21, Tristan Wibberley wrote:
    On 18/02/2026 07:47, Tim Rentsch wrote:

    The key property of a (pseudo) random number generator is that the
    values produced exhibit no discernible pattern.

    For a PRNG, they exhibit the pattern of following the sequence of the PRNG! >>>

    As a deterministic function, a PRNG will obviously follow the pattern of
    its generating function. But the aim is to have no /discernible/
    pattern. The sequence 3, 4, 2, 1, 1, 7, 0, 6, 7 has no pattern that
    could be identified without knowledge of where they came from - and thus
    no way to predict the next number, 9, in the sequence. But there is a
    pattern there - it's the 90th - 100th digits of the decimal expansion of pi.

    I think you're being overoptimistic. I suspect that the pattern could be identified, exactly, without knowing how it was generated. That's
    because every possible pattern has infinitely many different ways in
    which in can be produced. One of those other ways might be easier to
    describe than the way in which the numbers were actually produced, in
    which case that simpler way might be guessed more easily that the actual
    one - possibly a lot easier.

    How likely is it that someone would guess a formula that happened to
    generate the decimal digits of pi, without more knowledge than a part of
    the sequence? I don't believe it is possible to quantify such a
    probability, but I would expect it to be very low.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c,sci.math.num-analysis on Thu Feb 19 14:39:33 2026
    From Newsgroup: comp.lang.c

    David Brown <david.brown@hesbynett.no> writes:
    [...]
    As a deterministic function, a PRNG will obviously follow the pattern
    of its generating function. But the aim is to have no /discernible/
    pattern. The sequence 3, 4, 2, 1, 1, 7, 0, 6, 7 has no pattern that
    could be identified without knowledge of where they came from - and
    thus no way to predict the next number, 9, in the sequence. But there
    is a pattern there - it's the 90th - 100th digits of the decimal
    expansion of pi.
    [...]

    A Google search for 342117067 gives numerous hits referring to the
    digits of pi.
    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From David Brown@david.brown@hesbynett.no to comp.lang.c,sci.math.num-analysis on Fri Feb 20 09:16:04 2026
    From Newsgroup: comp.lang.c

    On 19/02/2026 23:39, Keith Thompson wrote:
    David Brown <david.brown@hesbynett.no> writes:
    [...]
    As a deterministic function, a PRNG will obviously follow the pattern
    of its generating function. But the aim is to have no /discernible/
    pattern. The sequence 3, 4, 2, 1, 1, 7, 0, 6, 7 has no pattern that
    could be identified without knowledge of where they came from - and
    thus no way to predict the next number, 9, in the sequence. But there
    is a pattern there - it's the 90th - 100th digits of the decimal
    expansion of pi.
    [...]

    A Google search for 342117067 gives numerous hits referring to the
    digits of pi.


    That is using knowledge of where the sequence comes from - something
    else's knowledge rather than your own, but it's the same principle.

    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From James Kuyper@jameskuyper@alumni.caltech.edu to comp.lang.c,sci.math.num-analysis on Fri Feb 20 16:01:02 2026
    From Newsgroup: comp.lang.c

    On 2026-02-19 14:47, David Brown wrote:
    ...
    How likely is it that someone would guess a formula that happened to
    generate the decimal digits of pi, without more knowledge than a part
    of the sequence? I don't believe it is possible to quantify such a probability, but I would expect it to be very low.

    I'm thinking of the kind of software that looks for patterns in
    something, such as compression utilities. A compression utility
    basically converts a long string of numbers into a much shorter string
    that can be expanded by the decompression utility to recover the
    original pattern. If you look at the algorithms such code uses, you
    realize that they do not attempt to recreate the process that originally generated the long string, they just, in effect, characterize the
    resulting sequence.

    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From David Brown@david.brown@hesbynett.no to comp.lang.c,sci.math.num-analysis on Sat Feb 21 11:09:12 2026
    From Newsgroup: comp.lang.c

    On 20/02/2026 22:01, James Kuyper wrote:
    On 2026-02-19 14:47, David Brown wrote:
    ...
    How likely is it that someone would guess a formula that happened to
    generate the decimal digits of pi, without more knowledge than a part
    of the sequence? I don't believe it is possible to quantify such a
    probability, but I would expect it to be very low.

    I'm thinking of the kind of software that looks for patterns in
    something, such as compression utilities. A compression utility
    basically converts a long string of numbers into a much shorter string
    that can be expanded by the decompression utility to recover the
    original pattern. If you look at the algorithms such code uses, you
    realize that they do not attempt to recreate the process that originally generated the long string, they just, in effect, characterize the
    resulting sequence.


    One of the characteristics of a good PRNG is that its unpredictability
    and its statistical properties (typically they aim to be like white
    noise, but other distributions are sometimes useful) make them
    uncompressible with generic algorithms. Since the PRNG's sequence is generated from an algorithm, it is of course always possible to
    re-create that algorithm as a "compressed" form of the sequence. But
    generic algorithms will never manage it.

    Indeed, you can /define/ a random sequence as a series x(i) such that
    for any given compression algorithm Z, there is an integer n such that
    the Z-compressed version of x is always bigger than the original x for a sequence length n or more.

    And for any given compression algorithm Z, you can find a PRNG algorithm
    that Z cannot compress (after a possible initial segment). I haven't
    dug through the details, but I am confident that you could show this
    with a diagonalisation algorithm over Turing machines, or something akin
    to the halting problem proofs.

    Or you can just try it yourself:

    $ dd if=/dev/urandom of=rand bs=1M count=1
    $ cp rand rand1
    $ cp rand rand2
    $ gzip rand1
    $ bzip2 rand2
    $ ls -l
    -rw-rw-r-- 1 david david 1048576 Feb 20 09:12 rand
    -rw-rw-r-- 1 david david 1048760 Feb 20 09:12 rand1.gz
    -rw-rw-r-- 1 david david 1053414 Feb 20 09:12 rand2.bz2


    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Paul@nospam@needed.invalid to comp.lang.c,sci.math.num-analysis on Mon Feb 23 08:32:38 2026
    From Newsgroup: comp.lang.c

    On Fri, 2/20/2026 3:16 AM, David Brown wrote:
    On 19/02/2026 23:39, Keith Thompson wrote:
    David Brown <david.brown@hesbynett.no> writes:
    [...]
    As a deterministic function, a PRNG will obviously follow the pattern
    of its generating function.  But the aim is to have no /discernible/
    pattern.  The sequence 3, 4, 2, 1, 1, 7, 0, 6, 7 has no pattern that
    could be identified without knowledge of where they came from - and
    thus no way to predict the next number, 9, in the sequence.  But there
    is a pattern there - it's the 90th - 100th digits of the decimal
    expansion of pi.
    [...]

    A Google search for 342117067 gives numerous hits referring to the
    digits of pi.


    That is using knowledge of where the sequence comes from - something else's knowledge rather than your own, but it's the same principle.


    "In the following sequence, what is the next digit 7,7,7,7,7,7,7,7,7 ? " :-)

    PI=3.

    1415926535 8979323846 2643383279 5028841971 6939937510
    ...
    7777777772 4846769425 9310468643 5260899021 0266057232 # Line 517834

    I suspect seeing that, that's not good.

    Using pgmp-chudnovsky.c , and dumping pi as a binary float to a file,
    I get this:

    (text version of PI) 100,000,022 bytes
    PI-Binary.bin 41,524,121 bytes exponent and limbs
    PI-Binary.bin.7Z 41,526,823 bytes 7Z Ultra compression, running on 1 core

    The entropy property looks pretty good, but I doubt I would
    be using that for my supply of random numbers :-)

    https://gmplib.org/list-archives/gmp-discuss/2008-November/003444.html https://stackoverflow.com/questions/3318979/how-to-serialize-the-gmp-mpf-type
    https://gmplib.org/list-archives/gmp-discuss/2007-November/002981.html

    gcc -DNO_FACTOR -fopenmp -Wall -O2 -o pgmp-chudnovsky pgmp-chudnovsky.c -lgmp -lm

    Paul

    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From David Brown@david.brown@hesbynett.no to comp.lang.c,sci.math.num-analysis on Mon Feb 23 16:05:45 2026
    From Newsgroup: comp.lang.c

    On 23/02/2026 14:32, Paul wrote:
    On Fri, 2/20/2026 3:16 AM, David Brown wrote:
    On 19/02/2026 23:39, Keith Thompson wrote:
    David Brown <david.brown@hesbynett.no> writes:
    [...]
    As a deterministic function, a PRNG will obviously follow the pattern
    of its generating function.  But the aim is to have no /discernible/
    pattern.  The sequence 3, 4, 2, 1, 1, 7, 0, 6, 7 has no pattern that
    could be identified without knowledge of where they came from - and
    thus no way to predict the next number, 9, in the sequence.  But there >>>> is a pattern there - it's the 90th - 100th digits of the decimal
    expansion of pi.
    [...]

    A Google search for 342117067 gives numerous hits referring to the
    digits of pi.


    That is using knowledge of where the sequence comes from - something else's knowledge rather than your own, but it's the same principle.


    "In the following sequence, what is the next digit 7,7,7,7,7,7,7,7,7 ? " :-)

    PI=3.

    1415926535 8979323846 2643383279 5028841971 6939937510
    ...
    7777777772 4846769425 9310468643 5260899021 0266057232 # Line 517834

    I suspect seeing that, that's not good.

    Using pgmp-chudnovsky.c , and dumping pi as a binary float to a file,
    I get this:

    (text version of PI) 100,000,022 bytes
    PI-Binary.bin 41,524,121 bytes exponent and limbs
    PI-Binary.bin.7Z 41,526,823 bytes 7Z Ultra compression, running on 1 core

    The entropy property looks pretty good, but I doubt I would
    be using that for my supply of random numbers :-)


    In a random sequence of decimal digits, you would expect a sequence of
    nine identical digits to turn up on average every 10 ^ 8 digits or so.
    You calculated 10 ^ 8 digits, so it's not surprising to see that here.

    As for your compression, remember that your text file contains only the
    digits 0 to 9, spaces and newlines - 12 different characters in 8-bit
    bytes. If these were purely randomly distributed, you'd expect a best compression ratio of log(12) / log(256), or 0.448. But they are not completely random - your space characters and newlines are predictably
    spaced so you get marginally better compression ratios. Without spaces
    and newlines, you'd expect log(10) / log(256) compression - 0.415241012.
    What a coincidence - this matches your "exponent and limbs", and your compressor can't improve on it. (I downloaded a billion digits of pi
    and gzip'ed it, for a compression ration of 0.469.)

    It turns out that the pseudo-randomness here is extremely good. While
    it has not been proven that pi is "normal" (that is to say, its digits
    are all evenly distributed), it is strongly believed to be so.

    Of course it's not a great source of entropy for secure random numbers,
    but the digits of pi form a fine pseudo-random generator function (if
    you don't mind the calculation time).


    https://gmplib.org/list-archives/gmp-discuss/2008-November/003444.html https://stackoverflow.com/questions/3318979/how-to-serialize-the-gmp-mpf-type
    https://gmplib.org/list-archives/gmp-discuss/2007-November/002981.html

    gcc -DNO_FACTOR -fopenmp -Wall -O2 -o pgmp-chudnovsky pgmp-chudnovsky.c -lgmp -lm

    Paul


    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Michael S@already5chosen@yahoo.com to comp.lang.c,sci.math.num-analysis on Mon Feb 23 19:59:17 2026
    From Newsgroup: comp.lang.c

    On Mon, 23 Feb 2026 16:05:45 +0100
    David Brown <david.brown@hesbynett.no> wrote:
    On 23/02/2026 14:32, Paul wrote:
    On Fri, 2/20/2026 3:16 AM, David Brown wrote:
    On 19/02/2026 23:39, Keith Thompson wrote:
    David Brown <david.brown@hesbynett.no> writes:
    [...]
    As a deterministic function, a PRNG will obviously follow the
    pattern of its generating function.  But the aim is to have no
    /discernible/ pattern.  The sequence 3, 4, 2, 1, 1, 7, 0, 6, 7
    has no pattern that could be identified without knowledge of
    where they came from - and thus no way to predict the next
    number, 9, in the sequence.  But there is a pattern there - it's
    the 90th - 100th digits of the decimal expansion of pi.
    [...]

    A Google search for 342117067 gives numerous hits referring to the
    digits of pi.


    That is using knowledge of where the sequence comes from -
    something else's knowledge rather than your own, but it's the same
    principle.

    "In the following sequence, what is the next digit
    7,7,7,7,7,7,7,7,7 ? " :-)

    PI=3.

    1415926535 8979323846 2643383279 5028841971 6939937510
    ...
    7777777772 4846769425 9310468643 5260899021 0266057232 # Line
    517834

    I suspect seeing that, that's not good.

    Using pgmp-chudnovsky.c , and dumping pi as a binary float to a
    file, I get this:

    (text version of PI) 100,000,022 bytes
    PI-Binary.bin 41,524,121 bytes exponent and limbs
    PI-Binary.bin.7Z 41,526,823 bytes 7Z Ultra
    compression, running on 1 core

    The entropy property looks pretty good, but I doubt I would
    be using that for my supply of random numbers :-)


    In a random sequence of decimal digits, you would expect a sequence
    of nine identical digits to turn up on average every 10 ^ 8 digits or
    so. You calculated 10 ^ 8 digits, so it's not surprising to see that
    here.

    As for your compression, remember that your text file contains only
    the digits 0 to 9, spaces and newlines - 12 different characters in
    8-bit bytes. If these were purely randomly distributed, you'd expect
    a best compression ratio of log(12) / log(256), or 0.448. But they
    are not completely random - your space characters and newlines are predictably spaced so you get marginally better compression ratios.
    Without spaces and newlines, you'd expect log(10) / log(256)
    compression - 0.415241012. What a coincidence - this matches your
    "exponent and limbs", and your compressor can't improve on it. (I
    downloaded a billion digits of pi and gzip'ed it, for a compression
    ration of 0.469.)

    It turns out that the pseudo-randomness here is extremely good.
    While it has not been proven that pi is "normal" (that is to say, its
    digits are all evenly distributed), it is strongly believed to be so.

    Of course it's not a great source of entropy for secure random
    numbers, but the digits of pi form a fine pseudo-random generator
    function (if you don't mind the calculation time).

    Would be interesting to find out if it passes Big Crash of L’Ecuyer.
    Of course, one would need far more than a billion of decimal digits to
    have a chance. Something like 100B hexadecimal digits appears to be a
    minimum.

    https://gmplib.org/list-archives/gmp-discuss/2008-November/003444.html https://stackoverflow.com/questions/3318979/how-to-serialize-the-gmp-mpf-type
    https://gmplib.org/list-archives/gmp-discuss/2007-November/002981.html

    gcc -DNO_FACTOR -fopenmp -Wall -O2 -o pgmp-chudnovsky
    pgmp-chudnovsky.c -lgmp -lm

    Paul


    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From David Brown@david.brown@hesbynett.no to comp.lang.c,sci.math.num-analysis on Mon Feb 23 20:06:23 2026
    From Newsgroup: comp.lang.c

    On 23/02/2026 18:59, Michael S wrote:
    On Mon, 23 Feb 2026 16:05:45 +0100
    David Brown <david.brown@hesbynett.no> wrote:

    On 23/02/2026 14:32, Paul wrote:
    On Fri, 2/20/2026 3:16 AM, David Brown wrote:
    On 19/02/2026 23:39, Keith Thompson wrote:
    David Brown <david.brown@hesbynett.no> writes:
    [...]
    As a deterministic function, a PRNG will obviously follow the
    pattern of its generating function.  But the aim is to have no
    /discernible/ pattern.  The sequence 3, 4, 2, 1, 1, 7, 0, 6, 7
    has no pattern that could be identified without knowledge of
    where they came from - and thus no way to predict the next
    number, 9, in the sequence.  But there is a pattern there - it's
    the 90th - 100th digits of the decimal expansion of pi.
    [...]

    A Google search for 342117067 gives numerous hits referring to the
    digits of pi.


    That is using knowledge of where the sequence comes from -
    something else's knowledge rather than your own, but it's the same
    principle.

    "In the following sequence, what is the next digit
    7,7,7,7,7,7,7,7,7 ? " :-)

    PI=3.

    1415926535 8979323846 2643383279 5028841971 6939937510
    ...
    7777777772 4846769425 9310468643 5260899021 0266057232 # Line
    517834

    I suspect seeing that, that's not good.

    Using pgmp-chudnovsky.c , and dumping pi as a binary float to a
    file, I get this:

    (text version of PI) 100,000,022 bytes
    PI-Binary.bin 41,524,121 bytes exponent and limbs
    PI-Binary.bin.7Z 41,526,823 bytes 7Z Ultra
    compression, running on 1 core

    The entropy property looks pretty good, but I doubt I would
    be using that for my supply of random numbers :-)


    In a random sequence of decimal digits, you would expect a sequence
    of nine identical digits to turn up on average every 10 ^ 8 digits or
    so. You calculated 10 ^ 8 digits, so it's not surprising to see that
    here.

    As for your compression, remember that your text file contains only
    the digits 0 to 9, spaces and newlines - 12 different characters in
    8-bit bytes. If these were purely randomly distributed, you'd expect
    a best compression ratio of log(12) / log(256), or 0.448. But they
    are not completely random - your space characters and newlines are
    predictably spaced so you get marginally better compression ratios.
    Without spaces and newlines, you'd expect log(10) / log(256)
    compression - 0.415241012. What a coincidence - this matches your
    "exponent and limbs", and your compressor can't improve on it. (I
    downloaded a billion digits of pi and gzip'ed it, for a compression
    ration of 0.469.)

    It turns out that the pseudo-randomness here is extremely good.
    While it has not been proven that pi is "normal" (that is to say, its
    digits are all evenly distributed), it is strongly believed to be so.

    Of course it's not a great source of entropy for secure random
    numbers, but the digits of pi form a fine pseudo-random generator
    function (if you don't mind the calculation time).


    Would be interesting to find out if it passes Big Crash of L’Ecuyer.
    Of course, one would need far more than a billion of decimal digits to
    have a chance. Something like 100B hexadecimal digits appears to be a minimum.


    Weirdly (at least /I/ think it is weird), it is easier to calculate hexadecimal digits of pi than decimal digits. At least, it is possible
    to calculate them independently, without having to calculate and
    remember all the previous digits. So it should be possible to split the
    task up and run it in parallel on multiple systems.

    Of course, confirming that the hexadecimal digits of pi are random
    enough to pass such a test does not ensure that the decimal digits would
    do so too.


    https://gmplib.org/list-archives/gmp-discuss/2008-November/003444.html
    https://stackoverflow.com/questions/3318979/how-to-serialize-the-gmp-mpf-type
    https://gmplib.org/list-archives/gmp-discuss/2007-November/002981.html

    gcc -DNO_FACTOR -fopenmp -Wall -O2 -o pgmp-chudnovsky
    pgmp-chudnovsky.c -lgmp -lm

    Paul





    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Paul@nospam@needed.invalid to comp.lang.c,sci.math.num-analysis on Mon Feb 23 15:24:20 2026
    From Newsgroup: comp.lang.c

    On Mon, 2/23/2026 2:06 PM, David Brown wrote:

          (text version of PI)  100,000,022 bytes
          PI-Binary.bin          41,524,121 bytes   exponent and limbs

    Weirdly (at least /I/ think it is weird), it is easier to calculate hexadecimal digits of pi than decimal digits.

    I computed the decimal representation and the hex representation
    (by dumping the exponent and limbs), in the same run.

    int mpf_out_raw (FILE *f, mpf_t X) {
    int expt; mpz_t Z; size_t nz;
    expt = X->_mp_exp;
    fwrite(&expt, sizeof(int), 1, f);
    nz = X->_mp_size;
    Z->_mp_alloc = nz;
    Z->_mp_size = nz;
    Z->_mp_d = X->_mp_d;
    return (mpz_out_raw(f, Z) + sizeof(int));
    }

    And that's called this way.

    /* Open the destination file in binary write mode */
    FILE *destination = fopen("PI-Binary.bin", "wb");
    if (!destination) {
    perror("Error opening PI-Binary.bin file");
    } else {
    mpf_out_raw(destination, qi); /* qi happens to hold 100 million digits of PI */
    fflush(destination);
    fclose(destination);
    }

    You can do them in the same run.

    The 7,7,7,7,7,7,7,7,2 sequence was detected in a 32 million digit run
    of SuperPi 1.5 XS. The 100 million digit sequence is too large
    for SuperPI, and pgmp-chudnovsky.c (with OpenMP) was
    used for that, with a little extra code thrown in so I could get
    the floating point storage in raw format. It takes a bit more
    than one minute, to generate 100 million digits (16 cores). The
    compression attempt was done on a single core, to ensure the best
    attempt at compression with 7ZIP.

    The order of the PI method O() used is covered here.

    https://en.wikipedia.org/wiki/Chudnovsky_algorithm

    Paul
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Axel Reichert@mail@axel-reichert.de to comp.lang.c,sci.math.num-analysis on Tue Feb 24 07:08:52 2026
    From Newsgroup: comp.lang.c

    David Brown <david.brown@hesbynett.no> writes:

    Of course, confirming that the hexadecimal digits of pi are random
    enough to pass such a test does not ensure that the decimal digits
    would do so too.

    I was puzzled by the "Of course": To me, this is not intuitively clear.
    Is there any easy (not too technical) way to "see this"/make it
    plausible? My gut feeling (wrongly?) said that the base should not
    affect the randomness of a numerical pattern. "Of course" I am aware
    (and taught to dozens of numerical beginners) that, say, 0.5 in base 10
    has a non terminating representation in base 2, but "random" is neither representation.

    Pointers or simple counter-examples highly welcome!

    Axel
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From David Brown@david.brown@hesbynett.no to sci.math.num-analysis,comp.lang.c on Tue Feb 24 10:24:34 2026
    From Newsgroup: comp.lang.c

    On 24/02/2026 07:08, Axel Reichert wrote:
    David Brown <david.brown@hesbynett.no> writes:

    Of course, confirming that the hexadecimal digits of pi are random
    enough to pass such a test does not ensure that the decimal digits
    would do so too.

    I was puzzled by the "Of course": To me, this is not intuitively clear.
    Is there any easy (not too technical) way to "see this"/make it
    plausible? My gut feeling (wrongly?) said that the base should not
    affect the randomness of a numerical pattern. "Of course" I am aware
    (and taught to dozens of numerical beginners) that, say, 0.5 in base 10
    has a non terminating representation in base 2, but "random" is neither representation.

    Pointers or simple counter-examples highly welcome!

    Axel

    Numbers can be normal in some bases and not in others. This is easy to
    see if we pick related bases, such as base 2 and base 16. For example,
    let x be 1/3. Then x is 0.0101010101... in base 2. That is clearly
    normal in base 2. But in base 16, it is 0.55555555..., which is clearly
    very far from normal.

    I think (but I am not sure) that if a number is normal in base B, then
    it will be normal in any other bases co-prime to B. If the bases are
    not co-prime, then things are not as clear (as shown by my simple example).

    Almost all (in the technical mathematical sense) real numbers are normal
    in all bases. And lots of numbers (including pi) are believed to be
    normal, and have been checked to various lengths in many bases. But it
    is extremely difficult to prove that any given number is normal, unless
    it can be seen from its construction.

    Being normal in a base is not sufficient to have the digits form a good pseudo-random sequence, but it is a necessary condition for a uniform distribution random sequence.

    (I know you set the follow-ups to exclude comp.lang.c, since it is
    off-topic in that group, but I added it again as I don't follow sci.math.num-analysis, so I would not see any replies. People have
    different opinions on the pros and cons of limiting follow-up groups.
    My opinion is that it is better to leave all groups there as long as
    there are people from different groups in the discussion, even if it is
    moving off-topic for a group, because limiting follow-up groups can lead
    to fragmentation. It is better for comp.lang.c to have one off-topic
    thread than to have multiple threads that are part of the same
    discussion but appear separately as groups are added and removed. If
    the discussion goes on for long, and becomes dominated by s.m.n regulars
    and of no interest to c.l.c regulars, then it becomes time to move it
    over to just that one group. Others can have different opinions on such matters.)

    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Axel Reichert@mail@axel-reichert.de to sci.math.num-analysis,comp.lang.c on Thu Feb 26 19:13:41 2026
    From Newsgroup: comp.lang.c

    David Brown <david.brown@hesbynett.no> writes:

    Numbers can be normal in some bases and not in others. This is easy to
    see if we pick related bases, such as base 2 and base 16. For example,
    let x be 1/3. Then x is 0.0101010101... in base 2. That is clearly
    normal in base 2. But in base 16, it is 0.55555555..., which is
    clearly very far from normal.

    I had to look up "normal" on Wikipedia and was in for a delightful read.
    Thanks for the pointer and the nice explanation!

    Best regards

    Axel
    --- Synchronet 3.21c-Linux NewsLink 1.2