On 13.06.2025 15:56, Michael S wrote:
A significant part of x86 installed base (all Intel Core CPUs starting
from gen 6 up to gen 9 and their Xeon contemporaries) has extension
named Itel MPX that was invented exactly for that purpose. But it didn't >>work particularly well. Compiler people never liked it, but despite
that it was supported by several generations of gcc and probably by
clang as well.
This does not really sound like something "readily available", unless you
are suggesting that I migrate to a Linux kernel from 10 years ago, switch
to gcc 5.0 and use outdated hardware.
The proper solution to your problem is to stop using memory-unsafe
language for complex application programming. It's not that successful
use of unsafe languages is for complex application programming is >>impossible. The practice proved many times that it can be done. But
only by very good team. You team is not good enough.
Just to clarify: I didn’t post here seeking help with a simple out-of-bounds
issue, nor was I here to vent. I’ve been wrangling C code in complex, high-performance systems for over a decade - I’m managing just fine. Code improvement is a continual, non-negotiable process in our line of work, but fires happen occasionally nonetheless. While fixing the issue, I started wondering about how faults like this could be located faster, that is assuming they do slip into production - because in spite of the testing process, some faults will inevitably get to customers.
A crash that happens closer to the source of the problem (same compilation unit) would significantly ease the debugging effort. I figured it was a
topic worth sharing, in the spirit of sparking some constructive
discussions.
IIUC in your example the array was global, so compiler knew its
bound and in principle could generate bounds checks. But
I am not aware of C compiler which actually generate such
checks.
That said, detecting out-of-bounds array access is no panacea. Memory corruption can arise from various sources, such as dangling pointers or poorly managed pointer arithmetic.
Hence why I was looking in the direction
of the MMU. All compilation units of a program share the same set of TLBs.
I figured there might perhaps be a way to isolate a given compilation unit
in different TLBs, effectively sandboxing its memory, then make this unit communicate with the rest of the program via shm when shared memory
accesses are needed.
Mateusz Viste <mateusz@not.gonna.tell> wrote:
That said, detecting out-of-bounds array access is no panacea. Memory
corruption can arise from various sources, such as dangling pointers or
poorly managed pointer arithmetic.
AFAICS there is no reason for explicit pointer arithmetic in well
written C programs.
Implicit pointer arithmetic (coming from array
indexing) is done by compiler so should be no problem. Like in
On 2025-06-15, Waldek Hebisch <antispam@fricas.org> wrote:
Mateusz Viste <mateusz@not.gonna.tell> wrote:
That said, detecting out-of-bounds array access is no panacea. Memory
corruption can arise from various sources, such as dangling pointers or
poorly managed pointer arithmetic.
AFAICS there is no reason for explicit pointer arithmetic in well
written C programs.
LOL, you heard it here.
Implicit pointer arithmetic (coming from array
indexing) is done by compiler so should be no problem. Like in
Array indexing *is* pointer arithmetic.
Are you not aware of this equivalence?
(E1)[(E2)] <---> *((E1) + (E2))
In fact, let's draw the commutative diagram
(E1)[(E2)] <---> *((E1) + (E2))
^ ^
| |
| |
v v
(E2)[(E1)] <---> *((E2) + (E1))
You're not saying anything here other than that you like the p[i]
/notation/ better than *(p + i), and &p[i] better than p + i.
Kaz Kylheku <643-408-1753@kylheku.com> wrote:...
You're not saying anything here other than that you like the p[i]
/notation/ better than *(p + i), and &p[i] better than p + i.
The indexing notation at least have chance of being automatically
checked (in cases when compiler/checker knows array size). With arbitrary user-written pointer arithmetic there is no hope of automatic checking.
This might not be a strictly C question, but it definitely concerns all
C programmers.
Earlier today, I fixed an out-of-bounds write bug. An obvious issue:
static int *socks[0xffff];
void update_my_socks(int *sock, int val) {
socks[val & 0xffff] = sock;
}
<snip>
Imagine an alternate universe in which array declarations took the
form (borrowed from Unisys ALGOL):
array_name[lower_bound : upper_bound]
Mateusz Viste <mateusz@not.gonna.tell> wrote:
That said, detecting out-of-bounds array access is no panacea. Memory
corruption can arise from various sources, such as dangling pointers or
poorly managed pointer arithmetic.
AFAICS there is no reason for explicit pointer arithmetic in well
written C programs.
Implicit pointer arithmetic (coming from array
indexing) is done by compiler so should be no problem.
Sure. Or some people prefer to single-step with a debugger. SuchI think out of bound of the array many times there is a write of the 2
people can make their lives a little easier by surrounding the
buffer with sentinel soldiers, setting the sentinel soldiers to a
magic number, and putting a watch on them both - the buffer high
soldier and the buffer low soldier.
antispam@fricas.org (Waldek Hebisch) writes:
Mateusz Viste <mateusz@not.gonna.tell> wrote:
That said, detecting out-of-bounds array access is no panacea. Memory
corruption can arise from various sources, such as dangling pointers or
poorly managed pointer arithmetic.
AFAICS there is no reason for explicit pointer arithmetic in well
written C programs.
This assertion is in effect a No True Scotsman statement.
Implicit pointer arithmetic (coming from array
indexing) is done by compiler so should be no problem.
Even if there is no direct manipulation ("pointer arithmetic") of
pointer variables, access can be checked only if array bounds
information is available, and in many cases it isn't. The reason is
(among other things) C doesn't have array parameters; what it does
have instead is pointer parameters. At the point in the code when
an "array" access is to be done, the information needed to check
that an index value is in bounds just isn't available. The culprit
here is not explicit pointer arithmetic, but lacking the information
needed to do a bounds check. That lack is inherent in how the C
language works with respect to arrays and pointer conversion.
On Thu, 12 Jun 2025 19:15:26 +0100, Richard Heathfield wrote:
Sure. Or some people prefer to single-step with a debugger. Such
people can make their lives a little easier by surrounding the
buffer with sentinel soldiers, setting the sentinel soldiers to a
magic number, and putting a watch on them both - the buffer high
soldier and the buffer low soldier.
I think out of bound of the array many times there is a write of the 2
limit bounds memory... but there are cases where bound are ok but
memory is written out the array the same, in some other places
This might not be a strictly C question, but it definitely concerns all
C programmers.
Earlier today, I fixed an out-of-bounds write bug. An obvious issue:
static int *socks[0xffff];
void update_my_socks(int *sock, int val) {
socks[val & 0xffff] = sock;
}
While the presented issue is common knowledge for anyone familiar with
C, *locating* the bug was challenging. The program did not crash at the moment of the out-of-bounds write but much later - somewhere entirely different, in a different object file that maintained a static pointer
for tracking a position in a linked list. To my surprise, the pointer
was randomly reset to NULL about once a week, causing a segfault.
Tracing this back to an unrelated out-of-bounds write elsewhere in the
code was tedious, to say the least.
This raises a question: how can such corruptions be detected sooner? Protected mode prevents interference between programs but doesn’t
safeguard a program from corrupting itself. Is there a way to enforce
memory protection between module files of the same program? After all,
static objects shouldn't be accessible outside their compilation unit.
How would you approach this?
Mateusz
On 14.06.2025 01:31, Tim Rentsch wrote:
It isn't wrong to think of bitwise-and as masking-in (or possibly
masking-out) of certain bits, but it still isn't a modulo. A
modulo operation is what is desired;
By "different viewpoints," I meant that while you approach the
problem by applying a modulo operation to the index so it fits the
array size, I tend to think in terms of ensuring the index
correctly maps to a location within an n-bit address space.
Naturally, the array should accommodate the maximum possible index
for the given address space, and that?s where the original code
fell short. And you're absolutely right that hardcoded values are problematic, the size of the array should have been linked with
the n-bits address space expectation.
Sysop: | DaiTengu |
---|---|
Location: | Appleton, WI |
Users: | 1,069 |
Nodes: | 10 (0 / 10) |
Uptime: | 70:55:47 |
Calls: | 13,725 |
Files: | 186,960 |
D/L today: |
4,358 files (1,099M bytes) |
Messages: | 2,410,344 |