Thiago Adams <thiago.adams@gmail.com> writes:
speaking on signed x unsigned,
u8"a" in C11 had the type char [N]. Normally char is signed
I would have said "commonly" rather than "normally". Not an
important point.
in C23 it is unsigned char8_t [N].
when converting code from c11 to c23 we have a error here
const char* s = u8""
I generally "cast char* " to "unsigned char*" when handling
something with utf8. I am not u8"" , I use just " " with utf8
encoded source code and I just assume const char* is utf8.
That raises another issue.
The <uchar.h> header was introduced in C99. In C99, C11, and C17,
that header defines char16_t and char32_t. C23 introduces char8_t.
There doesn't seem to be any way, other than checking the value of __STDC_VERSION__ to determine whether char8_t is defined or not.
There are not *_MIN or *_MAX macros for these types, either in
<uchar.h> or in <limits.h>. A test program I just wrote would have
been a little simpler if I could have used `#ifdef CHAR8_MAX`.
Here's the test program :
#include <stdio.h>
#include <uchar.h>
#define TYPEOF(x) \
(_Generic(x, \
char: "char", \
signed char: "signed char", \
unsigned char: "unsigned char", \
short: "short", \
unsigned short: "unsigned short", \
int: "int", \
unsigned int: "unsigned int", \
long: "long", \
unsigned long: "unsigned long", \
long long: "long long", \
unsigned long long: "unsigned long long"))
int main(void) {
printf("__STDC_VERSION__ = %ldL\n", __STDC_VERSION__);
printf("u8\"a\"[0] is of type %s\n",
TYPEOF(u8"a"[0]));
#if __STDC_VERSION__ >= 202311L
printf("char8_t is %s\n", TYPEOF((char8_t)0));
#endif
printf("char16_t is %s\n", TYPEOF((char16_t)0));
printf("char32_t is %s\n", TYPEOF((char32_t)0));
}
Its output with `gcc -std=c17` :
__STDC_VERSION__ = 201710L
u8"a"[0] is of type char
char16_t is unsigned short
char32_t is unsigned int
Its output with `gcc -std=c23` :
__STDC_VERSION__ = 202311L
u8"a"[0] is of type unsigned char
char8_t is unsigned char
char16_t is unsigned short
char32_t is unsigned int
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:[...]
The <uchar.h> header was introduced in C99. In C99, C11, and C17,
that header defines char16_t and char32_t. C23 introduces char8_t.
There doesn't seem to be any way, other than checking the value of
__STDC_VERSION__ to determine whether char8_t is defined or not.
There are not *_MIN or *_MAX macros for these types, either in
<uchar.h> or in <limits.h>. A test program I just wrote would have
been a little simpler if I could have used `#ifdef CHAR8_MAX`.
Since C23 defines char8_t to be the same type as unsigned char,
it seems better to just define it when it isn't there:
#include <limits.h>
#if CHAR_BIT == 8 && __STDC_VERSION__ < 202311
typedef unsigned char char8_t;
#endif
But before C23, u8"a" is a syntax error.
On 12/15/2025 7:27 PM, Keith Thompson wrote:
...
But before C23, u8"a" is a syntax error.
u8"a" was introduced in C11.
u8'a' was introduced in C23.
speaking on signed x unsigned,
u8"a" in C11 had the type char [N]. Normally char is signed
in C23 it is unsigned char8_t [N].
when converting code from c11 to c23 we have a error here
const char* s = u8""
I generally "cast char* " to "unsigned char*" when handling something
with utf8. I am not u8"" , I use just " " with utf8 encoded source code
and I just assume const char* is utf8.
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
[...]
The <uchar.h> header was introduced in C99. In C99, C11, and C17,
that header defines char16_t and char32_t. C23 introduces char8_t.
There doesn't seem to be any way, other than checking the value of
__STDC_VERSION__ to determine whether char8_t is defined or not.
There are not *_MIN or *_MAX macros for these types, either in
<uchar.h> or in <limits.h>. A test program I just wrote would have
been a little simpler if I could have used `#ifdef CHAR8_MAX`.
[...]
Since C23 defines char8_t to be the same type as unsigned char,
it seems better to just define it when it isn't there:
#include <limits.h>
#if CHAR_BIT == 8 && __STDC_VERSION__ < 202311
typedef unsigned char char8_t;
#endif
Yes. And the test for CHAR_BIT may not be necessary, depending on
the programmer's intent. char8_t is the same type as unsigned char
even if CHAR_BIT > 8.
Similarly, char16_t and char32_t are the same type as
uint_least16_t and uint_least32_t, respectively.
| Sysop: | DaiTengu |
|---|---|
| Location: | Appleton, WI |
| Users: | 1,090 |
| Nodes: | 10 (0 / 10) |
| Uptime: | 45:24:52 |
| Calls: | 13,946 |
| Calls today: | 3 |
| Files: | 187,034 |
| D/L today: |
8,063 files (2,942M bytes) |
| Messages: | 2,460,945 |