Enable soft-fp for -m64 only.
libgcc/ChangeLog:
* config.host: Include makefiles t-softfp for -m64.
* config/s390/sfp-exceptions.c: New file.
* config/s390/sfp-machine.h: New file.
* config/s390/t-softfp: New file.
aarch64 for some strange reason unconditionally enables -Werror for libgcc
building and this particular file for some strange reason contains
a useless static forward declaration of a function only defined in the
#if defined __sun__ && defined __svr4__
block and not otherwise (with __attribute__((constructor))).
And we warn (with -Werror error) in the non-__sun__/__svr4__ case because
it declares a static function that is never defined.
The forward declaration makes no sense to me, for static functions
forward declarations shouldn't be needed even for -Wstrict-prototypes,
and AFAIK we don't warn on static __attribute__((constructor)) void foo (void) {}
being unused. And the function isn't used before being defined.
So, the following patch just removes the forward declaration.
2025-08-06 Jakub Jelinek <jakub@redhat.com>
PR libgcc/121397
* enable-execute-stack-mprotect.c (check_enabling): Remove useless
forward declaration.
Update FMV features to latest ACLE spec of 2024Q4 - several features have been
removed or merged. Add FMV support for CSSC and MOPS. Preserve the ordering
in enum CPUFeatures.
gcc:
* common/config/aarch64/cpuinfo.h: Remove unused features, add FEAT_CSSC
and FEAT_MOPS.
* config/aarch64/aarch64-option-extensions.def: Remove FMV support
for RPRES, use PULL rather than AES, add FMV support for CSSC and MOPS.
libgcc:
* config/aarch64/cpuinfo.c (__init_cpu_features_constructor):
Remove unused features, add support for CSSC and MOPS.
Cleanup HWCAP defines - rather than including hwcap.h and then repeating it
using ifndef, just define the HWCAPs we need exactly as in hwcap.h.
libgcc:
* config/aarch64/cpuinfo.c: Cleanup HWCAP defines.
This optional header is used to bring in the definition of the
struct __ifunc_arg_t type. Since it has been added to glibc only
recently, the previous implementation had to check whether this
header is present and, if not, it provide its own definition.
This creates dead code because either one of these two parts would
not be tested. The ABI specification for ifunc resolvers allows to
create own ABI-compatible definition for this type, which is the
right way of doing it.
In addition to improving consistency, the new approach also helps
with addition of new fields to struct __ifunc_arg_t type without
the need to work-around situations when the definition imported
from the header lacks these new fields.
ABI allows to define as many hwcap fields in this struct as needed,
provided that at runtime we only access the fields that are permitted
by the _size value.
gcc/
* config/aarch64/aarch64.cc (build_ifunc_arg_type):
Add new fields _hwcap3 and _hwcap4.
libatomic/
* config/linux/aarch64/host-config.h (__ifunc_arg_t):
Remove sys/ifunc.h and add new fields _hwcap3 and _hwcap4.
libgcc/
* config/aarch64/cpuinfo.c (__ifunc_arg_t): Likewise.
(__init_cpu_features): obtain and assign values for the
fields _hwcap3 and _hwcap4.
(__init_cpu_features_constructor): check _size in the
arg argument.
SME uses a lazy save system to manage ZA. The idea is that,
if a function with ZA state wants to call a "normal" function,
it can leave its state in ZA and instead set up a lazy save buffer.
If, unexpectedly, that normal function contains a nested use of ZA,
that nested use of ZA must commit the lazy save first.
This lazy save system uses a special system register called TPIDR2_EL0.
See:
https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#66the-za-lazy-saving-scheme
for details.
The ABI specifies that, on entry to an exception handler, the following
things must be true:
* PSTATE.SM must be 0 (the processor must be in non-streaming mode)
* PSTATE.ZA must be 0 (ZA must be off)
* TPIDR2_EL0 must be 0 (there must be no uncommitted lazy save)
This is normally done by making _Unwind_RaiseException & friends
commit any lazy save before they unwind. This also has the side
effect of ensuring that TPIDR2_EL0 is never left pointing to a
lazy save buffer that has been unwound.
However, things get more complicated with signals. If:
(a) a signal is raised while ZA is dormant (that is, while there is an
uncommitted lazy save);
(b) the signal handler throws an exception; and
(c) that exception is caught outside the signal handler
something must ensure that the lazy save from (a) is committed.
This would be simple if the signal handler was entered with ZA and
TPIDR2_EL0 intact. However, for various good reasons that are out
of scope here, this is not done. Instead, Linux now clears both
TPIDR2_EL0 and PSTATE.ZA before entering a signal handler, see:
https://lore.kernel.org/all/20250417190113.3778111-1-mark.rutland@arm.com/
for details.
Therefore, it is the unwinder that must simulate a commit of the lazy
save from (a). It can do this by reading the previous values of
TPIDR2_EL0 and ZA from the sigcontext.
The SME-related sigcontext structures were only added to linux's
asm/sigcontext.h relatively recently and we can't rely on GCC being
built against such recent kernel header files. The patch therefore uses
defines relevant macros if they are not defined and provide types that
comply with ABI layout of the corresponding linux types.
The patch includes some ugly casting in an attempt to support big-endian
ILP32, even though SME on big-endian ILP32 linux should never be a thing.
We can remove it if we also remove ILP32 support from GCC.
Co-authored-by: Yury Khrustalev <yury.khrustalev@arm.com>
Reviewed-by: Tamar Christina <tamar.christina@arm.com>
gcc/
* doc/sourcebuild.texi (aarch64_sme_hw): Document.
gcc/testsuite/
* lib/target-supports.exp (add_options_for_aarch64_sme)
(check_effective_target_aarch64_sme_hw): New procedures.
* g++.target/aarch64/sme/sme_throw_1.C: New test.
* g++.target/aarch64/sme/sme_throw_2.C: Likewise.
libgcc/
* config/aarch64/linux-unwind.h (aarch64_fallback_frame_state):
If a signal was raised while there was an uncommitted lazy save,
commit the save as part of the unwind process.
This dates back to the creation of top-level `libgcc` in
fa9585134f. I strongly suspect that this
does nothing.
Andrew Pinksi adds:
> So looking into this further, MACHMODE_H used part of LIBGCC_DEPS
> because of TM_H and r0-78222-gfa9585134f6f58 moved away from including
> tm.h from libgcc. It was copied over unused.
It is indeed used then.
(For background context, my overall goal here is hoping libgcc can depend on
fewer/no stuff that is generated by `gcc/Makefile`. This is me trying to
pluck some low-hanging fruit -- this is the only direct mention of
`insn-modes.h` in libgcc.)
libgcc/ChangeLog:
* Makefile.in: Delete dead `MACHMODE_H` variable
For aarch64, libgcc is built with -Werror, after the latest
-Wunused-but-set* commit (r16-2258-g0eac9cfee8cb0b21d), a new warning
showed up:
```
../../../gcc/libgcc/config/libbid/bid_binarydecimal.c: In function
‘__binary32_to_bid128’:
../../../gcc/libgcc/config/libbid/bid_binarydecimal.c:130:31: error:
variable ‘c3’ set but not used [-Werror=unused-but-set-variable=]
130 | { unsigned long long c0,c1,c2,c3; \
| ^~
../../../gcc/libgcc/config/libbid/bid_binarydecimal.c:146842:5: note:
in expansion of macro ‘__mul_10x256_to_256’
146842 | __mul_10x256_to_256 (z.w[5], z.w[4], z.w[3], z.w[2],
z.w[5], z.w[4],
| ^~~~~~~~~~~~~~~~~~~
```
This fixes it by casting c3 to void after the last __mul_10x64 in
__mul_10x256_to_256 macro to mark it as being "used".
libgcc/config/libbid/ChangeLog:
* bid_binarydecimal.c (__mul_10x256_to_256): Mark c3 as being
used.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
This patch fixes SFtype to UDWtype (aka float to unsigned long long)
conversion on targets without DFmode like e.g. H8/300H. It solely relies
on SFtype->UWtype and UWtype->UDWtype conversions/casts. The existing code
in line 2218 (counter = a) assigns/casts a float which is *always* not lesser
than Wtype_MAXp1_F to an UWtype int which of course does not have enough
capacity.
PR target/116363
libgcc/ChangeLog:
* libgcc2.c (__fixunssfDI): Fix SFtype to UDWtype conversion for targets
without LIBGCC2_HAS_DF_MODE defined
The following patch fixes
FAIL: gcc.dg/dfp/bitint-1.c (test for excess errors)
FAIL: gcc.dg/dfp/bitint-2.c (test for excess errors)
FAIL: gcc.dg/dfp/bitint-3.c (test for excess errors)
FAIL: gcc.dg/dfp/bitint-4.c (test for excess errors)
FAIL: gcc.dg/dfp/bitint-5.c (test for excess errors)
FAIL: gcc.dg/dfp/bitint-6.c (test for excess errors)
FAIL: gcc.dg/dfp/bitint-8.c (test for excess errors)
FAIL: gcc.dg/dfp/int128-1.c (test for excess errors)
FAIL: gcc.dg/dfp/int128-2.c (test for excess errors)
FAIL: gcc.dg/dfp/int128-4.c (test for excess errors)
on s390x-linux (with the 3 not yet posted patches).
The patch does multiple things:
1) the routines were written for the DFP BID (binary integer decimal)
format which is used on all arches but powerpc*/s390* (those use
DPD - densely packed decimal format); as most of the code is actually
the same for both BID and DPD formats, I haven't copied the sources
+ slightly modified them, but added the DPD support directly, + renaming
of the exported symbols from __bid_* prefixed to __dpd_* prefixed that
GCC expects on the DPD targets
2) while testing that I've found some big-endian issues in the existing
support
3) testing also revealed that in some cases __builtin_clzll (~msb) was
called with msb set to all ones, so invoking UB; apparently on aarch64
and x86 we were lucky and got some value that happened to work well,
but that wasn't the case on s390x
For 1), the patch uses two ~ 2KB tables to speed up the decoding/encoding.
I haven't found such tables in what is added into libgcc.a, though they
are in libdecnumber/bid/bid2dpd_dpd2bid.h, but there they are just huge
and next to other huge tables - there is d2b which is like __dpd_d2bbitint
in the patch but it uses 64-bit entries rather than 16-bit, then there is
d2b2 with 64-bit entries like in d2b all multiplied by 1000, then d2b3
similarly multiplied by 1000000, then d2b4 similarly multiplied by
1000000000, then d2b5 similarly multiplied by 1000000000000ULL and
d2b6 similarly multipled by 1000000000000000ULL. Arguably it can
save some of the multiplications, but on the other side accesses memory
which is unlikely in the caches, and the 2048 bytes in the patch vs.
24 times more for d2b is IMHO significant.
For b2d, libdecnumber/bid/bid2dpd_dpd2bid.h has again b2d table like
__dpd_b2dbitint in the patch, except that it has 64-bit entries rather
than 16-bit (this time 1000 entries), but then has b2d2 which has the
same entries shifted left by 10, then b2d3 shifted left by 20, b2d4 shifted
left by 30 and b2d5 shifted left by 40. I can understand for d2b paying
memory cost to speed up multiplications, but don't understand paying
extra 4 * 8 * 1000 bytes (+ 6 * 1000 bytes for b2d not using ushort)
just to avoid shifts.
2025-05-27 Jakub Jelinek <jakub@redhat.com>
* config/t-softfp (softfp_bid_list): Don't guard with
$(enable_decimal_float) == bid.
* soft-fp/bitint.h (__bid_pow10bitint): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_pow10bitint.
(__dpd_d2bbitint, __dpd_b2dbitint): Declare.
* soft-fp/bitintpow10.c (__dpd_d2bbitint, __dpd_b2dbitint): New
variables.
* soft-fp/fixsdbitint.c (__bid_fixsdbitint): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_fixsdbitint.
Add DPD support. Fix big-endian support.
* soft-fp/fixddbitint.c (__bid_fixddbitint): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_fixddbitint.
Add DPD support. Fix big-endian support.
* soft-fp/fixtdbitint.c (__bid_fixtdbitint): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_fixtdbitint.
Add DPD support. Fix big-endian support.
* soft-fp/fixsdti.c (__bid_fixsdbitint): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_fixsdbitint.
(__bid_fixsdti): For !defined(ENABLE_DECIMAL_BID_FORMAT) redefine to
__dpd_fixsdti.
* soft-fp/fixddti.c (__bid_fixddbitint): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_fixddbitint.
(__bid_fixddti): For !defined(ENABLE_DECIMAL_BID_FORMAT) redefine to
__dpd_fixddti.
* soft-fp/fixtdti.c (__bid_fixtdbitint): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_fixtdbitint.
(__bid_fixtdti): For !defined(ENABLE_DECIMAL_BID_FORMAT) redefine to
__dpd_fixtdti.
* soft-fp/fixunssdti.c (__bid_fixsdbitint): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_fixsdbitint.
(__bid_fixunssdti): For !defined(ENABLE_DECIMAL_BID_FORMAT) redefine
to __dpd_fixunssdti.
* soft-fp/fixunsddti.c (__bid_fixddbitint): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_fixddbitint.
(__bid_fixunsddti): For !defined(ENABLE_DECIMAL_BID_FORMAT) redefine
to __dpd_fixunsddti.
* soft-fp/fixunstdti.c (__bid_fixtdbitint): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_fixtdbitint.
(__bid_fixunstdti): For !defined(ENABLE_DECIMAL_BID_FORMAT) redefine
to __dpd_fixunstdti.
* soft-fp/floatbitintsd.c (__bid_floatbitintsd): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_floatbitintsd.
Add DPD support. Avoid calling __builtin_clzll with 0 argument. Fix
big-endian support.
* soft-fp/floatbitintdd.c (__bid_floatbitintdd): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_floatbitintdd.
Add DPD support. Avoid calling __builtin_clzll with 0 argument. Fix
big-endian support.
* soft-fp/floatbitinttd.c (__bid_floatbitinttd): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_floatbitinttd.
Add DPD support. Avoid calling __builtin_clzll with 0 argument. Fix
big-endian support.
* soft-fp/floattisd.c (__bid_floatbitintsd): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_floatbitintsd.
(__bid_floattisd): For !defined(ENABLE_DECIMAL_BID_FORMAT) redefine to
__dpd_floattisd.
* soft-fp/floattidd.c (__bid_floatbitintdd): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_floatbitintdd.
(__bid_floattidd): For !defined(ENABLE_DECIMAL_BID_FORMAT) redefine to
__dpd_floattidd.
* soft-fp/floattitd.c (__bid_floatbitinttd): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_floatbitinttd.
(__bid_floattitd): For !defined(ENABLE_DECIMAL_BID_FORMAT) redefine to
__dpd_floattitd.
* soft-fp/floatuntisd.c (__bid_floatbitintsd): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_floatbitintsd.
(__bid_floatuntisd): For !defined(ENABLE_DECIMAL_BID_FORMAT) redefine
to __dpd_floatuntisd.
* soft-fp/floatuntidd.c (__bid_floatbitintdd): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_floatbitintdd.
(__bid_floatuntidd): For !defined(ENABLE_DECIMAL_BID_FORMAT) redefine
to __dpd_floatuntidd.
* soft-fp/floatuntitd.c (__bid_floatbitinttd): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_floatbitinttd.
(__bid_floatuntitd): For !defined(ENABLE_DECIMAL_BID_FORMAT) redefine
to __dpd_floatuntitd.
f7_exp limited exponents to 512, but 1023 * ln2 ≈ 709,
hence 1024 is a correct limit.
libgcc/config/avr/libf7/
PR target/120441
* libf7.c (f7_exp): Limit aa->expo to 10 (not to 9).
This is similar to d6d7afcdbc about the posix
and win32 thread model.
Signed-off-by: LIU Hao <lh_mouse@126.com>
Signed-off-by: Jonathan Yong <10walls@gmail.com>
libgcc/ChangeLog:
* config.host: Enable mcf thread model for aarch64-*-mingw*.
* config/i386/t-mingw-mcfgthread: Move to...
* config/mingw/t-mingw-mcfgthread: ...here.
gthr-vxworks-thread.c calls memset in __ghtread_cond_signal, but it
fails ot include <string.h>, where this function is declared, and GCC
14 rejects calls of undeclared functions. Include the required
header.
for libgcc/ChangeLog
* config/gthr-vxworks-thread.c: Include string.h for memset.
When adding _BitInt support I was hoping all or most of arches would
implement it already for GCC 14. That didn't happen and with
new hosts adding support for _BitInt for GCC 16 (s390x-linux and as was
posted today loongarch-linux too), we need the _BitInt support functions
exported on those arches at GCC_16.0.0 rather than GCC_14.0.0 which
shouldn't be changed anymore.
The following patch does that. Both arches were already exporting
some of the _BitInt related symbols in their specific map files, this
just moves the remaining ones there as well.
2025-05-20 Jakub Jelinek <jakub@redhat.com>
* libgcc-std.ver.in (GCC_14.0.0): Remove bitint related exports
from here.
* config/i386/libgcc-glibc.ver (GCC_14.0.0): Add them here.
* config/i386/libgcc-darwin.ver (GCC_14.0.0): Likewise.
* config/i386/libgcc-sol2.ver (GCC_14.0.0): Likewise.
* config/aarch64/libgcc-softfp.ver (GCC_14.0.0): Likewise.
The big-endian _BitInt support in libgcc was written without any
testing and so I haven't discovered I've made one mistake in it
(in multiple places).
The bitint_reduce_prec function attempts to optimize inputs
which have some larger precision but at runtime they are found
to need smaller number of limbs.
For little-endian that is handled just by returning smaller
precision (or negative precision for signed), but for
big-endian we need to adjust the passed in limb pointer so that
when it returns smaller precision the argument still contains
the least significant limbs for the returned precision.
2025-05-20 Jakub Jelinek <jakub@redhat.com>
* libgcc2.c (bitint_reduce_prec): For big endian
__LIBGCC_BITINT_ORDER__ use ++*p and --*p instead of
++p and --p.
* soft-fp/bitint.h (bitint_reduce_prec): Likewise.
From 6462f1e6a2565c5d4756036d9bc2f39dce9bd768 Mon Sep 17 00:00:00 2001
From: QBos07 <qubos@outlook.de>
Date: Sat, 10 May 2025 16:56:28 +0000
Subject: [PATCH] libgcc SH: fix alignment for relaxation
when relaxation is enabled we can not infer the alignment
from the position as that may change. This should not change
non-relaxed builds as its allready aligned there. This was
the missing piece to building an entire toolchain with -mrelax
Credit goes to Oleg Endo: https://sourceware.org/bugzilla/show_bug.cgi?id=3298#c4
libgcc/
* config/sh/lib1funcs.S (ashiftrt_r4_32): Increase alignment.
(movemem): Force alignment of the mova intruction.
The Intel Decimal Floating-Point Math Library is available as open-source on Netlib[1].
[1] https://www.netlib.org/misc/intel/
libgcc/config/libbid/ChangeLog:
* bid128_string.c (MIN_DIGITS): New macro.
(bid128_from_string): Bug fix. Conversion from very long input
string to decimal.
As per architecture, SuperH has a reversed NaN signalling bit
vs IEEE754-2008, it also has a NaN propgation rule similar to
MIPS style.
Use mips style float format and mode for all float types, and
correct sfp-machine header accordingly.
PR target/111814
gcc/ChangeLog:
* config/sh/sh-modes.def (RESET_FLOAT_FORMAT): Use mips format.
(FLOAT_MODE): Use mips mode.
libgcc/ChangeLog:
* config/sh/sfp-machine.h (_FP_NANFRAC_B): Reverse signaling bit.
(_FP_NANFRAC_H): Likewise.
(_FP_NANFRAC_S): Likewise.
(_FP_NANFRAC_D): Likewise.
(_FP_NANFRAC_Q): Likewise.
(_FP_KEEPNANFRACP): Enable for target.
(_FP_QNANNEGATEDP): Enable for target.
(_FP_CHOOSENAN): Port from MIPS.
gcc/testsuite/ChangeLog:
* gcc.target/sh/pr111814.c: New test.
This resolves GCN:
ld: error: undefined symbol: _Unwind_RaiseException
>>> referenced by eh_throw.cc:93 ([...]/source-gcc/libstdc++-v3/libsupc++/eh_throw.cc:93)
>>> eh_throw.o:(__cxa_throw) in archive /srv/data/tschwinge/amd-instinct2/gcc/build/submit-light-target_gcn/build-gcc/amdgcn-amdhsa/gfx908/libstdc++-v3/src/.libs/libstdc++.a
[...]
collect2: error: ld returned 1 exit status
..., and/or:
ld: error: undefined symbol: _Unwind_Resume_or_Rethrow
>>> referenced by eh_throw.cc:129 ([...]/source-gcc/libstdc++-v3/libsupc++/eh_throw.cc:129)
>>> eh_throw.o:(__cxa_rethrow) in archive /srv/data/tschwinge/amd-instinct2/gcc/build/submit-light-target_gcn/build-gcc/amdgcn-amdhsa/gfx908/libstdc++-v3/src/.libs/libstdc++.a
[...]
collect2: error: ld returned 1 exit status
..., and nvptx:
unresolved symbol _Unwind_RaiseException
collect2: error: ld returned 1 exit status
..., or:
unresolved symbol _Unwind_Resume_or_Rethrow
collect2: error: ld returned 1 exit status
For both GCN, nvptx, this each progresses ~25 'check-gcc-c++',
and ~10 'check-target-libstdc++-v3' test cases:
[-FAIL:-]{+PASS:+} [...] (test for excess errors)
..., with (if applicable, for most of them):
[-UNRESOLVED:-]{+PASS:+} [...] [-compilation failed to produce executable-]{+execution test+}
..., or some 'FAIL: [...] execution test' where these test cases now FAIL when
attempting to use these interfaces, or, if applicable, FAIL due to run-time
'GCC/nvptx: sorry, unimplemented: dynamic stack allocation not supported'.
libgcc/
* config/gcn/unwind-gcn.c (_Unwind_RaiseException)
(_Unwind_Resume_or_Rethrow): New.
* config/nvptx/unwind-nvptx.c (_Unwind_RaiseException)
(_Unwind_Resume_or_Rethrow): Likewise.
This resolves GCN:
ld: error: undefined symbol: _Unwind_DeleteException
>>> referenced by eh_catch.cc:109 ([...]/source-gcc/libstdc++-v3/libsupc++/eh_catch.cc:109)
>>> eh_catch.o:(__cxa_end_catch) in archive [...]/build-gcc/amdgcn-amdhsa/libstdc++-v3/src/.libs/libstdc++.a
[...]
collect2: error: ld returned 1 exit status
..., and nvptx:
unresolved symbol _Unwind_DeleteException
collect2: error: ld returned 1 exit status
For both GCN, nvptx, this each progresses ~100 'check-gcc-c++',
and ~500 'check-target-libstdc++-v3' test cases:
[-FAIL:-]{+PASS:+} [...] (test for excess errors)
..., with (if applicable, for most of them):
[-UNRESOLVED:-]{+PASS:+} [...] [-compilation failed to produce executable-]{+execution test+}
..., or just a few 'FAIL: [...] execution test' where these test cases now
FAIL for unrelated reasons, or, if applicable, FAIL due to run-time
'GCC/nvptx: sorry, unimplemented: dynamic stack allocation not supported'.
libgcc/
* config/gcn/unwind-gcn.c (_Unwind_DeleteException): New.
* config/nvptx/unwind-nvptx.c (_Unwind_DeleteException): Likewise.
Follow-up to commit 1146410c0f
"nvptx: Support '-mfake-ptx-alloca'". '-mfake-ptx-alloca' is applicable only
for configurations where PTX 'alloca' is not supported, where target libraries
are built with it enabled (that is, libstdc++, libgfortran).
This change progresses:
[-FAIL:-]{+PASS:+} g++.dg/tree-ssa/pr20458.C -std=gnu++17 (test for excess errors)
[-UNRESOLVED:-]{+PASS:+} g++.dg/tree-ssa/pr20458.C -std=gnu++17 [-compilation failed to produce executable-]{+execution test+}
[-FAIL:-]{+PASS:+} g++.dg/tree-ssa/pr20458.C -std=gnu++26 (test for excess errors)
[-UNRESOLVED:-]{+PASS:+} g++.dg/tree-ssa/pr20458.C -std=gnu++26 [-compilation failed to produce executable-]{+execution test+}
UNSUPPORTED: g++.dg/tree-ssa/pr20458.C -std=gnu++98: exception handling not supported
..., and "enables" a few test cases:
FAIL: g++.old-deja/g++.other/sibcall1.C -std=gnu++17 (test for excess errors)
[Etc.]
FAIL: g++.old-deja/g++.other/unchanging1.C -std=gnu++17 (test for excess errors)
[Etc.]
..., which now (unrelatedly to 'alloca', and in the same way as configurations
where PTX 'alloca' is supported) FAIL due to:
unresolved symbol _Unwind_DeleteException
collect2: error: ld returned 1 exit status
Most importantly, it progresses ~830 libstdc++ test cases:
[-FAIL:-]{+PASS:+} [...] (test for excess errors)
..., with (if applicable, for most of them):
[-UNRESOLVED:-]{+PASS:+} [...] [-compilation failed to produce executable-]{+execution test+}
..., or just a few 'FAIL: [...] execution test' where these test cases also
FAIL in configurations where PTX 'alloca' is supported, or ~120 instances of
'FAIL: [...] execution test' due to run-time
'GCC/nvptx: sorry, unimplemented: dynamic stack allocation not supported'.
This change also resolves the cases noted in
commit bac2d8a246
"nvptx: Build libgfortran with '-mfake-ptx-alloca' [PR107635]":
| With '-mfake-ptx-alloca', libgfortran again succeeds to build, and compared
| to before, we've got only a small number of regressions due to nvptx 'ld'
| complaining about 'unresolved symbol __GCC_nvptx__PTX_alloca_not_supported':
|
| [-PASS:-]{+FAIL:+} gfortran.dg/coarray/codimension_2.f90 -fcoarray=lib -O2 -lcaf_single (test for excess errors)
[-FAIL:-]{+PASS:+} gfortran.dg/coarray/codimension_2.f90 -fcoarray=lib -O2 -lcaf_single (test for excess errors)
| [-PASS:-]{+FAIL:+} gfortran.dg/coarray/event_4.f08 -fcoarray=lib -O2 -lcaf_single (test for excess errors)
| [-PASS:-]{+UNRESOLVED:+} gfortran.dg/coarray/event_4.f08 -fcoarray=lib -O2 -lcaf_single [-execution test-]{+compilation failed to produce executable+}
[-FAIL:-]{+PASS:+} gfortran.dg/coarray/event_4.f08 -fcoarray=lib -O2 -lcaf_single (test for excess errors)
[-UNRESOLVED:-]{+PASS:+} gfortran.dg/coarray/event_4.f08 -fcoarray=lib -O2 -lcaf_single [-compilation failed to produce executable-]{+execution test+}
| [-PASS:-]{+FAIL:+} gfortran.dg/coarray/fail_image_2.f08 -fcoarray=lib -O2 -lcaf_single (test for excess errors)
| [-PASS:-]{+UNRESOLVED:+} gfortran.dg/coarray/fail_image_2.f08 -fcoarray=lib -O2 -lcaf_single [-execution test-]{+compilation failed to produce executable+}
[-FAIL:-]{+PASS:+} gfortran.dg/coarray/fail_image_2.f08 -fcoarray=lib -O2 -lcaf_single (test for excess errors)
[-UNRESOLVED:-]{+PASS:+} gfortran.dg/coarray/fail_image_2.f08 -fcoarray=lib -O2 -lcaf_single [-compilation failed to produce executable-]{+execution test+}
| [-PASS:-]{+FAIL:+} gfortran.dg/coarray/proc_pointer_assign_1.f90 -fcoarray=lib -O2 -lcaf_single (test for excess errors)
| [-PASS:-]{+UNRESOLVED:+} gfortran.dg/coarray/proc_pointer_assign_1.f90 -fcoarray=lib -O2 -lcaf_single [-execution test-]{+compilation failed to produce executable+}
[-FAIL:-]{+PASS:+} gfortran.dg/coarray/proc_pointer_assign_1.f90 -fcoarray=lib -O2 -lcaf_single (test for excess errors)
[-UNRESOLVED:-]{+PASS:+} gfortran.dg/coarray/proc_pointer_assign_1.f90 -fcoarray=lib -O2 -lcaf_single [-compilation failed to produce executable-]{+execution test+}
| [-PASS:-]{+FAIL:+} gfortran.dg/coarray_43.f90 -O (test for excess errors)
[-FAIL:-]{+PASS:+} gfortran.dg/coarray_43.f90 -O (test for excess errors)
..., and further progresses:
[-FAIL:-]{+PASS:+} gfortran.dg/coarray_lib_comm_1.f90 -O0 (test for excess errors)
[-UNRESOLVED:-]{+FAIL:+} gfortran.dg/coarray_lib_comm_1.f90 -O0 [-compilation failed to produce executable-]{+execution test+}
[Etc.]
..., which now (unrelatedly to 'alloca', and in the same way as configurations
where PTX 'alloca' is supported) FAILs due to:
error : Prototype doesn't match for '_gfortran_caf_transfer_between_remotes' in 'input file 9 at offset 159897', first defined in 'input file 9 at offset 159897'
error : Prototype doesn't match for '_gfortran_caf_stop_numeric' in 'input file 9 at offset 159897', first defined in 'input file 9 at offset 159897'
nvptx-run: cuLinkAddData failed: device kernel image is invalid (CUDA_ERROR_INVALID_SOURCE, 300)
gcc/
* config/nvptx/nvptx.opt (-mfake-ptx-alloca): Update.
gcc/testsuite/
* gcc.target/nvptx/alloca-2-O0_-mfake-ptx-alloca.c: Adjust.
libgcc/
* config/nvptx/alloca.c: New.
* config/nvptx/t-nvptx (LIB2ADD): Add it.
When MUL is not available, then the __umulhisi3 and __mulhisi3
functions can use __mulhisi3_helper. This improves code size,
stack footprint and runtime on AVRrc.
libgcc/
* config/avr/lib1funcs.S (__mulhisi3, __umulhisi3): Use
__mulhisi3_helper for better performance on AVRrc.
__umulhisi3 had an "rcall 1f" to save 6 bytes, which is an unreasonable
size gain vs. cycle cost. Just use the same code on all devices with MUL,
irrespective of program memory size.
libgcc/
* config/avr/lib1funcs.S (__umulhisi3) [Have MUL]: Reduce call
depth by 1.
There are many objects / functions that are not available on AVRrc,
the reduced core. The old way to exclude some objects for AVRrc
did not work properly since it tested for MULTIFLAGS.
This does not work for, say MULTIFLAGS = "-mmcu=avrtiny -mdouble=64".
This patch uses $(findstring avrtiny,$(MULTIDIR)) in the condition.
libgcc/
* config/avr/t-avr (LIB1ASMFUNCS, LIB2FUNCS_EXCLUDE):
Properly handle avrtiny.
libgcc/config/avr/libf7/
* t-libf7 (libgcc-objects): Only add objects when building
for non-AVRrc.
Change AArch64 cpuinfo to follow the latest updates to the FMV spec [1]:
Remove FEAT_PREDRES and FEAT_LS64*. Preserve the ordering in enum CPUFeatures.
[1] https://github.com/ARM-software/acle/pull/382
gcc:
* common/config/aarch64/cpuinfo.h: Remove FEAT_PREDRES and FEAT_LS64*.
* config/aarch64/aarch64-option-extensions.def: Remove FMV support
for PREDRES.
libgcc:
* config/aarch64/cpuinfo.c (__init_cpu_features_constructor):
Remove FEAT_PREDRES and FEAT_LS64* support.
The following testcase shows a bug in unwind-dw2-btree.h.
In short, the header provides lock-free btree data structure (so no parent
link on nodes, both insertion and deletion are done in top-down walks
with some locking of just a few nodes at a time so that lookups can notice
concurrent modifications and retry, non-leaf (inner) nodes contain keys
which are initially the base address of the left-most leaf entry of the
following child (or all ones if there is none) minus one, insertion ensures
balancing of the tree to ensure [d/2, d] entries filled through aggressive
splitting if it sees a full tree while walking, deletion performs various
operations like merging neighbour trees, merging into parent or moving some
nodes from neighbour to the current one).
What differs from the textbook implementations is mostly that the leaf nodes
don't include just address as a key, but address range, address + size
(where we don't insert any ranges with zero size) and the lookups can be
performed for any address in the [address, address + size) range. The keys
on inner nodes are still just address-1, so the child covers all nodes
where addr <= key unless it is covered already in children to the left.
The user (static executables or JIT) should always ensure there is no
overlap in between any of the ranges.
In the testcase a bunch of insertions are done, always followed by one
removal, followed by one insertion of a range slightly different from the
removed one. E.g. in the first case [&code[0x50], &code[0x59]] range
is removed and then we insert [&code[0x4c], &code[0x53]] range instead.
This is valid, it doesn't overlap anything. But the problem is that some
non-leaf (inner) one used the &code[0x4f] key (after the 11 insertions
completely correctly). On removal, nothing adjusts the keys on the parent
nodes (it really can't in the top-down only walk, the keys could be many nodes
above it and unlike insertion, removal only knows the start address, doesn't
know the removed size and so will discover it only when reaching the leaf
node which contains it; plus even if it knew the address and size, it still
doesn't know what the second left-most leaf node will be (i.e. the one after
removal)). And on insertion, if nodes aren't split at a level, nothing
adjusts the inner keys either. If a range is inserted and is either fully
bellow key (keys are - 1, so having address + size - 1 being equal to key is
fine) or fully after key (i.e. address > key), it works just fine, but if
the key is in a middle of the range like in this case, &code[0x4f] is in the
middle of the [&code[0x4c], &code[0x53]] range, then insertion works fine
(we only use size on the leaf nodes), and lookup of the addresses below
the key work fine too (i.e. [&code[0x4c], &code[0x4f]] will succeed).
The problem is with lookups after the key (i.e. [&code[0x50, &code[0x53]]),
the lookup looks for them in different children of the btree and doesn't
find an entry and returns NULL.
As users need to ensure non-overlapping entries at any time, the following
patch fixes it by adjusting keys during insertion where we know not just
the address but also size; if we find during the top-down walk a key
which is in the middle of the range being inserted, we simply increase the
key to be equal to address + size - 1 of the range being inserted.
There can't be any existing leaf nodes overlapping the range in correct
programs and the btree rebalancing done on deletion ensures we don't have
any empty nodes which would also cause problems.
The patch adjusts the keys in two spots, once for the current node being
walked (the last hunk in the header, with large comment trying to explain
it) and once during inner node splitting in a parent node if we'd otherwise
try to add that key in the middle of the range being inserted into the
parent node (in that case it would be missed in the last hunk).
The testcase covers both of those spots, so succeeds with GCC 12 (which
didn't have btrees) and fails with vanilla GCC trunk and also fails if
either the
if (fence < base + size - 1)
fence = iter->content.children[slot].separator = base + size - 1;
or
if (left_fence >= target && left_fence < target + size - 1)
left_fence = target + size - 1;
hunk is removed (of course, only with the current node sizes, i.e. up to
15 children of inner nodes and up to 10 entries in leaf nodes).
2025-03-10 Jakub Jelinek <jakub@redhat.com>
Michael Leuchtenburg <michael@slashhome.org>
PR libgcc/119151
* unwind-dw2-btree.h (btree_split_inner): Add size argument. If
left_fence is in the middle of [target,target + size - 1] range,
increase it to target + size - 1.
(btree_insert): Adjust btree_split_inner caller. If fence is smaller
than base + size - 1, increase it and separator of the slot to
base + size - 1.
* gcc.dg/pr119151.c: New test.
Studying unwind-dw2-btree.h was really hard for me because
the formatting is wrong or weird in many ways all around the code
and that kept distracting my attention.
That includes all kinds of things, including wrong indentation, using
{} around single statement substatements, excessive use of ()s around
some parts of expressions which don't increase code clarity, no space
after dot in comments, some comments not starting with capital letters,
some not ending with dot, adding {} around some parts of code without
any obvious reason (and when it isn't done in a similar neighboring
function) or ( at the end of line without any reason.
The following patch fixes the formatting issues I found, no functional
changes.
2025-03-10 Jakub Jelinek <jakub@redhat.com>
* unwind-dw2-btree.h: Formatting fixes.
When INT_TYPE_SIZE < BITS_PER_WORD gcc emits a call to an external ffs()
implementation instead of a call to "__builtin_ffs()" – see function
init_optabs() in <SRCROOT>/gcc/optabs-libfuncs.cc. External ffs()
(which is usually the one from newlib) in turn calls __builtin_ffs()
what causes infinite recursion and stack overflow. This patch overrides
default gcc bahaviour for H8/300H (and newer) and provides a generic
ffs() implementation for HImode.
PR target/114222
gcc/ChangeLog:
* config/h8300/h8300.cc (h8300_init_libfuncs): For HImode override
calls to external ffs() (from newlib) with calls to __ffshi2() from
libgcc. The implementation of ffs() in newlib calls __builtin_ffs()
what causes infinite recursion and finally a stack overflow.
libgcc/ChangeLog:
* config/h8300/t-h8300: Add __ffshi2().
* config/h8300/ffshi2.c: New file.
When gcc is built for x86_64-linux-musl target, stack unwinding from
within signal handler stops at the innermost signal frame. The reason
for this behaviro is that the signal trampoline is not accompanied with
appropiate CFI directives, and the fallback path in libgcc to recognize
it by the code sequence is only enabled for glibc except 2.0. The
latter is motivated by the lack of sys/ucontext.h in that glibc version.
Given that all relevant libc-s ship sys/ucontext.h for over a decade,
and that other arches aren't shy of unconditionally using it, follow
suit and remove the preprocessor condition, too.
libgcc/ChangeLog:
* config/i386/linux-unwind.h: Remove preprocessor
condition to enable fallback path for all libc-s.
Signed-off-by: Roman Kagan <rkagan@amazon.de>
Due to the presence of R_LARCH_B26 in
/usr/lib/gcc/loongarch64-linux-gnu/14/crtbeginS.o, its addressing
range is [PC-128MiB, PC+128MiB-4]. This means that when the code
segment size exceeds 128MB, linking with lld will definitely fail
(ld will not fail because the order of the two is different).
The linking order:
lld: crtbeginS.o + .text + .plt
ld : .plt + crtbeginS.o + .text
To solve this issue, add '-mcmodel=extreme' when compiling crtbeginS.o.
PR target/118844
libgcc/ChangeLog:
* config/loongarch/t-crtstuff: Add '-mcmodel=extreme'
to CRTSTUFF_T_CFLAGS_S.
As discussed from RISC-V C-API PR #101 [1], As discussed in #96, current
interface is insufficient to support some cases, like a vendor buying a
CPU IP from the upstream vendor but using their own mvendorid and custom
features from the upstream vendor. In this case, we might need to add
these extensions for each downstream vendor many times. Thus, making
__riscv_vendor_feature_bits guarded by mvendorid is not a good idea. So,
drop __riscv_vendor_feature_bits for now, and we should have time to
discuss a better solution.
[1] https://github.com/riscv-non-isa/riscv-c-api-doc/pull/101
Signed-off-by: Yangyu Chen <cyy@cyyself.name>
gcc/ChangeLog:
* config/riscv/riscv-feature-bits.h (RISCV_VENDOR_FEATURE_BITS_LENGTH): Drop.
(struct riscv_vendor_feature_bits): Drop.
libgcc/ChangeLog:
* config/riscv/feature_bits.c (RISCV_VENDOR_FEATURE_BITS_LENGTH): Drop.
(__init_riscv_features_bits_linux): Drop.
Add crtbeginT.o to extra_parts on FreeBSD. This ensures we use GCC's
crt objects for static linking. Otherwise it could mix crtbeginT.o
from the base system with libgcc's crtend.o, possibly leading to
segfaults.
libgcc:
PR target/118685
* config.host (*-*-freebsd*): Add crtbeginT.o to extra_parts.
Signed-off-by: Dimitry Andric <dimitry@andric.com>
2025-02-07 Peter Bergner <bergner@linux.ibm.com>
libgcc/
PR target/117674
* config/rs6000/linux-unwind.h (ppc_backchain_fallback): Add cast to
avoid comparison between pointer and integer warning.
This patch adds built-in functions __builtin_avr_strlen_flash,
__builtin_avr_strlen_flashx and __builtin_avr_strlen_memx.
Purpose is that higher-level functions can use __builtin_constant_p
on strlen without raising a diagnostic due to -Waddr-space-convert.
gcc/
* config/avr/builtins.def (STRLEN_FLASH, STRLEN_FLASHX)
(STRLEN_MEMX): New DEF_BUILTIN's.
* config/avr/avr.cc (avr_ftype_strlen): New static function.
(avr_builtin_supported_p): New built-ins are not for AVR_TINY.
(avr_init_builtins) <strlen_flash_node, strlen_flashx_node,
strlen_memx_node>: Provide new fntypes.
(avr_fold_builtin) [AVR_BUILTIN_STRLEN_FLASH]
[AVR_BUILTIN_STRLEN_FLASHX, AVR_BUILTIN_STRLEN_MEMX]: Fold if
possible.
* doc/extend.texi (AVR Built-in Functions): Document
__builtin_avr_strlen_flash, __builtin_avr_strlen_flashx,
__builtin_avr_strlen_memx.
libgcc/
* config/avr/t-avr (LIB1ASMFUNCS): Add _strlen_memx.
* config/avr/lib1funcs.S <L_strlen_memx, __strlen_memx>: Implement.
The arm-none-eabi port provides some alternative implementations of
__sync_synchronize for different implementations of the architecture.
These can be selected using one of -specs=sync-{none,dmb,cp15dmb}.specs.
These specs fragments fail, however, when LTO is used because they
unconditionally add a --defsym=__sync_synchronize=<implementation> to
the linker arguments and that fails if libgcc is not added to the list
of libraries.
Fix this by only adding the defsym if libgcc will be passed to the
linker.
libgcc/
PR target/118642
* config/arm/sync-none.specs (link): Only add the defsym if
libgcc will be used.
* config/arm/sync-dmb.specs: Likewise.
* config/arm/sync-cp15dmb.specs: Likewise.
Fix __extenddfxf2:
* Remove bogus denorm handling block which would never execute --
the converted exp value is always positive as EXCESSX > EXCESSD.
* Compute the whole significand in dl instead of doing part of it in
ldl.
* Mask off exponent from dl.l.upper so the denorm shift test
works.
* Insert the hidden one bit into dl.l.upper as needed.
Fix __truncxfdf2 denorm handling. All that is required is to shift the
significand right by the correct amount; it already has all of the
necessary bits set including the explicit one. Compute the shift
amount, then perform the wide shift across both elements of the
significand.
Fix __fixxfsi:
* The value was off by a factor of two as the significand contains
32 bits, not 31 so we need to shift by one more than the equivalent
code in __fixdfsi.
* Simplify the code having realized that the lower 32 bits of the
significand can never appear in the results.
Return positive qNaN instead of negative. For floats, qNaN is 0x7fff_ffff. For
doubles, qNaN is 0x7fff_ffff_ffff_ffff.
Return correctly signed zero on float and double divide underflow. This means
that Ld$underflow now expects d7 to contain the sign bit, just like the other
return paths.
libgcc/
* config/m68k/fpgnulib.c (extenddfxf2): Simplify code by removing code
that should never execute. Fix denorm shift test and insert hidden bit
as needed.
(__truncxfdf2): Properly compue and shift the significant right.
* config/m68k/lb1sf68.S (__fixxfsi): Correct shift counts and simplify.
(QUIET_NAN): Make it a positive quiet NaN and fix return values to inject
sign properly.
In the OpenRISC build we get the following warning:
ld: warning: __modsi3_s.o: missing .note.GNU-stack section implies executable stack
ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker
Fix this by adding a .note.GNU-stack to indicate the stack does not need to be
executable for the lib1funcs.
Note, this is also needed for the upcoming glibc 2.41.
libgcc/
* config/or1k/lib1funcs.S: Add .note.GNU-stack section on linux.
This patch adds __flashx as a new named address space that allocates
objects in .progmemx.data. The handling is mostly the same or similar
to that of 24-bit space __memx, except that the asm routines are
simpler and more efficient. Loads are emit inline when ELPMX or
LPMX is available. The address space uses a 24-bit addresses even
on devices with a program memory size of 64 KiB or less.
PR target/118001
gcc/
* doc/extend.texi (AVR Named Address Spaces): Document __flashx.
* config/avr/avr.h (ADDR_SPACE_FLASHX): New enum value.
* config/avr/avr-protos.h (avr_out_fload, avr_mem_flashx_p)
(avr_fload_libgcc_p, avr_load_libgcc_mem_p)
(avr_load_libgcc_insn_p): New.
* config/avr/avr.cc (avr_addrspace): Add ADDR_SPACE_FLASHX.
(avr_decl_flashx_p, avr_mem_flashx_p, avr_fload_libgcc_p)
(avr_load_libgcc_mem_p, avr_load_libgcc_insn_p, avr_out_fload):
New functions.
(avr_adjust_insn_length) [ADJUST_LEN_FLOAD]: Handle case.
(avr_progmem_p) [avr_decl_flashx_p]: return 2.
(avr_addr_space_legitimate_address_p) [ADDR_SPACE_FLASHX]:
Has same behavior like ADDR_SPACE_MEMX.
(avr_addr_space_convert): Use pointer sizes rather then ASes.
(avr_addr_space_contains): New function.
(avr_convert_to_type): Use it.
(avr_emit_cpymemhi): Handle ADDR_SPACE_FLASHX.
* config/avr/avr.md (adjust_len) <fload>: New attr value.
(gen_load<mode>_libgcc): Renamed from load<mode>_libgcc.
(xload8<mode>_A): Iterate over MOVMODE rather than over ALL1.
(fxmov<mode>_A): New from xloadv<mode>_A.
(xmov<mode>_8): New from xload<mode>_A.
(fmov<mode>): New insns.
(fxload<mode>_A): New from xload<mode>_A.
(fxload_<mode>_libgcc): New from xload_<mode>_libgcc.
(*fxload_<mode>_libgcc): New from *xload_<mode>_libgcc.
(mov<mode>) [avr_mem_flashx_p]: Hande ADDR_SPACE_FLASHX.
(cpymemx_<mode>): Make sure the address space is not lost
when splitting.
(*cpymemx_<mode>) [ADDR_SPACE_FLASHX]: Use __movmemf_<mode> for asm.
(*ashlqi.1.zextpsi_split): New combine pattern.
* config/avr/predicates.md (nox_general_operand): Don't match
when avr_mem_flashx_p is true.
* config/avr/avr-passes.cc (AVR_LdSt_Props):
ADDR_SPACE_FLASHX has no post_inc.
gcc/testsuite/
* gcc.target/avr/torture/addr-space-1.h [AVR_HAVE_ELPM]:
Use a function to bump .progmemx.data to a high address.
* gcc.target/avr/torture/addr-space-2.h: Same.
* gcc.target/avr/torture/addr-space-1-fx.c: New test.
* gcc.target/avr/torture/addr-space-2-fx.c: New test.
libgcc/
* config/avr/t-avr (LIB1ASMFUNCS): Add _fload_1, _fload_2,
_fload_3, _fload_4, _movmemf.
* config/avr/lib1funcs.S (.branch_plus): New .macro.
(__xload_1, __xload_2, __xload_3, __xload_4): When the address is
located in flash, then forward to...
(__fload_1, __fload_2, __fload_3, __fload_4): ...these new
functions, respectively.
(__movmemx_hi): When the address is located in flash, forward to...
(__movmemf_hi): ...this new function.
Unlike crtoffload{begin,end}.o which just define some symbols at the start/end
of the various .gnu.offload* sections, crtoffloadtable.o contains
const void *const __OFFLOAD_TABLE__[]
__attribute__ ((__visibility__ ("hidden"))) =
{
&__offload_func_table, &__offload_funcs_end,
&__offload_var_table, &__offload_vars_end,
&__offload_ind_func_table, &__offload_ind_funcs_end,
};
The problem is that linking this into PIEs or shared libraries doesn't
work when it is compiled without -fpic/-fpie - __OFFLOAD_TABLE__ for non-PIC
code is put into .rodata section, but it really needs relocations, so for
PIC it should go into .data.rel.ro/.data.rel.ro.local.
As I think we don't want .data.rel.ro section in non-PIE binaries, this patch
follows the path of e.g. crtbegin.o vs. crtbeginS.o and adds crtoffloadtableS.o
next to crtoffloadtable.o, where crtoffloadtableS.o is compiled with -fpic.
2024-11-30 Jakub Jelinek <jakub@redhat.com>
PR libgomp/117851
gcc/
* lto-wrapper.cc (find_crtoffloadtable): Add PIE_OR_SHARED argument,
search for crtoffloadtableS.o rather than crtoffloadtable.o if
true.
(run_gcc): Add pie_or_shared variable. If OPT_pie or OPT_shared or
OPT_static_pie is seen, set pie_or_shared to true, if OPT_no_pie is
seen, set pie_or_shared to false. Pass it to find_crtoffloadtable.
libgcc/
* configure.ac (extra_parts): Add crtoffloadtableS.o.
* Makefile.in (crtoffloadtableS$(objext)): New goal.
* configure: Regenerated.
Including the "arm_acle.h" header in aarch64-unwind.h requires
stdint.h to be present and it may not be available during the
first stage of cross-compilation of GCC.
When cross-building GCC for the aarch64-none-linux-gnu target
(on any supporting host) using the 3-stage bootstrap build
process when we build native compiler from source, libgcc fails
to compile due to missing header that has not been installed yet.
This could be worked around but it's better to fix the issue.
libgcc/ChangeLog:
* config/aarch64/aarch64-unwind.h (_CHKFEAT_GCS): Add.
While these haven't shown up in my tester (not configs I test) and I think
we're likely going to be deprecating the nds32 target. we might as well go
ahead and fix them.
I'm going to include this under the pr117628 umbrella.
PR target/117628
libgcc/
* config/arm/freebsd-atomic.c (bool): Remove unnecessary typedef.
* config/arm/linux-atomic-64bit.c: Likewise.
* config/arm/linux-atomic.c: Likewise.
* config/nds32/linux-atomic.c: Likewise.
* config/nios2/linux-atomic.c: Likewise.
csky fails to build libgcc after the c23 changes because it has a typedef for
bool. AFAICT it's internal to the file, so removing the typedef isn't an ABI
change.
Similiarly for c6x which includes unwind-arm-common.inc. I suspect most, if
not all of the arm-v7 and older targets are failing to build right now.
I've built and regression tested both csky-linux-gnu and c6x-elf with this
change. OK for the trunk?
PR target/117628
libgcc/
* config/csky/linux-atomic.c (bool): Remove unnecessary typedef.
* unwind-arm-common.inc (bool): Similarly.
In C23, bool is now a keyword. So, doing a typedef for it is invalid.
2024-11-17 John David Anglin <danglin@gcc.gnu.org>
libgcc/ChangeLog:
PR target/117627
* config/pa/linux-atomic.c: Remove typedef for bool type.
Since r15-5327, GNU-C23 is being used as C language default.
libf7.h doesn't assume headers like stdbool.h are present
and defines bool, true and false on its own.
libgcc/config/avr/libf7/
* libf7.h (bool, true, false): Don't define in C23 or higher.