Go to file
Tamar Christina 25c8a8d431 AArch64: Implement widen_[us]sum using [US]ADDW[TB] for SVE2 [PR122069]
SVE2 adds [US]ADDW[TB] which we can use when we have to do a single step
widening addition.  This is useful for instance when the value to be widened
does not come from a load.  For example for

int foo2_int(unsigned short *x, unsigned short * restrict y) {
  int sum = 0;
  for (int i = 0; i < 8000; i++)
    {
      x[i] = x[i] + y[i];
      sum += x[i];
    }
  return sum;
}

we used to generate

.L6:
        ld1h    z1.h, p7/z, [x0, x2, lsl 1]
        ld1h    z29.h, p7/z, [x1, x2, lsl 1]
        add     z29.h, z29.h, z1.h
        punpklo p6.h, p7.b
        uunpklo z0.s, z29.h
        add     z31.s, p6/m, z31.s, z0.s
        punpkhi p6.h, p7.b
        uunpkhi z30.s, z29.h
        add     z31.s, p6/m, z31.s, z30.s
        st1h    z29.h, p7, [x0, x2, lsl 1]
        add     x2, x2, x4
        whilelo p7.h, w2, w3
        b.any   .L6
        ptrue   p7.b, all
        uaddv   d31, p7, z31.s

but with +sve2

.L12:
        ld1h    z30.h, p7/z, [x0, x2, lsl 1]
        ld1h    z29.h, p7/z, [x1, x2, lsl 1]
        add     z30.h, z30.h, z29.h
        uaddwb  z31.s, z31.s, z30.h
        uaddwt  z31.s, z31.s, z30.h
        st1h    z30.h, p7, [x0, x2, lsl 1]
        mov     x3, x2
        inch    x2
        cmp     w2, w4
        bls     .L12
        inch    x3
        uaddv   d31, p7, z31.s

gcc/ChangeLog:

	PR middle-end/122069
	* config/aarch64/aarch64-sve2.md: (widen_ssum<mode><Vnarrow>3): New.
	(widen_usum<mode><Vnarrow>3): New.
	* config/aarch64/iterators.md (Vnarrow): New, to match VNARROW.

gcc/testsuite/ChangeLog:

	PR middle-end/122069
	* gcc.target/aarch64/sve2/pr122069_1.c: New test.
	* gcc.target/aarch64/sve2/pr122069_2.c: New test.
2025-10-18 08:24:18 +01:00
.forgejo
.github
INSTALL
c++tools Daily bump. 2025-06-03 00:18:06 +00:00
config Daily bump. 2025-10-05 16:50:51 +00:00
contrib Daily bump. 2025-10-17 00:18:48 +00:00
fixincludes Daily bump. 2025-08-29 00:19:55 +00:00
gcc AArch64: Implement widen_[us]sum using [US]ADDW[TB] for SVE2 [PR122069] 2025-10-18 08:24:18 +01:00
gnattools Daily bump. 2025-06-23 00:16:33 +00:00
gotools
include Daily bump. 2025-10-17 00:18:48 +00:00
libada Update copyright years. 2025-01-02 11:59:57 +01:00
libatomic Daily bump. 2025-10-10 00:21:51 +00:00
libbacktrace Daily bump. 2025-10-05 16:50:51 +00:00
libcc1 Daily bump. 2025-10-10 00:21:51 +00:00
libcody Update Copyright year in ChangeLog files 2025-01-02 11:13:18 +01:00
libcpp Daily bump. 2025-10-14 00:20:06 +00:00
libdecnumber Update copyright years. 2025-01-02 11:59:57 +01:00
libffi Daily bump. 2025-10-05 16:50:51 +00:00
libgcc Daily bump. 2025-10-10 00:21:51 +00:00
libgcobol Daily bump. 2025-10-11 00:21:09 +00:00
libgfortran Daily bump. 2025-10-05 16:50:51 +00:00
libgm2 Daily bump. 2025-10-08 00:20:55 +00:00
libgo runtime: avoid libc memmove and memclr 2025-07-08 15:49:16 -07:00
libgomp Daily bump. 2025-10-17 00:18:48 +00:00
libgrust Daily bump. 2025-10-05 16:50:51 +00:00
libiberty Daily bump. 2025-10-05 16:50:51 +00:00
libitm Daily bump. 2025-10-05 16:50:51 +00:00
libobjc Daily bump. 2025-10-05 16:50:51 +00:00
libphobos Daily bump. 2025-10-05 16:50:51 +00:00
libquadmath Daily bump. 2025-10-05 16:50:51 +00:00
libsanitizer Daily bump. 2025-10-05 16:50:51 +00:00
libssp Daily bump. 2025-10-05 16:50:51 +00:00
libstdc++-v3 Daily bump. 2025-10-18 00:18:06 +00:00
libvtv Daily bump. 2025-10-05 16:50:51 +00:00
lto-plugin Daily bump. 2025-10-05 16:50:51 +00:00
maintainer-scripts Daily bump. 2025-09-02 00:19:26 +00:00
zlib Daily bump. 2025-10-05 16:50:51 +00:00
.b4-config
.dir-locals.el
.editorconfig toplevel: unify the GCC and GDB/binutils .editorconfig files 2025-10-01 15:42:59 +01:00
.gitattributes
.gitignore Rust: Move 'libformat_parser' build into the GCC build directory 2025-08-05 16:36:43 +02:00
ABOUT-NLS
COPYING
COPYING.LIB
COPYING.RUNTIME
COPYING3
COPYING3.LIB
ChangeLog Daily bump. 2025-10-16 00:21:56 +00:00
ChangeLog.jit
ChangeLog.tree-ssa
MAINTAINERS MAINTAINERS: Add myself as an aarch64 port reviewer 2025-09-04 14:35:30 +00:00
Makefile.def PR81358: Enable automatic linking of libatomic. 2025-10-09 07:26:51 +00:00
Makefile.in [PATCH] Makefile.tpl: remove an extra \; from find command 2025-10-15 11:32:21 -06:00
Makefile.tpl [PATCH] Makefile.tpl: remove an extra \; from find command 2025-10-15 11:32:21 -06:00
README
SECURITY.txt Remove Debian from SECURITY.txt 2024-11-19 12:27:33 +01:00
ar-lib
compile
config-ml.in *: Fix patch email address 2025-10-11 11:08:01 +02:00
config.guess
config.rpath
config.sub
configure PR81358: Enable automatic linking of libatomic. 2025-10-09 07:26:51 +00:00
configure.ac PR81358: Enable automatic linking of libatomic. 2025-10-09 07:26:51 +00:00
depcomp
install-sh
libtool-ldflags
libtool.m4 Sync toplevel files from binutils-gdb 2025-10-02 15:00:06 +08:00
ltgcc.m4
ltmain.sh
ltoptions.m4
ltsugar.m4
ltversion.m4
lt~obsolete.m4
missing
mkdep
mkinstalldirs
move-if-change
multilib.am
symlink-tree *: Fix patch email address 2025-10-11 11:08:01 +02:00
test-driver
ylwrap

README

This directory contains the GNU Compiler Collection (GCC).

The GNU Compiler Collection is free software.  See the files whose
names start with COPYING for copying permission.  The manuals, and
some of the runtime libraries, are under different terms; see the
individual source files for details.

The directory INSTALL contains copies of the installation information
as HTML and plain text.  The source of this information is
gcc/doc/install.texi.  The installation information includes details
of what is included in the GCC sources and what files GCC installs.

See the file gcc/doc/gcc.texi (together with other files that it
includes) for usage and porting information.  An online readable
version of the manual is in the files gcc/doc/gcc.info*.

See http://gcc.gnu.org/bugs/ for how to report bugs usefully.

Copyright years on GCC source files may be listed using range
notation, e.g., 1987-2012, indicating that every year in the range,
inclusive, is a copyrightable year that could otherwise be listed
individually.