The following patch implements CWG3053 approved in Kona, where it is now
valid not just to #define likely(a) or #define unlikely(a, b, c) but also
to #undef likely or #undef unlikely.
2025-11-10 Jakub Jelinek <jakub@redhat.com>
libcpp/
* directives.cc: Implement CWG3053.
(do_undef): Don't pedwarn or warn about #undef likely or #undef
unlikely.
gcc/testsuite/
* g++.dg/warn/Wkeyword-macro-4.C: Don't diagnose for #undef likely
or #undef unlikely.
* g++.dg/warn/Wkeyword-macro-5.C: Likewise.
* g++.dg/warn/Wkeyword-macro-9.C: Likewise.
* g++.dg/warn/Wkeyword-macro-8.C: Likewise.
* g++.dg/warn/Wkeyword-macro-10.C: Likewise.
It is permissible to define macros prior to including a PCH, as long as
these definitions are disjoint from or identical to the macros in the
PCH. The PCH loading process replaces all libcpp data structures with those
from the PCH, so it is necessary to remember the extra macros separately and
then restore them after loading the PCH, which all is handled by
cpp_save_state() and cpp_read_state() in libcpp/pch.cc. The restoration
process consists of pushing a buffer containing the macro definition and
then lexing it from there, similar to how a command-line -D option is
processed. The current implementation does not attempt to set up the
line_map for this process, and so the locations assigned to the macros are
often not meaningful. (Similar to what happened in the past with lexing the
tokens out of a _Pragma string, lexing out of a buffer rather than a file
produces "sorta" reasonable locations that are often close enough, but not
reliably correct.)
Fix that up by remembering enough additional information (more or less, an
expanded_location for each macro definition) to produce a reasonable
location for the newly restored macros.
One issue that came up is the treatment of command-line-defined macros. From
the perspective of the generic line_map data structures, the command-line
location is not distinguishable from other locations; it's just an ordinary
location created by the front ends with a fake file name by convention. (At
the moment, it is always the string `<command-line>', subject to
translation.) Since libcpp needs to assign macros to that location, it
needs to know what location to use, so I added a new member
line_maps::cmdline_location for the front ends to set, similar to how
line_maps::builtin_location is handled.
This revealed a small issue, in c-opts.cc we have:
/* All command line defines must have the same location. */
cpp_force_token_locations (parse_in, line_table->highest_line);
But contrary to the comment, all command line defines don't actually end up
with the same location anymore. This is because libcpp/lex.cc has been
expanded (r6-4873) to include range information on the returned
locations. That logic has never been respecting the request of
cpp_force_token_locations. I believe this was not intentional, and so I have
corrected that here. Prior to this patch, the range logic has been leading
to command-line macros all having similar locations in the same line map (or
ad-hoc locations based from there for sufficiently long tokens); with this
change, they all have exactly the same location and that location is
recorded in line_maps::cmdline_location.
With that change, then it works fine for pch.cc to restore macros whether
they came from the command-line or from the main file.
gcc/c-family/ChangeLog:
PR preprocessor/105608
* c-opts.cc (c_finish_options): Set new member
line_table->cmdline_location.
* c-pch.cc (c_common_read_pch): Adapt linemap usage to changes in
libcpp pch.cc; it is now possible that the linemap is in a different
file after returning from cpp_read_state().
libcpp/ChangeLog:
PR preprocessor/105608
* include/line-map.h: Add new member CMDLINE_LOCATION.
* lex.cc (get_location_for_byte_range_in_cur_line): Do not expand
the token location to include range information if token location
override was requested.
(warn_about_normalization): Likewise.
(_cpp_lex_direct): Likewise.
* pch.cc (struct saved_macro): New local struct.
(struct save_macro_data): Change DEFNS vector to hold saved_macro
rather than uchar*.
(save_macros): Adapt to remember the location information for each
saved macro in addition to the definition.
(cpp_prepare_state): Likewise.
(cpp_read_state): Use the saved location information to generate
proper locations for the restored macros.
gcc/testsuite/ChangeLog:
PR preprocessor/105608
* g++.dg/pch/line-map-3.C: Remove xfails.
* g++.dg/pch/line-map-4.C: New test.
* g++.dg/pch/line-map-4.Hs: New test.
In traditional CPP mode (-save-temps, -no-integrated-cpp, etc.), the
compilation directory is conveyed to cc1 using a line such as:
# <line> "/path/name//"
This string literal can contain escape sequences, for instance, if the
original source file was compiled in "/tmp/a\b", then this line will be:
# <line> "/tmp/a\\b//"
So reading the compilation directory must decode escape sequences. This
last part is currently missing and this patch implements it.
libcpp/
* init.cc (read_original_directory): Attempt to decode escape
sequences with cpp_interpret_string_notranslate.
The following patch updates GCC from Unicode 16.0.0 to 17.0.0.
I've followed what the README says and updated also one script from
glibc, but that needed another Unicode file - HangulSyllableType.txt -
around as well, so I'm adding it.
I've added one new test to named-universal-char-escape-1.c for
randomly chosen character from new CJK block.
Note, Unicode 17.0.0 authors forgot to adjust the 4-8 table, I've filed
bugreports about that but the UnicodeData.txt changes for the range ends
and the new range seems to match e.g. what is in the glyph tables, so
the patch follows UnicodeData.txt and not 4-8 table here.
Another thing was that makeuname2c.cc didn't handle correctly when
the size of the generated string table modulo 77 was 76 or 77, in which
case it forgot to emit a semicolon after the string literal and so failed
to compile.
And as can be seen in the emoji-data.txt diff, some properties like
Extended_Pictographic have been removed from certain characters, e.g.
from the Mahjong cards characters except U+1F004, and one libstdc++
test was testing that property exactly on U+1F000. Dunno why that was
changed, but U+1F004 is the only colored one among tons of black and white
ones.
2025-10-08 Jakub Jelinek <jakub@redhat.com>
contrib/
* unicode/README: Add HangulSyllableType.txt file to the
list as newest utf8_gen.py from glibc now needs it. Adjust
git commit hash and change unicode 16 version to 17.
* unicode/from_glibc/utf8_gen.py: Updated from glibc.
* unicode/DerivedCoreProperties.txt: Updated from Unicode 17.0.0.
* unicode/emoji-data.txt: Likewise.
* unicode/PropList.txt: Likewise.
* unicode/GraphemeBreakProperty.txt: Likewise.
* unicode/DerivedNormalizationProps.txt: Likewise.
* unicode/NameAliases.txt: Likewise.
* unicode/UnicodeData.txt: Likewise.
* unicode/EastAsianWidth.txt: Likewise.
* unicode/DerivedGeneralCategory.txt: Likewise.
* unicode/HangulSyllableType.txt: New file.
gcc/testsuite/
* c-c++-common/cpp/named-universal-char-escape-1.c: Add test for
\N{CJK UNIFIED IDEOGRAPH-3340E}.
libcpp/
* makeucnid.cc (write_copyright): Adjust copyright year.
* makeuname2c.cc (generated_ranges): Adjust end points for a couple
of ranges based on UnicodeData.txt Last changes and add a whole new
CJK UNIFIED IDEOGRAPH- entry. None of these changes are in the 4-8
table, but clearly it has just been forgotten.
(write_copyright): Adjust copyright year.
(write_dict): Fix up condition when to print semicolon.
* generated_cpp_wcwidth.h: Regenerate.
* ucnid.h: Regenerate.
* uname2c.h: Regenerate.
libstdc++-v3/
* include/bits/unicode-data.h: Regenerate.
* testsuite/ext/unicode/properties.cc: Test __is_extended_pictographic
on U+1F004 rather than U+1F000.
SARIF "fix" objects SHOULD have a "description" property (§3.55.2) that
describes the proposed fix, but currently GCC's SARIF output doesn't
support this, and we don't capture this anywhere internally as we build
fix-it hints in the compiler.
Currently we can have zero or more instances of fixit_hint associated
with a diagnostic, each representing an edit of a range of the source
code. Ideally we would have an internal API that allowed for associating
multiple fixes with a diagnostic, each with a description worded in terms
of the source language (e.g. "Fix 'colour' mispelling of field 'color'"),
and each consisting of multiple edited ranges.
For now, this patch extends the sarif output sink so that it
autogenerates descriptions of fix-it hints for simple cases of
insertion, deletion, and replacement of a single range
(e.g. "Replace 'colour' with 'color'").
gcc/ChangeLog:
PR diagnostics/121986
* diagnostics/sarif-sink.cc: Include "intl.h".
(sarif_builder::make_message_describing_fix_it_hint): New.
(sarif_builder::make_fix_object): Attempt to auto-generate a
description for fix-it hints.
gcc/testsuite/ChangeLog:
PR diagnostics/121986
* gcc.dg/sarif-output/extra-semicolon.c: New test.
* gcc.dg/sarif-output/extra-semicolon.py: New test.
* gcc.dg/sarif-output/missing-semicolon.py: Verify the description
of the insertion fix-it hint.
* libgdiagnostics.dg/test-fix-it-hint-c.py: Verify the description
of the replacement fix-it hint.
libcpp/ChangeLog:
PR diagnostics/121986
* include/rich-location.h (fixit_hint::deletion_p): New accessor.
(fixit_hint::replacement_p): New accessor.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
The following patch implements the
https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3457.htm
paper without the first 3 lines in Recommended practice.
Seems GCC behavior already matches the expected behavior except for
diagnostics of more than 2147483648 __COUNTER__ expansions, so the
patch adds a diagnostic for that (but not testcase because
#define A __COUNTER__ __COUNTER__ __COUNTER__ __COUNTER__ __COUNTER__ __COUNTER__ __COUNTER__ __COUNTER__
#define B A A A A A A A A
#define C B B B B B B B B
#define D C C C C C C C C
#define E D D D D D D D D
#define F E E E E E E E E
#define G F F F F F F F F
#define H G G G G G G G G
#define I H H H H H H H H
#define J I I I I I I I I
J J J J
__COUNTER__
just takes too long to preprocess).
Plus I've included all the snippets from the paper into one testcase.
2025-09-01 Jakub Jelinek <jakub@redhat.com>
* macro.cc: Implement C2Y N3457 - The __COUNTER__ predefined macro.
(_cpp_builtin_macro_text): Diagnose if __COUNTER__ reaches
2147483648 value.
* gcc.dg/cpp/c2y-counter-1.c: New test.
We already warn on #undef or pedwarn on #define (but not on #define
after #undef) of some builtin macros mentioned in cpp.predefined.
The C++26 P2843R3 paper changes it from (compile time) undefined behavior
to ill-formed. The following patch arranges for warning (for #undef)
and pedwarn (on #define) for the remaining cpp.predefined macros.
__cpp_* feature test macros only for C++20 which added some of them
to cpp.predefined, in earlier C++ versions it was just an extension and
for pedantic diagnostic I think we don't need to diagnose anything,
__STDCPP_* and __cplusplus macros for all C++ versions where they appeared.
Like the earlier posted -Wkeyword-macro diagnostics (which is done
regardless whether the identifier is defined as a macro or not, obviously
most likely none of the keywords are defined as macros initially), this
one also warns on #undef when a macro isn't defined or later #define
after #undef.
2025-08-15 Jakub Jelinek <jakub@redhat.com>
PR preprocessor/120778
PR target/121520
gcc/c-family/
* c-cppbuiltin.cc (c_cpp_builtins): Implement C++26 DR 2581. Add
cpp_define_warn lambda and use it as well as cpp_warn where needed.
In the if (c_dialect_cxx ()) block with __cpp_* predefinitions add
cpp_define lambda. Formatting fixes.
gcc/c/
* c-decl.cc (c_init_decl_processing): Use cpp_warn instead of
cpp_lookup and NODE_WARN bit setting.
gcc/cp/
* lex.cc (cxx_init): Remove warn_on lambda. Use cpp_warn instead of
cpp_lookup and NODE_WARN bit setting or warn_on.
gcc/testsuite/
* g++.dg/DRs/dr2581-1.C: New test.
* g++.dg/DRs/dr2581-2.C: New test.
* c-c++-common/cpp/pr92296-2.c: Expect warnings also on defining
special macros after undefining them.
libcpp/
* include/cpplib.h (struct cpp_options): Add
suppress_builtin_macro_warnings member.
(cpp_warn): New inline functions.
* init.cc (cpp_create_reader): Clear suppress_builtin_macro_warnings.
(cpp_init_builtins): Call cpp_warn on __cplusplus, __STDC__,
__STDC_VERSION__, __STDC_MB_MIGHT_NEQ_WC__ and
__STDCPP_STRICT_POINTER_SAFETY__ when appropriate.
* directives.cc (do_undef): Warn on undefining NODE_WARN macros if
not cpp_keyword_p. Don't emit any NODE_WARN related diagnostics
if CPP_OPTION (pfile, suppress_builtin_macro_warnings).
(cpp_define, _cpp_define_builtin, cpp_undef): Temporarily set
CPP_OPTION (pfile, suppress_builtin_macro_warnings) around
run_directive calls.
* macro.cc (_cpp_create_definition): Warn on defining NODE_WARN
macros if they weren't previously defined and not cpp_keyword_p.
Ignore NODE_WARN for diagnostics if
CPP_OPTION (pfile, suppress_builtin_macro_warnings).
The following patch introduces a -Wkeyword-macro warning that clang has
since 2014 to implement part of C++26 P2843R3 Preprocessing is never undefined
paper.
The relevant change in the paper is moving [macro.names]/2 paragraph to
https://eel.is/c++draft/cpp.replace.general#9 :
"A translation unit shall not #define or #undef names lexically identical to
keywords, to the identifiers listed in Table 4, or to the attribute-tokens
described in [dcl.attr], except that the names likely and unlikely may be
defined as function-like macros."
Now, my understanding of the paper is that in [macro.names] and surrounding
sections the word shall bears different meaning from [cpp.replace.general],
where only the latter location implies ill-formed, diagnostic required.
The warning in clang when introduced diagnosed all #define/#undef directives
on keywords, but shortly after introduction has been changed not to
diagnose #undef at all (with "#undef a keyword is generally harmless but used
often in configuration scripts" message) and later on even the #define
part tweaked - not warn about say
#define inline
(or const, extern, static), or
#define keyword keyword
or
#define keyword __keyword
or
#define keyword __keyword__
Later on the warning has been moved to be only pedantic diagnostic unless
requested by users. Clearly some code in the wild does e.g.
#define private public
and similar games, or e.g. Linux kernel (sure, C) does
#define inline __inline__ __attribute__((__always_inline__))
etc.
Now, I believe at least with the current C++26 wording such exceptions
aren't allowed (unless it is changed to IFNDR). But given that this is just
pedantic stuff, the following patch makes the warning off by default for
C and C++ before C++26 and even for C++26 it enables it by default only
if -pedantic/-pedantic-errors (in that case it pedwarns, otherwise it
warns). And it diagnoses both #define and #undef without exceptions.
From what I can see, all the current NODE_WARN cases are macros starting
with __ with one exception (_Pragma). As the NODE_* flags seem to be a
limited resource, I chose to just use NODE_WARN as well and differentiate
on the node names (if they don't start with __ or _P, they are considered
to be -Wkeyword-macro registered ones, otherwise old NODE_WARN cases,
typically builtin macros or __STDC* macros).
2025-08-07 Jakub Jelinek <jakub@redhat.com>
PR preprocessor/120778
gcc/
* doc/invoke.texi (Wkeyword-macro): Document.
gcc/c-family/
* c.opt (Wkeyword-macro): New option.
* c.opt.urls: Regenerate.
* c-common.h (cxx_dialect): Comment formatting fix.
* c-opts.cc (c_common_post_options): Default to
-Wkeyword-macro for C++26 if pedantic.
gcc/c/
* c-decl.cc (c_init_decl_processing): Mark cpp nodes corresponding
to keywords as NODE_WARN if warn_keyword_macro.
gcc/cp/
* lex.cc (cxx_init): Mark cpp nodes corresponding
to keywords, identifiers with special meaning and standard
attribute identifiers as NODE_WARN if warn_keyword_macro.
gcc/testsuite/
* gcc.dg/Wkeyword-macro-1.c: New test.
* gcc.dg/Wkeyword-macro-2.c: New test.
* gcc.dg/Wkeyword-macro-3.c: New test.
* gcc.dg/Wkeyword-macro-4.c: New test.
* gcc.dg/Wkeyword-macro-5.c: New test.
* gcc.dg/Wkeyword-macro-6.c: New test.
* gcc.dg/Wkeyword-macro-7.c: New test.
* gcc.dg/Wkeyword-macro-8.c: New test.
* gcc.dg/Wkeyword-macro-9.c: New test.
* g++.dg/warn/Wkeyword-macro-1.C: New test.
* g++.dg/warn/Wkeyword-macro-2.C: New test.
* g++.dg/warn/Wkeyword-macro-3.C: New test.
* g++.dg/warn/Wkeyword-macro-4.C: New test.
* g++.dg/warn/Wkeyword-macro-5.C: New test.
* g++.dg/warn/Wkeyword-macro-6.C: New test.
* g++.dg/warn/Wkeyword-macro-7.C: New test.
* g++.dg/warn/Wkeyword-macro-8.C: New test.
* g++.dg/warn/Wkeyword-macro-9.C: New test.
* g++.dg/warn/Wkeyword-macro-10.C: New test.
* g++.dg/opt/pr82577.C: Don't #define register to nothing for
C++17 and later. Instead define reg macro to nothing for C++17
and later or to register and use it instead of register.
* g++.dg/modules/atom-preamble-3.C: Add -Wno-keyword-macro to
dg-additional-options.
* g++.dg/template/sfinae17.C (static_assert): Rename macro to ...
(my_static_assert): ... this.
(main): Use my_static_assert instead of static_assert.
libcpp/
* include/cpplib.h (struct cpp_options): Add cpp_warn_keyword_macro.
(enum cpp_warning_reason): Add CPP_W_KEYWORD_MACRO enumerator.
(cpp_keyword_p): New inline function.
* directives.cc (do_undef): Support -Wkeyword-macro diagnostics.
* macro.cc (warn_of_redefinition): Ignore NODE_WARN flag on nodes
registered for -Wkeyword-macro.
(_cpp_create_definition): Support -Wkeyword-macro diagnostics.
Formatting fixes.
Another easy part from the paper.
Part of the CWG2579 has been already done in an earlier paper (with
test commits by Marek) and the remaining part is implemented correctly,
we diagnose as error when token pasting doesn't form a valid token.
Except that message
pasting """" and """" does not give a valid preprocessing token
looked weird and so I've updated the message to use %< and %> instead
of \" quoting.
2025-08-05 Jakub Jelinek <jakub@redhat.com>
PR preprocessor/120778
* macro.cc (paste_tokens): Use %< and %> instead of \" in
diagnostics around %.*s.
* g++.dg/DRs/dr2579.C: New test.
* c-c++-common/cpp/va-opt-6.c: Expect ' rather than \" around
tokens in incorrect pasting diagnostics.
* gcc.dg/c23-attr-syntax-6.c: Likewise.
* gcc.dg/cpp/paste12.c: Likewise.
* gcc.dg/cpp/paste12-2.c: Likewise.
* gcc.dg/cpp/paste14.c: Likewise.
* gcc.dg/cpp/paste14-2.c: Likewise.
This is another case which changed from compile time undefined behavior
to ill-formed, diagnostic required. Now, we warn on this, so pedantically
that is good enough, maybe all we need is a testcase, but the following
patch changes it to a pedwarn for C++26.
2025-08-04 Jakub Jelinek <jakub@redhat.com>
PR preprocessor/120778
* macro.cc (stringify_arg): For C++26 emit a pedarn instead of warning
for \ at the end of stringification.
* g++.dg/DRs/dr2578.C: New test.
My changes for "Module Declarations Shouldn’t be Macros" paper broke
the following testcase. The backup handling intentionally tries to
drop CPP_PRAGMA_EOL token if things go wrong, which is desirable for the
case where we haven't committed to the module preprocessing directive
(i.e. changed the first token to the magic one). In that case there is
no preprocessing directive start and so CPP_PRAGMA_EOL would be wrong.
If there is a premature new-line after we've changed the first token though,
we shouldn't drop CPP_PRAGMA_EOL, because otherwise we ICE in the FE.
While clang++ and MSVC accept the testcase, in my reading it is incorrect
at least in the C++23 and newer wordings and I think the changes have been
a DR, https://eel.is/c++draft/cpp.module has no exception for new-lines
and https://eel.is/c++draft/cpp.pre#1.sentence-2 says that new-line (unless
deleted during phase 2 when after backslash) ends the preprocessing
directive.
The patch arranges for eol being set only in the not_module case.
2025-08-03 Jakub Jelinek <jakub@redhat.com>
PR c++/120845
libcpp/
* lex.cc (cpp_maybe_module_directive): Move eol variable declaration
to the start of the function, initialize to false and only set it to
peek->type == CPP_PRAGMA_EOL in the not_module case. Formatting fix.
gcc/testsuite/
* g++.dg/modules/cpp-21.C: New test.
The P2843R3 Preprocessing is never undefined paper contains comments
that various compilers handle comma operators in preprocessor expressions
incorrectly and I think they are right.
In both C and C++ the grammar uses constant-expression non-terminal
for #if/#elif and in both C and C++ that NT is conditional-expression,
so
#if 1, 2
is IMHO clearly wrong in both languages.
C89 then says for constant-expression
"Constant expressions shall not contain assignment, increment, decrement,
function-call, or comma operators, except when they are contained within the
operand of a sizeof operator."
Because all the remaining identifiers in the #if/#elif expression are
replaced with 0 I think assignments, increment, decrement and function-call
aren't that big deal because (0 = 1) or ++4 etc. are all invalid, but
for comma expressions I think it matters. In r0-56429 PR456 Joseph has
added !CPP_OPTION (pfile, c99) to handle that correctly.
Then C99 changed that to:
"Constant expressions shall not contain assignment, increment, decrement, function-call,
or comma operators, except when they are contained within a subexpression that is not
evaluated."
That made for C99+
#if 1 || (1, 2)
etc. valid but
#if (1, 2)
is still invalid, ditto
#if 1 ? 1, 2 : 3
In C++ I can't find anything like that though, and as can be seen on say
int a[(1, 2)];
int b[1 ? 1, 2 : 3];
being accepted by C++ and rejected by C while
int c[1, 2];
int d[1 ? 2 : 3, 4];
being rejected in both C and C++, so I think for C++ it is indeed just the
grammar that prevents #if 1, 2. When it is the second operand of ?: or
inside of () the grammar just uses expression and that allows comma
operator.
So, the following patch uses different decisions for C++ when to diagnose
comma operator in preprocessor expressions, for C++ tracks if it is inside
of () (obviously () around #embed clauses don't count unless one uses
limit ((1, 2)) etc.) or inside of the second ?: operand and allows comma
operator there and disallows elsewhere.
BTW, I wonder if anything in the standard disallows <=> in the preprocessor
expressions. Say
#if (0 <=> 1) < 0
etc.
#include <compare>
constexpr int a = (0 <=> 1) < 0;
is valid (but not valid without #include <compare>) and the expressions
don't use any identifiers.
2025-07-30 Jakub Jelinek <jakub@redhat.com>
PR c++/120778
* internal.h (struct lexer_state): Add comma_ok member.
* expr.cc (_cpp_parse_expr): Initialize it to 0, increment on
CPP_OPEN_PAREN and CPP_QUERY, decrement on CPP_CLOSE_PAREN
and CPP_COLON.
(num_binary_op): For C++ pedwarn on comma operator if
pfile->state.comma_ok is 0 instead of !c99 or skip_eval.
* g++.dg/cpp/if-comma-1.C: New test.
To respect the #pragma diagnostic lines in libstdc++ headers when compiling
with module std, we need to represent them in the module.
I think it's reasonable to give serializers direct access to the underlying
data, as here with get_classification_history. This is a different approach
from how Jakub made PCH streaming members of diagnostic_option_classifier,
but it seems to me that modules handling belongs in module.cc.
libcpp/ChangeLog:
* line-map.cc (linemap_location_from_module_p): Add.
* include/line-map.h: Declare it.
gcc/ChangeLog:
* diagnostic.h (diagnostic_option_classifier): Friend
diagnostic_context.
(diagnostic_context::get_classification_history): New.
gcc/cp/ChangeLog:
* module.cc (module_state::write_diagnostic_classification): New.
(module_state::write_begin): Call it.
(module_state::read_diagnostic_classification): New.
(module_state::read_initial): Call it.
(dk_string, dump_dc_change): New.
gcc/testsuite/ChangeLog:
* g++.dg/modules/warn-spec-3_a.C: New test.
* g++.dg/modules/warn-spec-3_b.C: New test.
* g++.dg/modules/warn-spec-3_c.C: New test.
This patch to the "experimental-html" diagnostic sink:
* adds use of the PatternFly 3 CSS library (via an optional link
in the generated html to a copy in a CDN)
* uses PatternFly's "alert" pattern to show severities for diagnostics,
properly nesting "note" diagnostics for diagnostic groups.
Example:
before: https://dmalcolm.fedorapeople.org/gcc/2025-06-10/before/diagnostic-ranges.c.html
after: https://dmalcolm.fedorapeople.org/gcc/2025-06-10/after/diagnostic-ranges.c.html
* adds initial support for logical locations and physical locations
* adds initial support for multi-level nested diagnostics such as those
for C++ concepts diagnostics. Ideally this would show a clickable
disclosure widget to expand/collapse a level, but for now it uses
nested <ul> elements with <li> for the child diagnostics.
Example:
before: https://dmalcolm.fedorapeople.org/gcc/2025-06-10/before/nested-diagnostics-1.C.html
after: https://dmalcolm.fedorapeople.org/gcc/2025-06-10/after/nested-diagnostics-1.C.html
gcc/ChangeLog:
PR other/116792
* diagnostic-format-html.cc: Include "diagnostic-path.h" and
"diagnostic-client-data-hooks.h".
(html_builder::m_logical_loc_mgr): New field.
(html_builder::m_cur_nesting_levels): New field.
(html_builder::m_last_logical_location): New field.
(html_builder::m_last_location): New field.
(html_builder::m_last_expanded_location): New field.
(HTML_STYLE): Add "white-space: pre;" to .source and .annotation.
Add "gcc-quoted-text" CSS class.
(html_builder::html_builder): Initialize the new fields. If CSS
is enabled, add CDN links to PatternFly 3 stylesheets.
(html_builder::add_stylesheet): New.
(html_builder::on_report_diagnostic): Add "alert" param to
make_element_for_diagnostic, setting it by default, but unsetting
it for nested diagnostics below the top level. Use
add_at_nesting_level for nested diagnostics.
(add_nesting_level_attr): New.
(html_builder::add_at_nesting_level): New.
(get_pf_class_for_alert_div): New.
(get_pf_class_for_alert_icon): New.
(get_label_for_logical_location_kind): New.
(add_labelled_value): New.
(html_builder::make_element_for_diagnostic): Add leading comment.
Add "alert" param. Drop class="gcc-diagnostic" from <div> tag,
instead adding the class for a PatternFly 3 alert if "alert" is
true, and adding a <span> with an alert icon, both according to
the diagnostic severity. Add a severity prefix to the message for
alerts. Add any metadata/option text as suffixes to the message.
Show any logical location. Show any physical location. Don't
show the locus if the last location is unchanged within the
diagnostic_group. Wrap any execution path element in a
<div id="execution-path"> and add a label to it. Wrap any
generated patch in a <div id="suggested-fix"> and add a label
to it.
(selftest::test_simple_log): Update expected HTML.
gcc/testsuite/ChangeLog:
PR other/116792
* gcc.dg/html-output/missing-semicolon.py: Update for changes
to diagnostic elements.
* gcc.dg/format/diagnostic-ranges-html.py: Likewise.
* gcc.dg/plugin/diagnostic-test-metadata-html.py: Likewise. Drop
out-of-date comment.
* gcc.dg/plugin/diagnostic-test-paths-2.py: Likewise.
* gcc.dg/plugin/diagnostic-test-paths-4.py: Likewise. Drop
out-of-date comment.
* gcc.dg/plugin/diagnostic-test-show-locus.py: Likewise.
* lib/htmltest.py (get_diag_by_index): Update to use search by id.
(get_message_within_diag): Update to use search by class.
libcpp/ChangeLog:
PR other/116792
* include/line-map.h (typedef expanded_location): Convert to...
(struct expanded_location): ...this.
(operator==): New decl, for expanded_location.
(operator!=): Likewise.
* line-map.cc (operator==): New decl, for expanded_location.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
The backport of the PR108900 fix to 14 branch broke building chromium
because static_assert (__LINE__ == expected_line_number, ""); now triggers
as the __LINE__ values are off by one.
This isn't the case on the trunk and 15 branch because we've switched
to 64-bit location_t and so one actually needs far longer header files
to trigger it.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120061#c11https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120061#c12
contain (large) testcases in patch form which show on the 14 branch
that the first one used to fail before the PR108900 backport and now
works correctly, while the second one attempts to match the chromium
behavior and it used to pass before the PR108900 backport and now it
FAILs.
The two testcases show rare problematic cases, because
do_include_common -> parse_include -> check_eol -> check_eol_1 ->
cpp_get_token_1 -> _cpp_lex_token -> _cpp_lex_direct -> linemap_line_start
triggers there
/* Allocate the new line_map. However, if the current map only has a
single line we can sometimes just increase its column_bits instead. */
if (line_delta < 0
|| last_line != ORDINARY_MAP_STARTING_LINE_NUMBER (map)
|| SOURCE_COLUMN (map, highest) >= (1U << (column_bits - range_bits))
|| ( /* We can't reuse the map if the line offset is sufficiently
large to cause overflow when computing location_t values. */
(to_line - ORDINARY_MAP_STARTING_LINE_NUMBER (map))
>= (((uint64_t) 1)
<< (CHAR_BIT * sizeof (linenum_type) - column_bits)))
|| range_bits < map->m_range_bits)
map = linemap_check_ordinary
(const_cast <line_map *>
(linemap_add (set, LC_RENAME,
ORDINARY_MAP_IN_SYSTEM_HEADER_P (map),
ORDINARY_MAP_FILE_NAME (map),
to_line)));
and so creates a new ordinary map on the line right after the
(problematic) #include line.
Now, in the spot that r14-11679-g8a884140c2bcb7 patched,
pfile->line_table->highest_location in all 3 tests (also
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120061#c13
) is before the decrement the start of the line after the #include line and so
the decrement is really desirable in that case to put highest_location
somewhere on the line where the #include actually is.
But at the same time it is also undesirable, because if we do decrement it,
then linemap_add LC_ENTER called from _cpp_do_file_change will then
/* Generate a start_location above the current highest_location.
If possible, make the low range bits be zero. */
location_t start_location = set->highest_location + 1;
unsigned range_bits = 0;
if (start_location < LINE_MAP_MAX_LOCATION_WITH_COLS)
range_bits = set->default_range_bits;
start_location += (1 << range_bits) - 1;
start_location &= ~((1 << range_bits) - 1);
linemap_assert (!LINEMAPS_ORDINARY_USED (set)
|| (start_location
>= MAP_START_LOCATION (LINEMAPS_LAST_ORDINARY_MAP (set))));
and we can end up with the new LC_ENTER ordinary map having the same
start_location as the preceding LC_RENAME one.
Next thing that happens is computation of included_from:
if (reason == LC_ENTER)
{
if (set->depth == 0)
map->included_from = 0;
else
/* The location of the end of the just-closed map. */
map->included_from
= (((map[0].start_location - 1 - map[-1].start_location)
& ~((1 << map[-1].m_column_and_range_bits) - 1))
+ map[-1].start_location);
The normal case (e.g. with the testcase included at the start of this comment) is
that map[-1] starts somewhere earlier and so map->included_from computation above
nicely computes location_t which expands to the start of the #include line.
With r14-11679 reverted, for #c11 as well as #c12
map[0].start_location == map[-1].start_location above, and so it is
((location_t) -1 & ~((1 << map[-1].m_column_and_range_bits) - 1)))
+ map[-1].start_location,
which happens to be start of the #include line.
For #c11 map[0].start_location is 0x500003a0 and map[-1] has
m_column_and_range_bits 7 and map[-2] has m_column_and_range_bits 12 and
map[0].included_from is set to 0x50000320.
For #c12 map[0].start_location is 0x606c0402 and map[-2].start_location is
0x606c0400 and m_column_and_range_bits is 0 for all 3 maps.
map[0].included_from is set to 0x606c0401.
The last important part is again in linemap_add when doing LC_LEAVE:
/* (MAP - 1) points to the map we are leaving. The
map from which (MAP - 1) got included should be the map
that comes right before MAP in the same file. */
from = linemap_included_from_linemap (set, map - 1);
/* A TO_FILE of NULL is special - we use the natural values. */
if (to_file == NULL)
{
to_file = ORDINARY_MAP_FILE_NAME (from);
to_line = SOURCE_LINE (from, from[1].start_location);
sysp = ORDINARY_MAP_IN_SYSTEM_HEADER_P (from);
}
Here it wants to compute the right to_line which ought to be the line after
the #include directive.
On the #c11 testcase that doesn't work correctly though, because
map[-1].included_from is 0x50000320, from[0] for that is LC_ENTER with
start_location 0x4080 and m_column_and_range_bits 12 but note that we've
earlier computed map[-1].start_location + (-1 & 0xffffff80) and so only
decreased by 7 bits, so to_line is still on the line with #include and not
after it. In the #c12 that doesn't happen, all the ordinary maps involved
there had 0 m_column_and_range_bits and so this computes correct line.
Below is a fix for the trunk including testcases using the
location_overflow_plugin hack to simulate the bugs without needing huge
files (in the 14 case it is just 330KB and almost 10MB, but in the 15
case it would need to be far bigger).
The pre- r15-9018 trunk has
FAIL: gcc.dg/plugin/location-overflow-test-pr116047.c -fplugin=./location_overflow_plugin.so scan-file static_assert[^\n\r]*6[^\n\r]*== 6
and current trunk
FAIL: gcc.dg/plugin/location-overflow-test-pr116047.c -fplugin=./location_overflow_plugin.so scan-file static_assert[^\n\r]*6[^\n\r]*== 6
FAIL: gcc.dg/plugin/location-overflow-test-pr120061.c -fplugin=./location_overflow_plugin.so scan-file static_assert[^\n\r]*5[^\n\r]*== 5
and with the patch everything PASSes.
I'll post afterwards a 14 version of the patch.
The patch reverts the r15-9018 change, because it is incorrect,
we really need to decrement it even when crossing ordinary map
boundaries, so that the location is not on the line after the #include
line but somewhere on the #include line. It also patches two spots
in linemap_add mentioned above to make sure we get correct locations
both in the included_from location_t when doing LC_ENTER (second
line-map.cc hunk) and when doing LC_LEAVE to compute the right to_line
(first line-map.cc hunk), both in presence of an added LC_RENAME
with the same start_location as the following LC_ENTER (i.e. the
problematic cases).
The LC_ENTER hunk is mostly to ensure included_form location_t is
at the start of the #include line (column 0), without it we can
decrease include_from not enough and end up at some random column
in the middle of the line, because it is masking away
map[-1].m_column_and_range_bits bits even when in the end the resulting
include_from location_t will be found in map[-2] map with perhaps
different m_column_and_range_bits. That alone doesn't fix the bug
though.
The more important is the LC_LEAVE hunk and the problem there is
caused by linemap_line_start not actually doing
r = set->highest_line + (line_delta << map->m_column_and_range_bits);
when adding a new map (the LC_RENAME one because we need to switch to
different number of directly encoded ranges, or columns, etc.).
So, in the original PR108900 case that
to_line = SOURCE_LINE (from, from[1].start_location);
doesn't do the right thing, from there is the last < 0x50000000 map
with m_column_and_range_bits 12, from[1] is the first one above it
and map[-1].included_from is the correct location of column 0 on
the #include line, but as the new LC_RENAME map has been created without
actually increasing highest_location to be on the new line (we've just
set to_line of the new LC_RENAME map to the correct line),
to_line = SOURCE_LINE (from, from[1].start_location);
stays on the same source line. I've tried to just replace that with
to_line = SOURCE_LINE (from, linemap_included_from (map - 1)) + 1;
i.e. just find out the #include line from map[-1].included_from and
add 1 to it, unfortunately that breaks the
c-c++-common/cpp/line-4.c
test where we expect to stay on the same 0 line for LC_LEAVE from
<command line> and gcc.dg/cpp/trad/Wunused.c, gcc.dg/cpp/trad/builtins.c
and c-c++-common/analyzer/named-constants-via-macros-traditional.c tests
all with -traditional-cpp preprocessing where to_line is also off-by-one
from the expected one.
So, this patch instead conditionalizes it, uses the
to_line = SOURCE_LINE (from, linemap_included_from (map - 1)) + 1;
way only if from[1] is a LC_RENAME map (rather than the usual
LC_ENTER one), that should limit it to the problematic cases of when
parse_include peeked after EOL and had to create LC_RENAME map with
the same start_location as the LC_ENTER after it.
Some further justification for the LC_ENTER hunk, using the
https://gcc.gnu.org/pipermail/gcc-patches/2025-May/682774.html testcase
(old is 14 before r14-11679, vanilla current 14 and new with the 14 patch)
I get
$ /usr/src/gcc-14/obj/gcc/cc1.old -quiet -std=c23 pr116047.c -nostdinc
In file included from pr116047-1.h:327677:21,
from pr116047.c:4:
pr116047-2.h:1:1: error: unknown type name ‘a’
1 | a b c;
| ^
pr116047-2.h:1:5: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘c’
1 | a b c;
| ^
pr116047-1.h:327677:1: error: static assertion failed: ""
327677 | #include "pr116047-2.h"
| ^~~~~~~~~~~~~
$ /usr/src/gcc-14/obj/gcc/cc1.vanilla -quiet -std=c23 pr116047.c -nostdinc
In file included from pr116047-1.h:327678,
from pr116047.c:4:
pr116047-2.h:1:1: error: unknown type name ‘a’
1 | a b c;
| ^
pr116047-2.h:1:5: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘c’
1 | a b c;
| ^
$ /usr/src/gcc-14/obj/gcc/cc1.new -quiet -std=c23 pr116047.c -nostdinc
In file included from pr116047-1.h:327677,
from pr116047.c:4:
pr116047-2.h:1:1: error: unknown type name ‘a’
1 | a b c;
| ^
pr116047-2.h:1:5: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘c’
1 | a b c;
| ^
pr116047-1.h has on lines 327677+327678:
#include "pr116047-2.h"
static_assert (__LINE__ == 327678, "");
so the static_assert failure is something that was dealt mainly in the
LC_LEAVE hunk and files.cc reversion, but please have a look at the
In file included from lines.
14.2 emits correct line (#include "pr116047-2.h" is indeed on line
327677) but some random column in there (which is not normally printed
for smaller headers; 21 is the . before extension in the filename).
Current trunk emits incorrect line (327678 instead of 327677, clearly
it didn't decrement).
And the patched compiler emits the right line with no column, as would
be printed if I remove e.g. 300000 newlines from the file.
2025-05-07 Jakub Jelinek <jakub@redhat.com>
PR preprocessor/108900
PR preprocessor/116047
PR preprocessor/120061
* files.cc (_cpp_stack_file): Revert 2025-03-28 change.
* line-map.cc (linemap_add): Use
SOURCE_LINE (from, linemap_included_from (map - 1)) + 1; instead of
SOURCE_LINE (from, from[1].start_location); to compute to_line
for LC_LEAVE. For LC_ENTER included_from computation, look at
map[-2] or even lower if map[-1] has the same start_location as
map[0].
* gcc.dg/plugin/plugin.exp: Add location-overflow-test-pr116047.c
and location-overflow-test-pr120061.c.
* gcc.dg/plugin/location_overflow_plugin.cc (plugin_init): Don't error
on unknown values, instead just break. Handle 0x4fHHHHHH arguments
differently.
* gcc.dg/plugin/location-overflow-test-pr116047.c: New test.
* gcc.dg/plugin/location-overflow-test-pr116047-1.h: New test.
* gcc.dg/plugin/location-overflow-test-pr116047-2.h: New test.
* gcc.dg/plugin/location-overflow-test-pr120061.c: New test.
* gcc.dg/plugin/location-overflow-test-pr120061-1.h: New test.
* gcc.dg/plugin/location-overflow-test-pr120061-2.h: New test.
The warning for -Wunknown-pragmas is issued at the location provided by
libcpp to the def_pragma() callback. This location is
cpp_reader::directive_line, which is a location for the start of the line
only; it is also not a valid location in case the unknown pragma was lexed
from a _Pragma string. These factors make it impossible to suppress
-Wunknown-pragmas via _Pragma("GCC diagnostic...") directives on the same
source line, as in the PR and the test case. Address that by issuing the
warning at a better location returned by cpp_get_diagnostic_override_loc().
libcpp already maintains this location to handle _Pragma-related diagnostics
internally; it was needed also to make a publicly accessible version of it.
gcc/c-family/ChangeLog:
PR c/118838
* c-lex.cc (cb_def_pragma): Call cpp_get_diagnostic_override_loc()
to get a valid location at which to issue -Wunknown-pragmas, in case
it was triggered from a _Pragma.
libcpp/ChangeLog:
PR c/118838
* errors.cc (cpp_get_diagnostic_override_loc): New function.
* include/cpplib.h (cpp_get_diagnostic_override_loc): Declare.
gcc/testsuite/ChangeLog:
PR c/118838
* c-c++-common/cpp/pragma-diagnostic-loc-2.c: New test.
* g++.dg/gomp/macro-4.C: Adjust expected output.
* gcc.dg/gomp/macro-4.c: Likewise.
* gcc.dg/cpp/Wunknown-pragmas-1.c: Likewise.