libcpp: Improve locations for macros defined prior to PCH include [PR105608]

It is permissible to define macros prior to including a PCH, as long as
these definitions are disjoint from or identical to the macros in the
PCH. The PCH loading process replaces all libcpp data structures with those
from the PCH, so it is necessary to remember the extra macros separately and
then restore them after loading the PCH, which all is handled by
cpp_save_state() and cpp_read_state() in libcpp/pch.cc. The restoration
process consists of pushing a buffer containing the macro definition and
then lexing it from there, similar to how a command-line -D option is
processed. The current implementation does not attempt to set up the
line_map for this process, and so the locations assigned to the macros are
often not meaningful. (Similar to what happened in the past with lexing the
tokens out of a _Pragma string, lexing out of a buffer rather than a file
produces "sorta" reasonable locations that are often close enough, but not
reliably correct.)

Fix that up by remembering enough additional information (more or less, an
expanded_location for each macro definition) to produce a reasonable
location for the newly restored macros.

One issue that came up is the treatment of command-line-defined macros. From
the perspective of the generic line_map data structures, the command-line
location is not distinguishable from other locations; it's just an ordinary
location created by the front ends with a fake file name by convention. (At
the moment, it is always the string `<command-line>', subject to
translation.)  Since libcpp needs to assign macros to that location, it
needs to know what location to use, so I added a new member
line_maps::cmdline_location for the front ends to set, similar to how
line_maps::builtin_location is handled.

This revealed a small issue, in c-opts.cc we have:

    /* All command line defines must have the same location.  */
      cpp_force_token_locations (parse_in, line_table->highest_line);

But contrary to the comment, all command line defines don't actually end up
with the same location anymore. This is because libcpp/lex.cc has been
expanded (r6-4873) to include range information on the returned
locations. That logic has never been respecting the request of
cpp_force_token_locations. I believe this was not intentional, and so I have
corrected that here. Prior to this patch, the range logic has been leading
to command-line macros all having similar locations in the same line map (or
ad-hoc locations based from there for sufficiently long tokens); with this
change, they all have exactly the same location and that location is
recorded in line_maps::cmdline_location.

With that change, then it works fine for pch.cc to restore macros whether
they came from the command-line or from the main file.

gcc/c-family/ChangeLog:

	PR preprocessor/105608
	* c-opts.cc (c_finish_options): Set new member
	line_table->cmdline_location.
	* c-pch.cc (c_common_read_pch): Adapt linemap usage to changes in
	libcpp pch.cc; it is now possible that the linemap is in a different
	file after returning from cpp_read_state().

libcpp/ChangeLog:

	PR preprocessor/105608
	* include/line-map.h: Add new member CMDLINE_LOCATION.
	* lex.cc (get_location_for_byte_range_in_cur_line): Do not expand
	the token location to include range information if token location
	override was requested.
	(warn_about_normalization): Likewise.
	(_cpp_lex_direct): Likewise.
	* pch.cc (struct saved_macro): New local struct.
	(struct save_macro_data): Change DEFNS vector to hold saved_macro
	rather than uchar*.
	(save_macros): Adapt to remember the location information for each
	saved macro in addition to the definition.
	(cpp_prepare_state): Likewise.
	(cpp_read_state): Use the saved location information to generate
	proper locations for the restored macros.

gcc/testsuite/ChangeLog:

	PR preprocessor/105608
	* g++.dg/pch/line-map-3.C: Remove xfails.
	* g++.dg/pch/line-map-4.C: New test.
	* g++.dg/pch/line-map-4.Hs: New test.
This commit is contained in:
Lewis Hyatt 2025-07-30 19:20:55 -04:00 committed by Lewis Hyatt
parent f8881d56b6
commit d538348556
No known key found for this signature in database
8 changed files with 132 additions and 50 deletions

View File

@ -1700,6 +1700,7 @@ c_finish_options (void)
bool cxx_assert_seen_p = false;
/* All command line defines must have the same location. */
line_table->cmdline_location = line_table->highest_line;
cpp_force_token_locations (parse_in, line_table->highest_line);
for (size_t i = 0; i < deferred_count; i++)
{

View File

@ -347,18 +347,18 @@ c_common_read_pch (cpp_reader *pfile, const char *name,
rebuild_location_adhoc_htab (line_table);
line_table->trace_includes = saved_trace_includes;
/* Set the current location to the line containing the #include (or the
#pragma GCC pch_preprocess) for the purpose of assigning locations to any
macros that are about to be restored. */
linemap_add (line_table, LC_ENTER, 0, saved_loc.file,
saved_loc.line > 1 ? saved_loc.line - 1 : saved_loc.line);
/* Set the line_map current location to the start of the file, so that things
remain in order after cpp_read_state() re-adds any macros that were defined
prior to calling gt_pch_restore(). */
linemap_add (line_table, LC_ENTER, saved_loc.sysp, saved_loc.file, 0);
timevar_push (TV_PCH_CPP_RESTORE);
cpp_result = cpp_read_state (pfile, name, f, smd);
/* Set the current location to the line following the #include, where we
were prior to processing the PCH. */
linemap_line_start (line_table, saved_loc.line, 0);
linemap_add (line_table, LC_RENAME, saved_loc.sysp, saved_loc.file,
saved_loc.line);
timevar_pop (TV_PCH_CPP_RESTORE);

View File

@ -1,23 +1,5 @@
#define UNUSED_MACRO /* { dg-error "UNUSED_MACRO" "" { xfail *-*-* } } */
#include "line-map-3.H" /* { dg-bogus "-:UNUSED_MACRO" "" { xfail *-*-* } } */
/* { dg-do compile } */
/* { dg-additional-options "-Werror=unused-macros" } */
/* PR preprocessor/105608 */
/* This test case is currently xfailed and requires work in libcpp/pch.cc to
resolve. Currently, the macro location is incorrectly assigned to line 2
of the header file when read via PCH, because libcpp doesn't try to
assign locations relative to the newly loaded line map after restoring
the PCH. */
/* In PCH mode we also complain incorrectly about the command line macro -Dwith_PCH
added by dejagnu; that warning would get suppressed if the macro location were
correctly restored by libcpp to reflect that it was a command line macro. */
/* { dg-bogus "-:with_PCH" "" { xfail *-*-* } 2 } */
/* The reason we used -Werror was to prevent pch.exp from rerunning without PCH;
in that case we would get unnecessary XPASS outputs since the test does work
fine without PCH. Once the bug is fixed, remove the -Werror and switch to
dg-warning. */
/* { dg-regexp {[^[:space:]]*: some warnings being treated as errors} } */
/* { dg-do compile } */
/* { dg-additional-options "-Wunused-macros" } */
#define UNUSED_MACRO /* { dg-warning "-:UNUSED_MACRO" "" } */
#include "line-map-3.H" /* { dg-bogus "-:UNUSED_MACRO" "" } */

View File

@ -0,0 +1,5 @@
/* PR preprocessor/105608 */
/* { dg-do compile } */
#define INT int /* { dg-error "-:declaration does not declare anything" } */
#include "line-map-4.H"
INT; /* { dg-note "in expansion of macro" } */

View File

@ -0,0 +1 @@
/* This space intentionally left blank. */

View File

@ -878,6 +878,12 @@ public:
built-in tokens. */
location_t builtin_location;
/* The special location value to be used for tokens originating on the
command line. This is currently only needed by the C-family front ends
for PCH support; if it would be used for another purpose in the future,
then other libcpp-using front ends may need to set it as well. */
location_t cmdline_location;
/* The default value of range_bits in ordinary line maps. */
unsigned int default_range_bits;

View File

@ -1353,6 +1353,8 @@ get_location_for_byte_range_in_cur_line (cpp_reader *pfile,
const unsigned char *const start,
size_t num_bytes)
{
if (pfile->forced_token_location)
return pfile->forced_token_location;
gcc_checking_assert (num_bytes > 0);
/* CPP_BUF_COLUMN and linemap_position_for_column both refer
@ -2035,6 +2037,7 @@ warn_about_normalization (cpp_reader *pfile,
/* If possible, create a location range for the token. */
if (loc >= RESERVED_LOCATION_COUNT
&& token->type != CPP_EOF
&& !pfile->forced_token_location
/* There must be no line notes to process. */
&& (!(pfile->buffer->cur
>= pfile->buffer->notes[pfile->buffer->cur_note].pos
@ -4399,7 +4402,8 @@ _cpp_lex_direct (cpp_reader *pfile)
/* Potentially convert the location of the token to a range. */
if (result->src_loc >= RESERVED_LOCATION_COUNT
&& result->type != CPP_EOF)
&& result->type != CPP_EOF
&& !pfile->forced_token_location)
{
/* Ensure that any line notes are processed, so that we have the
correct physical line/column for the end-point of the token even

View File

@ -730,9 +730,20 @@ cpp_valid_state (cpp_reader *r, const char *name, int fd)
/* Save all the existing macros. */
namespace {
struct saved_macro
{
uchar *defn;
location_t loc;
expanded_location xloc;
};
} /* unnamed namespace */
struct save_macro_data
{
uchar **defns;
saved_macro *defns;
size_t count;
size_t array_size;
char **saved_pragmas;
@ -762,14 +773,29 @@ save_macros (cpp_reader *r, cpp_hashnode *h, void *data_p)
if (data->count == data->array_size)
{
data->array_size *= 2;
data->defns = XRESIZEVEC (uchar *, data->defns, (data->array_size));
data->defns = XRESIZEVEC (saved_macro, data->defns, data->array_size);
}
const uchar * defn = cpp_macro_definition (r, h);
size_t defnlen = ustrlen (defn);
data->defns[data->count] = (uchar *) xmemdup (defn, defnlen, defnlen + 2);
data->defns[data->count][defnlen] = '\n';
const auto d = data->defns + data->count;
d->defn = (uchar *) xmemdup (defn, defnlen, defnlen + 2);
d->defn[defnlen] = '\n';
d->loc = h->value.macro->line;
d->xloc = {};
if (d->loc == r->line_table->cmdline_location)
{
/* The cmdline_location may be different when it comes time to
restore the macros, so use 0 to indicate this. */
d->loc = 0;
}
else if (d->loc != r->line_table->builtin_location)
{
const auto map = linemap_lookup (r->line_table, d->loc);
gcc_assert (map);
d->xloc = linemap_expand_location (r->line_table, map, d->loc);
}
data->count++;
}
@ -785,9 +811,22 @@ cpp_prepare_state (cpp_reader *r, struct save_macro_data **data)
struct save_macro_data *d = XNEW (struct save_macro_data);
d->array_size = 512;
d->defns = XNEWVEC (uchar *, d->array_size);
d->defns = XNEWVEC (saved_macro, d->array_size);
d->count = 0;
cpp_forall_identifiers (r, save_macros, d);
/* Sort the saved macros in order, so the line map gets built up
in chronological order as expected when we load them back in. */
static line_maps *line_table;
line_table = r->line_table;
qsort (d->defns, d->count, sizeof (saved_macro),
[] (const void *x1, const void *x2)
{
const auto d1 = static_cast<const saved_macro *> (x1);
const auto d2 = static_cast<const saved_macro *> (x2);
return linemap_compare_locations (line_table, d2->loc, d1->loc);
});
d->saved_pragmas = _cpp_save_pragma_names (r);
*data = d;
}
@ -820,16 +859,16 @@ cpp_read_state (cpp_reader *r, const char *name, FILE *f,
r->state.prevent_expansion = 1;
r->state.angled_headers = 0;
const auto old_directive_line = r->directive_line;
/* Run through the carefully-saved macros, insert them. */
const char *cur_fname = nullptr;
for (i = 0; i < data->count; i++)
{
cpp_hashnode *h;
size_t namelen;
uchar *defn;
namelen = ustrcspn (data->defns[i], "( \n");
h = cpp_lookup (r, data->defns[i], namelen);
defn = data->defns[i] + namelen;
const auto d = data->defns + i;
const size_t namelen = ustrcspn (d->defn, "( \n");
cpp_hashnode *const h = cpp_lookup (r, d->defn, namelen);
const auto defn = d->defn + namelen;
/* The PCH file is valid, so we know that if there is a definition
from the PCH file it must be the same as the one we had
@ -839,28 +878,72 @@ cpp_read_state (cpp_reader *r, const char *name, FILE *f,
if (cpp_push_buffer (r, defn, ustrchr (defn, '\n') - defn, true)
!= NULL)
{
_cpp_clean_line (r);
if (d->loc == r->line_table->builtin_location)
r->directive_line = r->line_table->builtin_location;
else if (d->loc == 0)
/* This was a command-line-defined macro, preserve that
aspect now. */
r->directive_line = r->line_table->cmdline_location;
else
{
/* In practice, almost all of the locations ought to be in the
main file, since a PCH cannot be loaded after including
other headers. The (probably sole) exception is that
implicit preincludes can be loaded, so if an implicit
preinclude such as /usr/include/stdc-predef.h has changed
since the PCH was created (in a way not invalidating the
PCH), we can end up here with locations that are not in the
main file. In that case, we may be adding them to the
line_map out of order (the implicit preinclude already
being represented in the line_map we have loaded from the
PCH), but it is probably not going to cause any observable
issue given the constraints on what can appear in a header
before loading a PCH. */
if (!cur_fname || strcmp (cur_fname, d->xloc.file))
{
linemap_add (r->line_table, LC_RENAME, d->xloc.sysp,
d->xloc.file, d->xloc.line);
cur_fname = d->xloc.file;
}
r->directive_line = linemap_line_start (r->line_table,
d->xloc.line, 127);
}
/* ??? Using r->line_table->highest_line is not ideal here, but we
do need to use some location that is relative to the new line
map just loaded, not the old one that was in effect when these
macros were lexed. The proper fix is to remember the file name
and line number where each macro was defined, and then add
these locations into the new line map. See PR105608. */
if (!_cpp_create_definition (r, h, r->line_table->highest_line))
/* Even when the macro was defined in the main file, we have to
use cpp_force_token_locations here, so all tokens in the macro
will have location assigned to the directive line with no
column information. We no longer know even which line the
tokens appeared on originally now, much less the column, since
we only stored the macro definition string after lexing it. */
cpp_force_token_locations (r, r->directive_line);
_cpp_clean_line (r);
if (!_cpp_create_definition (r, h, r->directive_line))
abort ();
cpp_stop_forcing_token_locations (r);
_cpp_pop_buffer (r);
if (d->loc < RESERVED_LOCATION_COUNT)
/* Macros defined on the command line or as builtins are always
implicitly used. It is necessary to note this here or else
-Wunused-macros will complain, because these locations do
return TRUE for MAIN_FILE_P(). For a command line
definition, d->loc was set to 0 when we saved it, so it does
satisfy the check performed here. */
h->value.macro->used = true;
}
else
abort ();
}
free (data->defns[i]);
free (data->defns[i].defn);
}
r->state = old_state;
r->directive_line = old_directive_line;
_cpp_restore_pragma_names (r, data->saved_pragmas);
XDELETEVEC (data->defns);
free (data);
if (deps_restore (r->deps, f, CPP_OPTION (r, restore_pch_deps) ? name : NULL)