mirror of git://gcc.gnu.org/git/gcc.git
Merge of HSA
2016-01-19 Martin Jambor <mjambor@suse.cz>
Martin Liska <mliska@suse.cz>
Michael Matz <matz@suse.de>
libgomp/
* plugin/Makefrag.am: Add HSA plugin requirements.
* plugin/configfrag.ac (HSA_RUNTIME_INCLUDE): New variable.
(HSA_RUNTIME_LIB): Likewise.
(HSA_RUNTIME_CPPFLAGS): Likewise.
(HSA_RUNTIME_INCLUDE): New substitution.
(HSA_RUNTIME_LIB): Likewise.
(HSA_RUNTIME_LDFLAGS): Likewise.
(hsa-runtime): New configure option.
(hsa-runtime-include): Likewise.
(hsa-runtime-lib): Likewise.
(PLUGIN_HSA): New substitution variable.
Fill HSA_RUNTIME_INCLUDE and HSA_RUNTIME_LIB according to the new
configure options.
(PLUGIN_HSA_CPPFLAGS): Likewise.
(PLUGIN_HSA_LDFLAGS): Likewise.
(PLUGIN_HSA_LIBS): Likewise.
Check that we have access to HSA run-time.
* libgomp-plugin.h (offload_target_type): New element
OFFLOAD_TARGET_TYPE_HSA.
* libgomp.h (gomp_target_task): New fields firstprivate_copies and
args.
(bool gomp_create_target_task): Updated.
(gomp_device_descr): Extra parameter of run_func and async_run_func,
new field can_run_func.
* libgomp_g.h (GOMP_target_ext): Update prototype.
* oacc-host.c (host_run): Added a new parameter args.
* target.c (calculate_firstprivate_requirements): New function.
(copy_firstprivate_data): Likewise.
(gomp_target_fallback_firstprivate): Use them.
(gomp_target_unshare_firstprivate): New function.
(gomp_get_target_fn_addr): Allow returning NULL for shared memory
devices.
(GOMP_target): Do host fallback for all shared memory devices. Do not
pass any args to plugins.
(GOMP_target_ext): Introduce device-specific argument parameter args.
Allow host fallback if device shares memory. Do not remap data if
device has shared memory.
(gomp_target_task_fn): Likewise. Also treat shared memory devices
like host fallback for mappings.
(GOMP_target_data): Treat shared memory devices like host fallback.
(GOMP_target_data_ext): Likewise.
(GOMP_target_update): Likewise.
(GOMP_target_update_ext): Likewise. Also pass NULL as args to
gomp_create_target_task.
(GOMP_target_enter_exit_data): Likewise.
(omp_target_alloc): Treat shared memory devices like host fallback.
(omp_target_free): Likewise.
(omp_target_is_present): Likewise.
(omp_target_memcpy): Likewise.
(omp_target_memcpy_rect): Likewise.
(omp_target_associate_ptr): Likewise.
(gomp_load_plugin_for_device): Also load can_run.
* task.c (GOMP_PLUGIN_target_task_completion): Free
firstprivate_copies.
(gomp_create_target_task): Accept new argument args and store it to
ttask.
* plugin/plugin-hsa.c: New file.
gcc/
* Makefile.in (OBJS): Add new source files.
(GTFILES): Add hsa.c.
* common.opt (disable_hsa): New variable.
(-Whsa): New warning.
* config.in (ENABLE_HSA): New.
* configure.ac: Treat hsa differently from other accelerators.
(OFFLOAD_TARGETS): Define ENABLE_OFFLOADING according to
$enable_offloading.
(ENABLE_HSA): Define ENABLE_HSA according to $enable_hsa.
* doc/install.texi (Configuration): Document --with-hsa-runtime,
--with-hsa-runtime-include, --with-hsa-runtime-lib and
--with-hsa-kmt-lib.
* doc/invoke.texi (-Whsa): Document.
(hsa-gen-debug-stores): Likewise.
* lto-wrapper.c (compile_images_for_offload_targets): Do not attempt
to invoke offload compiler for hsa acclerator.
* opts.c (common_handle_option): Determine whether HSA offloading
should be performed.
* params.def (PARAM_HSA_GEN_DEBUG_STORES): New parameter.
* builtin-types.def (BT_FN_VOID_UINT_PTR_INT_PTR): New.
(BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR_INT_INT): Removed.
(BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR_PTR): New.
* gimple-low.c (lower_stmt): Also handle GIMPLE_OMP_GRID_BODY.
* gimple-pretty-print.c (dump_gimple_omp_for): Also handle
GF_OMP_FOR_KIND_GRID_LOOP.
(dump_gimple_omp_block): Also handle GIMPLE_OMP_GRID_BODY.
(pp_gimple_stmt_1): Likewise.
* gimple-walk.c (walk_gimple_stmt): Likewise.
* gimple.c (gimple_build_omp_grid_body): New function.
(gimple_copy): Also handle GIMPLE_OMP_GRID_BODY.
* gimple.def (GIMPLE_OMP_GRID_BODY): New.
* gimple.h (enum gf_mask): Added GF_OMP_PARALLEL_GRID_PHONY,
GF_OMP_FOR_KIND_GRID_LOOP, GF_OMP_FOR_GRID_PHONY and
GF_OMP_TEAMS_GRID_PHONY.
(gimple_statement_omp_single_layout): Updated comments.
(gimple_build_omp_grid_body): New function.
(gimple_has_substatements): Also handle GIMPLE_OMP_GRID_BODY.
(gimple_omp_for_grid_phony): New function.
(gimple_omp_for_set_grid_phony): Likewise.
(gimple_omp_parallel_grid_phony): Likewise.
(gimple_omp_parallel_set_grid_phony): Likewise.
(gimple_omp_teams_grid_phony): Likewise.
(gimple_omp_teams_set_grid_phony): Likewise.
(gimple_return_set_retbnd): Also handle GIMPLE_OMP_GRID_BODY.
* omp-builtins.def (BUILT_IN_GOMP_OFFLOAD_REGISTER): New.
(BUILT_IN_GOMP_OFFLOAD_UNREGISTER): Likewise.
(BUILT_IN_GOMP_TARGET): Updated type.
* omp-low.c: Include symbol-summary.h, hsa.h and params.h.
(adjust_for_condition): New function.
(get_omp_for_step_from_incr): Likewise.
(extract_omp_for_data): Moved parts to adjust_for_condition and
get_omp_for_step_from_incr.
(build_outer_var_ref): Handle GIMPLE_OMP_GRID_BODY.
(fixup_child_record_type): Bail out if receiver_decl is NULL.
(scan_sharing_clauses): Handle OMP_CLAUSE__GRIDDIM_.
(scan_omp_parallel): Do not create child functions for phony
constructs.
(check_omp_nesting_restrictions): Handle GIMPLE_OMP_GRID_BODY.
(scan_omp_1_op): Checking assert we are not remapping to
ERROR_MARK. Also also handle GIMPLE_OMP_GRID_BODY.
(parallel_needs_hsa_kernel_p): New function.
(expand_parallel_call): Register apprpriate parallel child
functions as HSA kernels.
(grid_launch_attributes_trees): New type.
(grid_attr_trees): New variable.
(grid_create_kernel_launch_attr_types): New function.
(grid_insert_store_range_dim): Likewise.
(grid_get_kernel_launch_attributes): Likewise.
(get_target_argument_identifier_1): Likewise.
(get_target_argument_identifier): Likewise.
(get_target_argument_value): Likewise.
(push_target_argument_according_to_value): Likewise.
(get_target_arguments): Likewise.
(expand_omp_target): Call get_target_arguments instead of looking
up for teams and thread limit.
(grid_expand_omp_for_loop): New function.
(grid_arg_decl_map): New type.
(grid_remap_kernel_arg_accesses): New function.
(grid_expand_target_kernel_body): New function.
(expand_omp): Call it.
(lower_omp_for): Do not emit phony constructs.
(lower_omp_taskreg): Do not emit phony constructs but create for them
a temporary variable receiver_decl.
(lower_omp_taskreg): Do not emit phony constructs.
(lower_omp_teams): Likewise.
(lower_omp_grid_body): New function.
(lower_omp_1): Call it.
(grid_reg_assignment_to_local_var_p): New function.
(grid_seq_only_contains_local_assignments): Likewise.
(grid_find_single_omp_among_assignments_1): Likewise.
(grid_find_single_omp_among_assignments): Likewise.
(grid_find_ungridifiable_statement): Likewise.
(grid_target_follows_gridifiable_pattern): Likewise.
(grid_remap_prebody_decls): Likewise.
(grid_copy_leading_local_assignments): Likewise.
(grid_process_kernel_body_copy): Likewise.
(grid_attempt_target_gridification): Likewise.
(grid_gridify_all_targets_stmt): Likewise.
(grid_gridify_all_targets): Likewise.
(execute_lower_omp): Call grid_gridify_all_targets.
(make_gimple_omp_edges): Handle GIMPLE_OMP_GRID_BODY.
* tree-core.h (omp_clause_code): Added OMP_CLAUSE__GRIDDIM_.
(tree_omp_clause): Added union field dimension.
* tree-pretty-print.c (dump_omp_clause): Handle OMP_CLAUSE__GRIDDIM_.
* tree.c (omp_clause_num_ops): Added number of arguments of
OMP_CLAUSE__GRIDDIM_.
(omp_clause_code_name): Added name of OMP_CLAUSE__GRIDDIM_.
(walk_tree_1): Handle OMP_CLAUSE__GRIDDIM_.
* tree.h (OMP_CLAUSE_GRIDDIM_DIMENSION): New.
(OMP_CLAUSE_SET_GRIDDIM_DIMENSION): Likewise.
(OMP_CLAUSE_GRIDDIM_SIZE): Likewise.
(OMP_CLAUSE_GRIDDIM_GROUP): Likewise.
* passes.def: Schedule pass_ipa_hsa and pass_gen_hsail.
* tree-pass.h (make_pass_gen_hsail): Declare.
(make_pass_ipa_hsa): Likewise.
* ipa-hsa.c: New file.
* lto-section-in.c (lto_section_name): Add hsa section name.
* lto-streamer.h (lto_section_type): Add hsa section.
* timevar.def (TV_IPA_HSA): New.
* hsa-brig-format.h: New file.
* hsa-brig.c: New file.
* hsa-dump.c: Likewise.
* hsa-gen.c: Likewise.
* hsa.c: Likewise.
* hsa.h: Likewise.
* toplev.c (compile_file): Call hsa_output_brig.
* hsa-regalloc.c: New file.
gcc/fortran/
* types.def (BT_FN_VOID_UINT_PTR_INT_PTR): New.
(BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR_INT_INT): Removed.
(BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR_PTR): New.
gcc/lto/
* lto-partition.c: Include "hsa.h"
(add_symbol_to_partition_1): Put hsa implementations into the
same partition as host implementations.
liboffloadmic/
* plugin/libgomp-plugin-intelmic.cpp (GOMP_OFFLOAD_async_run): New
unused parameter.
(GOMP_OFFLOAD_run): Likewise.
include/
* gomp-constants.h (GOMP_DEVICE_HSA): New macro.
(GOMP_VERSION_HSA): Likewise.
(GOMP_TARGET_ARG_DEVICE_MASK): Likewise.
(GOMP_TARGET_ARG_DEVICE_ALL): Likewise.
(GOMP_TARGET_ARG_SUBSEQUENT_PARAM): Likewise.
(GOMP_TARGET_ARG_ID_MASK): Likewise.
(GOMP_TARGET_ARG_NUM_TEAMS): Likewise.
(GOMP_TARGET_ARG_THREAD_LIMIT): Likewise.
(GOMP_TARGET_ARG_VALUE_SHIFT): Likewise.
(GOMP_TARGET_ARG_HSA_KERNEL_ATTRIBUTES): Likewise.
From-SVN: r232549
This commit is contained in:
parent
2bedb645f2
commit
b2b4005150
132
gcc/ChangeLog
132
gcc/ChangeLog
|
|
@ -1,3 +1,135 @@
|
||||||
|
2016-01-19 Martin Jambor <mjambor@suse.cz>
|
||||||
|
Martin Liska <mliska@suse.cz>
|
||||||
|
Michael Matz <matz@suse.de>
|
||||||
|
|
||||||
|
* Makefile.in (OBJS): Add new source files.
|
||||||
|
(GTFILES): Add hsa.c.
|
||||||
|
* common.opt (disable_hsa): New variable.
|
||||||
|
(-Whsa): New warning.
|
||||||
|
* config.in (ENABLE_HSA): New.
|
||||||
|
* configure.ac: Treat hsa differently from other accelerators.
|
||||||
|
(OFFLOAD_TARGETS): Define ENABLE_OFFLOADING according to
|
||||||
|
$enable_offloading.
|
||||||
|
(ENABLE_HSA): Define ENABLE_HSA according to $enable_hsa.
|
||||||
|
* doc/install.texi (Configuration): Document --with-hsa-runtime,
|
||||||
|
--with-hsa-runtime-include, --with-hsa-runtime-lib and
|
||||||
|
--with-hsa-kmt-lib.
|
||||||
|
* doc/invoke.texi (-Whsa): Document.
|
||||||
|
(hsa-gen-debug-stores): Likewise.
|
||||||
|
* lto-wrapper.c (compile_images_for_offload_targets): Do not attempt
|
||||||
|
to invoke offload compiler for hsa acclerator.
|
||||||
|
* opts.c (common_handle_option): Determine whether HSA offloading
|
||||||
|
should be performed.
|
||||||
|
* params.def (PARAM_HSA_GEN_DEBUG_STORES): New parameter.
|
||||||
|
* builtin-types.def (BT_FN_VOID_UINT_PTR_INT_PTR): New.
|
||||||
|
(BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR_INT_INT): Removed.
|
||||||
|
(BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR_PTR): New.
|
||||||
|
* gimple-low.c (lower_stmt): Also handle GIMPLE_OMP_GRID_BODY.
|
||||||
|
* gimple-pretty-print.c (dump_gimple_omp_for): Also handle
|
||||||
|
GF_OMP_FOR_KIND_GRID_LOOP.
|
||||||
|
(dump_gimple_omp_block): Also handle GIMPLE_OMP_GRID_BODY.
|
||||||
|
(pp_gimple_stmt_1): Likewise.
|
||||||
|
* gimple-walk.c (walk_gimple_stmt): Likewise.
|
||||||
|
* gimple.c (gimple_build_omp_grid_body): New function.
|
||||||
|
(gimple_copy): Also handle GIMPLE_OMP_GRID_BODY.
|
||||||
|
* gimple.def (GIMPLE_OMP_GRID_BODY): New.
|
||||||
|
* gimple.h (enum gf_mask): Added GF_OMP_PARALLEL_GRID_PHONY,
|
||||||
|
GF_OMP_FOR_KIND_GRID_LOOP, GF_OMP_FOR_GRID_PHONY and
|
||||||
|
GF_OMP_TEAMS_GRID_PHONY.
|
||||||
|
(gimple_statement_omp_single_layout): Updated comments.
|
||||||
|
(gimple_build_omp_grid_body): New function.
|
||||||
|
(gimple_has_substatements): Also handle GIMPLE_OMP_GRID_BODY.
|
||||||
|
(gimple_omp_for_grid_phony): New function.
|
||||||
|
(gimple_omp_for_set_grid_phony): Likewise.
|
||||||
|
(gimple_omp_parallel_grid_phony): Likewise.
|
||||||
|
(gimple_omp_parallel_set_grid_phony): Likewise.
|
||||||
|
(gimple_omp_teams_grid_phony): Likewise.
|
||||||
|
(gimple_omp_teams_set_grid_phony): Likewise.
|
||||||
|
(gimple_return_set_retbnd): Also handle GIMPLE_OMP_GRID_BODY.
|
||||||
|
* omp-builtins.def (BUILT_IN_GOMP_OFFLOAD_REGISTER): New.
|
||||||
|
(BUILT_IN_GOMP_OFFLOAD_UNREGISTER): Likewise.
|
||||||
|
(BUILT_IN_GOMP_TARGET): Updated type.
|
||||||
|
* omp-low.c: Include symbol-summary.h, hsa.h and params.h.
|
||||||
|
(adjust_for_condition): New function.
|
||||||
|
(get_omp_for_step_from_incr): Likewise.
|
||||||
|
(extract_omp_for_data): Moved parts to adjust_for_condition and
|
||||||
|
get_omp_for_step_from_incr.
|
||||||
|
(build_outer_var_ref): Handle GIMPLE_OMP_GRID_BODY.
|
||||||
|
(fixup_child_record_type): Bail out if receiver_decl is NULL.
|
||||||
|
(scan_sharing_clauses): Handle OMP_CLAUSE__GRIDDIM_.
|
||||||
|
(scan_omp_parallel): Do not create child functions for phony
|
||||||
|
constructs.
|
||||||
|
(check_omp_nesting_restrictions): Handle GIMPLE_OMP_GRID_BODY.
|
||||||
|
(scan_omp_1_op): Checking assert we are not remapping to
|
||||||
|
ERROR_MARK. Also also handle GIMPLE_OMP_GRID_BODY.
|
||||||
|
(parallel_needs_hsa_kernel_p): New function.
|
||||||
|
(expand_parallel_call): Register apprpriate parallel child
|
||||||
|
functions as HSA kernels.
|
||||||
|
(grid_launch_attributes_trees): New type.
|
||||||
|
(grid_attr_trees): New variable.
|
||||||
|
(grid_create_kernel_launch_attr_types): New function.
|
||||||
|
(grid_insert_store_range_dim): Likewise.
|
||||||
|
(grid_get_kernel_launch_attributes): Likewise.
|
||||||
|
(get_target_argument_identifier_1): Likewise.
|
||||||
|
(get_target_argument_identifier): Likewise.
|
||||||
|
(get_target_argument_value): Likewise.
|
||||||
|
(push_target_argument_according_to_value): Likewise.
|
||||||
|
(get_target_arguments): Likewise.
|
||||||
|
(expand_omp_target): Call get_target_arguments instead of looking
|
||||||
|
up for teams and thread limit.
|
||||||
|
(grid_expand_omp_for_loop): New function.
|
||||||
|
(grid_arg_decl_map): New type.
|
||||||
|
(grid_remap_kernel_arg_accesses): New function.
|
||||||
|
(grid_expand_target_kernel_body): New function.
|
||||||
|
(expand_omp): Call it.
|
||||||
|
(lower_omp_for): Do not emit phony constructs.
|
||||||
|
(lower_omp_taskreg): Do not emit phony constructs but create for them
|
||||||
|
a temporary variable receiver_decl.
|
||||||
|
(lower_omp_taskreg): Do not emit phony constructs.
|
||||||
|
(lower_omp_teams): Likewise.
|
||||||
|
(lower_omp_grid_body): New function.
|
||||||
|
(lower_omp_1): Call it.
|
||||||
|
(grid_reg_assignment_to_local_var_p): New function.
|
||||||
|
(grid_seq_only_contains_local_assignments): Likewise.
|
||||||
|
(grid_find_single_omp_among_assignments_1): Likewise.
|
||||||
|
(grid_find_single_omp_among_assignments): Likewise.
|
||||||
|
(grid_find_ungridifiable_statement): Likewise.
|
||||||
|
(grid_target_follows_gridifiable_pattern): Likewise.
|
||||||
|
(grid_remap_prebody_decls): Likewise.
|
||||||
|
(grid_copy_leading_local_assignments): Likewise.
|
||||||
|
(grid_process_kernel_body_copy): Likewise.
|
||||||
|
(grid_attempt_target_gridification): Likewise.
|
||||||
|
(grid_gridify_all_targets_stmt): Likewise.
|
||||||
|
(grid_gridify_all_targets): Likewise.
|
||||||
|
(execute_lower_omp): Call grid_gridify_all_targets.
|
||||||
|
(make_gimple_omp_edges): Handle GIMPLE_OMP_GRID_BODY.
|
||||||
|
* tree-core.h (omp_clause_code): Added OMP_CLAUSE__GRIDDIM_.
|
||||||
|
(tree_omp_clause): Added union field dimension.
|
||||||
|
* tree-pretty-print.c (dump_omp_clause): Handle OMP_CLAUSE__GRIDDIM_.
|
||||||
|
* tree.c (omp_clause_num_ops): Added number of arguments of
|
||||||
|
OMP_CLAUSE__GRIDDIM_.
|
||||||
|
(omp_clause_code_name): Added name of OMP_CLAUSE__GRIDDIM_.
|
||||||
|
(walk_tree_1): Handle OMP_CLAUSE__GRIDDIM_.
|
||||||
|
* tree.h (OMP_CLAUSE_GRIDDIM_DIMENSION): New.
|
||||||
|
(OMP_CLAUSE_SET_GRIDDIM_DIMENSION): Likewise.
|
||||||
|
(OMP_CLAUSE_GRIDDIM_SIZE): Likewise.
|
||||||
|
(OMP_CLAUSE_GRIDDIM_GROUP): Likewise.
|
||||||
|
* passes.def: Schedule pass_ipa_hsa and pass_gen_hsail.
|
||||||
|
* tree-pass.h (make_pass_gen_hsail): Declare.
|
||||||
|
(make_pass_ipa_hsa): Likewise.
|
||||||
|
* ipa-hsa.c: New file.
|
||||||
|
* lto-section-in.c (lto_section_name): Add hsa section name.
|
||||||
|
* lto-streamer.h (lto_section_type): Add hsa section.
|
||||||
|
* timevar.def (TV_IPA_HSA): New.
|
||||||
|
* hsa-brig-format.h: New file.
|
||||||
|
* hsa-brig.c: New file.
|
||||||
|
* hsa-dump.c: Likewise.
|
||||||
|
* hsa-gen.c: Likewise.
|
||||||
|
* hsa.c: Likewise.
|
||||||
|
* hsa.h: Likewise.
|
||||||
|
* toplev.c (compile_file): Call hsa_output_brig.
|
||||||
|
* hsa-regalloc.c: New file.
|
||||||
|
|
||||||
2016-01-18 Jeff Law <law@redhat.com>
|
2016-01-18 Jeff Law <law@redhat.com>
|
||||||
|
|
||||||
PR tree-optimization/69320
|
PR tree-optimization/69320
|
||||||
|
|
|
||||||
|
|
@ -1297,6 +1297,11 @@ OBJS = \
|
||||||
graphite-sese-to-poly.o \
|
graphite-sese-to-poly.o \
|
||||||
gtype-desc.o \
|
gtype-desc.o \
|
||||||
haifa-sched.o \
|
haifa-sched.o \
|
||||||
|
hsa.o \
|
||||||
|
hsa-gen.o \
|
||||||
|
hsa-regalloc.o \
|
||||||
|
hsa-brig.o \
|
||||||
|
hsa-dump.o \
|
||||||
hw-doloop.o \
|
hw-doloop.o \
|
||||||
hwint.o \
|
hwint.o \
|
||||||
ifcvt.o \
|
ifcvt.o \
|
||||||
|
|
@ -1321,6 +1326,7 @@ OBJS = \
|
||||||
ipa-icf.o \
|
ipa-icf.o \
|
||||||
ipa-icf-gimple.o \
|
ipa-icf-gimple.o \
|
||||||
ipa-reference.o \
|
ipa-reference.o \
|
||||||
|
ipa-hsa.o \
|
||||||
ipa-ref.o \
|
ipa-ref.o \
|
||||||
ipa-utils.o \
|
ipa-utils.o \
|
||||||
ipa.o \
|
ipa.o \
|
||||||
|
|
@ -2404,6 +2410,7 @@ GTFILES = $(CPP_ID_DATA_H) $(srcdir)/input.h $(srcdir)/coretypes.h \
|
||||||
$(srcdir)/sancov.c \
|
$(srcdir)/sancov.c \
|
||||||
$(srcdir)/ipa-devirt.c \
|
$(srcdir)/ipa-devirt.c \
|
||||||
$(srcdir)/internal-fn.h \
|
$(srcdir)/internal-fn.h \
|
||||||
|
$(srcdir)/hsa.c \
|
||||||
@all_gtfiles@
|
@all_gtfiles@
|
||||||
|
|
||||||
# Compute the list of GT header files from the corresponding C sources,
|
# Compute the list of GT header files from the corresponding C sources,
|
||||||
|
|
|
||||||
|
|
@ -478,6 +478,8 @@ DEF_FUNCTION_TYPE_4 (BT_FN_BOOL_UINT_LONGPTR_LONGPTR_LONGPTR,
|
||||||
DEF_FUNCTION_TYPE_4 (BT_FN_BOOL_UINT_ULLPTR_ULLPTR_ULLPTR,
|
DEF_FUNCTION_TYPE_4 (BT_FN_BOOL_UINT_ULLPTR_ULLPTR_ULLPTR,
|
||||||
BT_BOOL, BT_UINT, BT_PTR_ULONGLONG, BT_PTR_ULONGLONG,
|
BT_BOOL, BT_UINT, BT_PTR_ULONGLONG, BT_PTR_ULONGLONG,
|
||||||
BT_PTR_ULONGLONG)
|
BT_PTR_ULONGLONG)
|
||||||
|
DEF_FUNCTION_TYPE_4 (BT_FN_VOID_UINT_PTR_INT_PTR, BT_VOID, BT_INT, BT_PTR,
|
||||||
|
BT_INT, BT_PTR)
|
||||||
|
|
||||||
DEF_FUNCTION_TYPE_5 (BT_FN_INT_STRING_INT_SIZE_CONST_STRING_VALIST_ARG,
|
DEF_FUNCTION_TYPE_5 (BT_FN_INT_STRING_INT_SIZE_CONST_STRING_VALIST_ARG,
|
||||||
BT_INT, BT_STRING, BT_INT, BT_SIZE, BT_CONST_STRING,
|
BT_INT, BT_STRING, BT_INT, BT_SIZE, BT_CONST_STRING,
|
||||||
|
|
@ -555,10 +557,9 @@ DEF_FUNCTION_TYPE_9 (BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_BOOL_UINT_PTR_INT,
|
||||||
BT_VOID, BT_PTR_FN_VOID_PTR, BT_PTR,
|
BT_VOID, BT_PTR_FN_VOID_PTR, BT_PTR,
|
||||||
BT_PTR_FN_VOID_PTR_PTR, BT_LONG, BT_LONG,
|
BT_PTR_FN_VOID_PTR_PTR, BT_LONG, BT_LONG,
|
||||||
BT_BOOL, BT_UINT, BT_PTR, BT_INT)
|
BT_BOOL, BT_UINT, BT_PTR, BT_INT)
|
||||||
|
DEF_FUNCTION_TYPE_9 (BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR_PTR,
|
||||||
DEF_FUNCTION_TYPE_10 (BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR_INT_INT,
|
BT_VOID, BT_INT, BT_PTR_FN_VOID_PTR, BT_SIZE, BT_PTR,
|
||||||
BT_VOID, BT_INT, BT_PTR_FN_VOID_PTR, BT_SIZE, BT_PTR,
|
BT_PTR, BT_PTR, BT_UINT, BT_PTR, BT_PTR)
|
||||||
BT_PTR, BT_PTR, BT_UINT, BT_PTR, BT_INT, BT_INT)
|
|
||||||
|
|
||||||
DEF_FUNCTION_TYPE_11 (BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_UINT_LONG_INT_LONG_LONG_LONG,
|
DEF_FUNCTION_TYPE_11 (BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_UINT_LONG_INT_LONG_LONG_LONG,
|
||||||
BT_VOID, BT_PTR_FN_VOID_PTR, BT_PTR,
|
BT_VOID, BT_PTR_FN_VOID_PTR, BT_PTR,
|
||||||
|
|
|
||||||
|
|
@ -239,6 +239,10 @@ Inserts call to __sanitizer_cov_trace_pc into every basic block.
|
||||||
Variable
|
Variable
|
||||||
bool dump_base_name_prefixed = false
|
bool dump_base_name_prefixed = false
|
||||||
|
|
||||||
|
; Flag whether HSA generation has been explicitely disabled
|
||||||
|
Variable
|
||||||
|
bool flag_disable_hsa = false
|
||||||
|
|
||||||
###
|
###
|
||||||
Driver
|
Driver
|
||||||
|
|
||||||
|
|
@ -593,6 +597,10 @@ Wfree-nonheap-object
|
||||||
Common Var(warn_free_nonheap_object) Init(1) Warning
|
Common Var(warn_free_nonheap_object) Init(1) Warning
|
||||||
Warn when attempting to free a non-heap object.
|
Warn when attempting to free a non-heap object.
|
||||||
|
|
||||||
|
Whsa
|
||||||
|
Common Var(warn_hsa) Init(1) Warning
|
||||||
|
Warn when a function cannot be expanded to HSAIL.
|
||||||
|
|
||||||
Winline
|
Winline
|
||||||
Common Var(warn_inline) Warning
|
Common Var(warn_inline) Warning
|
||||||
Warn when an inlined function cannot be inlined.
|
Warn when an inlined function cannot be inlined.
|
||||||
|
|
|
||||||
|
|
@ -144,6 +144,12 @@
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
|
|
||||||
|
/* Define this to enable support for generating HSAIL. */
|
||||||
|
#ifndef USED_FOR_TARGET
|
||||||
|
#undef ENABLE_HSA
|
||||||
|
#endif
|
||||||
|
|
||||||
|
|
||||||
/* Define if gcc should always pass --build-id to linker. */
|
/* Define if gcc should always pass --build-id to linker. */
|
||||||
#ifndef USED_FOR_TARGET
|
#ifndef USED_FOR_TARGET
|
||||||
#undef ENABLE_LD_BUILDID
|
#undef ENABLE_LD_BUILDID
|
||||||
|
|
|
||||||
|
|
@ -7700,6 +7700,13 @@ fi
|
||||||
|
|
||||||
for tgt in `echo $enable_offload_targets | sed 's/,/ /g'`; do
|
for tgt in `echo $enable_offload_targets | sed 's/,/ /g'`; do
|
||||||
tgt=`echo $tgt | sed 's/=.*//'`
|
tgt=`echo $tgt | sed 's/=.*//'`
|
||||||
|
|
||||||
|
if echo "$tgt" | grep "^hsa" > /dev/null ; then
|
||||||
|
enable_hsa=1
|
||||||
|
else
|
||||||
|
enable_offloading=1
|
||||||
|
fi
|
||||||
|
|
||||||
if test x"$offload_targets" = x; then
|
if test x"$offload_targets" = x; then
|
||||||
offload_targets=$tgt
|
offload_targets=$tgt
|
||||||
else
|
else
|
||||||
|
|
@ -7711,7 +7718,7 @@ cat >>confdefs.h <<_ACEOF
|
||||||
#define OFFLOAD_TARGETS "$offload_targets"
|
#define OFFLOAD_TARGETS "$offload_targets"
|
||||||
_ACEOF
|
_ACEOF
|
||||||
|
|
||||||
if test x"$offload_targets" != x; then
|
if test x"$enable_offloading" != x; then
|
||||||
|
|
||||||
$as_echo "#define ENABLE_OFFLOADING 1" >>confdefs.h
|
$as_echo "#define ENABLE_OFFLOADING 1" >>confdefs.h
|
||||||
|
|
||||||
|
|
@ -7721,6 +7728,12 @@ $as_echo "#define ENABLE_OFFLOADING 0" >>confdefs.h
|
||||||
|
|
||||||
fi
|
fi
|
||||||
|
|
||||||
|
if test x"$enable_hsa" = x1 ; then
|
||||||
|
|
||||||
|
$as_echo "#define ENABLE_HSA 1" >>confdefs.h
|
||||||
|
|
||||||
|
fi
|
||||||
|
|
||||||
|
|
||||||
# Check whether --with-multilib-list was given.
|
# Check whether --with-multilib-list was given.
|
||||||
if test "${with_multilib_list+set}" = set; then :
|
if test "${with_multilib_list+set}" = set; then :
|
||||||
|
|
@ -18406,7 +18419,7 @@ else
|
||||||
lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
|
lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
|
||||||
lt_status=$lt_dlunknown
|
lt_status=$lt_dlunknown
|
||||||
cat > conftest.$ac_ext <<_LT_EOF
|
cat > conftest.$ac_ext <<_LT_EOF
|
||||||
#line 18409 "configure"
|
#line 18422 "configure"
|
||||||
#include "confdefs.h"
|
#include "confdefs.h"
|
||||||
|
|
||||||
#if HAVE_DLFCN_H
|
#if HAVE_DLFCN_H
|
||||||
|
|
@ -18512,7 +18525,7 @@ else
|
||||||
lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
|
lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
|
||||||
lt_status=$lt_dlunknown
|
lt_status=$lt_dlunknown
|
||||||
cat > conftest.$ac_ext <<_LT_EOF
|
cat > conftest.$ac_ext <<_LT_EOF
|
||||||
#line 18515 "configure"
|
#line 18528 "configure"
|
||||||
#include "confdefs.h"
|
#include "confdefs.h"
|
||||||
|
|
||||||
#if HAVE_DLFCN_H
|
#if HAVE_DLFCN_H
|
||||||
|
|
|
||||||
|
|
@ -940,6 +940,13 @@ AC_SUBST(accel_dir_suffix)
|
||||||
|
|
||||||
for tgt in `echo $enable_offload_targets | sed 's/,/ /g'`; do
|
for tgt in `echo $enable_offload_targets | sed 's/,/ /g'`; do
|
||||||
tgt=`echo $tgt | sed 's/=.*//'`
|
tgt=`echo $tgt | sed 's/=.*//'`
|
||||||
|
|
||||||
|
if echo "$tgt" | grep "^hsa" > /dev/null ; then
|
||||||
|
enable_hsa=1
|
||||||
|
else
|
||||||
|
enable_offloading=1
|
||||||
|
fi
|
||||||
|
|
||||||
if test x"$offload_targets" = x; then
|
if test x"$offload_targets" = x; then
|
||||||
offload_targets=$tgt
|
offload_targets=$tgt
|
||||||
else
|
else
|
||||||
|
|
@ -948,7 +955,7 @@ for tgt in `echo $enable_offload_targets | sed 's/,/ /g'`; do
|
||||||
done
|
done
|
||||||
AC_DEFINE_UNQUOTED(OFFLOAD_TARGETS, "$offload_targets",
|
AC_DEFINE_UNQUOTED(OFFLOAD_TARGETS, "$offload_targets",
|
||||||
[Define to offload targets, separated by commas.])
|
[Define to offload targets, separated by commas.])
|
||||||
if test x"$offload_targets" != x; then
|
if test x"$enable_offloading" != x; then
|
||||||
AC_DEFINE(ENABLE_OFFLOADING, 1,
|
AC_DEFINE(ENABLE_OFFLOADING, 1,
|
||||||
[Define this to enable support for offloading.])
|
[Define this to enable support for offloading.])
|
||||||
else
|
else
|
||||||
|
|
@ -956,6 +963,11 @@ else
|
||||||
[Define this to enable support for offloading.])
|
[Define this to enable support for offloading.])
|
||||||
fi
|
fi
|
||||||
|
|
||||||
|
if test x"$enable_hsa" = x1 ; then
|
||||||
|
AC_DEFINE(ENABLE_HSA, 1,
|
||||||
|
[Define this to enable support for generating HSAIL.])
|
||||||
|
fi
|
||||||
|
|
||||||
AC_ARG_WITH(multilib-list,
|
AC_ARG_WITH(multilib-list,
|
||||||
[AS_HELP_STRING([--with-multilib-list], [select multilibs (AArch64, SH and x86-64 only)])],
|
[AS_HELP_STRING([--with-multilib-list], [select multilibs (AArch64, SH and x86-64 only)])],
|
||||||
:,
|
:,
|
||||||
|
|
|
||||||
|
|
@ -1992,6 +1992,28 @@ specifying paths @var{path1}, @dots{}, @var{pathN}.
|
||||||
% @var{srcdir}/configure \
|
% @var{srcdir}/configure \
|
||||||
--enable-offload-target=i686-unknown-linux-gnu=/path/to/i686/compiler,x86_64-pc-linux-gnu
|
--enable-offload-target=i686-unknown-linux-gnu=/path/to/i686/compiler,x86_64-pc-linux-gnu
|
||||||
@end smallexample
|
@end smallexample
|
||||||
|
|
||||||
|
If @samp{hsa} is specified as one of the targets, the compiler will be
|
||||||
|
built with support for HSA GPU accelerators. Because the same
|
||||||
|
compiler will emit the accelerator code, no path should be specified.
|
||||||
|
|
||||||
|
@item --with-hsa-runtime=@var{pathname}
|
||||||
|
@itemx --with-hsa-runtime-include=@var{pathname}
|
||||||
|
@itemx --with-hsa-runtime-lib=@var{pathname}
|
||||||
|
|
||||||
|
If you configure GCC with HSA offloading but do not have the HSA
|
||||||
|
run-time library installed in a standard location then you can
|
||||||
|
explicitly specify the directory where they are installed. The
|
||||||
|
@option{--with-hsa-runtime=@/@var{hsainstalldir}} option is a
|
||||||
|
shorthand for
|
||||||
|
@option{--with-hsa-runtime-lib=@/@var{hsainstalldir}/lib} and
|
||||||
|
@option{--with-hsa-runtime-include=@/@var{hsainstalldir}/include}.
|
||||||
|
|
||||||
|
@item --with-hsa-kmt-lib=@var{pathname}
|
||||||
|
|
||||||
|
If you configure GCC with HSA offloading but do not have the HSA
|
||||||
|
KMT library installed in a standard location then you can
|
||||||
|
explicitly specify the directory where it resides.
|
||||||
@end table
|
@end table
|
||||||
|
|
||||||
@subheading Cross-Compiler-Specific Options
|
@subheading Cross-Compiler-Specific Options
|
||||||
|
|
|
||||||
|
|
@ -305,7 +305,7 @@ Objective-C and Objective-C++ Dialects}.
|
||||||
-Wunused-but-set-parameter -Wunused-but-set-variable @gol
|
-Wunused-but-set-parameter -Wunused-but-set-variable @gol
|
||||||
-Wuseless-cast -Wvariadic-macros -Wvector-operation-performance @gol
|
-Wuseless-cast -Wvariadic-macros -Wvector-operation-performance @gol
|
||||||
-Wvla -Wvolatile-register-var -Wwrite-strings @gol
|
-Wvla -Wvolatile-register-var -Wwrite-strings @gol
|
||||||
-Wzero-as-null-pointer-constant}
|
-Wzero-as-null-pointer-constant -Whsa}
|
||||||
|
|
||||||
@item C and Objective-C-only Warning Options
|
@item C and Objective-C-only Warning Options
|
||||||
@gccoptlist{-Wbad-function-cast -Wmissing-declarations @gol
|
@gccoptlist{-Wbad-function-cast -Wmissing-declarations @gol
|
||||||
|
|
@ -5693,6 +5693,10 @@ Suppress warnings when a positional initializer is used to initialize
|
||||||
a structure that has been marked with the @code{designated_init}
|
a structure that has been marked with the @code{designated_init}
|
||||||
attribute.
|
attribute.
|
||||||
|
|
||||||
|
@item -Whsa
|
||||||
|
Issue a warning when HSAIL cannot be emitted for the compiled function or
|
||||||
|
OpenMP construct.
|
||||||
|
|
||||||
@end table
|
@end table
|
||||||
|
|
||||||
@node Debugging Options
|
@node Debugging Options
|
||||||
|
|
@ -9508,6 +9512,12 @@ dynamic, guided, auto, runtime). The default is static.
|
||||||
Maximum depth of recursion when querying properties of SSA names in things
|
Maximum depth of recursion when querying properties of SSA names in things
|
||||||
like fold routines. One level of recursion corresponds to following a
|
like fold routines. One level of recursion corresponds to following a
|
||||||
use-def chain.
|
use-def chain.
|
||||||
|
|
||||||
|
@item hsa-gen-debug-stores
|
||||||
|
Enable emission of special debug stores within HSA kernels which are
|
||||||
|
then read and reported by libgomp plugin. Generation of these stores
|
||||||
|
is disabled by default, use @option{--param hsa-gen-debug-stores=1} to
|
||||||
|
enable it.
|
||||||
@end table
|
@end table
|
||||||
@end table
|
@end table
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -1,3 +1,9 @@
|
||||||
|
2016-01-19 Martin Jambor <mjambor@suse.cz>
|
||||||
|
|
||||||
|
* types.def (BT_FN_VOID_UINT_PTR_INT_PTR): New.
|
||||||
|
(BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR_INT_INT): Removed.
|
||||||
|
(BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR_PTR): New.
|
||||||
|
|
||||||
2016-01-15 Paul Thomas <pault@gcc.gnu.org>
|
2016-01-15 Paul Thomas <pault@gcc.gnu.org>
|
||||||
|
|
||||||
PR fortran/64324
|
PR fortran/64324
|
||||||
|
|
|
||||||
|
|
@ -159,6 +159,8 @@ DEF_FUNCTION_TYPE_4 (BT_FN_BOOL_UINT_LONGPTR_LONGPTR_LONGPTR,
|
||||||
DEF_FUNCTION_TYPE_4 (BT_FN_BOOL_UINT_ULLPTR_ULLPTR_ULLPTR,
|
DEF_FUNCTION_TYPE_4 (BT_FN_BOOL_UINT_ULLPTR_ULLPTR_ULLPTR,
|
||||||
BT_BOOL, BT_UINT, BT_PTR_ULONGLONG, BT_PTR_ULONGLONG,
|
BT_BOOL, BT_UINT, BT_PTR_ULONGLONG, BT_PTR_ULONGLONG,
|
||||||
BT_PTR_ULONGLONG)
|
BT_PTR_ULONGLONG)
|
||||||
|
DEF_FUNCTION_TYPE_4 (BT_FN_VOID_UINT_PTR_INT_PTR, BT_VOID, BT_INT, BT_PTR,
|
||||||
|
BT_INT, BT_PTR)
|
||||||
|
|
||||||
DEF_FUNCTION_TYPE_5 (BT_FN_VOID_OMPFN_PTR_UINT_UINT_UINT,
|
DEF_FUNCTION_TYPE_5 (BT_FN_VOID_OMPFN_PTR_UINT_UINT_UINT,
|
||||||
BT_VOID, BT_PTR_FN_VOID_PTR, BT_PTR, BT_UINT, BT_UINT,
|
BT_VOID, BT_PTR_FN_VOID_PTR, BT_PTR, BT_UINT, BT_UINT,
|
||||||
|
|
@ -220,10 +222,9 @@ DEF_FUNCTION_TYPE_9 (BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_BOOL_UINT_PTR_INT,
|
||||||
BT_VOID, BT_PTR_FN_VOID_PTR, BT_PTR,
|
BT_VOID, BT_PTR_FN_VOID_PTR, BT_PTR,
|
||||||
BT_PTR_FN_VOID_PTR_PTR, BT_LONG, BT_LONG,
|
BT_PTR_FN_VOID_PTR_PTR, BT_LONG, BT_LONG,
|
||||||
BT_BOOL, BT_UINT, BT_PTR, BT_INT)
|
BT_BOOL, BT_UINT, BT_PTR, BT_INT)
|
||||||
|
DEF_FUNCTION_TYPE_9 (BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR_PTR,
|
||||||
DEF_FUNCTION_TYPE_10 (BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR_INT_INT,
|
|
||||||
BT_VOID, BT_INT, BT_PTR_FN_VOID_PTR, BT_SIZE, BT_PTR,
|
BT_VOID, BT_INT, BT_PTR_FN_VOID_PTR, BT_SIZE, BT_PTR,
|
||||||
BT_PTR, BT_PTR, BT_UINT, BT_PTR, BT_INT, BT_INT)
|
BT_PTR, BT_PTR, BT_UINT, BT_PTR, BT_PTR)
|
||||||
|
|
||||||
DEF_FUNCTION_TYPE_11 (BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_UINT_LONG_INT_LONG_LONG_LONG,
|
DEF_FUNCTION_TYPE_11 (BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_UINT_LONG_INT_LONG_LONG_LONG,
|
||||||
BT_VOID, BT_PTR_FN_VOID_PTR, BT_PTR,
|
BT_VOID, BT_PTR_FN_VOID_PTR, BT_PTR,
|
||||||
|
|
|
||||||
|
|
@ -358,6 +358,7 @@ lower_stmt (gimple_stmt_iterator *gsi, struct lower_data *data)
|
||||||
case GIMPLE_OMP_TASK:
|
case GIMPLE_OMP_TASK:
|
||||||
case GIMPLE_OMP_TARGET:
|
case GIMPLE_OMP_TARGET:
|
||||||
case GIMPLE_OMP_TEAMS:
|
case GIMPLE_OMP_TEAMS:
|
||||||
|
case GIMPLE_OMP_GRID_BODY:
|
||||||
data->cannot_fallthru = false;
|
data->cannot_fallthru = false;
|
||||||
lower_omp_directive (gsi, data);
|
lower_omp_directive (gsi, data);
|
||||||
data->cannot_fallthru = false;
|
data->cannot_fallthru = false;
|
||||||
|
|
|
||||||
|
|
@ -1187,6 +1187,9 @@ dump_gimple_omp_for (pretty_printer *buffer, gomp_for *gs, int spc, int flags)
|
||||||
case GF_OMP_FOR_KIND_CILKSIMD:
|
case GF_OMP_FOR_KIND_CILKSIMD:
|
||||||
pp_string (buffer, "#pragma simd");
|
pp_string (buffer, "#pragma simd");
|
||||||
break;
|
break;
|
||||||
|
case GF_OMP_FOR_KIND_GRID_LOOP:
|
||||||
|
pp_string (buffer, "#pragma omp for grid_loop");
|
||||||
|
break;
|
||||||
default:
|
default:
|
||||||
gcc_unreachable ();
|
gcc_unreachable ();
|
||||||
}
|
}
|
||||||
|
|
@ -1494,6 +1497,9 @@ dump_gimple_omp_block (pretty_printer *buffer, gimple *gs, int spc, int flags)
|
||||||
case GIMPLE_OMP_SECTION:
|
case GIMPLE_OMP_SECTION:
|
||||||
pp_string (buffer, "#pragma omp section");
|
pp_string (buffer, "#pragma omp section");
|
||||||
break;
|
break;
|
||||||
|
case GIMPLE_OMP_GRID_BODY:
|
||||||
|
pp_string (buffer, "#pragma omp gridified body");
|
||||||
|
break;
|
||||||
default:
|
default:
|
||||||
gcc_unreachable ();
|
gcc_unreachable ();
|
||||||
}
|
}
|
||||||
|
|
@ -2301,6 +2307,7 @@ pp_gimple_stmt_1 (pretty_printer *buffer, gimple *gs, int spc, int flags)
|
||||||
case GIMPLE_OMP_MASTER:
|
case GIMPLE_OMP_MASTER:
|
||||||
case GIMPLE_OMP_TASKGROUP:
|
case GIMPLE_OMP_TASKGROUP:
|
||||||
case GIMPLE_OMP_SECTION:
|
case GIMPLE_OMP_SECTION:
|
||||||
|
case GIMPLE_OMP_GRID_BODY:
|
||||||
dump_gimple_omp_block (buffer, gs, spc, flags);
|
dump_gimple_omp_block (buffer, gs, spc, flags);
|
||||||
break;
|
break;
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -655,6 +655,7 @@ walk_gimple_stmt (gimple_stmt_iterator *gsi, walk_stmt_fn callback_stmt,
|
||||||
case GIMPLE_OMP_SINGLE:
|
case GIMPLE_OMP_SINGLE:
|
||||||
case GIMPLE_OMP_TARGET:
|
case GIMPLE_OMP_TARGET:
|
||||||
case GIMPLE_OMP_TEAMS:
|
case GIMPLE_OMP_TEAMS:
|
||||||
|
case GIMPLE_OMP_GRID_BODY:
|
||||||
ret = walk_gimple_seq_mod (gimple_omp_body_ptr (stmt), callback_stmt,
|
ret = walk_gimple_seq_mod (gimple_omp_body_ptr (stmt), callback_stmt,
|
||||||
callback_op, wi);
|
callback_op, wi);
|
||||||
if (ret)
|
if (ret)
|
||||||
|
|
|
||||||
14
gcc/gimple.c
14
gcc/gimple.c
|
|
@ -954,6 +954,19 @@ gimple_build_omp_master (gimple_seq body)
|
||||||
return p;
|
return p;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/* Build a GIMPLE_OMP_GRID_BODY statement.
|
||||||
|
|
||||||
|
BODY is the sequence of statements to be executed by the kernel. */
|
||||||
|
|
||||||
|
gimple *
|
||||||
|
gimple_build_omp_grid_body (gimple_seq body)
|
||||||
|
{
|
||||||
|
gimple *p = gimple_alloc (GIMPLE_OMP_GRID_BODY, 0);
|
||||||
|
if (body)
|
||||||
|
gimple_omp_set_body (p, body);
|
||||||
|
|
||||||
|
return p;
|
||||||
|
}
|
||||||
|
|
||||||
/* Build a GIMPLE_OMP_TASKGROUP statement.
|
/* Build a GIMPLE_OMP_TASKGROUP statement.
|
||||||
|
|
||||||
|
|
@ -1807,6 +1820,7 @@ gimple_copy (gimple *stmt)
|
||||||
case GIMPLE_OMP_SECTION:
|
case GIMPLE_OMP_SECTION:
|
||||||
case GIMPLE_OMP_MASTER:
|
case GIMPLE_OMP_MASTER:
|
||||||
case GIMPLE_OMP_TASKGROUP:
|
case GIMPLE_OMP_TASKGROUP:
|
||||||
|
case GIMPLE_OMP_GRID_BODY:
|
||||||
copy_omp_body:
|
copy_omp_body:
|
||||||
new_seq = gimple_seq_copy (gimple_omp_body (stmt));
|
new_seq = gimple_seq_copy (gimple_omp_body (stmt));
|
||||||
gimple_omp_set_body (copy, new_seq);
|
gimple_omp_set_body (copy, new_seq);
|
||||||
|
|
|
||||||
|
|
@ -376,6 +376,10 @@ DEFGSCODE(GIMPLE_OMP_TEAMS, "gimple_omp_teams", GSS_OMP_SINGLE_LAYOUT)
|
||||||
CLAUSES is an OMP_CLAUSE chain holding the associated clauses. */
|
CLAUSES is an OMP_CLAUSE chain holding the associated clauses. */
|
||||||
DEFGSCODE(GIMPLE_OMP_ORDERED, "gimple_omp_ordered", GSS_OMP_SINGLE_LAYOUT)
|
DEFGSCODE(GIMPLE_OMP_ORDERED, "gimple_omp_ordered", GSS_OMP_SINGLE_LAYOUT)
|
||||||
|
|
||||||
|
/* GIMPLE_OMP_GRID_BODY <BODY> represents a parallel loop lowered for execution
|
||||||
|
on a GPU. It is an artificial statement created by omp lowering. */
|
||||||
|
DEFGSCODE(GIMPLE_OMP_GRID_BODY, "gimple_omp_gpukernel", GSS_OMP)
|
||||||
|
|
||||||
/* GIMPLE_PREDICT <PREDICT, OUTCOME> specifies a hint for branch prediction.
|
/* GIMPLE_PREDICT <PREDICT, OUTCOME> specifies a hint for branch prediction.
|
||||||
|
|
||||||
PREDICT is one of the predictors from predict.def.
|
PREDICT is one of the predictors from predict.def.
|
||||||
|
|
|
||||||
65
gcc/gimple.h
65
gcc/gimple.h
|
|
@ -146,6 +146,7 @@ enum gf_mask {
|
||||||
GF_CALL_CTRL_ALTERING = 1 << 7,
|
GF_CALL_CTRL_ALTERING = 1 << 7,
|
||||||
GF_CALL_WITH_BOUNDS = 1 << 8,
|
GF_CALL_WITH_BOUNDS = 1 << 8,
|
||||||
GF_OMP_PARALLEL_COMBINED = 1 << 0,
|
GF_OMP_PARALLEL_COMBINED = 1 << 0,
|
||||||
|
GF_OMP_PARALLEL_GRID_PHONY = 1 << 1,
|
||||||
GF_OMP_TASK_TASKLOOP = 1 << 0,
|
GF_OMP_TASK_TASKLOOP = 1 << 0,
|
||||||
GF_OMP_FOR_KIND_MASK = (1 << 4) - 1,
|
GF_OMP_FOR_KIND_MASK = (1 << 4) - 1,
|
||||||
GF_OMP_FOR_KIND_FOR = 0,
|
GF_OMP_FOR_KIND_FOR = 0,
|
||||||
|
|
@ -153,12 +154,14 @@ enum gf_mask {
|
||||||
GF_OMP_FOR_KIND_TASKLOOP = 2,
|
GF_OMP_FOR_KIND_TASKLOOP = 2,
|
||||||
GF_OMP_FOR_KIND_CILKFOR = 3,
|
GF_OMP_FOR_KIND_CILKFOR = 3,
|
||||||
GF_OMP_FOR_KIND_OACC_LOOP = 4,
|
GF_OMP_FOR_KIND_OACC_LOOP = 4,
|
||||||
|
GF_OMP_FOR_KIND_GRID_LOOP = 5,
|
||||||
/* Flag for SIMD variants of OMP_FOR kinds. */
|
/* Flag for SIMD variants of OMP_FOR kinds. */
|
||||||
GF_OMP_FOR_SIMD = 1 << 3,
|
GF_OMP_FOR_SIMD = 1 << 3,
|
||||||
GF_OMP_FOR_KIND_SIMD = GF_OMP_FOR_SIMD | 0,
|
GF_OMP_FOR_KIND_SIMD = GF_OMP_FOR_SIMD | 0,
|
||||||
GF_OMP_FOR_KIND_CILKSIMD = GF_OMP_FOR_SIMD | 1,
|
GF_OMP_FOR_KIND_CILKSIMD = GF_OMP_FOR_SIMD | 1,
|
||||||
GF_OMP_FOR_COMBINED = 1 << 4,
|
GF_OMP_FOR_COMBINED = 1 << 4,
|
||||||
GF_OMP_FOR_COMBINED_INTO = 1 << 5,
|
GF_OMP_FOR_COMBINED_INTO = 1 << 5,
|
||||||
|
GF_OMP_FOR_GRID_PHONY = 1 << 6,
|
||||||
GF_OMP_TARGET_KIND_MASK = (1 << 4) - 1,
|
GF_OMP_TARGET_KIND_MASK = (1 << 4) - 1,
|
||||||
GF_OMP_TARGET_KIND_REGION = 0,
|
GF_OMP_TARGET_KIND_REGION = 0,
|
||||||
GF_OMP_TARGET_KIND_DATA = 1,
|
GF_OMP_TARGET_KIND_DATA = 1,
|
||||||
|
|
@ -172,6 +175,7 @@ enum gf_mask {
|
||||||
GF_OMP_TARGET_KIND_OACC_ENTER_EXIT_DATA = 9,
|
GF_OMP_TARGET_KIND_OACC_ENTER_EXIT_DATA = 9,
|
||||||
GF_OMP_TARGET_KIND_OACC_DECLARE = 10,
|
GF_OMP_TARGET_KIND_OACC_DECLARE = 10,
|
||||||
GF_OMP_TARGET_KIND_OACC_HOST_DATA = 11,
|
GF_OMP_TARGET_KIND_OACC_HOST_DATA = 11,
|
||||||
|
GF_OMP_TEAMS_GRID_PHONY = 1 << 0,
|
||||||
|
|
||||||
/* True on an GIMPLE_OMP_RETURN statement if the return does not require
|
/* True on an GIMPLE_OMP_RETURN statement if the return does not require
|
||||||
a thread synchronization via some sort of barrier. The exact barrier
|
a thread synchronization via some sort of barrier. The exact barrier
|
||||||
|
|
@ -733,7 +737,7 @@ struct GTY((tag("GSS_OMP_SINGLE_LAYOUT")))
|
||||||
{
|
{
|
||||||
/* [ WORD 1-7 ] : base class */
|
/* [ WORD 1-7 ] : base class */
|
||||||
|
|
||||||
/* [ WORD 7 ] */
|
/* [ WORD 8 ] */
|
||||||
tree clauses;
|
tree clauses;
|
||||||
};
|
};
|
||||||
|
|
||||||
|
|
@ -1454,6 +1458,7 @@ gomp_task *gimple_build_omp_task (gimple_seq, tree, tree, tree, tree,
|
||||||
tree, tree);
|
tree, tree);
|
||||||
gimple *gimple_build_omp_section (gimple_seq);
|
gimple *gimple_build_omp_section (gimple_seq);
|
||||||
gimple *gimple_build_omp_master (gimple_seq);
|
gimple *gimple_build_omp_master (gimple_seq);
|
||||||
|
gimple *gimple_build_omp_grid_body (gimple_seq);
|
||||||
gimple *gimple_build_omp_taskgroup (gimple_seq);
|
gimple *gimple_build_omp_taskgroup (gimple_seq);
|
||||||
gomp_continue *gimple_build_omp_continue (tree, tree);
|
gomp_continue *gimple_build_omp_continue (tree, tree);
|
||||||
gomp_ordered *gimple_build_omp_ordered (gimple_seq, tree);
|
gomp_ordered *gimple_build_omp_ordered (gimple_seq, tree);
|
||||||
|
|
@ -1714,6 +1719,7 @@ gimple_has_substatements (gimple *g)
|
||||||
case GIMPLE_OMP_CRITICAL:
|
case GIMPLE_OMP_CRITICAL:
|
||||||
case GIMPLE_WITH_CLEANUP_EXPR:
|
case GIMPLE_WITH_CLEANUP_EXPR:
|
||||||
case GIMPLE_TRANSACTION:
|
case GIMPLE_TRANSACTION:
|
||||||
|
case GIMPLE_OMP_GRID_BODY:
|
||||||
return true;
|
return true;
|
||||||
|
|
||||||
default:
|
default:
|
||||||
|
|
@ -5079,6 +5085,24 @@ gimple_omp_for_set_pre_body (gimple *gs, gimple_seq pre_body)
|
||||||
omp_for_stmt->pre_body = pre_body;
|
omp_for_stmt->pre_body = pre_body;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/* Return the kernel_phony of OMP_FOR statement. */
|
||||||
|
|
||||||
|
static inline bool
|
||||||
|
gimple_omp_for_grid_phony (const gomp_for *omp_for)
|
||||||
|
{
|
||||||
|
return (gimple_omp_subcode (omp_for) & GF_OMP_FOR_GRID_PHONY) != 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Set kernel_phony flag of OMP_FOR to VALUE. */
|
||||||
|
|
||||||
|
static inline void
|
||||||
|
gimple_omp_for_set_grid_phony (gomp_for *omp_for, bool value)
|
||||||
|
{
|
||||||
|
if (value)
|
||||||
|
omp_for->subcode |= GF_OMP_FOR_GRID_PHONY;
|
||||||
|
else
|
||||||
|
omp_for->subcode &= ~GF_OMP_FOR_GRID_PHONY;
|
||||||
|
}
|
||||||
|
|
||||||
/* Return the clauses associated with OMP_PARALLEL GS. */
|
/* Return the clauses associated with OMP_PARALLEL GS. */
|
||||||
|
|
||||||
|
|
@ -5165,6 +5189,24 @@ gimple_omp_parallel_set_data_arg (gomp_parallel *omp_parallel_stmt,
|
||||||
omp_parallel_stmt->data_arg = data_arg;
|
omp_parallel_stmt->data_arg = data_arg;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/* Return the kernel_phony flag of OMP_PARALLEL_STMT. */
|
||||||
|
|
||||||
|
static inline bool
|
||||||
|
gimple_omp_parallel_grid_phony (const gomp_parallel *stmt)
|
||||||
|
{
|
||||||
|
return (gimple_omp_subcode (stmt) & GF_OMP_PARALLEL_GRID_PHONY) != 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Set kernel_phony flag of OMP_PARALLEL_STMT to VALUE. */
|
||||||
|
|
||||||
|
static inline void
|
||||||
|
gimple_omp_parallel_set_grid_phony (gomp_parallel *stmt, bool value)
|
||||||
|
{
|
||||||
|
if (value)
|
||||||
|
stmt->subcode |= GF_OMP_PARALLEL_GRID_PHONY;
|
||||||
|
else
|
||||||
|
stmt->subcode &= ~GF_OMP_PARALLEL_GRID_PHONY;
|
||||||
|
}
|
||||||
|
|
||||||
/* Return the clauses associated with OMP_TASK GS. */
|
/* Return the clauses associated with OMP_TASK GS. */
|
||||||
|
|
||||||
|
|
@ -5638,6 +5680,24 @@ gimple_omp_teams_set_clauses (gomp_teams *omp_teams_stmt, tree clauses)
|
||||||
omp_teams_stmt->clauses = clauses;
|
omp_teams_stmt->clauses = clauses;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/* Return the kernel_phony flag of an OMP_TEAMS_STMT. */
|
||||||
|
|
||||||
|
static inline bool
|
||||||
|
gimple_omp_teams_grid_phony (const gomp_teams *omp_teams_stmt)
|
||||||
|
{
|
||||||
|
return (gimple_omp_subcode (omp_teams_stmt) & GF_OMP_TEAMS_GRID_PHONY) != 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Set kernel_phony flag of an OMP_TEAMS_STMT to VALUE. */
|
||||||
|
|
||||||
|
static inline void
|
||||||
|
gimple_omp_teams_set_grid_phony (gomp_teams *omp_teams_stmt, bool value)
|
||||||
|
{
|
||||||
|
if (value)
|
||||||
|
omp_teams_stmt->subcode |= GF_OMP_TEAMS_GRID_PHONY;
|
||||||
|
else
|
||||||
|
omp_teams_stmt->subcode &= ~GF_OMP_TEAMS_GRID_PHONY;
|
||||||
|
}
|
||||||
|
|
||||||
/* Return the clauses associated with OMP_SECTIONS GS. */
|
/* Return the clauses associated with OMP_SECTIONS GS. */
|
||||||
|
|
||||||
|
|
@ -6002,7 +6062,8 @@ gimple_return_set_retbnd (gimple *gs, tree retval)
|
||||||
case GIMPLE_OMP_RETURN: \
|
case GIMPLE_OMP_RETURN: \
|
||||||
case GIMPLE_OMP_ATOMIC_LOAD: \
|
case GIMPLE_OMP_ATOMIC_LOAD: \
|
||||||
case GIMPLE_OMP_ATOMIC_STORE: \
|
case GIMPLE_OMP_ATOMIC_STORE: \
|
||||||
case GIMPLE_OMP_CONTINUE
|
case GIMPLE_OMP_CONTINUE: \
|
||||||
|
case GIMPLE_OMP_GRID_BODY
|
||||||
|
|
||||||
static inline bool
|
static inline bool
|
||||||
is_gimple_omp (const gimple *stmt)
|
is_gimple_omp (const gimple *stmt)
|
||||||
|
|
|
||||||
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
|
|
@ -0,0 +1,719 @@
|
||||||
|
/* HSAIL IL Register allocation and out-of-SSA.
|
||||||
|
Copyright (C) 2013-2016 Free Software Foundation, Inc.
|
||||||
|
Contributed by Michael Matz <matz@suse.de>
|
||||||
|
|
||||||
|
This file is part of GCC.
|
||||||
|
|
||||||
|
GCC is free software; you can redistribute it and/or modify
|
||||||
|
it under the terms of the GNU General Public License as published by
|
||||||
|
the Free Software Foundation; either version 3, or (at your option)
|
||||||
|
any later version.
|
||||||
|
|
||||||
|
GCC is distributed in the hope that it will be useful,
|
||||||
|
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||||
|
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||||
|
GNU General Public License for more details.
|
||||||
|
|
||||||
|
You should have received a copy of the GNU General Public License
|
||||||
|
along with GCC; see the file COPYING3. If not see
|
||||||
|
<http://www.gnu.org/licenses/>. */
|
||||||
|
|
||||||
|
#include "config.h"
|
||||||
|
#include "system.h"
|
||||||
|
#include "coretypes.h"
|
||||||
|
#include "tm.h"
|
||||||
|
#include "is-a.h"
|
||||||
|
#include "vec.h"
|
||||||
|
#include "tree.h"
|
||||||
|
#include "dominance.h"
|
||||||
|
#include "cfg.h"
|
||||||
|
#include "cfganal.h"
|
||||||
|
#include "function.h"
|
||||||
|
#include "bitmap.h"
|
||||||
|
#include "dumpfile.h"
|
||||||
|
#include "cgraph.h"
|
||||||
|
#include "print-tree.h"
|
||||||
|
#include "cfghooks.h"
|
||||||
|
#include "symbol-summary.h"
|
||||||
|
#include "hsa.h"
|
||||||
|
|
||||||
|
|
||||||
|
/* Process a PHI node PHI of basic block BB as a part of naive out-f-ssa. */
|
||||||
|
|
||||||
|
static void
|
||||||
|
naive_process_phi (hsa_insn_phi *phi)
|
||||||
|
{
|
||||||
|
unsigned count = phi->operand_count ();
|
||||||
|
for (unsigned i = 0; i < count; i++)
|
||||||
|
{
|
||||||
|
gcc_checking_assert (phi->get_op (i));
|
||||||
|
hsa_op_base *op = phi->get_op (i);
|
||||||
|
hsa_bb *hbb;
|
||||||
|
edge e;
|
||||||
|
|
||||||
|
if (!op)
|
||||||
|
break;
|
||||||
|
|
||||||
|
e = EDGE_PRED (phi->m_bb, i);
|
||||||
|
if (single_succ_p (e->src))
|
||||||
|
hbb = hsa_bb_for_bb (e->src);
|
||||||
|
else
|
||||||
|
{
|
||||||
|
basic_block old_dest = e->dest;
|
||||||
|
hbb = hsa_init_new_bb (split_edge (e));
|
||||||
|
|
||||||
|
/* If switch insn used this edge, fix jump table. */
|
||||||
|
hsa_bb *source = hsa_bb_for_bb (e->src);
|
||||||
|
hsa_insn_sbr *sbr;
|
||||||
|
if (source->m_last_insn
|
||||||
|
&& (sbr = dyn_cast <hsa_insn_sbr *> (source->m_last_insn)))
|
||||||
|
sbr->replace_all_labels (old_dest, hbb->m_bb);
|
||||||
|
}
|
||||||
|
|
||||||
|
hsa_build_append_simple_mov (phi->m_dest, op, hbb);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Naive out-of SSA. */
|
||||||
|
|
||||||
|
static void
|
||||||
|
naive_outof_ssa (void)
|
||||||
|
{
|
||||||
|
basic_block bb;
|
||||||
|
|
||||||
|
hsa_cfun->m_in_ssa = false;
|
||||||
|
|
||||||
|
FOR_ALL_BB_FN (bb, cfun)
|
||||||
|
{
|
||||||
|
hsa_bb *hbb = hsa_bb_for_bb (bb);
|
||||||
|
hsa_insn_phi *phi;
|
||||||
|
|
||||||
|
for (phi = hbb->m_first_phi;
|
||||||
|
phi;
|
||||||
|
phi = phi->m_next ? as_a <hsa_insn_phi *> (phi->m_next) : NULL)
|
||||||
|
naive_process_phi (phi);
|
||||||
|
|
||||||
|
/* Zap PHI nodes, they will be deallocated when everything else will. */
|
||||||
|
hbb->m_first_phi = NULL;
|
||||||
|
hbb->m_last_phi = NULL;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Return register class number for the given HSA TYPE. 0 means the 'c' one
|
||||||
|
bit register class, 1 means 's' 32 bit class, 2 stands for 'd' 64 bit class
|
||||||
|
and 3 for 'q' 128 bit class. */
|
||||||
|
|
||||||
|
static int
|
||||||
|
m_reg_class_for_type (BrigType16_t type)
|
||||||
|
{
|
||||||
|
switch (type)
|
||||||
|
{
|
||||||
|
case BRIG_TYPE_B1:
|
||||||
|
return 0;
|
||||||
|
|
||||||
|
case BRIG_TYPE_U8:
|
||||||
|
case BRIG_TYPE_U16:
|
||||||
|
case BRIG_TYPE_U32:
|
||||||
|
case BRIG_TYPE_S8:
|
||||||
|
case BRIG_TYPE_S16:
|
||||||
|
case BRIG_TYPE_S32:
|
||||||
|
case BRIG_TYPE_F16:
|
||||||
|
case BRIG_TYPE_F32:
|
||||||
|
case BRIG_TYPE_B8:
|
||||||
|
case BRIG_TYPE_B16:
|
||||||
|
case BRIG_TYPE_B32:
|
||||||
|
case BRIG_TYPE_U8X4:
|
||||||
|
case BRIG_TYPE_S8X4:
|
||||||
|
case BRIG_TYPE_U16X2:
|
||||||
|
case BRIG_TYPE_S16X2:
|
||||||
|
case BRIG_TYPE_F16X2:
|
||||||
|
return 1;
|
||||||
|
|
||||||
|
case BRIG_TYPE_U64:
|
||||||
|
case BRIG_TYPE_S64:
|
||||||
|
case BRIG_TYPE_F64:
|
||||||
|
case BRIG_TYPE_B64:
|
||||||
|
case BRIG_TYPE_U8X8:
|
||||||
|
case BRIG_TYPE_S8X8:
|
||||||
|
case BRIG_TYPE_U16X4:
|
||||||
|
case BRIG_TYPE_S16X4:
|
||||||
|
case BRIG_TYPE_F16X4:
|
||||||
|
case BRIG_TYPE_U32X2:
|
||||||
|
case BRIG_TYPE_S32X2:
|
||||||
|
case BRIG_TYPE_F32X2:
|
||||||
|
return 2;
|
||||||
|
|
||||||
|
case BRIG_TYPE_B128:
|
||||||
|
case BRIG_TYPE_U8X16:
|
||||||
|
case BRIG_TYPE_S8X16:
|
||||||
|
case BRIG_TYPE_U16X8:
|
||||||
|
case BRIG_TYPE_S16X8:
|
||||||
|
case BRIG_TYPE_F16X8:
|
||||||
|
case BRIG_TYPE_U32X4:
|
||||||
|
case BRIG_TYPE_U64X2:
|
||||||
|
case BRIG_TYPE_S32X4:
|
||||||
|
case BRIG_TYPE_S64X2:
|
||||||
|
case BRIG_TYPE_F32X4:
|
||||||
|
case BRIG_TYPE_F64X2:
|
||||||
|
return 3;
|
||||||
|
|
||||||
|
default:
|
||||||
|
gcc_unreachable ();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/* If the Ith operands of INSN is or contains a register (in an address),
|
||||||
|
return the address of that register operand. If not return NULL. */
|
||||||
|
|
||||||
|
static hsa_op_reg **
|
||||||
|
insn_reg_addr (hsa_insn_basic *insn, int i)
|
||||||
|
{
|
||||||
|
hsa_op_base *op = insn->get_op (i);
|
||||||
|
if (!op)
|
||||||
|
return NULL;
|
||||||
|
hsa_op_reg *reg = dyn_cast <hsa_op_reg *> (op);
|
||||||
|
if (reg)
|
||||||
|
return (hsa_op_reg **) insn->get_op_addr (i);
|
||||||
|
hsa_op_address *addr = dyn_cast <hsa_op_address *> (op);
|
||||||
|
if (addr && addr->m_reg)
|
||||||
|
return &addr->m_reg;
|
||||||
|
return NULL;
|
||||||
|
}
|
||||||
|
|
||||||
|
struct m_reg_class_desc
|
||||||
|
{
|
||||||
|
unsigned next_avail, max_num;
|
||||||
|
unsigned used_num, max_used;
|
||||||
|
uint64_t used[2];
|
||||||
|
char cl_char;
|
||||||
|
};
|
||||||
|
|
||||||
|
/* Rewrite the instructions in BB to observe spilled live ranges.
|
||||||
|
CLASSES is the global register class state. */
|
||||||
|
|
||||||
|
static void
|
||||||
|
rewrite_code_bb (basic_block bb, struct m_reg_class_desc *classes)
|
||||||
|
{
|
||||||
|
hsa_bb *hbb = hsa_bb_for_bb (bb);
|
||||||
|
hsa_insn_basic *insn, *next_insn;
|
||||||
|
|
||||||
|
for (insn = hbb->m_first_insn; insn; insn = next_insn)
|
||||||
|
{
|
||||||
|
next_insn = insn->m_next;
|
||||||
|
unsigned count = insn->operand_count ();
|
||||||
|
for (unsigned i = 0; i < count; i++)
|
||||||
|
{
|
||||||
|
gcc_checking_assert (insn->get_op (i));
|
||||||
|
hsa_op_reg **regaddr = insn_reg_addr (insn, i);
|
||||||
|
|
||||||
|
if (regaddr)
|
||||||
|
{
|
||||||
|
hsa_op_reg *reg = *regaddr;
|
||||||
|
if (reg->m_reg_class)
|
||||||
|
continue;
|
||||||
|
gcc_assert (reg->m_spill_sym);
|
||||||
|
|
||||||
|
int cl = m_reg_class_for_type (reg->m_type);
|
||||||
|
hsa_op_reg *tmp, *tmp2;
|
||||||
|
if (insn->op_output_p (i))
|
||||||
|
tmp = hsa_spill_out (insn, reg, &tmp2);
|
||||||
|
else
|
||||||
|
tmp = hsa_spill_in (insn, reg, &tmp2);
|
||||||
|
|
||||||
|
*regaddr = tmp;
|
||||||
|
|
||||||
|
tmp->m_reg_class = classes[cl].cl_char;
|
||||||
|
tmp->m_hard_num = (char) (classes[cl].max_num + i);
|
||||||
|
if (tmp2)
|
||||||
|
{
|
||||||
|
gcc_assert (cl == 0);
|
||||||
|
tmp2->m_reg_class = classes[1].cl_char;
|
||||||
|
tmp2->m_hard_num = (char) (classes[1].max_num + i);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Dump current function to dump file F, with info specific
|
||||||
|
to register allocation. */
|
||||||
|
|
||||||
|
void
|
||||||
|
dump_hsa_cfun_regalloc (FILE *f)
|
||||||
|
{
|
||||||
|
basic_block bb;
|
||||||
|
|
||||||
|
fprintf (f, "\nHSAIL IL for %s\n", hsa_cfun->m_name);
|
||||||
|
|
||||||
|
FOR_ALL_BB_FN (bb, cfun)
|
||||||
|
{
|
||||||
|
hsa_bb *hbb = (struct hsa_bb *) bb->aux;
|
||||||
|
bitmap_print (dump_file, hbb->m_livein, "m_livein ", "\n");
|
||||||
|
dump_hsa_bb (f, hbb);
|
||||||
|
bitmap_print (dump_file, hbb->m_liveout, "m_liveout ", "\n");
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Given the global register allocation state CLASSES and a
|
||||||
|
register REG, try to give it a hardware register. If successful,
|
||||||
|
store that hardreg in REG and return it, otherwise return -1.
|
||||||
|
Also changes CLASSES to accommodate for the allocated register. */
|
||||||
|
|
||||||
|
static int
|
||||||
|
try_alloc_reg (struct m_reg_class_desc *classes, hsa_op_reg *reg)
|
||||||
|
{
|
||||||
|
int cl = m_reg_class_for_type (reg->m_type);
|
||||||
|
int ret = -1;
|
||||||
|
if (classes[1].used_num + classes[2].used_num * 2 + classes[3].used_num * 4
|
||||||
|
>= 128 - 5)
|
||||||
|
return -1;
|
||||||
|
if (classes[cl].used_num < classes[cl].max_num)
|
||||||
|
{
|
||||||
|
unsigned int i;
|
||||||
|
classes[cl].used_num++;
|
||||||
|
if (classes[cl].used_num > classes[cl].max_used)
|
||||||
|
classes[cl].max_used = classes[cl].used_num;
|
||||||
|
for (i = 0; i < classes[cl].used_num; i++)
|
||||||
|
if (! (classes[cl].used[i / 64] & (((uint64_t)1) << (i & 63))))
|
||||||
|
break;
|
||||||
|
ret = i;
|
||||||
|
classes[cl].used[i / 64] |= (((uint64_t)1) << (i & 63));
|
||||||
|
reg->m_reg_class = classes[cl].cl_char;
|
||||||
|
reg->m_hard_num = i;
|
||||||
|
}
|
||||||
|
return ret;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Free up hardregs used by REG, into allocation state CLASSES. */
|
||||||
|
|
||||||
|
static void
|
||||||
|
free_reg (struct m_reg_class_desc *classes, hsa_op_reg *reg)
|
||||||
|
{
|
||||||
|
int cl = m_reg_class_for_type (reg->m_type);
|
||||||
|
int ret = reg->m_hard_num;
|
||||||
|
gcc_assert (reg->m_reg_class == classes[cl].cl_char);
|
||||||
|
classes[cl].used_num--;
|
||||||
|
classes[cl].used[ret / 64] &= ~(((uint64_t)1) << (ret & 63));
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Note that the live range for REG ends at least at END. */
|
||||||
|
|
||||||
|
static void
|
||||||
|
note_lr_end (hsa_op_reg *reg, int end)
|
||||||
|
{
|
||||||
|
if (reg->m_lr_end < end)
|
||||||
|
reg->m_lr_end = end;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Note that the live range for REG starts at least at BEGIN. */
|
||||||
|
|
||||||
|
static void
|
||||||
|
note_lr_begin (hsa_op_reg *reg, int begin)
|
||||||
|
{
|
||||||
|
if (reg->m_lr_begin > begin)
|
||||||
|
reg->m_lr_begin = begin;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Given two registers A and B, return -1, 0 or 1 if A's live range
|
||||||
|
starts before, at or after B's live range. */
|
||||||
|
|
||||||
|
static int
|
||||||
|
cmp_begin (const void *a, const void *b)
|
||||||
|
{
|
||||||
|
const hsa_op_reg * const *rega = (const hsa_op_reg * const *)a;
|
||||||
|
const hsa_op_reg * const *regb = (const hsa_op_reg * const *)b;
|
||||||
|
int ret;
|
||||||
|
if (rega == regb)
|
||||||
|
return 0;
|
||||||
|
ret = (*rega)->m_lr_begin - (*regb)->m_lr_begin;
|
||||||
|
if (ret)
|
||||||
|
return ret;
|
||||||
|
return ((*rega)->m_order - (*regb)->m_order);
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Given two registers REGA and REGB, return true if REGA's
|
||||||
|
live range ends after REGB's. This results in a sorting order
|
||||||
|
with earlier end points at the end. */
|
||||||
|
|
||||||
|
static bool
|
||||||
|
cmp_end (hsa_op_reg * const ®a, hsa_op_reg * const ®b)
|
||||||
|
{
|
||||||
|
int ret;
|
||||||
|
if (rega == regb)
|
||||||
|
return false;
|
||||||
|
ret = (regb)->m_lr_end - (rega)->m_lr_end;
|
||||||
|
if (ret)
|
||||||
|
return ret < 0;
|
||||||
|
return (((regb)->m_order - (rega)->m_order)) < 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Expire all old intervals in ACTIVE (a per-regclass vector),
|
||||||
|
that is, those that end before the interval REG starts. Give
|
||||||
|
back resources freed so into the state CLASSES. */
|
||||||
|
|
||||||
|
static void
|
||||||
|
expire_old_intervals (hsa_op_reg *reg, vec<hsa_op_reg*> *active,
|
||||||
|
struct m_reg_class_desc *classes)
|
||||||
|
{
|
||||||
|
for (int i = 0; i < 4; i++)
|
||||||
|
while (!active[i].is_empty ())
|
||||||
|
{
|
||||||
|
hsa_op_reg *a = active[i].pop ();
|
||||||
|
if (a->m_lr_end > reg->m_lr_begin)
|
||||||
|
{
|
||||||
|
active[i].quick_push (a);
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
free_reg (classes, a);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/* The interval REG didn't get a hardreg. Spill it or one of those
|
||||||
|
from ACTIVE (if the latter, then REG will become allocated to the
|
||||||
|
hardreg that formerly was used by it). */
|
||||||
|
|
||||||
|
static void
|
||||||
|
spill_at_interval (hsa_op_reg *reg, vec<hsa_op_reg*> *active)
|
||||||
|
{
|
||||||
|
int cl = m_reg_class_for_type (reg->m_type);
|
||||||
|
gcc_assert (!active[cl].is_empty ());
|
||||||
|
hsa_op_reg *cand = active[cl][0];
|
||||||
|
if (cand->m_lr_end > reg->m_lr_end)
|
||||||
|
{
|
||||||
|
reg->m_reg_class = cand->m_reg_class;
|
||||||
|
reg->m_hard_num = cand->m_hard_num;
|
||||||
|
active[cl].ordered_remove (0);
|
||||||
|
unsigned place = active[cl].lower_bound (reg, cmp_end);
|
||||||
|
active[cl].quick_insert (place, reg);
|
||||||
|
}
|
||||||
|
else
|
||||||
|
cand = reg;
|
||||||
|
|
||||||
|
gcc_assert (!cand->m_spill_sym);
|
||||||
|
BrigType16_t type = cand->m_type;
|
||||||
|
if (type == BRIG_TYPE_B1)
|
||||||
|
type = BRIG_TYPE_U8;
|
||||||
|
cand->m_reg_class = 0;
|
||||||
|
cand->m_spill_sym = hsa_get_spill_symbol (type);
|
||||||
|
cand->m_spill_sym->m_name_number = cand->m_order;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Given the global register state CLASSES allocate all HSA virtual
|
||||||
|
registers either to hardregs or to a spill symbol. */
|
||||||
|
|
||||||
|
static void
|
||||||
|
linear_scan_regalloc (struct m_reg_class_desc *classes)
|
||||||
|
{
|
||||||
|
/* Compute liveness. */
|
||||||
|
bool changed;
|
||||||
|
int i, n;
|
||||||
|
int insn_order;
|
||||||
|
int *bbs = XNEWVEC (int, n_basic_blocks_for_fn (cfun));
|
||||||
|
bitmap work = BITMAP_ALLOC (NULL);
|
||||||
|
vec<hsa_op_reg*> ind2reg = vNULL;
|
||||||
|
vec<hsa_op_reg*> active[4] = {vNULL, vNULL, vNULL, vNULL};
|
||||||
|
hsa_insn_basic *m_last_insn;
|
||||||
|
|
||||||
|
/* We will need the reverse post order for linearization,
|
||||||
|
and the post order for liveness analysis, which is the same
|
||||||
|
backward. */
|
||||||
|
n = pre_and_rev_post_order_compute (NULL, bbs, true);
|
||||||
|
ind2reg.safe_grow_cleared (hsa_cfun->m_reg_count);
|
||||||
|
|
||||||
|
/* Give all instructions a linearized number, at the same time
|
||||||
|
build a mapping from register index to register. */
|
||||||
|
insn_order = 1;
|
||||||
|
for (i = 0; i < n; i++)
|
||||||
|
{
|
||||||
|
basic_block bb = BASIC_BLOCK_FOR_FN (cfun, bbs[i]);
|
||||||
|
hsa_bb *hbb = hsa_bb_for_bb (bb);
|
||||||
|
hsa_insn_basic *insn;
|
||||||
|
for (insn = hbb->m_first_insn; insn; insn = insn->m_next)
|
||||||
|
{
|
||||||
|
unsigned opi;
|
||||||
|
insn->m_number = insn_order++;
|
||||||
|
for (opi = 0; opi < insn->operand_count (); opi++)
|
||||||
|
{
|
||||||
|
gcc_checking_assert (insn->get_op (opi));
|
||||||
|
hsa_op_reg **regaddr = insn_reg_addr (insn, opi);
|
||||||
|
if (regaddr)
|
||||||
|
ind2reg[(*regaddr)->m_order] = *regaddr;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Initialize all live ranges to [after-end, 0). */
|
||||||
|
for (i = 0; i < hsa_cfun->m_reg_count; i++)
|
||||||
|
if (ind2reg[i])
|
||||||
|
ind2reg[i]->m_lr_begin = insn_order, ind2reg[i]->m_lr_end = 0;
|
||||||
|
|
||||||
|
/* Classic liveness analysis, as long as something changes:
|
||||||
|
m_liveout is union (m_livein of successors)
|
||||||
|
m_livein is m_liveout minus defs plus uses. */
|
||||||
|
do
|
||||||
|
{
|
||||||
|
changed = false;
|
||||||
|
for (i = n - 1; i >= 0; i--)
|
||||||
|
{
|
||||||
|
edge e;
|
||||||
|
edge_iterator ei;
|
||||||
|
basic_block bb = BASIC_BLOCK_FOR_FN (cfun, bbs[i]);
|
||||||
|
hsa_bb *hbb = hsa_bb_for_bb (bb);
|
||||||
|
|
||||||
|
/* Union of successors m_livein (or empty if none). */
|
||||||
|
bool first = true;
|
||||||
|
FOR_EACH_EDGE (e, ei, bb->succs)
|
||||||
|
if (e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun))
|
||||||
|
{
|
||||||
|
hsa_bb *succ = hsa_bb_for_bb (e->dest);
|
||||||
|
if (first)
|
||||||
|
{
|
||||||
|
bitmap_copy (work, succ->m_livein);
|
||||||
|
first = false;
|
||||||
|
}
|
||||||
|
else
|
||||||
|
bitmap_ior_into (work, succ->m_livein);
|
||||||
|
}
|
||||||
|
if (first)
|
||||||
|
bitmap_clear (work);
|
||||||
|
|
||||||
|
bitmap_copy (hbb->m_liveout, work);
|
||||||
|
|
||||||
|
/* Remove defs, include uses in a backward insn walk. */
|
||||||
|
hsa_insn_basic *insn;
|
||||||
|
for (insn = hbb->m_last_insn; insn; insn = insn->m_prev)
|
||||||
|
{
|
||||||
|
unsigned opi;
|
||||||
|
unsigned ndefs = insn->input_count ();
|
||||||
|
for (opi = 0; opi < ndefs && insn->get_op (opi); opi++)
|
||||||
|
{
|
||||||
|
gcc_checking_assert (insn->get_op (opi));
|
||||||
|
hsa_op_reg **regaddr = insn_reg_addr (insn, opi);
|
||||||
|
if (regaddr)
|
||||||
|
bitmap_clear_bit (work, (*regaddr)->m_order);
|
||||||
|
}
|
||||||
|
for (; opi < insn->operand_count (); opi++)
|
||||||
|
{
|
||||||
|
gcc_checking_assert (insn->get_op (opi));
|
||||||
|
hsa_op_reg **regaddr = insn_reg_addr (insn, opi);
|
||||||
|
if (regaddr)
|
||||||
|
bitmap_set_bit (work, (*regaddr)->m_order);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Note if that changed something. */
|
||||||
|
if (bitmap_ior_into (hbb->m_livein, work))
|
||||||
|
changed = true;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
while (changed);
|
||||||
|
|
||||||
|
/* Make one pass through all instructions in linear order,
|
||||||
|
noting and merging possible live range start and end points. */
|
||||||
|
m_last_insn = NULL;
|
||||||
|
for (i = n - 1; i >= 0; i--)
|
||||||
|
{
|
||||||
|
basic_block bb = BASIC_BLOCK_FOR_FN (cfun, bbs[i]);
|
||||||
|
hsa_bb *hbb = hsa_bb_for_bb (bb);
|
||||||
|
hsa_insn_basic *insn;
|
||||||
|
int after_end_number;
|
||||||
|
unsigned bit;
|
||||||
|
bitmap_iterator bi;
|
||||||
|
|
||||||
|
if (m_last_insn)
|
||||||
|
after_end_number = m_last_insn->m_number;
|
||||||
|
else
|
||||||
|
after_end_number = insn_order;
|
||||||
|
/* Everything live-out in this BB has at least an end point
|
||||||
|
after us. */
|
||||||
|
EXECUTE_IF_SET_IN_BITMAP (hbb->m_liveout, 0, bit, bi)
|
||||||
|
note_lr_end (ind2reg[bit], after_end_number);
|
||||||
|
|
||||||
|
for (insn = hbb->m_last_insn; insn; insn = insn->m_prev)
|
||||||
|
{
|
||||||
|
unsigned opi;
|
||||||
|
unsigned ndefs = insn->input_count ();
|
||||||
|
for (opi = 0; opi < insn->operand_count (); opi++)
|
||||||
|
{
|
||||||
|
gcc_checking_assert (insn->get_op (opi));
|
||||||
|
hsa_op_reg **regaddr = insn_reg_addr (insn, opi);
|
||||||
|
if (regaddr)
|
||||||
|
{
|
||||||
|
hsa_op_reg *reg = *regaddr;
|
||||||
|
if (opi < ndefs)
|
||||||
|
note_lr_begin (reg, insn->m_number);
|
||||||
|
else
|
||||||
|
note_lr_end (reg, insn->m_number);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Everything live-in in this BB has a start point before
|
||||||
|
our first insn. */
|
||||||
|
int before_start_number;
|
||||||
|
if (hbb->m_first_insn)
|
||||||
|
before_start_number = hbb->m_first_insn->m_number;
|
||||||
|
else
|
||||||
|
before_start_number = after_end_number;
|
||||||
|
before_start_number--;
|
||||||
|
EXECUTE_IF_SET_IN_BITMAP (hbb->m_livein, 0, bit, bi)
|
||||||
|
note_lr_begin (ind2reg[bit], before_start_number);
|
||||||
|
|
||||||
|
if (hbb->m_first_insn)
|
||||||
|
m_last_insn = hbb->m_first_insn;
|
||||||
|
}
|
||||||
|
|
||||||
|
for (i = 0; i < hsa_cfun->m_reg_count; i++)
|
||||||
|
if (ind2reg[i])
|
||||||
|
{
|
||||||
|
/* All regs that have still their start at after all code actually
|
||||||
|
are defined at the start of the routine (prologue). */
|
||||||
|
if (ind2reg[i]->m_lr_begin == insn_order)
|
||||||
|
ind2reg[i]->m_lr_begin = 0;
|
||||||
|
/* All regs that have no use but a def will have lr_end == 0,
|
||||||
|
they are actually live from def until after the insn they are
|
||||||
|
defined in. */
|
||||||
|
if (ind2reg[i]->m_lr_end == 0)
|
||||||
|
ind2reg[i]->m_lr_end = ind2reg[i]->m_lr_begin + 1;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Sort all intervals by increasing start point. */
|
||||||
|
gcc_assert (ind2reg.length () == (size_t) hsa_cfun->m_reg_count);
|
||||||
|
|
||||||
|
#ifdef ENABLE_CHECKING
|
||||||
|
for (unsigned i = 0; i < ind2reg.length (); i++)
|
||||||
|
gcc_assert (ind2reg[i]);
|
||||||
|
#endif
|
||||||
|
|
||||||
|
ind2reg.qsort (cmp_begin);
|
||||||
|
for (i = 0; i < 4; i++)
|
||||||
|
active[i].reserve_exact (hsa_cfun->m_reg_count);
|
||||||
|
|
||||||
|
/* Now comes the linear scan allocation. */
|
||||||
|
for (i = 0; i < hsa_cfun->m_reg_count; i++)
|
||||||
|
{
|
||||||
|
hsa_op_reg *reg = ind2reg[i];
|
||||||
|
if (!reg)
|
||||||
|
continue;
|
||||||
|
expire_old_intervals (reg, active, classes);
|
||||||
|
int cl = m_reg_class_for_type (reg->m_type);
|
||||||
|
if (try_alloc_reg (classes, reg) >= 0)
|
||||||
|
{
|
||||||
|
unsigned place = active[cl].lower_bound (reg, cmp_end);
|
||||||
|
active[cl].quick_insert (place, reg);
|
||||||
|
}
|
||||||
|
else
|
||||||
|
spill_at_interval (reg, active);
|
||||||
|
|
||||||
|
/* Some interesting dumping as we go. */
|
||||||
|
if (dump_file)
|
||||||
|
{
|
||||||
|
fprintf (dump_file, " reg%d: [%5d, %5d)->",
|
||||||
|
reg->m_order, reg->m_lr_begin, reg->m_lr_end);
|
||||||
|
if (reg->m_reg_class)
|
||||||
|
fprintf (dump_file, "$%c%i", reg->m_reg_class, reg->m_hard_num);
|
||||||
|
else
|
||||||
|
fprintf (dump_file, "[%%__%s_%i]",
|
||||||
|
hsa_seg_name (reg->m_spill_sym->m_segment),
|
||||||
|
reg->m_spill_sym->m_name_number);
|
||||||
|
for (int cl = 0; cl < 4; cl++)
|
||||||
|
{
|
||||||
|
bool first = true;
|
||||||
|
hsa_op_reg *r;
|
||||||
|
fprintf (dump_file, " {");
|
||||||
|
for (int j = 0; active[cl].iterate (j, &r); j++)
|
||||||
|
if (first)
|
||||||
|
{
|
||||||
|
fprintf (dump_file, "%d", r->m_order);
|
||||||
|
first = false;
|
||||||
|
}
|
||||||
|
else
|
||||||
|
fprintf (dump_file, ", %d", r->m_order);
|
||||||
|
fprintf (dump_file, "}");
|
||||||
|
}
|
||||||
|
fprintf (dump_file, "\n");
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
BITMAP_FREE (work);
|
||||||
|
free (bbs);
|
||||||
|
|
||||||
|
if (dump_file)
|
||||||
|
{
|
||||||
|
fprintf (dump_file, "------- After liveness: -------\n");
|
||||||
|
dump_hsa_cfun_regalloc (dump_file);
|
||||||
|
fprintf (dump_file, " ----- Intervals:\n");
|
||||||
|
for (i = 0; i < hsa_cfun->m_reg_count; i++)
|
||||||
|
{
|
||||||
|
hsa_op_reg *reg = ind2reg[i];
|
||||||
|
if (!reg)
|
||||||
|
continue;
|
||||||
|
fprintf (dump_file, " reg%d: [%5d, %5d)->", reg->m_order,
|
||||||
|
reg->m_lr_begin, reg->m_lr_end);
|
||||||
|
if (reg->m_reg_class)
|
||||||
|
fprintf (dump_file, "$%c%i\n", reg->m_reg_class, reg->m_hard_num);
|
||||||
|
else
|
||||||
|
fprintf (dump_file, "[%%__%s_%i]\n",
|
||||||
|
hsa_seg_name (reg->m_spill_sym->m_segment),
|
||||||
|
reg->m_spill_sym->m_name_number);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
for (i = 0; i < 4; i++)
|
||||||
|
active[i].release ();
|
||||||
|
ind2reg.release ();
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Entry point for register allocation. */
|
||||||
|
|
||||||
|
static void
|
||||||
|
regalloc (void)
|
||||||
|
{
|
||||||
|
basic_block bb;
|
||||||
|
m_reg_class_desc classes[4];
|
||||||
|
|
||||||
|
/* If there are no registers used in the function, exit right away. */
|
||||||
|
if (hsa_cfun->m_reg_count == 0)
|
||||||
|
return;
|
||||||
|
|
||||||
|
memset (classes, 0, sizeof (classes));
|
||||||
|
classes[0].next_avail = 0;
|
||||||
|
classes[0].max_num = 7;
|
||||||
|
classes[0].cl_char = 'c';
|
||||||
|
classes[1].cl_char = 's';
|
||||||
|
classes[2].cl_char = 'd';
|
||||||
|
classes[3].cl_char = 'q';
|
||||||
|
|
||||||
|
for (int i = 1; i < 4; i++)
|
||||||
|
{
|
||||||
|
classes[i].next_avail = 0;
|
||||||
|
classes[i].max_num = 20;
|
||||||
|
}
|
||||||
|
|
||||||
|
linear_scan_regalloc (classes);
|
||||||
|
|
||||||
|
FOR_ALL_BB_FN (bb, cfun)
|
||||||
|
rewrite_code_bb (bb, classes);
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Out of SSA and register allocation on HSAIL IL. */
|
||||||
|
|
||||||
|
void
|
||||||
|
hsa_regalloc (void)
|
||||||
|
{
|
||||||
|
naive_outof_ssa ();
|
||||||
|
|
||||||
|
if (dump_file)
|
||||||
|
{
|
||||||
|
fprintf (dump_file, "------- After out-of-SSA: -------\n");
|
||||||
|
dump_hsa_cfun (dump_file);
|
||||||
|
}
|
||||||
|
|
||||||
|
regalloc ();
|
||||||
|
|
||||||
|
if (dump_file)
|
||||||
|
{
|
||||||
|
fprintf (dump_file, "------- After register allocation: -------\n");
|
||||||
|
dump_hsa_cfun (dump_file);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
@ -0,0 +1,947 @@
|
||||||
|
/* Implementation of commonly needed HSAIL related functions and methods.
|
||||||
|
Copyright (C) 2013-2016 Free Software Foundation, Inc.
|
||||||
|
Contributed by Martin Jambor <mjambor@suse.cz> and
|
||||||
|
Martin Liska <mliska@suse.cz>.
|
||||||
|
|
||||||
|
This file is part of GCC.
|
||||||
|
|
||||||
|
GCC is free software; you can redistribute it and/or modify
|
||||||
|
it under the terms of the GNU General Public License as published by
|
||||||
|
the Free Software Foundation; either version 3, or (at your option)
|
||||||
|
any later version.
|
||||||
|
|
||||||
|
GCC is distributed in the hope that it will be useful,
|
||||||
|
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||||
|
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||||
|
GNU General Public License for more details.
|
||||||
|
|
||||||
|
You should have received a copy of the GNU General Public License
|
||||||
|
along with GCC; see the file COPYING3. If not see
|
||||||
|
<http://www.gnu.org/licenses/>. */
|
||||||
|
|
||||||
|
#include "config.h"
|
||||||
|
#include "system.h"
|
||||||
|
#include "coretypes.h"
|
||||||
|
#include "tm.h"
|
||||||
|
#include "is-a.h"
|
||||||
|
#include "hash-set.h"
|
||||||
|
#include "hash-map.h"
|
||||||
|
#include "vec.h"
|
||||||
|
#include "tree.h"
|
||||||
|
#include "dumpfile.h"
|
||||||
|
#include "gimple-pretty-print.h"
|
||||||
|
#include "diagnostic-core.h"
|
||||||
|
#include "alloc-pool.h"
|
||||||
|
#include "cgraph.h"
|
||||||
|
#include "print-tree.h"
|
||||||
|
#include "stringpool.h"
|
||||||
|
#include "symbol-summary.h"
|
||||||
|
#include "hsa.h"
|
||||||
|
#include "internal-fn.h"
|
||||||
|
#include "ctype.h"
|
||||||
|
|
||||||
|
/* Structure containing intermediate HSA representation of the generated
|
||||||
|
function. */
|
||||||
|
class hsa_function_representation *hsa_cfun;
|
||||||
|
|
||||||
|
/* Element of the mapping vector between a host decl and an HSA kernel. */
|
||||||
|
|
||||||
|
struct GTY(()) hsa_decl_kernel_map_element
|
||||||
|
{
|
||||||
|
/* The decl of the host function. */
|
||||||
|
tree decl;
|
||||||
|
/* Name of the HSA kernel in BRIG. */
|
||||||
|
char * GTY((skip)) name;
|
||||||
|
/* Size of OMP data, if the kernel contains a kernel dispatch. */
|
||||||
|
unsigned omp_data_size;
|
||||||
|
/* True if the function is gridified kernel. */
|
||||||
|
bool gridified_kernel_p;
|
||||||
|
};
|
||||||
|
|
||||||
|
/* Mapping between decls and corresponding HSA kernels in this compilation
|
||||||
|
unit. */
|
||||||
|
|
||||||
|
static GTY (()) vec<hsa_decl_kernel_map_element, va_gc>
|
||||||
|
*hsa_decl_kernel_mapping;
|
||||||
|
|
||||||
|
/* Mapping between decls and corresponding HSA kernels
|
||||||
|
called by the function. */
|
||||||
|
hash_map <tree, vec <const char *> *> *hsa_decl_kernel_dependencies;
|
||||||
|
|
||||||
|
/* Hash function to lookup a symbol for a decl. */
|
||||||
|
hash_table <hsa_noop_symbol_hasher> *hsa_global_variable_symbols;
|
||||||
|
|
||||||
|
/* HSA summaries. */
|
||||||
|
hsa_summary_t *hsa_summaries = NULL;
|
||||||
|
|
||||||
|
/* HSA number of threads. */
|
||||||
|
hsa_symbol *hsa_num_threads = NULL;
|
||||||
|
|
||||||
|
/* HSA function that cannot be expanded to HSAIL. */
|
||||||
|
hash_set <tree> *hsa_failed_functions = NULL;
|
||||||
|
|
||||||
|
/* True if compilation unit-wide data are already allocated and initialized. */
|
||||||
|
static bool compilation_unit_data_initialized;
|
||||||
|
|
||||||
|
/* Return true if FNDECL represents an HSA-callable function. */
|
||||||
|
|
||||||
|
bool
|
||||||
|
hsa_callable_function_p (tree fndecl)
|
||||||
|
{
|
||||||
|
return (lookup_attribute ("omp declare target", DECL_ATTRIBUTES (fndecl))
|
||||||
|
&& !lookup_attribute ("oacc function", DECL_ATTRIBUTES (fndecl)));
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Allocate HSA structures that are are used when dealing with different
|
||||||
|
functions. */
|
||||||
|
|
||||||
|
void
|
||||||
|
hsa_init_compilation_unit_data (void)
|
||||||
|
{
|
||||||
|
if (compilation_unit_data_initialized)
|
||||||
|
return;
|
||||||
|
|
||||||
|
compilation_unit_data_initialized = true;
|
||||||
|
|
||||||
|
hsa_global_variable_symbols = new hash_table <hsa_noop_symbol_hasher> (8);
|
||||||
|
hsa_failed_functions = new hash_set <tree> ();
|
||||||
|
hsa_emitted_internal_decls = new hash_table <hsa_internal_fn_hasher> (2);
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Free data structures that are used when dealing with different
|
||||||
|
functions. */
|
||||||
|
|
||||||
|
void
|
||||||
|
hsa_deinit_compilation_unit_data (void)
|
||||||
|
{
|
||||||
|
gcc_assert (compilation_unit_data_initialized);
|
||||||
|
|
||||||
|
delete hsa_failed_functions;
|
||||||
|
delete hsa_emitted_internal_decls;
|
||||||
|
|
||||||
|
for (hash_table <hsa_noop_symbol_hasher>::iterator it
|
||||||
|
= hsa_global_variable_symbols->begin ();
|
||||||
|
it != hsa_global_variable_symbols->end ();
|
||||||
|
++it)
|
||||||
|
{
|
||||||
|
hsa_symbol *sym = *it;
|
||||||
|
delete sym;
|
||||||
|
}
|
||||||
|
|
||||||
|
delete hsa_global_variable_symbols;
|
||||||
|
|
||||||
|
if (hsa_num_threads)
|
||||||
|
{
|
||||||
|
delete hsa_num_threads;
|
||||||
|
hsa_num_threads = NULL;
|
||||||
|
}
|
||||||
|
|
||||||
|
compilation_unit_data_initialized = false;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Return true if we are generating large HSA machine model. */
|
||||||
|
|
||||||
|
bool
|
||||||
|
hsa_machine_large_p (void)
|
||||||
|
{
|
||||||
|
/* FIXME: I suppose this is technically wrong but should work for me now. */
|
||||||
|
return (GET_MODE_BITSIZE (Pmode) == 64);
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Return the HSA profile we are using. */
|
||||||
|
|
||||||
|
bool
|
||||||
|
hsa_full_profile_p (void)
|
||||||
|
{
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Return true if a register in operand number OPNUM of instruction
|
||||||
|
is an output. False if it is an input. */
|
||||||
|
|
||||||
|
bool
|
||||||
|
hsa_insn_basic::op_output_p (unsigned opnum)
|
||||||
|
{
|
||||||
|
switch (m_opcode)
|
||||||
|
{
|
||||||
|
case HSA_OPCODE_PHI:
|
||||||
|
case BRIG_OPCODE_CBR:
|
||||||
|
case BRIG_OPCODE_SBR:
|
||||||
|
case BRIG_OPCODE_ST:
|
||||||
|
case BRIG_OPCODE_SIGNALNORET:
|
||||||
|
/* FIXME: There are probably missing cases here, double check. */
|
||||||
|
return false;
|
||||||
|
case BRIG_OPCODE_EXPAND:
|
||||||
|
/* Example: expand_v4_b32_b128 (dest0, dest1, dest2, dest3), src0. */
|
||||||
|
return opnum < operand_count () - 1;
|
||||||
|
default:
|
||||||
|
return opnum == 0;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Return true if OPCODE is an floating-point bit instruction opcode. */
|
||||||
|
|
||||||
|
bool
|
||||||
|
hsa_opcode_floating_bit_insn_p (BrigOpcode16_t opcode)
|
||||||
|
{
|
||||||
|
switch (opcode)
|
||||||
|
{
|
||||||
|
case BRIG_OPCODE_NEG:
|
||||||
|
case BRIG_OPCODE_ABS:
|
||||||
|
case BRIG_OPCODE_CLASS:
|
||||||
|
case BRIG_OPCODE_COPYSIGN:
|
||||||
|
return true;
|
||||||
|
default:
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Return the number of destination operands for this INSN. */
|
||||||
|
|
||||||
|
unsigned
|
||||||
|
hsa_insn_basic::input_count ()
|
||||||
|
{
|
||||||
|
switch (m_opcode)
|
||||||
|
{
|
||||||
|
default:
|
||||||
|
return 1;
|
||||||
|
|
||||||
|
case BRIG_OPCODE_NOP:
|
||||||
|
return 0;
|
||||||
|
|
||||||
|
case BRIG_OPCODE_EXPAND:
|
||||||
|
return 2;
|
||||||
|
|
||||||
|
case BRIG_OPCODE_LD:
|
||||||
|
/* ld_v[234] not yet handled. */
|
||||||
|
return 1;
|
||||||
|
|
||||||
|
case BRIG_OPCODE_ST:
|
||||||
|
return 0;
|
||||||
|
|
||||||
|
case BRIG_OPCODE_ATOMICNORET:
|
||||||
|
return 0;
|
||||||
|
|
||||||
|
case BRIG_OPCODE_SIGNAL:
|
||||||
|
return 1;
|
||||||
|
|
||||||
|
case BRIG_OPCODE_SIGNALNORET:
|
||||||
|
return 0;
|
||||||
|
|
||||||
|
case BRIG_OPCODE_MEMFENCE:
|
||||||
|
return 0;
|
||||||
|
|
||||||
|
case BRIG_OPCODE_RDIMAGE:
|
||||||
|
case BRIG_OPCODE_LDIMAGE:
|
||||||
|
case BRIG_OPCODE_STIMAGE:
|
||||||
|
case BRIG_OPCODE_QUERYIMAGE:
|
||||||
|
case BRIG_OPCODE_QUERYSAMPLER:
|
||||||
|
sorry ("HSA image ops not handled");
|
||||||
|
return 0;
|
||||||
|
|
||||||
|
case BRIG_OPCODE_CBR:
|
||||||
|
case BRIG_OPCODE_BR:
|
||||||
|
return 0;
|
||||||
|
|
||||||
|
case BRIG_OPCODE_SBR:
|
||||||
|
return 0; /* ??? */
|
||||||
|
|
||||||
|
case BRIG_OPCODE_WAVEBARRIER:
|
||||||
|
return 0; /* ??? */
|
||||||
|
|
||||||
|
case BRIG_OPCODE_BARRIER:
|
||||||
|
case BRIG_OPCODE_ARRIVEFBAR:
|
||||||
|
case BRIG_OPCODE_INITFBAR:
|
||||||
|
case BRIG_OPCODE_JOINFBAR:
|
||||||
|
case BRIG_OPCODE_LEAVEFBAR:
|
||||||
|
case BRIG_OPCODE_RELEASEFBAR:
|
||||||
|
case BRIG_OPCODE_WAITFBAR:
|
||||||
|
return 0;
|
||||||
|
|
||||||
|
case BRIG_OPCODE_LDF:
|
||||||
|
return 1;
|
||||||
|
|
||||||
|
case BRIG_OPCODE_ACTIVELANECOUNT:
|
||||||
|
case BRIG_OPCODE_ACTIVELANEID:
|
||||||
|
case BRIG_OPCODE_ACTIVELANEMASK:
|
||||||
|
case BRIG_OPCODE_ACTIVELANEPERMUTE:
|
||||||
|
return 1; /* ??? */
|
||||||
|
|
||||||
|
case BRIG_OPCODE_CALL:
|
||||||
|
case BRIG_OPCODE_SCALL:
|
||||||
|
case BRIG_OPCODE_ICALL:
|
||||||
|
return 0;
|
||||||
|
|
||||||
|
case BRIG_OPCODE_RET:
|
||||||
|
return 0;
|
||||||
|
|
||||||
|
case BRIG_OPCODE_ALLOCA:
|
||||||
|
return 1;
|
||||||
|
|
||||||
|
case BRIG_OPCODE_CLEARDETECTEXCEPT:
|
||||||
|
return 0;
|
||||||
|
|
||||||
|
case BRIG_OPCODE_SETDETECTEXCEPT:
|
||||||
|
return 0;
|
||||||
|
|
||||||
|
case BRIG_OPCODE_PACKETCOMPLETIONSIG:
|
||||||
|
case BRIG_OPCODE_PACKETID:
|
||||||
|
case BRIG_OPCODE_CASQUEUEWRITEINDEX:
|
||||||
|
case BRIG_OPCODE_LDQUEUEREADINDEX:
|
||||||
|
case BRIG_OPCODE_LDQUEUEWRITEINDEX:
|
||||||
|
case BRIG_OPCODE_STQUEUEREADINDEX:
|
||||||
|
case BRIG_OPCODE_STQUEUEWRITEINDEX:
|
||||||
|
return 1; /* ??? */
|
||||||
|
|
||||||
|
case BRIG_OPCODE_ADDQUEUEWRITEINDEX:
|
||||||
|
return 1;
|
||||||
|
|
||||||
|
case BRIG_OPCODE_DEBUGTRAP:
|
||||||
|
return 0;
|
||||||
|
|
||||||
|
case BRIG_OPCODE_GROUPBASEPTR:
|
||||||
|
case BRIG_OPCODE_KERNARGBASEPTR:
|
||||||
|
return 1; /* ??? */
|
||||||
|
|
||||||
|
case HSA_OPCODE_ARG_BLOCK:
|
||||||
|
return 0;
|
||||||
|
|
||||||
|
case BRIG_KIND_DIRECTIVE_COMMENT:
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Return the number of source operands for this INSN. */
|
||||||
|
|
||||||
|
unsigned
|
||||||
|
hsa_insn_basic::num_used_ops ()
|
||||||
|
{
|
||||||
|
gcc_checking_assert (input_count () <= operand_count ());
|
||||||
|
|
||||||
|
return operand_count () - input_count ();
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Set alignment to VALUE. */
|
||||||
|
|
||||||
|
void
|
||||||
|
hsa_insn_mem::set_align (BrigAlignment8_t value)
|
||||||
|
{
|
||||||
|
/* TODO: Perhaps remove this dump later on: */
|
||||||
|
if (dump_file && (dump_flags & TDF_DETAILS) && value < m_align)
|
||||||
|
{
|
||||||
|
fprintf (dump_file, "Decreasing alignment to %u in instruction ", value);
|
||||||
|
dump_hsa_insn (dump_file, this);
|
||||||
|
}
|
||||||
|
m_align = value;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Return size of HSA type T in bits. */
|
||||||
|
|
||||||
|
unsigned
|
||||||
|
hsa_type_bit_size (BrigType16_t t)
|
||||||
|
{
|
||||||
|
switch (t)
|
||||||
|
{
|
||||||
|
case BRIG_TYPE_B1:
|
||||||
|
return 1;
|
||||||
|
|
||||||
|
case BRIG_TYPE_U8:
|
||||||
|
case BRIG_TYPE_S8:
|
||||||
|
case BRIG_TYPE_B8:
|
||||||
|
return 8;
|
||||||
|
|
||||||
|
case BRIG_TYPE_U16:
|
||||||
|
case BRIG_TYPE_S16:
|
||||||
|
case BRIG_TYPE_B16:
|
||||||
|
case BRIG_TYPE_F16:
|
||||||
|
return 16;
|
||||||
|
|
||||||
|
case BRIG_TYPE_U32:
|
||||||
|
case BRIG_TYPE_S32:
|
||||||
|
case BRIG_TYPE_B32:
|
||||||
|
case BRIG_TYPE_F32:
|
||||||
|
case BRIG_TYPE_U8X4:
|
||||||
|
case BRIG_TYPE_U16X2:
|
||||||
|
case BRIG_TYPE_S8X4:
|
||||||
|
case BRIG_TYPE_S16X2:
|
||||||
|
case BRIG_TYPE_F16X2:
|
||||||
|
return 32;
|
||||||
|
|
||||||
|
case BRIG_TYPE_U64:
|
||||||
|
case BRIG_TYPE_S64:
|
||||||
|
case BRIG_TYPE_F64:
|
||||||
|
case BRIG_TYPE_B64:
|
||||||
|
case BRIG_TYPE_U8X8:
|
||||||
|
case BRIG_TYPE_U16X4:
|
||||||
|
case BRIG_TYPE_U32X2:
|
||||||
|
case BRIG_TYPE_S8X8:
|
||||||
|
case BRIG_TYPE_S16X4:
|
||||||
|
case BRIG_TYPE_S32X2:
|
||||||
|
case BRIG_TYPE_F16X4:
|
||||||
|
case BRIG_TYPE_F32X2:
|
||||||
|
|
||||||
|
return 64;
|
||||||
|
|
||||||
|
case BRIG_TYPE_B128:
|
||||||
|
case BRIG_TYPE_U8X16:
|
||||||
|
case BRIG_TYPE_U16X8:
|
||||||
|
case BRIG_TYPE_U32X4:
|
||||||
|
case BRIG_TYPE_U64X2:
|
||||||
|
case BRIG_TYPE_S8X16:
|
||||||
|
case BRIG_TYPE_S16X8:
|
||||||
|
case BRIG_TYPE_S32X4:
|
||||||
|
case BRIG_TYPE_S64X2:
|
||||||
|
case BRIG_TYPE_F16X8:
|
||||||
|
case BRIG_TYPE_F32X4:
|
||||||
|
case BRIG_TYPE_F64X2:
|
||||||
|
return 128;
|
||||||
|
|
||||||
|
default:
|
||||||
|
gcc_assert (hsa_seen_error ());
|
||||||
|
return t;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Return BRIG bit-type with BITSIZE length. */
|
||||||
|
|
||||||
|
BrigType16_t
|
||||||
|
hsa_bittype_for_bitsize (unsigned bitsize)
|
||||||
|
{
|
||||||
|
switch (bitsize)
|
||||||
|
{
|
||||||
|
case 1:
|
||||||
|
return BRIG_TYPE_B1;
|
||||||
|
case 8:
|
||||||
|
return BRIG_TYPE_B8;
|
||||||
|
case 16:
|
||||||
|
return BRIG_TYPE_B16;
|
||||||
|
case 32:
|
||||||
|
return BRIG_TYPE_B32;
|
||||||
|
case 64:
|
||||||
|
return BRIG_TYPE_B64;
|
||||||
|
case 128:
|
||||||
|
return BRIG_TYPE_B128;
|
||||||
|
default:
|
||||||
|
gcc_unreachable ();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Return BRIG unsigned int type with BITSIZE length. */
|
||||||
|
|
||||||
|
BrigType16_t
|
||||||
|
hsa_uint_for_bitsize (unsigned bitsize)
|
||||||
|
{
|
||||||
|
switch (bitsize)
|
||||||
|
{
|
||||||
|
case 8:
|
||||||
|
return BRIG_TYPE_U8;
|
||||||
|
case 16:
|
||||||
|
return BRIG_TYPE_U16;
|
||||||
|
case 32:
|
||||||
|
return BRIG_TYPE_U32;
|
||||||
|
case 64:
|
||||||
|
return BRIG_TYPE_U64;
|
||||||
|
default:
|
||||||
|
gcc_unreachable ();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Return BRIG float type with BITSIZE length. */
|
||||||
|
|
||||||
|
BrigType16_t
|
||||||
|
hsa_float_for_bitsize (unsigned bitsize)
|
||||||
|
{
|
||||||
|
switch (bitsize)
|
||||||
|
{
|
||||||
|
case 16:
|
||||||
|
return BRIG_TYPE_F16;
|
||||||
|
case 32:
|
||||||
|
return BRIG_TYPE_F32;
|
||||||
|
case 64:
|
||||||
|
return BRIG_TYPE_F64;
|
||||||
|
default:
|
||||||
|
gcc_unreachable ();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Return HSA bit-type with the same size as the type T. */
|
||||||
|
|
||||||
|
BrigType16_t
|
||||||
|
hsa_bittype_for_type (BrigType16_t t)
|
||||||
|
{
|
||||||
|
return hsa_bittype_for_bitsize (hsa_type_bit_size (t));
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Return true if and only if TYPE is a floating point number type. */
|
||||||
|
|
||||||
|
bool
|
||||||
|
hsa_type_float_p (BrigType16_t type)
|
||||||
|
{
|
||||||
|
switch (type & BRIG_TYPE_BASE_MASK)
|
||||||
|
{
|
||||||
|
case BRIG_TYPE_F16:
|
||||||
|
case BRIG_TYPE_F32:
|
||||||
|
case BRIG_TYPE_F64:
|
||||||
|
return true;
|
||||||
|
default:
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Return true if and only if TYPE is an integer number type. */
|
||||||
|
|
||||||
|
bool
|
||||||
|
hsa_type_integer_p (BrigType16_t type)
|
||||||
|
{
|
||||||
|
switch (type & BRIG_TYPE_BASE_MASK)
|
||||||
|
{
|
||||||
|
case BRIG_TYPE_U8:
|
||||||
|
case BRIG_TYPE_U16:
|
||||||
|
case BRIG_TYPE_U32:
|
||||||
|
case BRIG_TYPE_U64:
|
||||||
|
case BRIG_TYPE_S8:
|
||||||
|
case BRIG_TYPE_S16:
|
||||||
|
case BRIG_TYPE_S32:
|
||||||
|
case BRIG_TYPE_S64:
|
||||||
|
return true;
|
||||||
|
default:
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Return true if and only if TYPE is an bit-type. */
|
||||||
|
|
||||||
|
bool
|
||||||
|
hsa_btype_p (BrigType16_t type)
|
||||||
|
{
|
||||||
|
switch (type & BRIG_TYPE_BASE_MASK)
|
||||||
|
{
|
||||||
|
case BRIG_TYPE_B8:
|
||||||
|
case BRIG_TYPE_B16:
|
||||||
|
case BRIG_TYPE_B32:
|
||||||
|
case BRIG_TYPE_B64:
|
||||||
|
case BRIG_TYPE_B128:
|
||||||
|
return true;
|
||||||
|
default:
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
/* Return HSA alignment encoding alignment to N bits. */
|
||||||
|
|
||||||
|
BrigAlignment8_t
|
||||||
|
hsa_alignment_encoding (unsigned n)
|
||||||
|
{
|
||||||
|
gcc_assert (n >= 8 && !(n & (n - 1)));
|
||||||
|
if (n >= 256)
|
||||||
|
return BRIG_ALIGNMENT_32;
|
||||||
|
|
||||||
|
switch (n)
|
||||||
|
{
|
||||||
|
case 8:
|
||||||
|
return BRIG_ALIGNMENT_1;
|
||||||
|
case 16:
|
||||||
|
return BRIG_ALIGNMENT_2;
|
||||||
|
case 32:
|
||||||
|
return BRIG_ALIGNMENT_4;
|
||||||
|
case 64:
|
||||||
|
return BRIG_ALIGNMENT_8;
|
||||||
|
case 128:
|
||||||
|
return BRIG_ALIGNMENT_16;
|
||||||
|
default:
|
||||||
|
gcc_unreachable ();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Return natural alignment of HSA TYPE. */
|
||||||
|
|
||||||
|
BrigAlignment8_t
|
||||||
|
hsa_natural_alignment (BrigType16_t type)
|
||||||
|
{
|
||||||
|
return hsa_alignment_encoding (hsa_type_bit_size (type & ~BRIG_TYPE_ARRAY));
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Call the correct destructor of a HSA instruction. */
|
||||||
|
|
||||||
|
void
|
||||||
|
hsa_destroy_insn (hsa_insn_basic *insn)
|
||||||
|
{
|
||||||
|
if (hsa_insn_phi *phi = dyn_cast <hsa_insn_phi *> (insn))
|
||||||
|
phi->~hsa_insn_phi ();
|
||||||
|
else if (hsa_insn_br *br = dyn_cast <hsa_insn_br *> (insn))
|
||||||
|
br->~hsa_insn_br ();
|
||||||
|
else if (hsa_insn_cmp *cmp = dyn_cast <hsa_insn_cmp *> (insn))
|
||||||
|
cmp->~hsa_insn_cmp ();
|
||||||
|
else if (hsa_insn_mem *mem = dyn_cast <hsa_insn_mem *> (insn))
|
||||||
|
mem->~hsa_insn_mem ();
|
||||||
|
else if (hsa_insn_atomic *atomic = dyn_cast <hsa_insn_atomic *> (insn))
|
||||||
|
atomic->~hsa_insn_atomic ();
|
||||||
|
else if (hsa_insn_seg *seg = dyn_cast <hsa_insn_seg *> (insn))
|
||||||
|
seg->~hsa_insn_seg ();
|
||||||
|
else if (hsa_insn_call *call = dyn_cast <hsa_insn_call *> (insn))
|
||||||
|
call->~hsa_insn_call ();
|
||||||
|
else if (hsa_insn_arg_block *block = dyn_cast <hsa_insn_arg_block *> (insn))
|
||||||
|
block->~hsa_insn_arg_block ();
|
||||||
|
else if (hsa_insn_sbr *sbr = dyn_cast <hsa_insn_sbr *> (insn))
|
||||||
|
sbr->~hsa_insn_sbr ();
|
||||||
|
else if (hsa_insn_comment *comment = dyn_cast <hsa_insn_comment *> (insn))
|
||||||
|
comment->~hsa_insn_comment ();
|
||||||
|
else
|
||||||
|
insn->~hsa_insn_basic ();
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Call the correct destructor of a HSA operand. */
|
||||||
|
|
||||||
|
void
|
||||||
|
hsa_destroy_operand (hsa_op_base *op)
|
||||||
|
{
|
||||||
|
if (hsa_op_code_list *list = dyn_cast <hsa_op_code_list *> (op))
|
||||||
|
list->~hsa_op_code_list ();
|
||||||
|
else if (hsa_op_operand_list *list = dyn_cast <hsa_op_operand_list *> (op))
|
||||||
|
list->~hsa_op_operand_list ();
|
||||||
|
else if (hsa_op_reg *reg = dyn_cast <hsa_op_reg *> (op))
|
||||||
|
reg->~hsa_op_reg ();
|
||||||
|
else if (hsa_op_immed *immed = dyn_cast <hsa_op_immed *> (op))
|
||||||
|
immed->~hsa_op_immed ();
|
||||||
|
else
|
||||||
|
op->~hsa_op_base ();
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Create a mapping between the original function DECL and kernel name NAME. */
|
||||||
|
|
||||||
|
void
|
||||||
|
hsa_add_kern_decl_mapping (tree decl, char *name, unsigned omp_data_size,
|
||||||
|
bool gridified_kernel_p)
|
||||||
|
{
|
||||||
|
hsa_decl_kernel_map_element dkm;
|
||||||
|
dkm.decl = decl;
|
||||||
|
dkm.name = name;
|
||||||
|
dkm.omp_data_size = omp_data_size;
|
||||||
|
dkm.gridified_kernel_p = gridified_kernel_p;
|
||||||
|
vec_safe_push (hsa_decl_kernel_mapping, dkm);
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Return the number of kernel decl name mappings. */
|
||||||
|
|
||||||
|
unsigned
|
||||||
|
hsa_get_number_decl_kernel_mappings (void)
|
||||||
|
{
|
||||||
|
return vec_safe_length (hsa_decl_kernel_mapping);
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Return the decl in the Ith kernel decl name mapping. */
|
||||||
|
|
||||||
|
tree
|
||||||
|
hsa_get_decl_kernel_mapping_decl (unsigned i)
|
||||||
|
{
|
||||||
|
return (*hsa_decl_kernel_mapping)[i].decl;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Return the name in the Ith kernel decl name mapping. */
|
||||||
|
|
||||||
|
char *
|
||||||
|
hsa_get_decl_kernel_mapping_name (unsigned i)
|
||||||
|
{
|
||||||
|
return (*hsa_decl_kernel_mapping)[i].name;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Return maximum OMP size for kernel decl name mapping. */
|
||||||
|
|
||||||
|
unsigned
|
||||||
|
hsa_get_decl_kernel_mapping_omp_size (unsigned i)
|
||||||
|
{
|
||||||
|
return (*hsa_decl_kernel_mapping)[i].omp_data_size;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Return if the function is gridified kernel in decl name mapping. */
|
||||||
|
|
||||||
|
bool
|
||||||
|
hsa_get_decl_kernel_mapping_gridified (unsigned i)
|
||||||
|
{
|
||||||
|
return (*hsa_decl_kernel_mapping)[i].gridified_kernel_p;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Free the mapping between original decls and kernel names. */
|
||||||
|
|
||||||
|
void
|
||||||
|
hsa_free_decl_kernel_mapping (void)
|
||||||
|
{
|
||||||
|
if (hsa_decl_kernel_mapping == NULL)
|
||||||
|
return;
|
||||||
|
|
||||||
|
for (unsigned i = 0; i < hsa_decl_kernel_mapping->length (); ++i)
|
||||||
|
free ((*hsa_decl_kernel_mapping)[i].name);
|
||||||
|
ggc_free (hsa_decl_kernel_mapping);
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Add new kernel dependency. */
|
||||||
|
|
||||||
|
void
|
||||||
|
hsa_add_kernel_dependency (tree caller, const char *called_function)
|
||||||
|
{
|
||||||
|
if (hsa_decl_kernel_dependencies == NULL)
|
||||||
|
hsa_decl_kernel_dependencies = new hash_map<tree, vec<const char *> *> ();
|
||||||
|
|
||||||
|
vec <const char *> *s = NULL;
|
||||||
|
vec <const char *> **slot = hsa_decl_kernel_dependencies->get (caller);
|
||||||
|
if (slot == NULL)
|
||||||
|
{
|
||||||
|
s = new vec <const char *> ();
|
||||||
|
hsa_decl_kernel_dependencies->put (caller, s);
|
||||||
|
}
|
||||||
|
else
|
||||||
|
s = *slot;
|
||||||
|
|
||||||
|
s->safe_push (called_function);
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Modify the name P in-place so that it is a valid HSA identifier. */
|
||||||
|
|
||||||
|
void
|
||||||
|
hsa_sanitize_name (char *p)
|
||||||
|
{
|
||||||
|
for (; *p; p++)
|
||||||
|
if (*p == '.' || *p == '-')
|
||||||
|
*p = '_';
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Clone the name P, set trailing ampersand and sanitize the name. */
|
||||||
|
|
||||||
|
char *
|
||||||
|
hsa_brig_function_name (const char *p)
|
||||||
|
{
|
||||||
|
unsigned len = strlen (p);
|
||||||
|
char *buf = XNEWVEC (char, len + 2);
|
||||||
|
|
||||||
|
buf[0] = '&';
|
||||||
|
buf[len + 1] = '\0';
|
||||||
|
memcpy (buf + 1, p, len);
|
||||||
|
|
||||||
|
hsa_sanitize_name (buf);
|
||||||
|
return buf;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Return declaration name if exists. */
|
||||||
|
|
||||||
|
const char *
|
||||||
|
hsa_get_declaration_name (tree decl)
|
||||||
|
{
|
||||||
|
if (!DECL_NAME (decl))
|
||||||
|
{
|
||||||
|
char buf[64];
|
||||||
|
snprintf (buf, 64, "__hsa_anonymous_%i", DECL_UID (decl));
|
||||||
|
const char *ggc_str = ggc_strdup (buf);
|
||||||
|
return ggc_str;
|
||||||
|
}
|
||||||
|
|
||||||
|
tree name_tree;
|
||||||
|
if (TREE_CODE (decl) == FUNCTION_DECL
|
||||||
|
|| (TREE_CODE (decl) == VAR_DECL && is_global_var (decl)))
|
||||||
|
name_tree = DECL_ASSEMBLER_NAME (decl);
|
||||||
|
else
|
||||||
|
name_tree = DECL_NAME (decl);
|
||||||
|
|
||||||
|
const char *name = IDENTIFIER_POINTER (name_tree);
|
||||||
|
/* User-defined assembly names have prepended asterisk symbol. */
|
||||||
|
if (name[0] == '*')
|
||||||
|
name++;
|
||||||
|
|
||||||
|
return name;
|
||||||
|
}
|
||||||
|
|
||||||
|
void
|
||||||
|
hsa_summary_t::link_functions (cgraph_node *gpu, cgraph_node *host,
|
||||||
|
hsa_function_kind kind, bool gridified_kernel_p)
|
||||||
|
{
|
||||||
|
hsa_function_summary *gpu_summary = get (gpu);
|
||||||
|
hsa_function_summary *host_summary = get (host);
|
||||||
|
|
||||||
|
gpu_summary->m_kind = kind;
|
||||||
|
host_summary->m_kind = kind;
|
||||||
|
|
||||||
|
gpu_summary->m_gpu_implementation_p = true;
|
||||||
|
host_summary->m_gpu_implementation_p = false;
|
||||||
|
|
||||||
|
gpu_summary->m_gridified_kernel_p = gridified_kernel_p;
|
||||||
|
host_summary->m_gridified_kernel_p = gridified_kernel_p;
|
||||||
|
|
||||||
|
gpu_summary->m_binded_function = host;
|
||||||
|
host_summary->m_binded_function = gpu;
|
||||||
|
|
||||||
|
tree gdecl = gpu->decl;
|
||||||
|
DECL_ATTRIBUTES (gdecl)
|
||||||
|
= tree_cons (get_identifier ("flatten"), NULL_TREE,
|
||||||
|
DECL_ATTRIBUTES (gdecl));
|
||||||
|
|
||||||
|
tree fn_opts = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (gdecl);
|
||||||
|
if (fn_opts == NULL_TREE)
|
||||||
|
fn_opts = optimization_default_node;
|
||||||
|
fn_opts = copy_node (fn_opts);
|
||||||
|
TREE_OPTIMIZATION (fn_opts)->x_flag_tree_loop_vectorize = false;
|
||||||
|
TREE_OPTIMIZATION (fn_opts)->x_flag_tree_slp_vectorize = false;
|
||||||
|
DECL_FUNCTION_SPECIFIC_OPTIMIZATION (gdecl) = fn_opts;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Add a HOST function to HSA summaries. */
|
||||||
|
|
||||||
|
void
|
||||||
|
hsa_register_kernel (cgraph_node *host)
|
||||||
|
{
|
||||||
|
if (hsa_summaries == NULL)
|
||||||
|
hsa_summaries = new hsa_summary_t (symtab);
|
||||||
|
hsa_function_summary *s = hsa_summaries->get (host);
|
||||||
|
s->m_kind = HSA_KERNEL;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Add a pair of functions to HSA summaries. GPU is an HSA implementation of
|
||||||
|
a HOST function. */
|
||||||
|
|
||||||
|
void
|
||||||
|
hsa_register_kernel (cgraph_node *gpu, cgraph_node *host)
|
||||||
|
{
|
||||||
|
if (hsa_summaries == NULL)
|
||||||
|
hsa_summaries = new hsa_summary_t (symtab);
|
||||||
|
hsa_summaries->link_functions (gpu, host, HSA_KERNEL, true);
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Return true if expansion of the current HSA function has already failed. */
|
||||||
|
|
||||||
|
bool
|
||||||
|
hsa_seen_error (void)
|
||||||
|
{
|
||||||
|
return hsa_cfun->m_seen_error;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Mark current HSA function as failed. */
|
||||||
|
|
||||||
|
void
|
||||||
|
hsa_fail_cfun (void)
|
||||||
|
{
|
||||||
|
hsa_failed_functions->add (hsa_cfun->m_decl);
|
||||||
|
hsa_cfun->m_seen_error = true;
|
||||||
|
}
|
||||||
|
|
||||||
|
char *
|
||||||
|
hsa_internal_fn::name ()
|
||||||
|
{
|
||||||
|
char *name = xstrdup (internal_fn_name (m_fn));
|
||||||
|
for (char *ptr = name; *ptr; ptr++)
|
||||||
|
*ptr = TOLOWER (*ptr);
|
||||||
|
|
||||||
|
const char *suffix = NULL;
|
||||||
|
if (m_type_bit_size == 32)
|
||||||
|
suffix = "f";
|
||||||
|
|
||||||
|
if (suffix)
|
||||||
|
{
|
||||||
|
char *name2 = concat (name, suffix, NULL);
|
||||||
|
free (name);
|
||||||
|
name = name2;
|
||||||
|
}
|
||||||
|
|
||||||
|
hsa_sanitize_name (name);
|
||||||
|
return name;
|
||||||
|
}
|
||||||
|
|
||||||
|
unsigned
|
||||||
|
hsa_internal_fn::get_arity ()
|
||||||
|
{
|
||||||
|
switch (m_fn)
|
||||||
|
{
|
||||||
|
case IFN_ACOS:
|
||||||
|
case IFN_ASIN:
|
||||||
|
case IFN_ATAN:
|
||||||
|
case IFN_COS:
|
||||||
|
case IFN_EXP:
|
||||||
|
case IFN_EXP10:
|
||||||
|
case IFN_EXP2:
|
||||||
|
case IFN_EXPM1:
|
||||||
|
case IFN_LOG:
|
||||||
|
case IFN_LOG10:
|
||||||
|
case IFN_LOG1P:
|
||||||
|
case IFN_LOG2:
|
||||||
|
case IFN_LOGB:
|
||||||
|
case IFN_SIGNIFICAND:
|
||||||
|
case IFN_SIN:
|
||||||
|
case IFN_SQRT:
|
||||||
|
case IFN_TAN:
|
||||||
|
case IFN_CEIL:
|
||||||
|
case IFN_FLOOR:
|
||||||
|
case IFN_NEARBYINT:
|
||||||
|
case IFN_RINT:
|
||||||
|
case IFN_ROUND:
|
||||||
|
case IFN_TRUNC:
|
||||||
|
return 1;
|
||||||
|
case IFN_ATAN2:
|
||||||
|
case IFN_COPYSIGN:
|
||||||
|
case IFN_FMOD:
|
||||||
|
case IFN_POW:
|
||||||
|
case IFN_REMAINDER:
|
||||||
|
case IFN_SCALB:
|
||||||
|
case IFN_LDEXP:
|
||||||
|
return 2;
|
||||||
|
break;
|
||||||
|
case IFN_CLRSB:
|
||||||
|
case IFN_CLZ:
|
||||||
|
case IFN_CTZ:
|
||||||
|
case IFN_FFS:
|
||||||
|
case IFN_PARITY:
|
||||||
|
case IFN_POPCOUNT:
|
||||||
|
default:
|
||||||
|
/* As we produce sorry message for unknown internal functions,
|
||||||
|
reaching this label is definitely a bug. */
|
||||||
|
gcc_unreachable ();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
BrigType16_t
|
||||||
|
hsa_internal_fn::get_argument_type (int n)
|
||||||
|
{
|
||||||
|
switch (m_fn)
|
||||||
|
{
|
||||||
|
case IFN_ACOS:
|
||||||
|
case IFN_ASIN:
|
||||||
|
case IFN_ATAN:
|
||||||
|
case IFN_COS:
|
||||||
|
case IFN_EXP:
|
||||||
|
case IFN_EXP10:
|
||||||
|
case IFN_EXP2:
|
||||||
|
case IFN_EXPM1:
|
||||||
|
case IFN_LOG:
|
||||||
|
case IFN_LOG10:
|
||||||
|
case IFN_LOG1P:
|
||||||
|
case IFN_LOG2:
|
||||||
|
case IFN_LOGB:
|
||||||
|
case IFN_SIGNIFICAND:
|
||||||
|
case IFN_SIN:
|
||||||
|
case IFN_SQRT:
|
||||||
|
case IFN_TAN:
|
||||||
|
case IFN_CEIL:
|
||||||
|
case IFN_FLOOR:
|
||||||
|
case IFN_NEARBYINT:
|
||||||
|
case IFN_RINT:
|
||||||
|
case IFN_ROUND:
|
||||||
|
case IFN_TRUNC:
|
||||||
|
case IFN_ATAN2:
|
||||||
|
case IFN_COPYSIGN:
|
||||||
|
case IFN_FMOD:
|
||||||
|
case IFN_POW:
|
||||||
|
case IFN_REMAINDER:
|
||||||
|
case IFN_SCALB:
|
||||||
|
return hsa_float_for_bitsize (m_type_bit_size);
|
||||||
|
case IFN_LDEXP:
|
||||||
|
{
|
||||||
|
if (n == -1 || n == 0)
|
||||||
|
return hsa_float_for_bitsize (m_type_bit_size);
|
||||||
|
else
|
||||||
|
return BRIG_TYPE_S32;
|
||||||
|
}
|
||||||
|
default:
|
||||||
|
/* As we produce sorry message for unknown internal functions,
|
||||||
|
reaching this label is definitely a bug. */
|
||||||
|
gcc_unreachable ();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
#include "gt-hsa.h"
|
||||||
|
|
@ -0,0 +1,331 @@
|
||||||
|
/* Callgraph based analysis of static variables.
|
||||||
|
Copyright (C) 2015-2016 Free Software Foundation, Inc.
|
||||||
|
Contributed by Martin Liska <mliska@suse.cz>
|
||||||
|
|
||||||
|
This file is part of GCC.
|
||||||
|
|
||||||
|
GCC is free software; you can redistribute it and/or modify it under
|
||||||
|
the terms of the GNU General Public License as published by the Free
|
||||||
|
Software Foundation; either version 3, or (at your option) any later
|
||||||
|
version.
|
||||||
|
|
||||||
|
GCC is distributed in the hope that it will be useful, but WITHOUT ANY
|
||||||
|
WARRANTY; without even the implied warranty of MERCHANTABILITY or
|
||||||
|
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
|
||||||
|
for more details.
|
||||||
|
|
||||||
|
You should have received a copy of the GNU General Public License
|
||||||
|
along with GCC; see the file COPYING3. If not see
|
||||||
|
<http://www.gnu.org/licenses/>. */
|
||||||
|
|
||||||
|
/* Interprocedural HSA pass is responsible for creation of HSA clones.
|
||||||
|
For all these HSA clones, we emit HSAIL instructions and pass processing
|
||||||
|
is terminated. */
|
||||||
|
|
||||||
|
#include "config.h"
|
||||||
|
#include "system.h"
|
||||||
|
#include "coretypes.h"
|
||||||
|
#include "tm.h"
|
||||||
|
#include "is-a.h"
|
||||||
|
#include "hash-set.h"
|
||||||
|
#include "vec.h"
|
||||||
|
#include "tree.h"
|
||||||
|
#include "tree-pass.h"
|
||||||
|
#include "function.h"
|
||||||
|
#include "basic-block.h"
|
||||||
|
#include "gimple.h"
|
||||||
|
#include "dumpfile.h"
|
||||||
|
#include "gimple-pretty-print.h"
|
||||||
|
#include "tree-streamer.h"
|
||||||
|
#include "stringpool.h"
|
||||||
|
#include "cgraph.h"
|
||||||
|
#include "print-tree.h"
|
||||||
|
#include "symbol-summary.h"
|
||||||
|
#include "hsa.h"
|
||||||
|
|
||||||
|
namespace {
|
||||||
|
|
||||||
|
/* If NODE is not versionable, warn about not emiting HSAIL and return false.
|
||||||
|
Otherwise return true. */
|
||||||
|
|
||||||
|
static bool
|
||||||
|
check_warn_node_versionable (cgraph_node *node)
|
||||||
|
{
|
||||||
|
if (!node->local.versionable)
|
||||||
|
{
|
||||||
|
warning_at (EXPR_LOCATION (node->decl), OPT_Whsa,
|
||||||
|
"could not emit HSAIL for function %s: function cannot be "
|
||||||
|
"cloned", node->name ());
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* The function creates HSA clones for all functions that were either
|
||||||
|
marked as HSA kernels or are callable HSA functions. Apart from that,
|
||||||
|
we redirect all edges that come from an HSA clone and end in another
|
||||||
|
HSA clone to connect these two functions. */
|
||||||
|
|
||||||
|
static unsigned int
|
||||||
|
process_hsa_functions (void)
|
||||||
|
{
|
||||||
|
struct cgraph_node *node;
|
||||||
|
|
||||||
|
if (hsa_summaries == NULL)
|
||||||
|
hsa_summaries = new hsa_summary_t (symtab);
|
||||||
|
|
||||||
|
FOR_EACH_DEFINED_FUNCTION (node)
|
||||||
|
{
|
||||||
|
hsa_function_summary *s = hsa_summaries->get (node);
|
||||||
|
|
||||||
|
/* A linked function is skipped. */
|
||||||
|
if (s->m_binded_function != NULL)
|
||||||
|
continue;
|
||||||
|
|
||||||
|
if (s->m_kind != HSA_NONE)
|
||||||
|
{
|
||||||
|
if (!check_warn_node_versionable (node))
|
||||||
|
continue;
|
||||||
|
cgraph_node *clone
|
||||||
|
= node->create_virtual_clone (vec <cgraph_edge *> (),
|
||||||
|
NULL, NULL, "hsa");
|
||||||
|
TREE_PUBLIC (clone->decl) = TREE_PUBLIC (node->decl);
|
||||||
|
|
||||||
|
clone->force_output = true;
|
||||||
|
hsa_summaries->link_functions (clone, node, s->m_kind, false);
|
||||||
|
|
||||||
|
if (dump_file)
|
||||||
|
fprintf (dump_file, "Created a new HSA clone: %s, type: %s\n",
|
||||||
|
clone->name (),
|
||||||
|
s->m_kind == HSA_KERNEL ? "kernel" : "function");
|
||||||
|
}
|
||||||
|
else if (hsa_callable_function_p (node->decl))
|
||||||
|
{
|
||||||
|
if (!check_warn_node_versionable (node))
|
||||||
|
continue;
|
||||||
|
cgraph_node *clone
|
||||||
|
= node->create_virtual_clone (vec <cgraph_edge *> (),
|
||||||
|
NULL, NULL, "hsa");
|
||||||
|
TREE_PUBLIC (clone->decl) = TREE_PUBLIC (node->decl);
|
||||||
|
|
||||||
|
if (!cgraph_local_p (node))
|
||||||
|
clone->force_output = true;
|
||||||
|
hsa_summaries->link_functions (clone, node, HSA_FUNCTION, false);
|
||||||
|
|
||||||
|
if (dump_file)
|
||||||
|
fprintf (dump_file, "Created a new HSA function clone: %s\n",
|
||||||
|
clone->name ());
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Redirect all edges that are between HSA clones. */
|
||||||
|
FOR_EACH_DEFINED_FUNCTION (node)
|
||||||
|
{
|
||||||
|
cgraph_edge *e = node->callees;
|
||||||
|
|
||||||
|
while (e)
|
||||||
|
{
|
||||||
|
hsa_function_summary *src = hsa_summaries->get (node);
|
||||||
|
if (src->m_kind != HSA_NONE && src->m_gpu_implementation_p)
|
||||||
|
{
|
||||||
|
hsa_function_summary *dst = hsa_summaries->get (e->callee);
|
||||||
|
if (dst->m_kind != HSA_NONE && !dst->m_gpu_implementation_p)
|
||||||
|
{
|
||||||
|
e->redirect_callee (dst->m_binded_function);
|
||||||
|
if (dump_file)
|
||||||
|
fprintf (dump_file,
|
||||||
|
"Redirecting edge to HSA function: %s->%s\n",
|
||||||
|
xstrdup_for_dump (e->caller->name ()),
|
||||||
|
xstrdup_for_dump (e->callee->name ()));
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
e = e->next_callee;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Iterate all HSA functions and stream out HSA function summary. */
|
||||||
|
|
||||||
|
static void
|
||||||
|
ipa_hsa_write_summary (void)
|
||||||
|
{
|
||||||
|
struct bitpack_d bp;
|
||||||
|
struct cgraph_node *node;
|
||||||
|
struct output_block *ob;
|
||||||
|
unsigned int count = 0;
|
||||||
|
lto_symtab_encoder_iterator lsei;
|
||||||
|
lto_symtab_encoder_t encoder;
|
||||||
|
|
||||||
|
if (!hsa_summaries)
|
||||||
|
return;
|
||||||
|
|
||||||
|
ob = create_output_block (LTO_section_ipa_hsa);
|
||||||
|
encoder = ob->decl_state->symtab_node_encoder;
|
||||||
|
ob->symbol = NULL;
|
||||||
|
for (lsei = lsei_start_function_in_partition (encoder); !lsei_end_p (lsei);
|
||||||
|
lsei_next_function_in_partition (&lsei))
|
||||||
|
{
|
||||||
|
node = lsei_cgraph_node (lsei);
|
||||||
|
hsa_function_summary *s = hsa_summaries->get (node);
|
||||||
|
|
||||||
|
if (s->m_kind != HSA_NONE)
|
||||||
|
count++;
|
||||||
|
}
|
||||||
|
|
||||||
|
streamer_write_uhwi (ob, count);
|
||||||
|
|
||||||
|
/* Process all of the functions. */
|
||||||
|
for (lsei = lsei_start_function_in_partition (encoder); !lsei_end_p (lsei);
|
||||||
|
lsei_next_function_in_partition (&lsei))
|
||||||
|
{
|
||||||
|
node = lsei_cgraph_node (lsei);
|
||||||
|
hsa_function_summary *s = hsa_summaries->get (node);
|
||||||
|
|
||||||
|
if (s->m_kind != HSA_NONE)
|
||||||
|
{
|
||||||
|
encoder = ob->decl_state->symtab_node_encoder;
|
||||||
|
int node_ref = lto_symtab_encoder_encode (encoder, node);
|
||||||
|
streamer_write_uhwi (ob, node_ref);
|
||||||
|
|
||||||
|
bp = bitpack_create (ob->main_stream);
|
||||||
|
bp_pack_value (&bp, s->m_kind, 2);
|
||||||
|
bp_pack_value (&bp, s->m_gpu_implementation_p, 1);
|
||||||
|
bp_pack_value (&bp, s->m_binded_function != NULL, 1);
|
||||||
|
streamer_write_bitpack (&bp);
|
||||||
|
if (s->m_binded_function)
|
||||||
|
stream_write_tree (ob, s->m_binded_function->decl, true);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
streamer_write_char_stream (ob->main_stream, 0);
|
||||||
|
produce_asm (ob, NULL);
|
||||||
|
destroy_output_block (ob);
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Read section in file FILE_DATA of length LEN with data DATA. */
|
||||||
|
|
||||||
|
static void
|
||||||
|
ipa_hsa_read_section (struct lto_file_decl_data *file_data, const char *data,
|
||||||
|
size_t len)
|
||||||
|
{
|
||||||
|
const struct lto_function_header *header
|
||||||
|
= (const struct lto_function_header *) data;
|
||||||
|
const int cfg_offset = sizeof (struct lto_function_header);
|
||||||
|
const int main_offset = cfg_offset + header->cfg_size;
|
||||||
|
const int string_offset = main_offset + header->main_size;
|
||||||
|
struct data_in *data_in;
|
||||||
|
unsigned int i;
|
||||||
|
unsigned int count;
|
||||||
|
|
||||||
|
lto_input_block ib_main ((const char *) data + main_offset,
|
||||||
|
header->main_size, file_data->mode_table);
|
||||||
|
|
||||||
|
data_in
|
||||||
|
= lto_data_in_create (file_data, (const char *) data + string_offset,
|
||||||
|
header->string_size, vNULL);
|
||||||
|
count = streamer_read_uhwi (&ib_main);
|
||||||
|
|
||||||
|
for (i = 0; i < count; i++)
|
||||||
|
{
|
||||||
|
unsigned int index;
|
||||||
|
struct cgraph_node *node;
|
||||||
|
lto_symtab_encoder_t encoder;
|
||||||
|
|
||||||
|
index = streamer_read_uhwi (&ib_main);
|
||||||
|
encoder = file_data->symtab_node_encoder;
|
||||||
|
node = dyn_cast<cgraph_node *> (lto_symtab_encoder_deref (encoder,
|
||||||
|
index));
|
||||||
|
gcc_assert (node->definition);
|
||||||
|
hsa_function_summary *s = hsa_summaries->get (node);
|
||||||
|
|
||||||
|
struct bitpack_d bp = streamer_read_bitpack (&ib_main);
|
||||||
|
s->m_kind = (hsa_function_kind) bp_unpack_value (&bp, 2);
|
||||||
|
s->m_gpu_implementation_p = bp_unpack_value (&bp, 1);
|
||||||
|
bool has_tree = bp_unpack_value (&bp, 1);
|
||||||
|
|
||||||
|
if (has_tree)
|
||||||
|
{
|
||||||
|
tree decl = stream_read_tree (&ib_main, data_in);
|
||||||
|
s->m_binded_function = cgraph_node::get_create (decl);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
lto_free_section_data (file_data, LTO_section_ipa_hsa, NULL, data,
|
||||||
|
len);
|
||||||
|
lto_data_in_delete (data_in);
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Load streamed HSA functions summary and assign the summary to a function. */
|
||||||
|
|
||||||
|
static void
|
||||||
|
ipa_hsa_read_summary (void)
|
||||||
|
{
|
||||||
|
struct lto_file_decl_data **file_data_vec = lto_get_file_decl_data ();
|
||||||
|
struct lto_file_decl_data *file_data;
|
||||||
|
unsigned int j = 0;
|
||||||
|
|
||||||
|
if (hsa_summaries == NULL)
|
||||||
|
hsa_summaries = new hsa_summary_t (symtab);
|
||||||
|
|
||||||
|
while ((file_data = file_data_vec[j++]))
|
||||||
|
{
|
||||||
|
size_t len;
|
||||||
|
const char *data = lto_get_section_data (file_data, LTO_section_ipa_hsa,
|
||||||
|
NULL, &len);
|
||||||
|
|
||||||
|
if (data)
|
||||||
|
ipa_hsa_read_section (file_data, data, len);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
const pass_data pass_data_ipa_hsa =
|
||||||
|
{
|
||||||
|
IPA_PASS, /* type */
|
||||||
|
"hsa", /* name */
|
||||||
|
OPTGROUP_NONE, /* optinfo_flags */
|
||||||
|
TV_IPA_HSA, /* tv_id */
|
||||||
|
0, /* properties_required */
|
||||||
|
0, /* properties_provided */
|
||||||
|
0, /* properties_destroyed */
|
||||||
|
0, /* todo_flags_start */
|
||||||
|
TODO_dump_symtab, /* todo_flags_finish */
|
||||||
|
};
|
||||||
|
|
||||||
|
class pass_ipa_hsa : public ipa_opt_pass_d
|
||||||
|
{
|
||||||
|
public:
|
||||||
|
pass_ipa_hsa (gcc::context *ctxt)
|
||||||
|
: ipa_opt_pass_d (pass_data_ipa_hsa, ctxt,
|
||||||
|
NULL, /* generate_summary */
|
||||||
|
ipa_hsa_write_summary, /* write_summary */
|
||||||
|
ipa_hsa_read_summary, /* read_summary */
|
||||||
|
ipa_hsa_write_summary, /* write_optimization_summary */
|
||||||
|
ipa_hsa_read_summary, /* read_optimization_summary */
|
||||||
|
NULL, /* stmt_fixup */
|
||||||
|
0, /* function_transform_todo_flags_start */
|
||||||
|
NULL, /* function_transform */
|
||||||
|
NULL) /* variable_transform */
|
||||||
|
{}
|
||||||
|
|
||||||
|
/* opt_pass methods: */
|
||||||
|
virtual bool gate (function *);
|
||||||
|
|
||||||
|
virtual unsigned int execute (function *) { return process_hsa_functions (); }
|
||||||
|
|
||||||
|
}; // class pass_ipa_reference
|
||||||
|
|
||||||
|
bool
|
||||||
|
pass_ipa_hsa::gate (function *)
|
||||||
|
{
|
||||||
|
return hsa_gen_requested_p ();
|
||||||
|
}
|
||||||
|
|
||||||
|
} // anon namespace
|
||||||
|
|
||||||
|
ipa_opt_pass_d *
|
||||||
|
make_pass_ipa_hsa (gcc::context *ctxt)
|
||||||
|
{
|
||||||
|
return new pass_ipa_hsa (ctxt);
|
||||||
|
}
|
||||||
|
|
@ -51,7 +51,8 @@ const char *lto_section_name[LTO_N_SECTION_TYPES] =
|
||||||
"ipcp_trans",
|
"ipcp_trans",
|
||||||
"icf",
|
"icf",
|
||||||
"offload_table",
|
"offload_table",
|
||||||
"mode_table"
|
"mode_table",
|
||||||
|
"hsa"
|
||||||
};
|
};
|
||||||
|
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -244,6 +244,7 @@ enum lto_section_type
|
||||||
LTO_section_ipa_icf,
|
LTO_section_ipa_icf,
|
||||||
LTO_section_offload_table,
|
LTO_section_offload_table,
|
||||||
LTO_section_mode_table,
|
LTO_section_mode_table,
|
||||||
|
LTO_section_ipa_hsa,
|
||||||
LTO_N_SECTION_TYPES /* Must be last. */
|
LTO_N_SECTION_TYPES /* Must be last. */
|
||||||
};
|
};
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -736,6 +736,7 @@ compile_images_for_offload_targets (unsigned in_argc, char *in_argv[],
|
||||||
return;
|
return;
|
||||||
unsigned num_targets = parse_env_var (target_names, &names, NULL);
|
unsigned num_targets = parse_env_var (target_names, &names, NULL);
|
||||||
|
|
||||||
|
int next_name_entry = 0;
|
||||||
const char *compiler_path = getenv ("COMPILER_PATH");
|
const char *compiler_path = getenv ("COMPILER_PATH");
|
||||||
if (!compiler_path)
|
if (!compiler_path)
|
||||||
goto out;
|
goto out;
|
||||||
|
|
@ -745,13 +746,19 @@ compile_images_for_offload_targets (unsigned in_argc, char *in_argv[],
|
||||||
offload_names = XCNEWVEC (char *, num_targets + 1);
|
offload_names = XCNEWVEC (char *, num_targets + 1);
|
||||||
for (unsigned i = 0; i < num_targets; i++)
|
for (unsigned i = 0; i < num_targets; i++)
|
||||||
{
|
{
|
||||||
offload_names[i]
|
/* HSA does not use LTO-like streaming and a different compiler, skip
|
||||||
|
it. */
|
||||||
|
if (strcmp (names[i], "hsa") == 0)
|
||||||
|
continue;
|
||||||
|
|
||||||
|
offload_names[next_name_entry]
|
||||||
= compile_offload_image (names[i], compiler_path, in_argc, in_argv,
|
= compile_offload_image (names[i], compiler_path, in_argc, in_argv,
|
||||||
compiler_opts, compiler_opt_count,
|
compiler_opts, compiler_opt_count,
|
||||||
linker_opts, linker_opt_count);
|
linker_opts, linker_opt_count);
|
||||||
if (!offload_names[i])
|
if (!offload_names[next_name_entry])
|
||||||
fatal_error (input_location,
|
fatal_error (input_location,
|
||||||
"problem with building target image for %s\n", names[i]);
|
"problem with building target image for %s\n", names[i]);
|
||||||
|
next_name_entry++;
|
||||||
}
|
}
|
||||||
|
|
||||||
out:
|
out:
|
||||||
|
|
|
||||||
|
|
@ -1,3 +1,10 @@
|
||||||
|
2016-01-19 Martin Liska <mliska@suse.cz>
|
||||||
|
Martin Jambor <mjambor@suse.cz>
|
||||||
|
|
||||||
|
* lto-partition.c: Include "hsa.h"
|
||||||
|
(add_symbol_to_partition_1): Put hsa implementations into the
|
||||||
|
same partition as host implementations.
|
||||||
|
|
||||||
2016-01-12 Jan Hubicka <hubicka@ucw.cz>
|
2016-01-12 Jan Hubicka <hubicka@ucw.cz>
|
||||||
|
|
||||||
PR lto/69003
|
PR lto/69003
|
||||||
|
|
|
||||||
|
|
@ -34,6 +34,7 @@ along with GCC; see the file COPYING3. If not see
|
||||||
#include "ipa-prop.h"
|
#include "ipa-prop.h"
|
||||||
#include "ipa-inline.h"
|
#include "ipa-inline.h"
|
||||||
#include "lto-partition.h"
|
#include "lto-partition.h"
|
||||||
|
#include "hsa.h"
|
||||||
|
|
||||||
vec<ltrans_partition> ltrans_partitions;
|
vec<ltrans_partition> ltrans_partitions;
|
||||||
|
|
||||||
|
|
@ -170,6 +171,24 @@ add_symbol_to_partition_1 (ltrans_partition part, symtab_node *node)
|
||||||
Therefore put it into the same partition. */
|
Therefore put it into the same partition. */
|
||||||
if (cnode->instrumented_version)
|
if (cnode->instrumented_version)
|
||||||
add_symbol_to_partition_1 (part, cnode->instrumented_version);
|
add_symbol_to_partition_1 (part, cnode->instrumented_version);
|
||||||
|
|
||||||
|
/* Add an HSA associated with the symbol. */
|
||||||
|
if (hsa_summaries != NULL)
|
||||||
|
{
|
||||||
|
hsa_function_summary *s = hsa_summaries->get (cnode);
|
||||||
|
if (s->m_kind == HSA_KERNEL)
|
||||||
|
{
|
||||||
|
/* Add binded function. */
|
||||||
|
bool added = add_symbol_to_partition_1 (part,
|
||||||
|
s->m_binded_function);
|
||||||
|
gcc_assert (added);
|
||||||
|
if (symtab->dump_file)
|
||||||
|
fprintf (symtab->dump_file,
|
||||||
|
"adding an HSA function (host/gpu) to the "
|
||||||
|
"partition: %s\n",
|
||||||
|
s->m_binded_function->name ());
|
||||||
|
}
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
add_references_to_partition (part, node);
|
add_references_to_partition (part, node);
|
||||||
|
|
|
||||||
|
|
@ -340,8 +340,13 @@ DEF_GOMP_BUILTIN (BUILT_IN_GOMP_SINGLE_COPY_START, "GOMP_single_copy_start",
|
||||||
BT_FN_PTR, ATTR_NOTHROW_LEAF_LIST)
|
BT_FN_PTR, ATTR_NOTHROW_LEAF_LIST)
|
||||||
DEF_GOMP_BUILTIN (BUILT_IN_GOMP_SINGLE_COPY_END, "GOMP_single_copy_end",
|
DEF_GOMP_BUILTIN (BUILT_IN_GOMP_SINGLE_COPY_END, "GOMP_single_copy_end",
|
||||||
BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST)
|
BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST)
|
||||||
|
DEF_GOMP_BUILTIN (BUILT_IN_GOMP_OFFLOAD_REGISTER, "GOMP_offload_register_ver",
|
||||||
|
BT_FN_VOID_UINT_PTR_INT_PTR, ATTR_NOTHROW_LIST)
|
||||||
|
DEF_GOMP_BUILTIN (BUILT_IN_GOMP_OFFLOAD_UNREGISTER,
|
||||||
|
"GOMP_offload_unregister_ver",
|
||||||
|
BT_FN_VOID_UINT_PTR_INT_PTR, ATTR_NOTHROW_LIST)
|
||||||
DEF_GOMP_BUILTIN (BUILT_IN_GOMP_TARGET, "GOMP_target_ext",
|
DEF_GOMP_BUILTIN (BUILT_IN_GOMP_TARGET, "GOMP_target_ext",
|
||||||
BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR_INT_INT,
|
BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR_PTR,
|
||||||
ATTR_NOTHROW_LIST)
|
ATTR_NOTHROW_LIST)
|
||||||
DEF_GOMP_BUILTIN (BUILT_IN_GOMP_TARGET_DATA, "GOMP_target_data_ext",
|
DEF_GOMP_BUILTIN (BUILT_IN_GOMP_TARGET_DATA, "GOMP_target_data_ext",
|
||||||
BT_FN_VOID_INT_SIZE_PTR_PTR_PTR, ATTR_NOTHROW_LIST)
|
BT_FN_VOID_INT_SIZE_PTR_PTR_PTR, ATTR_NOTHROW_LIST)
|
||||||
|
|
|
||||||
1456
gcc/omp-low.c
1456
gcc/omp-low.c
File diff suppressed because it is too large
Load Diff
31
gcc/opts.c
31
gcc/opts.c
|
|
@ -1916,8 +1916,35 @@ common_handle_option (struct gcc_options *opts,
|
||||||
break;
|
break;
|
||||||
|
|
||||||
case OPT_foffload_:
|
case OPT_foffload_:
|
||||||
/* Deferred. */
|
{
|
||||||
break;
|
const char *p = arg;
|
||||||
|
opts->x_flag_disable_hsa = true;
|
||||||
|
while (*p != 0)
|
||||||
|
{
|
||||||
|
const char *comma = strchr (p, ',');
|
||||||
|
|
||||||
|
if ((strncmp (p, "disable", 7) == 0)
|
||||||
|
&& (p[7] == ',' || p[7] == '\0'))
|
||||||
|
{
|
||||||
|
opts->x_flag_disable_hsa = true;
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
|
||||||
|
if ((strncmp (p, "hsa", 3) == 0)
|
||||||
|
&& (p[3] == ',' || p[3] == '\0'))
|
||||||
|
{
|
||||||
|
#ifdef ENABLE_HSA
|
||||||
|
opts->x_flag_disable_hsa = false;
|
||||||
|
#else
|
||||||
|
sorry ("HSA has not been enabled during configuration");
|
||||||
|
#endif
|
||||||
|
}
|
||||||
|
if (!comma)
|
||||||
|
break;
|
||||||
|
p = comma + 1;
|
||||||
|
}
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
|
||||||
#ifndef ACCEL_COMPILER
|
#ifndef ACCEL_COMPILER
|
||||||
case OPT_foffload_abi_:
|
case OPT_foffload_abi_:
|
||||||
|
|
|
||||||
|
|
@ -1183,6 +1183,11 @@ DEFPARAM (PARAM_MAX_RTL_IF_CONVERSION_INSNS,
|
||||||
"Maximum number of insns in a basic block to consider for RTL "
|
"Maximum number of insns in a basic block to consider for RTL "
|
||||||
"if-conversion.",
|
"if-conversion.",
|
||||||
10, 0, 99)
|
10, 0, 99)
|
||||||
|
|
||||||
|
DEFPARAM (PARAM_HSA_GEN_DEBUG_STORES,
|
||||||
|
"hsa-gen-debug-stores",
|
||||||
|
"Level of hsa debug stores verbosity",
|
||||||
|
0, 0, 1)
|
||||||
/*
|
/*
|
||||||
|
|
||||||
Local variables:
|
Local variables:
|
||||||
|
|
|
||||||
|
|
@ -151,6 +151,7 @@ along with GCC; see the file COPYING3. If not see
|
||||||
NEXT_PASS (pass_ipa_cp);
|
NEXT_PASS (pass_ipa_cp);
|
||||||
NEXT_PASS (pass_ipa_cdtor_merge);
|
NEXT_PASS (pass_ipa_cdtor_merge);
|
||||||
NEXT_PASS (pass_target_clone);
|
NEXT_PASS (pass_target_clone);
|
||||||
|
NEXT_PASS (pass_ipa_hsa);
|
||||||
NEXT_PASS (pass_ipa_inline);
|
NEXT_PASS (pass_ipa_inline);
|
||||||
NEXT_PASS (pass_ipa_pure_const);
|
NEXT_PASS (pass_ipa_pure_const);
|
||||||
NEXT_PASS (pass_ipa_reference);
|
NEXT_PASS (pass_ipa_reference);
|
||||||
|
|
@ -386,6 +387,7 @@ along with GCC; see the file COPYING3. If not see
|
||||||
NEXT_PASS (pass_nrv);
|
NEXT_PASS (pass_nrv);
|
||||||
NEXT_PASS (pass_cleanup_cfg_post_optimizing);
|
NEXT_PASS (pass_cleanup_cfg_post_optimizing);
|
||||||
NEXT_PASS (pass_warn_function_noreturn);
|
NEXT_PASS (pass_warn_function_noreturn);
|
||||||
|
NEXT_PASS (pass_gen_hsail);
|
||||||
|
|
||||||
NEXT_PASS (pass_expand);
|
NEXT_PASS (pass_expand);
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -97,6 +97,7 @@ DEFTIMEVAR (TV_WHOPR_WPA_IO , "whopr wpa I/O")
|
||||||
DEFTIMEVAR (TV_WHOPR_PARTITIONING , "whopr partitioning")
|
DEFTIMEVAR (TV_WHOPR_PARTITIONING , "whopr partitioning")
|
||||||
DEFTIMEVAR (TV_WHOPR_LTRANS , "whopr ltrans")
|
DEFTIMEVAR (TV_WHOPR_LTRANS , "whopr ltrans")
|
||||||
DEFTIMEVAR (TV_IPA_REFERENCE , "ipa reference")
|
DEFTIMEVAR (TV_IPA_REFERENCE , "ipa reference")
|
||||||
|
DEFTIMEVAR (TV_IPA_HSA , "ipa HSA")
|
||||||
DEFTIMEVAR (TV_IPA_PROFILE , "ipa profile")
|
DEFTIMEVAR (TV_IPA_PROFILE , "ipa profile")
|
||||||
DEFTIMEVAR (TV_IPA_AUTOFDO , "auto profile")
|
DEFTIMEVAR (TV_IPA_AUTOFDO , "auto profile")
|
||||||
DEFTIMEVAR (TV_IPA_PURE_CONST , "ipa pure const")
|
DEFTIMEVAR (TV_IPA_PURE_CONST , "ipa pure const")
|
||||||
|
|
|
||||||
|
|
@ -75,6 +75,7 @@ along with GCC; see the file COPYING3. If not see
|
||||||
#include "gcse.h"
|
#include "gcse.h"
|
||||||
#include "tree-chkp.h"
|
#include "tree-chkp.h"
|
||||||
#include "omp-low.h"
|
#include "omp-low.h"
|
||||||
|
#include "hsa.h"
|
||||||
|
|
||||||
#if defined(DBX_DEBUGGING_INFO) || defined(XCOFF_DEBUGGING_INFO)
|
#if defined(DBX_DEBUGGING_INFO) || defined(XCOFF_DEBUGGING_INFO)
|
||||||
#include "dbxout.h"
|
#include "dbxout.h"
|
||||||
|
|
@ -518,6 +519,8 @@ compile_file (void)
|
||||||
|
|
||||||
omp_finish_file ();
|
omp_finish_file ();
|
||||||
|
|
||||||
|
hsa_output_brig ();
|
||||||
|
|
||||||
output_shared_constant_pool ();
|
output_shared_constant_pool ();
|
||||||
output_object_blocks ();
|
output_object_blocks ();
|
||||||
finish_tm_clone_pairs ();
|
finish_tm_clone_pairs ();
|
||||||
|
|
|
||||||
|
|
@ -458,7 +458,11 @@ enum omp_clause_code {
|
||||||
OMP_CLAUSE_VECTOR_LENGTH,
|
OMP_CLAUSE_VECTOR_LENGTH,
|
||||||
|
|
||||||
/* OpenACC clause: tile ( size-expr-list ). */
|
/* OpenACC clause: tile ( size-expr-list ). */
|
||||||
OMP_CLAUSE_TILE
|
OMP_CLAUSE_TILE,
|
||||||
|
|
||||||
|
/* OpenMP internal-only clause to specify grid dimensions of a gridified
|
||||||
|
kernel. */
|
||||||
|
OMP_CLAUSE__GRIDDIM_
|
||||||
};
|
};
|
||||||
|
|
||||||
#undef DEFTREESTRUCT
|
#undef DEFTREESTRUCT
|
||||||
|
|
@ -1375,6 +1379,9 @@ struct GTY(()) tree_omp_clause {
|
||||||
enum tree_code reduction_code;
|
enum tree_code reduction_code;
|
||||||
enum omp_clause_linear_kind linear_kind;
|
enum omp_clause_linear_kind linear_kind;
|
||||||
enum tree_code if_modifier;
|
enum tree_code if_modifier;
|
||||||
|
/* The dimension a OMP_CLAUSE__GRIDDIM_ clause of a gridified target
|
||||||
|
construct describes. */
|
||||||
|
unsigned int dimension;
|
||||||
} GTY ((skip)) subcode;
|
} GTY ((skip)) subcode;
|
||||||
|
|
||||||
/* The gimplification of OMP_CLAUSE_REDUCTION_{INIT,MERGE} for omp-low's
|
/* The gimplification of OMP_CLAUSE_REDUCTION_{INIT,MERGE} for omp-low's
|
||||||
|
|
|
||||||
|
|
@ -471,6 +471,7 @@ extern gimple_opt_pass *make_pass_sanopt (gcc::context *ctxt);
|
||||||
extern gimple_opt_pass *make_pass_oacc_kernels (gcc::context *ctxt);
|
extern gimple_opt_pass *make_pass_oacc_kernels (gcc::context *ctxt);
|
||||||
extern simple_ipa_opt_pass *make_pass_ipa_oacc (gcc::context *ctxt);
|
extern simple_ipa_opt_pass *make_pass_ipa_oacc (gcc::context *ctxt);
|
||||||
extern simple_ipa_opt_pass *make_pass_ipa_oacc_kernels (gcc::context *ctxt);
|
extern simple_ipa_opt_pass *make_pass_ipa_oacc_kernels (gcc::context *ctxt);
|
||||||
|
extern gimple_opt_pass *make_pass_gen_hsail (gcc::context *ctxt);
|
||||||
|
|
||||||
/* IPA Passes */
|
/* IPA Passes */
|
||||||
extern simple_ipa_opt_pass *make_pass_ipa_lower_emutls (gcc::context *ctxt);
|
extern simple_ipa_opt_pass *make_pass_ipa_lower_emutls (gcc::context *ctxt);
|
||||||
|
|
@ -495,6 +496,7 @@ extern ipa_opt_pass_d *make_pass_ipa_cp (gcc::context *ctxt);
|
||||||
extern ipa_opt_pass_d *make_pass_ipa_icf (gcc::context *ctxt);
|
extern ipa_opt_pass_d *make_pass_ipa_icf (gcc::context *ctxt);
|
||||||
extern ipa_opt_pass_d *make_pass_ipa_devirt (gcc::context *ctxt);
|
extern ipa_opt_pass_d *make_pass_ipa_devirt (gcc::context *ctxt);
|
||||||
extern ipa_opt_pass_d *make_pass_ipa_reference (gcc::context *ctxt);
|
extern ipa_opt_pass_d *make_pass_ipa_reference (gcc::context *ctxt);
|
||||||
|
extern ipa_opt_pass_d *make_pass_ipa_hsa (gcc::context *ctxt);
|
||||||
extern ipa_opt_pass_d *make_pass_ipa_pure_const (gcc::context *ctxt);
|
extern ipa_opt_pass_d *make_pass_ipa_pure_const (gcc::context *ctxt);
|
||||||
extern simple_ipa_opt_pass *make_pass_ipa_pta (gcc::context *ctxt);
|
extern simple_ipa_opt_pass *make_pass_ipa_pta (gcc::context *ctxt);
|
||||||
extern simple_ipa_opt_pass *make_pass_ipa_tm (gcc::context *ctxt);
|
extern simple_ipa_opt_pass *make_pass_ipa_tm (gcc::context *ctxt);
|
||||||
|
|
|
||||||
|
|
@ -942,6 +942,18 @@ dump_omp_clause (pretty_printer *pp, tree clause, int spc, int flags)
|
||||||
pp_right_paren (pp);
|
pp_right_paren (pp);
|
||||||
break;
|
break;
|
||||||
|
|
||||||
|
case OMP_CLAUSE__GRIDDIM_:
|
||||||
|
pp_string (pp, "_griddim_(");
|
||||||
|
pp_unsigned_wide_integer (pp, OMP_CLAUSE__GRIDDIM__DIMENSION (clause));
|
||||||
|
pp_colon (pp);
|
||||||
|
dump_generic_node (pp, OMP_CLAUSE__GRIDDIM__SIZE (clause), spc, flags,
|
||||||
|
false);
|
||||||
|
pp_comma (pp);
|
||||||
|
dump_generic_node (pp, OMP_CLAUSE__GRIDDIM__GROUP (clause), spc, flags,
|
||||||
|
false);
|
||||||
|
pp_right_paren (pp);
|
||||||
|
break;
|
||||||
|
|
||||||
default:
|
default:
|
||||||
/* Should never happen. */
|
/* Should never happen. */
|
||||||
dump_generic_node (pp, clause, spc, flags, false);
|
dump_generic_node (pp, clause, spc, flags, false);
|
||||||
|
|
|
||||||
|
|
@ -328,6 +328,7 @@ unsigned const char omp_clause_num_ops[] =
|
||||||
1, /* OMP_CLAUSE_NUM_WORKERS */
|
1, /* OMP_CLAUSE_NUM_WORKERS */
|
||||||
1, /* OMP_CLAUSE_VECTOR_LENGTH */
|
1, /* OMP_CLAUSE_VECTOR_LENGTH */
|
||||||
1, /* OMP_CLAUSE_TILE */
|
1, /* OMP_CLAUSE_TILE */
|
||||||
|
2, /* OMP_CLAUSE__GRIDDIM_ */
|
||||||
};
|
};
|
||||||
|
|
||||||
const char * const omp_clause_code_name[] =
|
const char * const omp_clause_code_name[] =
|
||||||
|
|
@ -398,7 +399,8 @@ const char * const omp_clause_code_name[] =
|
||||||
"num_gangs",
|
"num_gangs",
|
||||||
"num_workers",
|
"num_workers",
|
||||||
"vector_length",
|
"vector_length",
|
||||||
"tile"
|
"tile",
|
||||||
|
"_griddim_"
|
||||||
};
|
};
|
||||||
|
|
||||||
|
|
||||||
|
|
@ -11744,6 +11746,7 @@ walk_tree_1 (tree *tp, walk_tree_fn func, void *data,
|
||||||
switch (OMP_CLAUSE_CODE (*tp))
|
switch (OMP_CLAUSE_CODE (*tp))
|
||||||
{
|
{
|
||||||
case OMP_CLAUSE_GANG:
|
case OMP_CLAUSE_GANG:
|
||||||
|
case OMP_CLAUSE__GRIDDIM_:
|
||||||
WALK_SUBTREE (OMP_CLAUSE_OPERAND (*tp, 1));
|
WALK_SUBTREE (OMP_CLAUSE_OPERAND (*tp, 1));
|
||||||
/* FALLTHRU */
|
/* FALLTHRU */
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -1636,6 +1636,14 @@ extern void protected_set_expr_location (tree, location_t);
|
||||||
#define OMP_CLAUSE_TILE_LIST(NODE) \
|
#define OMP_CLAUSE_TILE_LIST(NODE) \
|
||||||
OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_TILE), 0)
|
OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_TILE), 0)
|
||||||
|
|
||||||
|
#define OMP_CLAUSE__GRIDDIM__DIMENSION(NODE) \
|
||||||
|
(OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE__GRIDDIM_)\
|
||||||
|
->omp_clause.subcode.dimension)
|
||||||
|
#define OMP_CLAUSE__GRIDDIM__SIZE(NODE) \
|
||||||
|
OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE__GRIDDIM_), 0)
|
||||||
|
#define OMP_CLAUSE__GRIDDIM__GROUP(NODE) \
|
||||||
|
OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE__GRIDDIM_), 1)
|
||||||
|
|
||||||
/* SSA_NAME accessors. */
|
/* SSA_NAME accessors. */
|
||||||
|
|
||||||
/* Returns the IDENTIFIER_NODE giving the SSA name a name or NULL_TREE
|
/* Returns the IDENTIFIER_NODE giving the SSA name a name or NULL_TREE
|
||||||
|
|
|
||||||
|
|
@ -1,3 +1,16 @@
|
||||||
|
2016-01-19 Martin Jambor <mjambor@suse.cz>
|
||||||
|
|
||||||
|
* gomp-constants.h (GOMP_DEVICE_HSA): New macro.
|
||||||
|
(GOMP_VERSION_HSA): Likewise.
|
||||||
|
(GOMP_TARGET_ARG_DEVICE_MASK): Likewise.
|
||||||
|
(GOMP_TARGET_ARG_DEVICE_ALL): Likewise.
|
||||||
|
(GOMP_TARGET_ARG_SUBSEQUENT_PARAM): Likewise.
|
||||||
|
(GOMP_TARGET_ARG_ID_MASK): Likewise.
|
||||||
|
(GOMP_TARGET_ARG_NUM_TEAMS): Likewise.
|
||||||
|
(GOMP_TARGET_ARG_THREAD_LIMIT): Likewise.
|
||||||
|
(GOMP_TARGET_ARG_VALUE_SHIFT): Likewise.
|
||||||
|
(GOMP_TARGET_ARG_HSA_KERNEL_ATTRIBUTES): Likewise.
|
||||||
|
|
||||||
2016-01-07 Mike Frysinger <vapier@gentoo.org>
|
2016-01-07 Mike Frysinger <vapier@gentoo.org>
|
||||||
|
|
||||||
* longlong.h: Change !__SHMEDIA__ to
|
* longlong.h: Change !__SHMEDIA__ to
|
||||||
|
|
|
||||||
|
|
@ -176,6 +176,7 @@ enum gomp_map_kind
|
||||||
#define GOMP_DEVICE_NOT_HOST 4
|
#define GOMP_DEVICE_NOT_HOST 4
|
||||||
#define GOMP_DEVICE_NVIDIA_PTX 5
|
#define GOMP_DEVICE_NVIDIA_PTX 5
|
||||||
#define GOMP_DEVICE_INTEL_MIC 6
|
#define GOMP_DEVICE_INTEL_MIC 6
|
||||||
|
#define GOMP_DEVICE_HSA 7
|
||||||
|
|
||||||
#define GOMP_DEVICE_ICV -1
|
#define GOMP_DEVICE_ICV -1
|
||||||
#define GOMP_DEVICE_HOST_FALLBACK -2
|
#define GOMP_DEVICE_HOST_FALLBACK -2
|
||||||
|
|
@ -201,6 +202,7 @@ enum gomp_map_kind
|
||||||
#define GOMP_VERSION 0
|
#define GOMP_VERSION 0
|
||||||
#define GOMP_VERSION_NVIDIA_PTX 1
|
#define GOMP_VERSION_NVIDIA_PTX 1
|
||||||
#define GOMP_VERSION_INTEL_MIC 0
|
#define GOMP_VERSION_INTEL_MIC 0
|
||||||
|
#define GOMP_VERSION_HSA 0
|
||||||
|
|
||||||
#define GOMP_VERSION_PACK(LIB, DEV) (((LIB) << 16) | (DEV))
|
#define GOMP_VERSION_PACK(LIB, DEV) (((LIB) << 16) | (DEV))
|
||||||
#define GOMP_VERSION_LIB(PACK) (((PACK) >> 16) & 0xffff)
|
#define GOMP_VERSION_LIB(PACK) (((PACK) >> 16) & 0xffff)
|
||||||
|
|
@ -228,4 +230,30 @@ enum gomp_map_kind
|
||||||
#define GOMP_LAUNCH_OP(X) (((X) >> GOMP_LAUNCH_OP_SHIFT) & 0xffff)
|
#define GOMP_LAUNCH_OP(X) (((X) >> GOMP_LAUNCH_OP_SHIFT) & 0xffff)
|
||||||
#define GOMP_LAUNCH_OP_MAX 0xffff
|
#define GOMP_LAUNCH_OP_MAX 0xffff
|
||||||
|
|
||||||
|
/* Bitmask to apply in order to find out the intended device of a target
|
||||||
|
argument. */
|
||||||
|
#define GOMP_TARGET_ARG_DEVICE_MASK ((1 << 7) - 1)
|
||||||
|
/* The target argument is significant for all devices. */
|
||||||
|
#define GOMP_TARGET_ARG_DEVICE_ALL 0
|
||||||
|
|
||||||
|
/* Flag set when the subsequent element in the device-specific argument
|
||||||
|
values. */
|
||||||
|
#define GOMP_TARGET_ARG_SUBSEQUENT_PARAM (1 << 7)
|
||||||
|
|
||||||
|
/* Bitmask to apply to a target argument to find out the value identifier. */
|
||||||
|
#define GOMP_TARGET_ARG_ID_MASK (((1 << 8) - 1) << 8)
|
||||||
|
/* Target argument index of NUM_TEAMS. */
|
||||||
|
#define GOMP_TARGET_ARG_NUM_TEAMS (1 << 8)
|
||||||
|
/* Target argument index of THREAD_LIMIT. */
|
||||||
|
#define GOMP_TARGET_ARG_THREAD_LIMIT (2 << 8)
|
||||||
|
|
||||||
|
/* If the value is directly embeded in target argument, it should be a 16-bit
|
||||||
|
at most and shifted by this many bits. */
|
||||||
|
#define GOMP_TARGET_ARG_VALUE_SHIFT 16
|
||||||
|
|
||||||
|
/* HSA specific data structures. */
|
||||||
|
|
||||||
|
/* Identifiers of device-specific target arguments. */
|
||||||
|
#define GOMP_TARGET_ARG_HSA_KERNEL_ATTRIBUTES (1 << 8)
|
||||||
|
|
||||||
#endif
|
#endif
|
||||||
|
|
|
||||||
|
|
@ -1,3 +1,64 @@
|
||||||
|
2016-01-19 Martin Jambor <mjambor@suse.cz>
|
||||||
|
Martin Liska <mliska@suse.cz>
|
||||||
|
|
||||||
|
* plugin/Makefrag.am: Add HSA plugin requirements.
|
||||||
|
* plugin/configfrag.ac (HSA_RUNTIME_INCLUDE): New variable.
|
||||||
|
(HSA_RUNTIME_LIB): Likewise.
|
||||||
|
(HSA_RUNTIME_CPPFLAGS): Likewise.
|
||||||
|
(HSA_RUNTIME_INCLUDE): New substitution.
|
||||||
|
(HSA_RUNTIME_LIB): Likewise.
|
||||||
|
(HSA_RUNTIME_LDFLAGS): Likewise.
|
||||||
|
(hsa-runtime): New configure option.
|
||||||
|
(hsa-runtime-include): Likewise.
|
||||||
|
(hsa-runtime-lib): Likewise.
|
||||||
|
(PLUGIN_HSA): New substitution variable.
|
||||||
|
Fill HSA_RUNTIME_INCLUDE and HSA_RUNTIME_LIB according to the new
|
||||||
|
configure options.
|
||||||
|
(PLUGIN_HSA_CPPFLAGS): Likewise.
|
||||||
|
(PLUGIN_HSA_LDFLAGS): Likewise.
|
||||||
|
(PLUGIN_HSA_LIBS): Likewise.
|
||||||
|
Check that we have access to HSA run-time.
|
||||||
|
* libgomp-plugin.h (offload_target_type): New element
|
||||||
|
OFFLOAD_TARGET_TYPE_HSA.
|
||||||
|
* libgomp.h (gomp_target_task): New fields firstprivate_copies and
|
||||||
|
args.
|
||||||
|
(bool gomp_create_target_task): Updated.
|
||||||
|
(gomp_device_descr): Extra parameter of run_func and async_run_func,
|
||||||
|
new field can_run_func.
|
||||||
|
* libgomp_g.h (GOMP_target_ext): Update prototype.
|
||||||
|
* oacc-host.c (host_run): Added a new parameter args.
|
||||||
|
* target.c (calculate_firstprivate_requirements): New function.
|
||||||
|
(copy_firstprivate_data): Likewise.
|
||||||
|
(gomp_target_fallback_firstprivate): Use them.
|
||||||
|
(gomp_target_unshare_firstprivate): New function.
|
||||||
|
(gomp_get_target_fn_addr): Allow returning NULL for shared memory
|
||||||
|
devices.
|
||||||
|
(GOMP_target): Do host fallback for all shared memory devices. Do not
|
||||||
|
pass any args to plugins.
|
||||||
|
(GOMP_target_ext): Introduce device-specific argument parameter args.
|
||||||
|
Allow host fallback if device shares memory. Do not remap data if
|
||||||
|
device has shared memory.
|
||||||
|
(gomp_target_task_fn): Likewise. Also treat shared memory devices
|
||||||
|
like host fallback for mappings.
|
||||||
|
(GOMP_target_data): Treat shared memory devices like host fallback.
|
||||||
|
(GOMP_target_data_ext): Likewise.
|
||||||
|
(GOMP_target_update): Likewise.
|
||||||
|
(GOMP_target_update_ext): Likewise. Also pass NULL as args to
|
||||||
|
gomp_create_target_task.
|
||||||
|
(GOMP_target_enter_exit_data): Likewise.
|
||||||
|
(omp_target_alloc): Treat shared memory devices like host fallback.
|
||||||
|
(omp_target_free): Likewise.
|
||||||
|
(omp_target_is_present): Likewise.
|
||||||
|
(omp_target_memcpy): Likewise.
|
||||||
|
(omp_target_memcpy_rect): Likewise.
|
||||||
|
(omp_target_associate_ptr): Likewise.
|
||||||
|
(gomp_load_plugin_for_device): Also load can_run.
|
||||||
|
* task.c (GOMP_PLUGIN_target_task_completion): Free
|
||||||
|
firstprivate_copies.
|
||||||
|
(gomp_create_target_task): Accept new argument args and store it to
|
||||||
|
ttask.
|
||||||
|
* plugin/plugin-hsa.c: New file.
|
||||||
|
|
||||||
2016-01-18 Tom de Vries <tom@codesourcery.com>
|
2016-01-18 Tom de Vries <tom@codesourcery.com>
|
||||||
|
|
||||||
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-2.c: New test.
|
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-2.c: New test.
|
||||||
|
|
|
||||||
|
|
@ -17,7 +17,7 @@
|
||||||
|
|
||||||
# Plugins for offload execution, Makefile.am fragment.
|
# Plugins for offload execution, Makefile.am fragment.
|
||||||
#
|
#
|
||||||
# Copyright (C) 2014-2015 Free Software Foundation, Inc.
|
# Copyright (C) 2014-2016 Free Software Foundation, Inc.
|
||||||
#
|
#
|
||||||
# Contributed by Mentor Embedded.
|
# Contributed by Mentor Embedded.
|
||||||
#
|
#
|
||||||
|
|
@ -89,7 +89,8 @@ DIST_COMMON = $(top_srcdir)/plugin/Makefrag.am ChangeLog \
|
||||||
$(srcdir)/omp_lib.f90.in $(srcdir)/libgomp_f.h.in \
|
$(srcdir)/omp_lib.f90.in $(srcdir)/libgomp_f.h.in \
|
||||||
$(srcdir)/libgomp.spec.in $(srcdir)/../depcomp
|
$(srcdir)/libgomp.spec.in $(srcdir)/../depcomp
|
||||||
@PLUGIN_NVPTX_TRUE@am__append_1 = libgomp-plugin-nvptx.la
|
@PLUGIN_NVPTX_TRUE@am__append_1 = libgomp-plugin-nvptx.la
|
||||||
@USE_FORTRAN_TRUE@am__append_2 = openacc.f90
|
@PLUGIN_HSA_TRUE@am__append_2 = libgomp-plugin-hsa.la
|
||||||
|
@USE_FORTRAN_TRUE@am__append_3 = openacc.f90
|
||||||
subdir = .
|
subdir = .
|
||||||
ACLOCAL_M4 = $(top_srcdir)/aclocal.m4
|
ACLOCAL_M4 = $(top_srcdir)/aclocal.m4
|
||||||
am__aclocal_m4_deps = $(top_srcdir)/../config/acx.m4 \
|
am__aclocal_m4_deps = $(top_srcdir)/../config/acx.m4 \
|
||||||
|
|
@ -147,6 +148,17 @@ am__installdirs = "$(DESTDIR)$(toolexeclibdir)" "$(DESTDIR)$(infodir)" \
|
||||||
"$(DESTDIR)$(toolexeclibdir)"
|
"$(DESTDIR)$(toolexeclibdir)"
|
||||||
LTLIBRARIES = $(toolexeclib_LTLIBRARIES)
|
LTLIBRARIES = $(toolexeclib_LTLIBRARIES)
|
||||||
am__DEPENDENCIES_1 =
|
am__DEPENDENCIES_1 =
|
||||||
|
@PLUGIN_HSA_TRUE@libgomp_plugin_hsa_la_DEPENDENCIES = libgomp.la \
|
||||||
|
@PLUGIN_HSA_TRUE@ $(am__DEPENDENCIES_1)
|
||||||
|
@PLUGIN_HSA_TRUE@am_libgomp_plugin_hsa_la_OBJECTS = \
|
||||||
|
@PLUGIN_HSA_TRUE@ libgomp_plugin_hsa_la-plugin-hsa.lo
|
||||||
|
libgomp_plugin_hsa_la_OBJECTS = $(am_libgomp_plugin_hsa_la_OBJECTS)
|
||||||
|
libgomp_plugin_hsa_la_LINK = $(LIBTOOL) --tag=CC \
|
||||||
|
$(libgomp_plugin_hsa_la_LIBTOOLFLAGS) $(LIBTOOLFLAGS) \
|
||||||
|
--mode=link $(CCLD) $(AM_CFLAGS) $(CFLAGS) \
|
||||||
|
$(libgomp_plugin_hsa_la_LDFLAGS) $(LDFLAGS) -o $@
|
||||||
|
@PLUGIN_HSA_TRUE@am_libgomp_plugin_hsa_la_rpath = -rpath \
|
||||||
|
@PLUGIN_HSA_TRUE@ $(toolexeclibdir)
|
||||||
@PLUGIN_NVPTX_TRUE@libgomp_plugin_nvptx_la_DEPENDENCIES = libgomp.la \
|
@PLUGIN_NVPTX_TRUE@libgomp_plugin_nvptx_la_DEPENDENCIES = libgomp.la \
|
||||||
@PLUGIN_NVPTX_TRUE@ $(am__DEPENDENCIES_1)
|
@PLUGIN_NVPTX_TRUE@ $(am__DEPENDENCIES_1)
|
||||||
@PLUGIN_NVPTX_TRUE@am_libgomp_plugin_nvptx_la_OBJECTS = \
|
@PLUGIN_NVPTX_TRUE@am_libgomp_plugin_nvptx_la_OBJECTS = \
|
||||||
|
|
@ -187,7 +199,8 @@ FCLD = $(FC)
|
||||||
FCLINK = $(LIBTOOL) --tag=FC $(AM_LIBTOOLFLAGS) $(LIBTOOLFLAGS) \
|
FCLINK = $(LIBTOOL) --tag=FC $(AM_LIBTOOLFLAGS) $(LIBTOOLFLAGS) \
|
||||||
--mode=link $(FCLD) $(AM_FCFLAGS) $(FCFLAGS) $(AM_LDFLAGS) \
|
--mode=link $(FCLD) $(AM_FCFLAGS) $(FCFLAGS) $(AM_LDFLAGS) \
|
||||||
$(LDFLAGS) -o $@
|
$(LDFLAGS) -o $@
|
||||||
SOURCES = $(libgomp_plugin_nvptx_la_SOURCES) $(libgomp_la_SOURCES)
|
SOURCES = $(libgomp_plugin_hsa_la_SOURCES) \
|
||||||
|
$(libgomp_plugin_nvptx_la_SOURCES) $(libgomp_la_SOURCES)
|
||||||
MULTISRCTOP =
|
MULTISRCTOP =
|
||||||
MULTIBUILDTOP =
|
MULTIBUILDTOP =
|
||||||
MULTIDIRS =
|
MULTIDIRS =
|
||||||
|
|
@ -255,6 +268,8 @@ FC = @FC@
|
||||||
FCFLAGS = @FCFLAGS@
|
FCFLAGS = @FCFLAGS@
|
||||||
FGREP = @FGREP@
|
FGREP = @FGREP@
|
||||||
GREP = @GREP@
|
GREP = @GREP@
|
||||||
|
HSA_RUNTIME_INCLUDE = @HSA_RUNTIME_INCLUDE@
|
||||||
|
HSA_RUNTIME_LIB = @HSA_RUNTIME_LIB@
|
||||||
INSTALL = @INSTALL@
|
INSTALL = @INSTALL@
|
||||||
INSTALL_DATA = @INSTALL_DATA@
|
INSTALL_DATA = @INSTALL_DATA@
|
||||||
INSTALL_PROGRAM = @INSTALL_PROGRAM@
|
INSTALL_PROGRAM = @INSTALL_PROGRAM@
|
||||||
|
|
@ -299,6 +314,10 @@ PACKAGE_URL = @PACKAGE_URL@
|
||||||
PACKAGE_VERSION = @PACKAGE_VERSION@
|
PACKAGE_VERSION = @PACKAGE_VERSION@
|
||||||
PATH_SEPARATOR = @PATH_SEPARATOR@
|
PATH_SEPARATOR = @PATH_SEPARATOR@
|
||||||
PERL = @PERL@
|
PERL = @PERL@
|
||||||
|
PLUGIN_HSA = @PLUGIN_HSA@
|
||||||
|
PLUGIN_HSA_CPPFLAGS = @PLUGIN_HSA_CPPFLAGS@
|
||||||
|
PLUGIN_HSA_LDFLAGS = @PLUGIN_HSA_LDFLAGS@
|
||||||
|
PLUGIN_HSA_LIBS = @PLUGIN_HSA_LIBS@
|
||||||
PLUGIN_NVPTX = @PLUGIN_NVPTX@
|
PLUGIN_NVPTX = @PLUGIN_NVPTX@
|
||||||
PLUGIN_NVPTX_CPPFLAGS = @PLUGIN_NVPTX_CPPFLAGS@
|
PLUGIN_NVPTX_CPPFLAGS = @PLUGIN_NVPTX_CPPFLAGS@
|
||||||
PLUGIN_NVPTX_LDFLAGS = @PLUGIN_NVPTX_LDFLAGS@
|
PLUGIN_NVPTX_LDFLAGS = @PLUGIN_NVPTX_LDFLAGS@
|
||||||
|
|
@ -391,7 +410,7 @@ libsubincludedir = $(libdir)/gcc/$(target_alias)/$(gcc_version)/include
|
||||||
AM_CPPFLAGS = $(addprefix -I, $(search_path))
|
AM_CPPFLAGS = $(addprefix -I, $(search_path))
|
||||||
AM_CFLAGS = $(XCFLAGS)
|
AM_CFLAGS = $(XCFLAGS)
|
||||||
AM_LDFLAGS = $(XLDFLAGS) $(SECTION_LDFLAGS) $(OPT_LDFLAGS)
|
AM_LDFLAGS = $(XLDFLAGS) $(SECTION_LDFLAGS) $(OPT_LDFLAGS)
|
||||||
toolexeclib_LTLIBRARIES = libgomp.la $(am__append_1)
|
toolexeclib_LTLIBRARIES = libgomp.la $(am__append_1) $(am__append_2)
|
||||||
nodist_toolexeclib_HEADERS = libgomp.spec
|
nodist_toolexeclib_HEADERS = libgomp.spec
|
||||||
|
|
||||||
# -Wc is only a libtool option.
|
# -Wc is only a libtool option.
|
||||||
|
|
@ -415,7 +434,7 @@ libgomp_la_SOURCES = alloc.c barrier.c critical.c env.c error.c iter.c \
|
||||||
bar.c ptrlock.c time.c fortran.c affinity.c target.c \
|
bar.c ptrlock.c time.c fortran.c affinity.c target.c \
|
||||||
splay-tree.c libgomp-plugin.c oacc-parallel.c oacc-host.c \
|
splay-tree.c libgomp-plugin.c oacc-parallel.c oacc-host.c \
|
||||||
oacc-init.c oacc-mem.c oacc-async.c oacc-plugin.c oacc-cuda.c \
|
oacc-init.c oacc-mem.c oacc-async.c oacc-plugin.c oacc-cuda.c \
|
||||||
priority_queue.c $(am__append_2)
|
priority_queue.c $(am__append_3)
|
||||||
|
|
||||||
# Nvidia PTX OpenACC plugin.
|
# Nvidia PTX OpenACC plugin.
|
||||||
@PLUGIN_NVPTX_TRUE@libgomp_plugin_nvptx_version_info = -version-info $(libtool_VERSION)
|
@PLUGIN_NVPTX_TRUE@libgomp_plugin_nvptx_version_info = -version-info $(libtool_VERSION)
|
||||||
|
|
@ -426,6 +445,16 @@ libgomp_la_SOURCES = alloc.c barrier.c critical.c env.c error.c iter.c \
|
||||||
@PLUGIN_NVPTX_TRUE@ $(lt_host_flags) $(PLUGIN_NVPTX_LDFLAGS)
|
@PLUGIN_NVPTX_TRUE@ $(lt_host_flags) $(PLUGIN_NVPTX_LDFLAGS)
|
||||||
@PLUGIN_NVPTX_TRUE@libgomp_plugin_nvptx_la_LIBADD = libgomp.la $(PLUGIN_NVPTX_LIBS)
|
@PLUGIN_NVPTX_TRUE@libgomp_plugin_nvptx_la_LIBADD = libgomp.la $(PLUGIN_NVPTX_LIBS)
|
||||||
@PLUGIN_NVPTX_TRUE@libgomp_plugin_nvptx_la_LIBTOOLFLAGS = --tag=disable-static
|
@PLUGIN_NVPTX_TRUE@libgomp_plugin_nvptx_la_LIBTOOLFLAGS = --tag=disable-static
|
||||||
|
|
||||||
|
# Heterogenous Systems Architecture plugin
|
||||||
|
@PLUGIN_HSA_TRUE@libgomp_plugin_hsa_version_info = -version-info $(libtool_VERSION)
|
||||||
|
@PLUGIN_HSA_TRUE@libgomp_plugin_hsa_la_SOURCES = plugin/plugin-hsa.c
|
||||||
|
@PLUGIN_HSA_TRUE@libgomp_plugin_hsa_la_CPPFLAGS = $(AM_CPPFLAGS) $(PLUGIN_HSA_CPPFLAGS)
|
||||||
|
@PLUGIN_HSA_TRUE@libgomp_plugin_hsa_la_LDFLAGS = \
|
||||||
|
@PLUGIN_HSA_TRUE@ $(libgomp_plugin_hsa_version_info) \
|
||||||
|
@PLUGIN_HSA_TRUE@ $(lt_host_flags) $(PLUGIN_HSA_LDFLAGS)
|
||||||
|
@PLUGIN_HSA_TRUE@libgomp_plugin_hsa_la_LIBADD = libgomp.la $(PLUGIN_HSA_LIBS)
|
||||||
|
@PLUGIN_HSA_TRUE@libgomp_plugin_hsa_la_LIBTOOLFLAGS = --tag=disable-static
|
||||||
nodist_noinst_HEADERS = libgomp_f.h
|
nodist_noinst_HEADERS = libgomp_f.h
|
||||||
nodist_libsubinclude_HEADERS = omp.h openacc.h
|
nodist_libsubinclude_HEADERS = omp.h openacc.h
|
||||||
@USE_FORTRAN_TRUE@nodist_finclude_HEADERS = omp_lib.h omp_lib.f90 omp_lib.mod omp_lib_kinds.mod \
|
@USE_FORTRAN_TRUE@nodist_finclude_HEADERS = omp_lib.h omp_lib.f90 omp_lib.mod omp_lib_kinds.mod \
|
||||||
|
|
@ -553,6 +582,8 @@ clean-toolexeclibLTLIBRARIES:
|
||||||
echo "rm -f \"$${dir}/so_locations\""; \
|
echo "rm -f \"$${dir}/so_locations\""; \
|
||||||
rm -f "$${dir}/so_locations"; \
|
rm -f "$${dir}/so_locations"; \
|
||||||
done
|
done
|
||||||
|
libgomp-plugin-hsa.la: $(libgomp_plugin_hsa_la_OBJECTS) $(libgomp_plugin_hsa_la_DEPENDENCIES) $(EXTRA_libgomp_plugin_hsa_la_DEPENDENCIES)
|
||||||
|
$(libgomp_plugin_hsa_la_LINK) $(am_libgomp_plugin_hsa_la_rpath) $(libgomp_plugin_hsa_la_OBJECTS) $(libgomp_plugin_hsa_la_LIBADD) $(LIBS)
|
||||||
libgomp-plugin-nvptx.la: $(libgomp_plugin_nvptx_la_OBJECTS) $(libgomp_plugin_nvptx_la_DEPENDENCIES) $(EXTRA_libgomp_plugin_nvptx_la_DEPENDENCIES)
|
libgomp-plugin-nvptx.la: $(libgomp_plugin_nvptx_la_OBJECTS) $(libgomp_plugin_nvptx_la_DEPENDENCIES) $(EXTRA_libgomp_plugin_nvptx_la_DEPENDENCIES)
|
||||||
$(libgomp_plugin_nvptx_la_LINK) $(am_libgomp_plugin_nvptx_la_rpath) $(libgomp_plugin_nvptx_la_OBJECTS) $(libgomp_plugin_nvptx_la_LIBADD) $(LIBS)
|
$(libgomp_plugin_nvptx_la_LINK) $(am_libgomp_plugin_nvptx_la_rpath) $(libgomp_plugin_nvptx_la_OBJECTS) $(libgomp_plugin_nvptx_la_LIBADD) $(LIBS)
|
||||||
libgomp.la: $(libgomp_la_OBJECTS) $(libgomp_la_DEPENDENCIES) $(EXTRA_libgomp_la_DEPENDENCIES)
|
libgomp.la: $(libgomp_la_OBJECTS) $(libgomp_la_DEPENDENCIES) $(EXTRA_libgomp_la_DEPENDENCIES)
|
||||||
|
|
@ -575,6 +606,7 @@ distclean-compile:
|
||||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/iter.Plo@am__quote@
|
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/iter.Plo@am__quote@
|
||||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/iter_ull.Plo@am__quote@
|
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/iter_ull.Plo@am__quote@
|
||||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/libgomp-plugin.Plo@am__quote@
|
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/libgomp-plugin.Plo@am__quote@
|
||||||
|
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/libgomp_plugin_hsa_la-plugin-hsa.Plo@am__quote@
|
||||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/libgomp_plugin_nvptx_la-plugin-nvptx.Plo@am__quote@
|
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/libgomp_plugin_nvptx_la-plugin-nvptx.Plo@am__quote@
|
||||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/lock.Plo@am__quote@
|
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/lock.Plo@am__quote@
|
||||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/loop.Plo@am__quote@
|
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/loop.Plo@am__quote@
|
||||||
|
|
@ -623,6 +655,13 @@ distclean-compile:
|
||||||
@AMDEP_TRUE@@am__fastdepCC_FALSE@ DEPDIR=$(DEPDIR) $(CCDEPMODE) $(depcomp) @AMDEPBACKSLASH@
|
@AMDEP_TRUE@@am__fastdepCC_FALSE@ DEPDIR=$(DEPDIR) $(CCDEPMODE) $(depcomp) @AMDEPBACKSLASH@
|
||||||
@am__fastdepCC_FALSE@ $(LTCOMPILE) -c -o $@ $<
|
@am__fastdepCC_FALSE@ $(LTCOMPILE) -c -o $@ $<
|
||||||
|
|
||||||
|
libgomp_plugin_hsa_la-plugin-hsa.lo: plugin/plugin-hsa.c
|
||||||
|
@am__fastdepCC_TRUE@ $(LIBTOOL) --tag=CC $(libgomp_plugin_hsa_la_LIBTOOLFLAGS) $(LIBTOOLFLAGS) --mode=compile $(CC) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) $(libgomp_plugin_hsa_la_CPPFLAGS) $(CPPFLAGS) $(AM_CFLAGS) $(CFLAGS) -MT libgomp_plugin_hsa_la-plugin-hsa.lo -MD -MP -MF $(DEPDIR)/libgomp_plugin_hsa_la-plugin-hsa.Tpo -c -o libgomp_plugin_hsa_la-plugin-hsa.lo `test -f 'plugin/plugin-hsa.c' || echo '$(srcdir)/'`plugin/plugin-hsa.c
|
||||||
|
@am__fastdepCC_TRUE@ $(am__mv) $(DEPDIR)/libgomp_plugin_hsa_la-plugin-hsa.Tpo $(DEPDIR)/libgomp_plugin_hsa_la-plugin-hsa.Plo
|
||||||
|
@AMDEP_TRUE@@am__fastdepCC_FALSE@ source='plugin/plugin-hsa.c' object='libgomp_plugin_hsa_la-plugin-hsa.lo' libtool=yes @AMDEPBACKSLASH@
|
||||||
|
@AMDEP_TRUE@@am__fastdepCC_FALSE@ DEPDIR=$(DEPDIR) $(CCDEPMODE) $(depcomp) @AMDEPBACKSLASH@
|
||||||
|
@am__fastdepCC_FALSE@ $(LIBTOOL) --tag=CC $(libgomp_plugin_hsa_la_LIBTOOLFLAGS) $(LIBTOOLFLAGS) --mode=compile $(CC) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) $(libgomp_plugin_hsa_la_CPPFLAGS) $(CPPFLAGS) $(AM_CFLAGS) $(CFLAGS) -c -o libgomp_plugin_hsa_la-plugin-hsa.lo `test -f 'plugin/plugin-hsa.c' || echo '$(srcdir)/'`plugin/plugin-hsa.c
|
||||||
|
|
||||||
libgomp_plugin_nvptx_la-plugin-nvptx.lo: plugin/plugin-nvptx.c
|
libgomp_plugin_nvptx_la-plugin-nvptx.lo: plugin/plugin-nvptx.c
|
||||||
@am__fastdepCC_TRUE@ $(LIBTOOL) --tag=CC $(libgomp_plugin_nvptx_la_LIBTOOLFLAGS) $(LIBTOOLFLAGS) --mode=compile $(CC) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) $(libgomp_plugin_nvptx_la_CPPFLAGS) $(CPPFLAGS) $(AM_CFLAGS) $(CFLAGS) -MT libgomp_plugin_nvptx_la-plugin-nvptx.lo -MD -MP -MF $(DEPDIR)/libgomp_plugin_nvptx_la-plugin-nvptx.Tpo -c -o libgomp_plugin_nvptx_la-plugin-nvptx.lo `test -f 'plugin/plugin-nvptx.c' || echo '$(srcdir)/'`plugin/plugin-nvptx.c
|
@am__fastdepCC_TRUE@ $(LIBTOOL) --tag=CC $(libgomp_plugin_nvptx_la_LIBTOOLFLAGS) $(LIBTOOLFLAGS) --mode=compile $(CC) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) $(libgomp_plugin_nvptx_la_CPPFLAGS) $(CPPFLAGS) $(AM_CFLAGS) $(CFLAGS) -MT libgomp_plugin_nvptx_la-plugin-nvptx.lo -MD -MP -MF $(DEPDIR)/libgomp_plugin_nvptx_la-plugin-nvptx.Tpo -c -o libgomp_plugin_nvptx_la-plugin-nvptx.lo `test -f 'plugin/plugin-nvptx.c' || echo '$(srcdir)/'`plugin/plugin-nvptx.c
|
||||||
@am__fastdepCC_TRUE@ $(am__mv) $(DEPDIR)/libgomp_plugin_nvptx_la-plugin-nvptx.Tpo $(DEPDIR)/libgomp_plugin_nvptx_la-plugin-nvptx.Plo
|
@am__fastdepCC_TRUE@ $(am__mv) $(DEPDIR)/libgomp_plugin_nvptx_la-plugin-nvptx.Tpo $(DEPDIR)/libgomp_plugin_nvptx_la-plugin-nvptx.Plo
|
||||||
|
|
|
||||||
|
|
@ -60,6 +60,9 @@
|
||||||
/* Define to 1 if you have the `strtoull' function. */
|
/* Define to 1 if you have the `strtoull' function. */
|
||||||
#undef HAVE_STRTOULL
|
#undef HAVE_STRTOULL
|
||||||
|
|
||||||
|
/* Define to 1 if the system has the type `struct _Mutex_Control'. */
|
||||||
|
#undef HAVE_STRUCT__MUTEX_CONTROL
|
||||||
|
|
||||||
/* Define to 1 if the target runtime linker supports binding the same symbol
|
/* Define to 1 if the target runtime linker supports binding the same symbol
|
||||||
to different versions. */
|
to different versions. */
|
||||||
#undef HAVE_SYMVER_SYMBOL_RENAMING_RUNTIME_SUPPORT
|
#undef HAVE_SYMVER_SYMBOL_RENAMING_RUNTIME_SUPPORT
|
||||||
|
|
@ -119,6 +122,9 @@
|
||||||
/* Define to the version of this package. */
|
/* Define to the version of this package. */
|
||||||
#undef PACKAGE_VERSION
|
#undef PACKAGE_VERSION
|
||||||
|
|
||||||
|
/* Define to 1 if the HSA plugin is built, 0 if not. */
|
||||||
|
#undef PLUGIN_HSA
|
||||||
|
|
||||||
/* Define to 1 if the NVIDIA plugin is built, 0 if not. */
|
/* Define to 1 if the NVIDIA plugin is built, 0 if not. */
|
||||||
#undef PLUGIN_NVPTX
|
#undef PLUGIN_NVPTX
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -627,10 +627,18 @@ LIBGOMP_BUILD_VERSIONED_SHLIB_FALSE
|
||||||
LIBGOMP_BUILD_VERSIONED_SHLIB_TRUE
|
LIBGOMP_BUILD_VERSIONED_SHLIB_TRUE
|
||||||
OPT_LDFLAGS
|
OPT_LDFLAGS
|
||||||
SECTION_LDFLAGS
|
SECTION_LDFLAGS
|
||||||
|
PLUGIN_HSA_FALSE
|
||||||
|
PLUGIN_HSA_TRUE
|
||||||
PLUGIN_NVPTX_FALSE
|
PLUGIN_NVPTX_FALSE
|
||||||
PLUGIN_NVPTX_TRUE
|
PLUGIN_NVPTX_TRUE
|
||||||
offload_additional_lib_paths
|
offload_additional_lib_paths
|
||||||
offload_additional_options
|
offload_additional_options
|
||||||
|
PLUGIN_HSA_LIBS
|
||||||
|
PLUGIN_HSA_LDFLAGS
|
||||||
|
PLUGIN_HSA_CPPFLAGS
|
||||||
|
PLUGIN_HSA
|
||||||
|
HSA_RUNTIME_LIB
|
||||||
|
HSA_RUNTIME_INCLUDE
|
||||||
PLUGIN_NVPTX_LIBS
|
PLUGIN_NVPTX_LIBS
|
||||||
PLUGIN_NVPTX_LDFLAGS
|
PLUGIN_NVPTX_LDFLAGS
|
||||||
PLUGIN_NVPTX_CPPFLAGS
|
PLUGIN_NVPTX_CPPFLAGS
|
||||||
|
|
@ -782,6 +790,10 @@ enable_maintainer_mode
|
||||||
with_cuda_driver
|
with_cuda_driver
|
||||||
with_cuda_driver_include
|
with_cuda_driver_include
|
||||||
with_cuda_driver_lib
|
with_cuda_driver_lib
|
||||||
|
with_hsa_runtime
|
||||||
|
with_hsa_runtime_include
|
||||||
|
with_hsa_runtime_lib
|
||||||
|
with_hsa_kmt_lib
|
||||||
enable_linux_futex
|
enable_linux_futex
|
||||||
enable_tls
|
enable_tls
|
||||||
enable_symvers
|
enable_symvers
|
||||||
|
|
@ -1453,6 +1465,17 @@ Optional Packages:
|
||||||
--with-cuda-driver-lib=PATH
|
--with-cuda-driver-lib=PATH
|
||||||
specify directory for the installed CUDA driver
|
specify directory for the installed CUDA driver
|
||||||
library
|
library
|
||||||
|
--with-hsa-runtime=PATH specify prefix directory for installed HSA run-time
|
||||||
|
package. Equivalent to
|
||||||
|
--with-hsa-runtime-include=PATH/include plus
|
||||||
|
--with-hsa-runtime-lib=PATH/lib
|
||||||
|
--with-hsa-runtime-include=PATH
|
||||||
|
specify directory for installed HSA run-time include
|
||||||
|
files
|
||||||
|
--with-hsa-runtime-lib=PATH
|
||||||
|
specify directory for the installed HSA run-time
|
||||||
|
library
|
||||||
|
--with-hsa-kmt-lib=PATH specify directory for installed HSA KMT library.
|
||||||
|
|
||||||
Some influential environment variables:
|
Some influential environment variables:
|
||||||
CC C compiler command
|
CC C compiler command
|
||||||
|
|
@ -11121,7 +11144,7 @@ else
|
||||||
lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
|
lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
|
||||||
lt_status=$lt_dlunknown
|
lt_status=$lt_dlunknown
|
||||||
cat > conftest.$ac_ext <<_LT_EOF
|
cat > conftest.$ac_ext <<_LT_EOF
|
||||||
#line 11124 "configure"
|
#line 11147 "configure"
|
||||||
#include "confdefs.h"
|
#include "confdefs.h"
|
||||||
|
|
||||||
#if HAVE_DLFCN_H
|
#if HAVE_DLFCN_H
|
||||||
|
|
@ -11227,7 +11250,7 @@ else
|
||||||
lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
|
lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
|
||||||
lt_status=$lt_dlunknown
|
lt_status=$lt_dlunknown
|
||||||
cat > conftest.$ac_ext <<_LT_EOF
|
cat > conftest.$ac_ext <<_LT_EOF
|
||||||
#line 11230 "configure"
|
#line 11253 "configure"
|
||||||
#include "confdefs.h"
|
#include "confdefs.h"
|
||||||
|
|
||||||
#if HAVE_DLFCN_H
|
#if HAVE_DLFCN_H
|
||||||
|
|
@ -15090,7 +15113,7 @@ esac
|
||||||
|
|
||||||
# Plugins for offload execution, configure.ac fragment. -*- mode: autoconf -*-
|
# Plugins for offload execution, configure.ac fragment. -*- mode: autoconf -*-
|
||||||
#
|
#
|
||||||
# Copyright (C) 2014-2015 Free Software Foundation, Inc.
|
# Copyright (C) 2014-2016 Free Software Foundation, Inc.
|
||||||
#
|
#
|
||||||
# Contributed by Mentor Embedded.
|
# Contributed by Mentor Embedded.
|
||||||
#
|
#
|
||||||
|
|
@ -15225,6 +15248,72 @@ PLUGIN_NVPTX_LIBS=
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
# Look for HSA run-time, its includes and libraries
|
||||||
|
|
||||||
|
HSA_RUNTIME_INCLUDE=
|
||||||
|
HSA_RUNTIME_LIB=
|
||||||
|
|
||||||
|
|
||||||
|
HSA_RUNTIME_CPPFLAGS=
|
||||||
|
HSA_RUNTIME_LDFLAGS=
|
||||||
|
|
||||||
|
|
||||||
|
# Check whether --with-hsa-runtime was given.
|
||||||
|
if test "${with_hsa_runtime+set}" = set; then :
|
||||||
|
withval=$with_hsa_runtime;
|
||||||
|
fi
|
||||||
|
|
||||||
|
|
||||||
|
# Check whether --with-hsa-runtime-include was given.
|
||||||
|
if test "${with_hsa_runtime_include+set}" = set; then :
|
||||||
|
withval=$with_hsa_runtime_include;
|
||||||
|
fi
|
||||||
|
|
||||||
|
|
||||||
|
# Check whether --with-hsa-runtime-lib was given.
|
||||||
|
if test "${with_hsa_runtime_lib+set}" = set; then :
|
||||||
|
withval=$with_hsa_runtime_lib;
|
||||||
|
fi
|
||||||
|
|
||||||
|
if test "x$with_hsa_runtime" != x; then
|
||||||
|
HSA_RUNTIME_INCLUDE=$with_hsa_runtime/include
|
||||||
|
HSA_RUNTIME_LIB=$with_hsa_runtime/lib
|
||||||
|
fi
|
||||||
|
if test "x$with_hsa_runtime_include" != x; then
|
||||||
|
HSA_RUNTIME_INCLUDE=$with_hsa_runtime_include
|
||||||
|
fi
|
||||||
|
if test "x$with_hsa_runtime_lib" != x; then
|
||||||
|
HSA_RUNTIME_LIB=$with_hsa_runtime_lib
|
||||||
|
fi
|
||||||
|
if test "x$HSA_RUNTIME_INCLUDE" != x; then
|
||||||
|
HSA_RUNTIME_CPPFLAGS=-I$HSA_RUNTIME_INCLUDE
|
||||||
|
fi
|
||||||
|
if test "x$HSA_RUNTIME_LIB" != x; then
|
||||||
|
HSA_RUNTIME_LDFLAGS=-L$HSA_RUNTIME_LIB
|
||||||
|
fi
|
||||||
|
|
||||||
|
|
||||||
|
# Check whether --with-hsa-kmt-lib was given.
|
||||||
|
if test "${with_hsa_kmt_lib+set}" = set; then :
|
||||||
|
withval=$with_hsa_kmt_lib;
|
||||||
|
fi
|
||||||
|
|
||||||
|
if test "x$with_hsa_kmt_lib" != x; then
|
||||||
|
HSA_RUNTIME_LDFLAGS="$HSA_RUNTIME_LDFLAGS -L$with_hsa_kmt_lib"
|
||||||
|
HSA_RUNTIME_LIB=
|
||||||
|
fi
|
||||||
|
|
||||||
|
PLUGIN_HSA=0
|
||||||
|
PLUGIN_HSA_CPPFLAGS=
|
||||||
|
PLUGIN_HSA_LDFLAGS=
|
||||||
|
PLUGIN_HSA_LIBS=
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
# Get offload targets and path to install tree of offloading compiler.
|
# Get offload targets and path to install tree of offloading compiler.
|
||||||
offload_additional_options=
|
offload_additional_options=
|
||||||
offload_additional_lib_paths=
|
offload_additional_lib_paths=
|
||||||
|
|
@ -15277,6 +15366,60 @@ rm -f core conftest.err conftest.$ac_objext \
|
||||||
;;
|
;;
|
||||||
esac
|
esac
|
||||||
;;
|
;;
|
||||||
|
hsa*)
|
||||||
|
case "${target}" in
|
||||||
|
x86_64-*-*)
|
||||||
|
case " ${CC} ${CFLAGS} " in
|
||||||
|
*" -m32 "*)
|
||||||
|
PLUGIN_HSA=0
|
||||||
|
;;
|
||||||
|
*)
|
||||||
|
tgt_name=hsa
|
||||||
|
PLUGIN_HSA=$tgt
|
||||||
|
PLUGIN_HSA_CPPFLAGS=$HSA_RUNTIME_CPPFLAGS
|
||||||
|
PLUGIN_HSA_LDFLAGS=$HSA_RUNTIME_LDFLAGS
|
||||||
|
PLUGIN_HSA_LIBS="-lhsa-runtime64 -lhsakmt"
|
||||||
|
|
||||||
|
PLUGIN_HSA_save_CPPFLAGS=$CPPFLAGS
|
||||||
|
CPPFLAGS="$PLUGIN_HSA_CPPFLAGS $CPPFLAGS"
|
||||||
|
PLUGIN_HSA_save_LDFLAGS=$LDFLAGS
|
||||||
|
LDFLAGS="$PLUGIN_HSA_LDFLAGS $LDFLAGS"
|
||||||
|
PLUGIN_HSA_save_LIBS=$LIBS
|
||||||
|
LIBS="$PLUGIN_HSA_LIBS $LIBS"
|
||||||
|
|
||||||
|
cat confdefs.h - <<_ACEOF >conftest.$ac_ext
|
||||||
|
/* end confdefs.h. */
|
||||||
|
#include "hsa.h"
|
||||||
|
int
|
||||||
|
main ()
|
||||||
|
{
|
||||||
|
hsa_status_t status = hsa_init ()
|
||||||
|
;
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
_ACEOF
|
||||||
|
if ac_fn_c_try_link "$LINENO"; then :
|
||||||
|
PLUGIN_HSA=1
|
||||||
|
fi
|
||||||
|
rm -f core conftest.err conftest.$ac_objext \
|
||||||
|
conftest$ac_exeext conftest.$ac_ext
|
||||||
|
CPPFLAGS=$PLUGIN_HSA_save_CPPFLAGS
|
||||||
|
LDFLAGS=$PLUGIN_HSA_save_LDFLAGS
|
||||||
|
LIBS=$PLUGIN_HSA_save_LIBS
|
||||||
|
case $PLUGIN_HSA in
|
||||||
|
hsa*)
|
||||||
|
HSA_PLUGIN=0
|
||||||
|
as_fn_error "HSA run-time package required for HSA support" "$LINENO" 5
|
||||||
|
;;
|
||||||
|
esac
|
||||||
|
;;
|
||||||
|
esac
|
||||||
|
;;
|
||||||
|
*-*-*)
|
||||||
|
PLUGIN_HSA=0
|
||||||
|
;;
|
||||||
|
esac
|
||||||
|
;;
|
||||||
*)
|
*)
|
||||||
as_fn_error "unknown offload target specified" "$LINENO" 5
|
as_fn_error "unknown offload target specified" "$LINENO" 5
|
||||||
;;
|
;;
|
||||||
|
|
@ -15313,6 +15456,19 @@ cat >>confdefs.h <<_ACEOF
|
||||||
#define PLUGIN_NVPTX $PLUGIN_NVPTX
|
#define PLUGIN_NVPTX $PLUGIN_NVPTX
|
||||||
_ACEOF
|
_ACEOF
|
||||||
|
|
||||||
|
if test $PLUGIN_HSA = 1; then
|
||||||
|
PLUGIN_HSA_TRUE=
|
||||||
|
PLUGIN_HSA_FALSE='#'
|
||||||
|
else
|
||||||
|
PLUGIN_HSA_TRUE='#'
|
||||||
|
PLUGIN_HSA_FALSE=
|
||||||
|
fi
|
||||||
|
|
||||||
|
|
||||||
|
cat >>confdefs.h <<_ACEOF
|
||||||
|
#define PLUGIN_HSA $PLUGIN_HSA
|
||||||
|
_ACEOF
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
# Check for functions needed.
|
# Check for functions needed.
|
||||||
|
|
@ -16712,6 +16868,10 @@ if test -z "${PLUGIN_NVPTX_TRUE}" && test -z "${PLUGIN_NVPTX_FALSE}"; then
|
||||||
as_fn_error "conditional \"PLUGIN_NVPTX\" was never defined.
|
as_fn_error "conditional \"PLUGIN_NVPTX\" was never defined.
|
||||||
Usually this means the macro was only invoked conditionally." "$LINENO" 5
|
Usually this means the macro was only invoked conditionally." "$LINENO" 5
|
||||||
fi
|
fi
|
||||||
|
if test -z "${PLUGIN_HSA_TRUE}" && test -z "${PLUGIN_HSA_FALSE}"; then
|
||||||
|
as_fn_error "conditional \"PLUGIN_HSA\" was never defined.
|
||||||
|
Usually this means the macro was only invoked conditionally." "$LINENO" 5
|
||||||
|
fi
|
||||||
if test -z "${LIBGOMP_BUILD_VERSIONED_SHLIB_TRUE}" && test -z "${LIBGOMP_BUILD_VERSIONED_SHLIB_FALSE}"; then
|
if test -z "${LIBGOMP_BUILD_VERSIONED_SHLIB_TRUE}" && test -z "${LIBGOMP_BUILD_VERSIONED_SHLIB_FALSE}"; then
|
||||||
as_fn_error "conditional \"LIBGOMP_BUILD_VERSIONED_SHLIB\" was never defined.
|
as_fn_error "conditional \"LIBGOMP_BUILD_VERSIONED_SHLIB\" was never defined.
|
||||||
Usually this means the macro was only invoked conditionally." "$LINENO" 5
|
Usually this means the macro was only invoked conditionally." "$LINENO" 5
|
||||||
|
|
|
||||||
|
|
@ -48,7 +48,8 @@ enum offload_target_type
|
||||||
OFFLOAD_TARGET_TYPE_HOST = 2,
|
OFFLOAD_TARGET_TYPE_HOST = 2,
|
||||||
/* OFFLOAD_TARGET_TYPE_HOST_NONSHM = 3 removed. */
|
/* OFFLOAD_TARGET_TYPE_HOST_NONSHM = 3 removed. */
|
||||||
OFFLOAD_TARGET_TYPE_NVIDIA_PTX = 5,
|
OFFLOAD_TARGET_TYPE_NVIDIA_PTX = 5,
|
||||||
OFFLOAD_TARGET_TYPE_INTEL_MIC = 6
|
OFFLOAD_TARGET_TYPE_INTEL_MIC = 6,
|
||||||
|
OFFLOAD_TARGET_TYPE_HSA = 7
|
||||||
};
|
};
|
||||||
|
|
||||||
/* Auxiliary struct, used for transferring pairs of addresses from plugin
|
/* Auxiliary struct, used for transferring pairs of addresses from plugin
|
||||||
|
|
|
||||||
|
|
@ -496,6 +496,10 @@ struct gomp_target_task
|
||||||
struct target_mem_desc *tgt;
|
struct target_mem_desc *tgt;
|
||||||
struct gomp_task *task;
|
struct gomp_task *task;
|
||||||
struct gomp_team *team;
|
struct gomp_team *team;
|
||||||
|
/* Copies of firstprivate mapped data for shared memory accelerators. */
|
||||||
|
void *firstprivate_copies;
|
||||||
|
/* Device-specific target arguments. */
|
||||||
|
void **args;
|
||||||
void *hostaddrs[];
|
void *hostaddrs[];
|
||||||
};
|
};
|
||||||
|
|
||||||
|
|
@ -750,7 +754,8 @@ extern void gomp_task_maybe_wait_for_dependencies (void **);
|
||||||
extern bool gomp_create_target_task (struct gomp_device_descr *,
|
extern bool gomp_create_target_task (struct gomp_device_descr *,
|
||||||
void (*) (void *), size_t, void **,
|
void (*) (void *), size_t, void **,
|
||||||
size_t *, unsigned short *, unsigned int,
|
size_t *, unsigned short *, unsigned int,
|
||||||
void **, enum gomp_target_task_state);
|
void **, void **,
|
||||||
|
enum gomp_target_task_state);
|
||||||
|
|
||||||
static void inline
|
static void inline
|
||||||
gomp_finish_task (struct gomp_task *task)
|
gomp_finish_task (struct gomp_task *task)
|
||||||
|
|
@ -937,8 +942,9 @@ struct gomp_device_descr
|
||||||
void *(*dev2host_func) (int, void *, const void *, size_t);
|
void *(*dev2host_func) (int, void *, const void *, size_t);
|
||||||
void *(*host2dev_func) (int, void *, const void *, size_t);
|
void *(*host2dev_func) (int, void *, const void *, size_t);
|
||||||
void *(*dev2dev_func) (int, void *, const void *, size_t);
|
void *(*dev2dev_func) (int, void *, const void *, size_t);
|
||||||
void (*run_func) (int, void *, void *);
|
bool (*can_run_func) (void *);
|
||||||
void (*async_run_func) (int, void *, void *, void *);
|
void (*run_func) (int, void *, void *, void **);
|
||||||
|
void (*async_run_func) (int, void *, void *, void **, void *);
|
||||||
|
|
||||||
/* Splay tree containing information about mapped memory regions. */
|
/* Splay tree containing information about mapped memory regions. */
|
||||||
struct splay_tree_s mem_map;
|
struct splay_tree_s mem_map;
|
||||||
|
|
|
||||||
|
|
@ -278,8 +278,7 @@ extern void GOMP_single_copy_end (void *);
|
||||||
extern void GOMP_target (int, void (*) (void *), const void *,
|
extern void GOMP_target (int, void (*) (void *), const void *,
|
||||||
size_t, void **, size_t *, unsigned char *);
|
size_t, void **, size_t *, unsigned char *);
|
||||||
extern void GOMP_target_ext (int, void (*) (void *), size_t, void **, size_t *,
|
extern void GOMP_target_ext (int, void (*) (void *), size_t, void **, size_t *,
|
||||||
unsigned short *, unsigned int, void **,
|
unsigned short *, unsigned int, void **, void **);
|
||||||
int, int);
|
|
||||||
extern void GOMP_target_data (int, const void *,
|
extern void GOMP_target_data (int, const void *,
|
||||||
size_t, void **, size_t *, unsigned char *);
|
size_t, void **, size_t *, unsigned char *);
|
||||||
extern void GOMP_target_data_ext (int, size_t, void **, size_t *,
|
extern void GOMP_target_data_ext (int, size_t, void **, size_t *,
|
||||||
|
|
|
||||||
|
|
@ -123,7 +123,8 @@ host_host2dev (int n __attribute__ ((unused)),
|
||||||
}
|
}
|
||||||
|
|
||||||
static void
|
static void
|
||||||
host_run (int n __attribute__ ((unused)), void *fn_ptr, void *vars)
|
host_run (int n __attribute__ ((unused)), void *fn_ptr, void *vars,
|
||||||
|
void **args __attribute__((unused)))
|
||||||
{
|
{
|
||||||
void (*fn)(void *) = (void (*)(void *)) fn_ptr;
|
void (*fn)(void *) = (void (*)(void *)) fn_ptr;
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -38,3 +38,16 @@ libgomp_plugin_nvptx_la_LDFLAGS += $(PLUGIN_NVPTX_LDFLAGS)
|
||||||
libgomp_plugin_nvptx_la_LIBADD = libgomp.la $(PLUGIN_NVPTX_LIBS)
|
libgomp_plugin_nvptx_la_LIBADD = libgomp.la $(PLUGIN_NVPTX_LIBS)
|
||||||
libgomp_plugin_nvptx_la_LIBTOOLFLAGS = --tag=disable-static
|
libgomp_plugin_nvptx_la_LIBTOOLFLAGS = --tag=disable-static
|
||||||
endif
|
endif
|
||||||
|
|
||||||
|
if PLUGIN_HSA
|
||||||
|
# Heterogenous Systems Architecture plugin
|
||||||
|
libgomp_plugin_hsa_version_info = -version-info $(libtool_VERSION)
|
||||||
|
toolexeclib_LTLIBRARIES += libgomp-plugin-hsa.la
|
||||||
|
libgomp_plugin_hsa_la_SOURCES = plugin/plugin-hsa.c
|
||||||
|
libgomp_plugin_hsa_la_CPPFLAGS = $(AM_CPPFLAGS) $(PLUGIN_HSA_CPPFLAGS)
|
||||||
|
libgomp_plugin_hsa_la_LDFLAGS = $(libgomp_plugin_hsa_version_info) \
|
||||||
|
$(lt_host_flags)
|
||||||
|
libgomp_plugin_hsa_la_LDFLAGS += $(PLUGIN_HSA_LDFLAGS)
|
||||||
|
libgomp_plugin_hsa_la_LIBADD = libgomp.la $(PLUGIN_HSA_LIBS)
|
||||||
|
libgomp_plugin_hsa_la_LIBTOOLFLAGS = --tag=disable-static
|
||||||
|
endif
|
||||||
|
|
|
||||||
|
|
@ -81,6 +81,62 @@ AC_SUBST(PLUGIN_NVPTX_CPPFLAGS)
|
||||||
AC_SUBST(PLUGIN_NVPTX_LDFLAGS)
|
AC_SUBST(PLUGIN_NVPTX_LDFLAGS)
|
||||||
AC_SUBST(PLUGIN_NVPTX_LIBS)
|
AC_SUBST(PLUGIN_NVPTX_LIBS)
|
||||||
|
|
||||||
|
# Look for HSA run-time, its includes and libraries
|
||||||
|
|
||||||
|
HSA_RUNTIME_INCLUDE=
|
||||||
|
HSA_RUNTIME_LIB=
|
||||||
|
AC_SUBST(HSA_RUNTIME_INCLUDE)
|
||||||
|
AC_SUBST(HSA_RUNTIME_LIB)
|
||||||
|
HSA_RUNTIME_CPPFLAGS=
|
||||||
|
HSA_RUNTIME_LDFLAGS=
|
||||||
|
|
||||||
|
AC_ARG_WITH(hsa-runtime,
|
||||||
|
[AS_HELP_STRING([--with-hsa-runtime=PATH],
|
||||||
|
[specify prefix directory for installed HSA run-time package.
|
||||||
|
Equivalent to --with-hsa-runtime-include=PATH/include
|
||||||
|
plus --with-hsa-runtime-lib=PATH/lib])])
|
||||||
|
AC_ARG_WITH(hsa-runtime-include,
|
||||||
|
[AS_HELP_STRING([--with-hsa-runtime-include=PATH],
|
||||||
|
[specify directory for installed HSA run-time include files])])
|
||||||
|
AC_ARG_WITH(hsa-runtime-lib,
|
||||||
|
[AS_HELP_STRING([--with-hsa-runtime-lib=PATH],
|
||||||
|
[specify directory for the installed HSA run-time library])])
|
||||||
|
if test "x$with_hsa_runtime" != x; then
|
||||||
|
HSA_RUNTIME_INCLUDE=$with_hsa_runtime/include
|
||||||
|
HSA_RUNTIME_LIB=$with_hsa_runtime/lib
|
||||||
|
fi
|
||||||
|
if test "x$with_hsa_runtime_include" != x; then
|
||||||
|
HSA_RUNTIME_INCLUDE=$with_hsa_runtime_include
|
||||||
|
fi
|
||||||
|
if test "x$with_hsa_runtime_lib" != x; then
|
||||||
|
HSA_RUNTIME_LIB=$with_hsa_runtime_lib
|
||||||
|
fi
|
||||||
|
if test "x$HSA_RUNTIME_INCLUDE" != x; then
|
||||||
|
HSA_RUNTIME_CPPFLAGS=-I$HSA_RUNTIME_INCLUDE
|
||||||
|
fi
|
||||||
|
if test "x$HSA_RUNTIME_LIB" != x; then
|
||||||
|
HSA_RUNTIME_LDFLAGS=-L$HSA_RUNTIME_LIB
|
||||||
|
fi
|
||||||
|
|
||||||
|
AC_ARG_WITH(hsa-kmt-lib,
|
||||||
|
[AS_HELP_STRING([--with-hsa-kmt-lib=PATH],
|
||||||
|
[specify directory for installed HSA KMT library.])])
|
||||||
|
if test "x$with_hsa_kmt_lib" != x; then
|
||||||
|
HSA_RUNTIME_LDFLAGS="$HSA_RUNTIME_LDFLAGS -L$with_hsa_kmt_lib"
|
||||||
|
HSA_RUNTIME_LIB=
|
||||||
|
fi
|
||||||
|
|
||||||
|
PLUGIN_HSA=0
|
||||||
|
PLUGIN_HSA_CPPFLAGS=
|
||||||
|
PLUGIN_HSA_LDFLAGS=
|
||||||
|
PLUGIN_HSA_LIBS=
|
||||||
|
AC_SUBST(PLUGIN_HSA)
|
||||||
|
AC_SUBST(PLUGIN_HSA_CPPFLAGS)
|
||||||
|
AC_SUBST(PLUGIN_HSA_LDFLAGS)
|
||||||
|
AC_SUBST(PLUGIN_HSA_LIBS)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
# Get offload targets and path to install tree of offloading compiler.
|
# Get offload targets and path to install tree of offloading compiler.
|
||||||
offload_additional_options=
|
offload_additional_options=
|
||||||
offload_additional_lib_paths=
|
offload_additional_lib_paths=
|
||||||
|
|
@ -122,6 +178,49 @@ if test x"$enable_offload_targets" != x; then
|
||||||
;;
|
;;
|
||||||
esac
|
esac
|
||||||
;;
|
;;
|
||||||
|
hsa*)
|
||||||
|
case "${target}" in
|
||||||
|
x86_64-*-*)
|
||||||
|
case " ${CC} ${CFLAGS} " in
|
||||||
|
*" -m32 "*)
|
||||||
|
PLUGIN_HSA=0
|
||||||
|
;;
|
||||||
|
*)
|
||||||
|
tgt_name=hsa
|
||||||
|
PLUGIN_HSA=$tgt
|
||||||
|
PLUGIN_HSA_CPPFLAGS=$HSA_RUNTIME_CPPFLAGS
|
||||||
|
PLUGIN_HSA_LDFLAGS=$HSA_RUNTIME_LDFLAGS
|
||||||
|
PLUGIN_HSA_LIBS="-lhsa-runtime64 -lhsakmt"
|
||||||
|
|
||||||
|
PLUGIN_HSA_save_CPPFLAGS=$CPPFLAGS
|
||||||
|
CPPFLAGS="$PLUGIN_HSA_CPPFLAGS $CPPFLAGS"
|
||||||
|
PLUGIN_HSA_save_LDFLAGS=$LDFLAGS
|
||||||
|
LDFLAGS="$PLUGIN_HSA_LDFLAGS $LDFLAGS"
|
||||||
|
PLUGIN_HSA_save_LIBS=$LIBS
|
||||||
|
LIBS="$PLUGIN_HSA_LIBS $LIBS"
|
||||||
|
|
||||||
|
AC_LINK_IFELSE(
|
||||||
|
[AC_LANG_PROGRAM(
|
||||||
|
[#include "hsa.h"],
|
||||||
|
[hsa_status_t status = hsa_init ()])],
|
||||||
|
[PLUGIN_HSA=1])
|
||||||
|
CPPFLAGS=$PLUGIN_HSA_save_CPPFLAGS
|
||||||
|
LDFLAGS=$PLUGIN_HSA_save_LDFLAGS
|
||||||
|
LIBS=$PLUGIN_HSA_save_LIBS
|
||||||
|
case $PLUGIN_HSA in
|
||||||
|
hsa*)
|
||||||
|
HSA_PLUGIN=0
|
||||||
|
AC_MSG_ERROR([HSA run-time package required for HSA support])
|
||||||
|
;;
|
||||||
|
esac
|
||||||
|
;;
|
||||||
|
esac
|
||||||
|
;;
|
||||||
|
*-*-*)
|
||||||
|
PLUGIN_HSA=0
|
||||||
|
;;
|
||||||
|
esac
|
||||||
|
;;
|
||||||
*)
|
*)
|
||||||
AC_MSG_ERROR([unknown offload target specified])
|
AC_MSG_ERROR([unknown offload target specified])
|
||||||
;;
|
;;
|
||||||
|
|
@ -145,3 +244,6 @@ AC_DEFINE_UNQUOTED(OFFLOAD_TARGETS, "$offload_targets",
|
||||||
AM_CONDITIONAL([PLUGIN_NVPTX], [test $PLUGIN_NVPTX = 1])
|
AM_CONDITIONAL([PLUGIN_NVPTX], [test $PLUGIN_NVPTX = 1])
|
||||||
AC_DEFINE_UNQUOTED([PLUGIN_NVPTX], [$PLUGIN_NVPTX],
|
AC_DEFINE_UNQUOTED([PLUGIN_NVPTX], [$PLUGIN_NVPTX],
|
||||||
[Define to 1 if the NVIDIA plugin is built, 0 if not.])
|
[Define to 1 if the NVIDIA plugin is built, 0 if not.])
|
||||||
|
AM_CONDITIONAL([PLUGIN_HSA], [test $PLUGIN_HSA = 1])
|
||||||
|
AC_DEFINE_UNQUOTED([PLUGIN_HSA], [$PLUGIN_HSA],
|
||||||
|
[Define to 1 if the HSA plugin is built, 0 if not.])
|
||||||
|
|
|
||||||
File diff suppressed because it is too large
Load Diff
227
libgomp/target.c
227
libgomp/target.c
|
|
@ -1329,6 +1329,49 @@ gomp_target_fallback (void (*fn) (void *), void **hostaddrs)
|
||||||
*thr = old_thr;
|
*thr = old_thr;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/* Calculate alignment and size requirements of a private copy of data shared
|
||||||
|
as GOMP_MAP_FIRSTPRIVATE and store them to TGT_ALIGN and TGT_SIZE. */
|
||||||
|
|
||||||
|
static inline void
|
||||||
|
calculate_firstprivate_requirements (size_t mapnum, size_t *sizes,
|
||||||
|
unsigned short *kinds, size_t *tgt_align,
|
||||||
|
size_t *tgt_size)
|
||||||
|
{
|
||||||
|
size_t i;
|
||||||
|
for (i = 0; i < mapnum; i++)
|
||||||
|
if ((kinds[i] & 0xff) == GOMP_MAP_FIRSTPRIVATE)
|
||||||
|
{
|
||||||
|
size_t align = (size_t) 1 << (kinds[i] >> 8);
|
||||||
|
if (*tgt_align < align)
|
||||||
|
*tgt_align = align;
|
||||||
|
*tgt_size = (*tgt_size + align - 1) & ~(align - 1);
|
||||||
|
*tgt_size += sizes[i];
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Copy data shared as GOMP_MAP_FIRSTPRIVATE to DST. */
|
||||||
|
|
||||||
|
static inline void
|
||||||
|
copy_firstprivate_data (char *tgt, size_t mapnum, void **hostaddrs,
|
||||||
|
size_t *sizes, unsigned short *kinds, size_t tgt_align,
|
||||||
|
size_t tgt_size)
|
||||||
|
{
|
||||||
|
uintptr_t al = (uintptr_t) tgt & (tgt_align - 1);
|
||||||
|
if (al)
|
||||||
|
tgt += tgt_align - al;
|
||||||
|
tgt_size = 0;
|
||||||
|
size_t i;
|
||||||
|
for (i = 0; i < mapnum; i++)
|
||||||
|
if ((kinds[i] & 0xff) == GOMP_MAP_FIRSTPRIVATE)
|
||||||
|
{
|
||||||
|
size_t align = (size_t) 1 << (kinds[i] >> 8);
|
||||||
|
tgt_size = (tgt_size + align - 1) & ~(align - 1);
|
||||||
|
memcpy (tgt + tgt_size, hostaddrs[i], sizes[i]);
|
||||||
|
hostaddrs[i] = tgt + tgt_size;
|
||||||
|
tgt_size = tgt_size + sizes[i];
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
/* Host fallback with firstprivate map-type handling. */
|
/* Host fallback with firstprivate map-type handling. */
|
||||||
|
|
||||||
static void
|
static void
|
||||||
|
|
@ -1336,37 +1379,40 @@ gomp_target_fallback_firstprivate (void (*fn) (void *), size_t mapnum,
|
||||||
void **hostaddrs, size_t *sizes,
|
void **hostaddrs, size_t *sizes,
|
||||||
unsigned short *kinds)
|
unsigned short *kinds)
|
||||||
{
|
{
|
||||||
size_t i, tgt_align = 0, tgt_size = 0;
|
size_t tgt_align = 0, tgt_size = 0;
|
||||||
char *tgt = NULL;
|
calculate_firstprivate_requirements (mapnum, sizes, kinds, &tgt_align,
|
||||||
for (i = 0; i < mapnum; i++)
|
&tgt_size);
|
||||||
if ((kinds[i] & 0xff) == GOMP_MAP_FIRSTPRIVATE)
|
|
||||||
{
|
|
||||||
size_t align = (size_t) 1 << (kinds[i] >> 8);
|
|
||||||
if (tgt_align < align)
|
|
||||||
tgt_align = align;
|
|
||||||
tgt_size = (tgt_size + align - 1) & ~(align - 1);
|
|
||||||
tgt_size += sizes[i];
|
|
||||||
}
|
|
||||||
if (tgt_align)
|
if (tgt_align)
|
||||||
{
|
{
|
||||||
tgt = gomp_alloca (tgt_size + tgt_align - 1);
|
char *tgt = gomp_alloca (tgt_size + tgt_align - 1);
|
||||||
uintptr_t al = (uintptr_t) tgt & (tgt_align - 1);
|
copy_firstprivate_data (tgt, mapnum, hostaddrs, sizes, kinds, tgt_align,
|
||||||
if (al)
|
tgt_size);
|
||||||
tgt += tgt_align - al;
|
|
||||||
tgt_size = 0;
|
|
||||||
for (i = 0; i < mapnum; i++)
|
|
||||||
if ((kinds[i] & 0xff) == GOMP_MAP_FIRSTPRIVATE)
|
|
||||||
{
|
|
||||||
size_t align = (size_t) 1 << (kinds[i] >> 8);
|
|
||||||
tgt_size = (tgt_size + align - 1) & ~(align - 1);
|
|
||||||
memcpy (tgt + tgt_size, hostaddrs[i], sizes[i]);
|
|
||||||
hostaddrs[i] = tgt + tgt_size;
|
|
||||||
tgt_size = tgt_size + sizes[i];
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
gomp_target_fallback (fn, hostaddrs);
|
gomp_target_fallback (fn, hostaddrs);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/* Handle firstprivate map-type for shared memory devices and the host
|
||||||
|
fallback. Return the pointer of firstprivate copies which has to be freed
|
||||||
|
after use. */
|
||||||
|
|
||||||
|
static void *
|
||||||
|
gomp_target_unshare_firstprivate (size_t mapnum, void **hostaddrs,
|
||||||
|
size_t *sizes, unsigned short *kinds)
|
||||||
|
{
|
||||||
|
size_t tgt_align = 0, tgt_size = 0;
|
||||||
|
char *tgt = NULL;
|
||||||
|
|
||||||
|
calculate_firstprivate_requirements (mapnum, sizes, kinds, &tgt_align,
|
||||||
|
&tgt_size);
|
||||||
|
if (tgt_align)
|
||||||
|
{
|
||||||
|
tgt = gomp_malloc (tgt_size + tgt_align - 1);
|
||||||
|
copy_firstprivate_data (tgt, mapnum, hostaddrs, sizes, kinds, tgt_align,
|
||||||
|
tgt_size);
|
||||||
|
}
|
||||||
|
return tgt;
|
||||||
|
}
|
||||||
|
|
||||||
/* Helper function of GOMP_target{,_ext} routines. */
|
/* Helper function of GOMP_target{,_ext} routines. */
|
||||||
|
|
||||||
static void *
|
static void *
|
||||||
|
|
@ -1390,7 +1436,12 @@ gomp_get_target_fn_addr (struct gomp_device_descr *devicep,
|
||||||
splay_tree_key tgt_fn = splay_tree_lookup (&devicep->mem_map, &k);
|
splay_tree_key tgt_fn = splay_tree_lookup (&devicep->mem_map, &k);
|
||||||
gomp_mutex_unlock (&devicep->lock);
|
gomp_mutex_unlock (&devicep->lock);
|
||||||
if (tgt_fn == NULL)
|
if (tgt_fn == NULL)
|
||||||
gomp_fatal ("Target function wasn't mapped");
|
{
|
||||||
|
if (devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
|
||||||
|
return NULL;
|
||||||
|
else
|
||||||
|
gomp_fatal ("Target function wasn't mapped");
|
||||||
|
}
|
||||||
|
|
||||||
return (void *) tgt_fn->tgt_offset;
|
return (void *) tgt_fn->tgt_offset;
|
||||||
}
|
}
|
||||||
|
|
@ -1416,13 +1467,16 @@ GOMP_target (int device, void (*fn) (void *), const void *unused,
|
||||||
void *fn_addr;
|
void *fn_addr;
|
||||||
if (devicep == NULL
|
if (devicep == NULL
|
||||||
|| !(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400)
|
|| !(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400)
|
||||||
|
/* All shared memory devices should use the GOMP_target_ext function. */
|
||||||
|
|| devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM
|
||||||
|| !(fn_addr = gomp_get_target_fn_addr (devicep, fn)))
|
|| !(fn_addr = gomp_get_target_fn_addr (devicep, fn)))
|
||||||
return gomp_target_fallback (fn, hostaddrs);
|
return gomp_target_fallback (fn, hostaddrs);
|
||||||
|
|
||||||
struct target_mem_desc *tgt_vars
|
struct target_mem_desc *tgt_vars
|
||||||
= gomp_map_vars (devicep, mapnum, hostaddrs, NULL, sizes, kinds, false,
|
= gomp_map_vars (devicep, mapnum, hostaddrs, NULL, sizes, kinds, false,
|
||||||
GOMP_MAP_VARS_TARGET);
|
GOMP_MAP_VARS_TARGET);
|
||||||
devicep->run_func (devicep->target_id, fn_addr, (void *) tgt_vars->tgt_start);
|
devicep->run_func (devicep->target_id, fn_addr, (void *) tgt_vars->tgt_start,
|
||||||
|
NULL);
|
||||||
gomp_unmap_vars (tgt_vars, true);
|
gomp_unmap_vars (tgt_vars, true);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
@ -1430,6 +1484,15 @@ GOMP_target (int device, void (*fn) (void *), const void *unused,
|
||||||
and several arguments have been added:
|
and several arguments have been added:
|
||||||
FLAGS is a bitmask, see GOMP_TARGET_FLAG_* in gomp-constants.h.
|
FLAGS is a bitmask, see GOMP_TARGET_FLAG_* in gomp-constants.h.
|
||||||
DEPEND is array of dependencies, see GOMP_task for details.
|
DEPEND is array of dependencies, see GOMP_task for details.
|
||||||
|
|
||||||
|
ARGS is a pointer to an array consisting of a variable number of both
|
||||||
|
device-independent and device-specific arguments, which can take one two
|
||||||
|
elements where the first specifies for which device it is intended, the type
|
||||||
|
and optionally also the value. If the value is not present in the first
|
||||||
|
one, the whole second element the actual value. The last element of the
|
||||||
|
array is a single NULL. Among the device independent can be for example
|
||||||
|
NUM_TEAMS and THREAD_LIMIT.
|
||||||
|
|
||||||
NUM_TEAMS is positive if GOMP_teams will be called in the body with
|
NUM_TEAMS is positive if GOMP_teams will be called in the body with
|
||||||
that value, or 1 if teams construct is not present, or 0, if
|
that value, or 1 if teams construct is not present, or 0, if
|
||||||
teams construct does not have num_teams clause and so the choice is
|
teams construct does not have num_teams clause and so the choice is
|
||||||
|
|
@ -1443,14 +1506,10 @@ GOMP_target (int device, void (*fn) (void *), const void *unused,
|
||||||
void
|
void
|
||||||
GOMP_target_ext (int device, void (*fn) (void *), size_t mapnum,
|
GOMP_target_ext (int device, void (*fn) (void *), size_t mapnum,
|
||||||
void **hostaddrs, size_t *sizes, unsigned short *kinds,
|
void **hostaddrs, size_t *sizes, unsigned short *kinds,
|
||||||
unsigned int flags, void **depend, int num_teams,
|
unsigned int flags, void **depend, void **args)
|
||||||
int thread_limit)
|
|
||||||
{
|
{
|
||||||
struct gomp_device_descr *devicep = resolve_device (device);
|
struct gomp_device_descr *devicep = resolve_device (device);
|
||||||
|
|
||||||
(void) num_teams;
|
|
||||||
(void) thread_limit;
|
|
||||||
|
|
||||||
if (flags & GOMP_TARGET_FLAG_NOWAIT)
|
if (flags & GOMP_TARGET_FLAG_NOWAIT)
|
||||||
{
|
{
|
||||||
struct gomp_thread *thr = gomp_thread ();
|
struct gomp_thread *thr = gomp_thread ();
|
||||||
|
|
@ -1487,7 +1546,7 @@ GOMP_target_ext (int device, void (*fn) (void *), size_t mapnum,
|
||||||
&& !thr->task->final_task)
|
&& !thr->task->final_task)
|
||||||
{
|
{
|
||||||
gomp_create_target_task (devicep, fn, mapnum, hostaddrs,
|
gomp_create_target_task (devicep, fn, mapnum, hostaddrs,
|
||||||
sizes, kinds, flags, depend,
|
sizes, kinds, flags, depend, args,
|
||||||
GOMP_TARGET_TASK_BEFORE_MAP);
|
GOMP_TARGET_TASK_BEFORE_MAP);
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
|
@ -1507,17 +1566,30 @@ GOMP_target_ext (int device, void (*fn) (void *), size_t mapnum,
|
||||||
void *fn_addr;
|
void *fn_addr;
|
||||||
if (devicep == NULL
|
if (devicep == NULL
|
||||||
|| !(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400)
|
|| !(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400)
|
||||||
|| !(fn_addr = gomp_get_target_fn_addr (devicep, fn)))
|
|| !(fn_addr = gomp_get_target_fn_addr (devicep, fn))
|
||||||
|
|| (devicep->can_run_func && !devicep->can_run_func (fn_addr)))
|
||||||
{
|
{
|
||||||
gomp_target_fallback_firstprivate (fn, mapnum, hostaddrs, sizes, kinds);
|
gomp_target_fallback_firstprivate (fn, mapnum, hostaddrs, sizes, kinds);
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
|
||||||
struct target_mem_desc *tgt_vars
|
struct target_mem_desc *tgt_vars;
|
||||||
= gomp_map_vars (devicep, mapnum, hostaddrs, NULL, sizes, kinds, true,
|
void *fpc = NULL;
|
||||||
GOMP_MAP_VARS_TARGET);
|
if (devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
|
||||||
devicep->run_func (devicep->target_id, fn_addr, (void *) tgt_vars->tgt_start);
|
{
|
||||||
gomp_unmap_vars (tgt_vars, true);
|
fpc = gomp_target_unshare_firstprivate (mapnum, hostaddrs, sizes, kinds);
|
||||||
|
tgt_vars = NULL;
|
||||||
|
}
|
||||||
|
else
|
||||||
|
tgt_vars = gomp_map_vars (devicep, mapnum, hostaddrs, NULL, sizes, kinds,
|
||||||
|
true, GOMP_MAP_VARS_TARGET);
|
||||||
|
devicep->run_func (devicep->target_id, fn_addr,
|
||||||
|
tgt_vars ? (void *) tgt_vars->tgt_start : hostaddrs,
|
||||||
|
args);
|
||||||
|
if (tgt_vars)
|
||||||
|
gomp_unmap_vars (tgt_vars, true);
|
||||||
|
else
|
||||||
|
free (fpc);
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Host fallback for GOMP_target_data{,_ext} routines. */
|
/* Host fallback for GOMP_target_data{,_ext} routines. */
|
||||||
|
|
@ -1547,7 +1619,8 @@ GOMP_target_data (int device, const void *unused, size_t mapnum,
|
||||||
struct gomp_device_descr *devicep = resolve_device (device);
|
struct gomp_device_descr *devicep = resolve_device (device);
|
||||||
|
|
||||||
if (devicep == NULL
|
if (devicep == NULL
|
||||||
|| !(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400))
|
|| !(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400)
|
||||||
|
|| (devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM))
|
||||||
return gomp_target_data_fallback ();
|
return gomp_target_data_fallback ();
|
||||||
|
|
||||||
struct target_mem_desc *tgt
|
struct target_mem_desc *tgt
|
||||||
|
|
@ -1565,7 +1638,8 @@ GOMP_target_data_ext (int device, size_t mapnum, void **hostaddrs,
|
||||||
struct gomp_device_descr *devicep = resolve_device (device);
|
struct gomp_device_descr *devicep = resolve_device (device);
|
||||||
|
|
||||||
if (devicep == NULL
|
if (devicep == NULL
|
||||||
|| !(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400))
|
|| !(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400)
|
||||||
|
|| devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
|
||||||
return gomp_target_data_fallback ();
|
return gomp_target_data_fallback ();
|
||||||
|
|
||||||
struct target_mem_desc *tgt
|
struct target_mem_desc *tgt
|
||||||
|
|
@ -1595,7 +1669,8 @@ GOMP_target_update (int device, const void *unused, size_t mapnum,
|
||||||
struct gomp_device_descr *devicep = resolve_device (device);
|
struct gomp_device_descr *devicep = resolve_device (device);
|
||||||
|
|
||||||
if (devicep == NULL
|
if (devicep == NULL
|
||||||
|| !(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400))
|
|| !(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400)
|
||||||
|
|| devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
|
||||||
return;
|
return;
|
||||||
|
|
||||||
gomp_update (devicep, mapnum, hostaddrs, sizes, kinds, false);
|
gomp_update (devicep, mapnum, hostaddrs, sizes, kinds, false);
|
||||||
|
|
@ -1626,7 +1701,7 @@ GOMP_target_update_ext (int device, size_t mapnum, void **hostaddrs,
|
||||||
if (gomp_create_target_task (devicep, (void (*) (void *)) NULL,
|
if (gomp_create_target_task (devicep, (void (*) (void *)) NULL,
|
||||||
mapnum, hostaddrs, sizes, kinds,
|
mapnum, hostaddrs, sizes, kinds,
|
||||||
flags | GOMP_TARGET_FLAG_UPDATE,
|
flags | GOMP_TARGET_FLAG_UPDATE,
|
||||||
depend, GOMP_TARGET_TASK_DATA))
|
depend, NULL, GOMP_TARGET_TASK_DATA))
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
else
|
else
|
||||||
|
|
@ -1646,7 +1721,8 @@ GOMP_target_update_ext (int device, size_t mapnum, void **hostaddrs,
|
||||||
}
|
}
|
||||||
|
|
||||||
if (devicep == NULL
|
if (devicep == NULL
|
||||||
|| !(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400))
|
|| !(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400)
|
||||||
|
|| devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
|
||||||
return;
|
return;
|
||||||
|
|
||||||
struct gomp_thread *thr = gomp_thread ();
|
struct gomp_thread *thr = gomp_thread ();
|
||||||
|
|
@ -1756,7 +1832,7 @@ GOMP_target_enter_exit_data (int device, size_t mapnum, void **hostaddrs,
|
||||||
{
|
{
|
||||||
if (gomp_create_target_task (devicep, (void (*) (void *)) NULL,
|
if (gomp_create_target_task (devicep, (void (*) (void *)) NULL,
|
||||||
mapnum, hostaddrs, sizes, kinds,
|
mapnum, hostaddrs, sizes, kinds,
|
||||||
flags, depend,
|
flags, depend, NULL,
|
||||||
GOMP_TARGET_TASK_DATA))
|
GOMP_TARGET_TASK_DATA))
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
|
@ -1777,7 +1853,8 @@ GOMP_target_enter_exit_data (int device, size_t mapnum, void **hostaddrs,
|
||||||
}
|
}
|
||||||
|
|
||||||
if (devicep == NULL
|
if (devicep == NULL
|
||||||
|| !(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400))
|
|| !(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400)
|
||||||
|
|| devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
|
||||||
return;
|
return;
|
||||||
|
|
||||||
struct gomp_thread *thr = gomp_thread ();
|
struct gomp_thread *thr = gomp_thread ();
|
||||||
|
|
@ -1815,7 +1892,8 @@ gomp_target_task_fn (void *data)
|
||||||
void *fn_addr;
|
void *fn_addr;
|
||||||
if (devicep == NULL
|
if (devicep == NULL
|
||||||
|| !(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400)
|
|| !(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400)
|
||||||
|| !(fn_addr = gomp_get_target_fn_addr (devicep, ttask->fn)))
|
|| !(fn_addr = gomp_get_target_fn_addr (devicep, ttask->fn))
|
||||||
|
|| (devicep->can_run_func && !devicep->can_run_func (fn_addr)))
|
||||||
{
|
{
|
||||||
ttask->state = GOMP_TARGET_TASK_FALLBACK;
|
ttask->state = GOMP_TARGET_TASK_FALLBACK;
|
||||||
gomp_target_fallback_firstprivate (ttask->fn, ttask->mapnum,
|
gomp_target_fallback_firstprivate (ttask->fn, ttask->mapnum,
|
||||||
|
|
@ -1826,22 +1904,36 @@ gomp_target_task_fn (void *data)
|
||||||
|
|
||||||
if (ttask->state == GOMP_TARGET_TASK_FINISHED)
|
if (ttask->state == GOMP_TARGET_TASK_FINISHED)
|
||||||
{
|
{
|
||||||
gomp_unmap_vars (ttask->tgt, true);
|
if (ttask->tgt)
|
||||||
|
gomp_unmap_vars (ttask->tgt, true);
|
||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
|
|
||||||
ttask->tgt
|
void *actual_arguments;
|
||||||
= gomp_map_vars (devicep, ttask->mapnum, ttask->hostaddrs, NULL,
|
if (devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
|
||||||
ttask->sizes, ttask->kinds, true,
|
{
|
||||||
GOMP_MAP_VARS_TARGET);
|
ttask->tgt = NULL;
|
||||||
|
ttask->firstprivate_copies
|
||||||
|
= gomp_target_unshare_firstprivate (ttask->mapnum, ttask->hostaddrs,
|
||||||
|
ttask->sizes, ttask->kinds);
|
||||||
|
actual_arguments = ttask->hostaddrs;
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
ttask->tgt = gomp_map_vars (devicep, ttask->mapnum, ttask->hostaddrs,
|
||||||
|
NULL, ttask->sizes, ttask->kinds, true,
|
||||||
|
GOMP_MAP_VARS_TARGET);
|
||||||
|
actual_arguments = (void *) ttask->tgt->tgt_start;
|
||||||
|
}
|
||||||
ttask->state = GOMP_TARGET_TASK_READY_TO_RUN;
|
ttask->state = GOMP_TARGET_TASK_READY_TO_RUN;
|
||||||
|
|
||||||
devicep->async_run_func (devicep->target_id, fn_addr,
|
devicep->async_run_func (devicep->target_id, fn_addr, actual_arguments,
|
||||||
(void *) ttask->tgt->tgt_start, (void *) ttask);
|
ttask->args, (void *) ttask);
|
||||||
return true;
|
return true;
|
||||||
}
|
}
|
||||||
else if (devicep == NULL
|
else if (devicep == NULL
|
||||||
|| !(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400))
|
|| !(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400)
|
||||||
|
|| devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
|
||||||
return false;
|
return false;
|
||||||
|
|
||||||
size_t i;
|
size_t i;
|
||||||
|
|
@ -1891,7 +1983,8 @@ omp_target_alloc (size_t size, int device_num)
|
||||||
if (devicep == NULL)
|
if (devicep == NULL)
|
||||||
return NULL;
|
return NULL;
|
||||||
|
|
||||||
if (!(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400))
|
if (!(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400)
|
||||||
|
|| devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
|
||||||
return malloc (size);
|
return malloc (size);
|
||||||
|
|
||||||
gomp_mutex_lock (&devicep->lock);
|
gomp_mutex_lock (&devicep->lock);
|
||||||
|
|
@ -1919,7 +2012,8 @@ omp_target_free (void *device_ptr, int device_num)
|
||||||
if (devicep == NULL)
|
if (devicep == NULL)
|
||||||
return;
|
return;
|
||||||
|
|
||||||
if (!(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400))
|
if (!(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400)
|
||||||
|
|| devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
|
||||||
{
|
{
|
||||||
free (device_ptr);
|
free (device_ptr);
|
||||||
return;
|
return;
|
||||||
|
|
@ -1946,7 +2040,8 @@ omp_target_is_present (void *ptr, int device_num)
|
||||||
if (devicep == NULL)
|
if (devicep == NULL)
|
||||||
return 0;
|
return 0;
|
||||||
|
|
||||||
if (!(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400))
|
if (!(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400)
|
||||||
|
|| devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
|
||||||
return 1;
|
return 1;
|
||||||
|
|
||||||
gomp_mutex_lock (&devicep->lock);
|
gomp_mutex_lock (&devicep->lock);
|
||||||
|
|
@ -1976,7 +2071,8 @@ omp_target_memcpy (void *dst, void *src, size_t length, size_t dst_offset,
|
||||||
if (dst_devicep == NULL)
|
if (dst_devicep == NULL)
|
||||||
return EINVAL;
|
return EINVAL;
|
||||||
|
|
||||||
if (!(dst_devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400))
|
if (!(dst_devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400)
|
||||||
|
|| dst_devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
|
||||||
dst_devicep = NULL;
|
dst_devicep = NULL;
|
||||||
}
|
}
|
||||||
if (src_device_num != GOMP_DEVICE_HOST_FALLBACK)
|
if (src_device_num != GOMP_DEVICE_HOST_FALLBACK)
|
||||||
|
|
@ -1988,7 +2084,8 @@ omp_target_memcpy (void *dst, void *src, size_t length, size_t dst_offset,
|
||||||
if (src_devicep == NULL)
|
if (src_devicep == NULL)
|
||||||
return EINVAL;
|
return EINVAL;
|
||||||
|
|
||||||
if (!(src_devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400))
|
if (!(src_devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400)
|
||||||
|
|| src_devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
|
||||||
src_devicep = NULL;
|
src_devicep = NULL;
|
||||||
}
|
}
|
||||||
if (src_devicep == NULL && dst_devicep == NULL)
|
if (src_devicep == NULL && dst_devicep == NULL)
|
||||||
|
|
@ -2118,7 +2215,8 @@ omp_target_memcpy_rect (void *dst, void *src, size_t element_size,
|
||||||
if (dst_devicep == NULL)
|
if (dst_devicep == NULL)
|
||||||
return EINVAL;
|
return EINVAL;
|
||||||
|
|
||||||
if (!(dst_devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400))
|
if (!(dst_devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400)
|
||||||
|
|| dst_devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
|
||||||
dst_devicep = NULL;
|
dst_devicep = NULL;
|
||||||
}
|
}
|
||||||
if (src_device_num != GOMP_DEVICE_HOST_FALLBACK)
|
if (src_device_num != GOMP_DEVICE_HOST_FALLBACK)
|
||||||
|
|
@ -2130,7 +2228,8 @@ omp_target_memcpy_rect (void *dst, void *src, size_t element_size,
|
||||||
if (src_devicep == NULL)
|
if (src_devicep == NULL)
|
||||||
return EINVAL;
|
return EINVAL;
|
||||||
|
|
||||||
if (!(src_devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400))
|
if (!(src_devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400)
|
||||||
|
|| src_devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
|
||||||
src_devicep = NULL;
|
src_devicep = NULL;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
@ -2166,7 +2265,8 @@ omp_target_associate_ptr (void *host_ptr, void *device_ptr, size_t size,
|
||||||
if (devicep == NULL)
|
if (devicep == NULL)
|
||||||
return EINVAL;
|
return EINVAL;
|
||||||
|
|
||||||
if (!(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400))
|
if (!(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400)
|
||||||
|
|| devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
|
||||||
return EINVAL;
|
return EINVAL;
|
||||||
|
|
||||||
gomp_mutex_lock (&devicep->lock);
|
gomp_mutex_lock (&devicep->lock);
|
||||||
|
|
@ -2309,6 +2409,7 @@ gomp_load_plugin_for_device (struct gomp_device_descr *device,
|
||||||
{
|
{
|
||||||
DLSYM (run);
|
DLSYM (run);
|
||||||
DLSYM (async_run);
|
DLSYM (async_run);
|
||||||
|
DLSYM_OPT (can_run, can_run);
|
||||||
DLSYM (dev2dev);
|
DLSYM (dev2dev);
|
||||||
}
|
}
|
||||||
if (device->capabilities & GOMP_OFFLOAD_CAP_OPENACC_200)
|
if (device->capabilities & GOMP_OFFLOAD_CAP_OPENACC_200)
|
||||||
|
|
|
||||||
|
|
@ -582,6 +582,7 @@ GOMP_PLUGIN_target_task_completion (void *data)
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
ttask->state = GOMP_TARGET_TASK_FINISHED;
|
ttask->state = GOMP_TARGET_TASK_FINISHED;
|
||||||
|
free (ttask->firstprivate_copies);
|
||||||
gomp_target_task_completion (team, task);
|
gomp_target_task_completion (team, task);
|
||||||
gomp_mutex_unlock (&team->task_lock);
|
gomp_mutex_unlock (&team->task_lock);
|
||||||
}
|
}
|
||||||
|
|
@ -594,7 +595,7 @@ bool
|
||||||
gomp_create_target_task (struct gomp_device_descr *devicep,
|
gomp_create_target_task (struct gomp_device_descr *devicep,
|
||||||
void (*fn) (void *), size_t mapnum, void **hostaddrs,
|
void (*fn) (void *), size_t mapnum, void **hostaddrs,
|
||||||
size_t *sizes, unsigned short *kinds,
|
size_t *sizes, unsigned short *kinds,
|
||||||
unsigned int flags, void **depend,
|
unsigned int flags, void **depend, void **args,
|
||||||
enum gomp_target_task_state state)
|
enum gomp_target_task_state state)
|
||||||
{
|
{
|
||||||
struct gomp_thread *thr = gomp_thread ();
|
struct gomp_thread *thr = gomp_thread ();
|
||||||
|
|
@ -654,6 +655,7 @@ gomp_create_target_task (struct gomp_device_descr *devicep,
|
||||||
ttask->devicep = devicep;
|
ttask->devicep = devicep;
|
||||||
ttask->fn = fn;
|
ttask->fn = fn;
|
||||||
ttask->mapnum = mapnum;
|
ttask->mapnum = mapnum;
|
||||||
|
ttask->args = args;
|
||||||
memcpy (ttask->hostaddrs, hostaddrs, mapnum * sizeof (void *));
|
memcpy (ttask->hostaddrs, hostaddrs, mapnum * sizeof (void *));
|
||||||
ttask->sizes = (size_t *) &ttask->hostaddrs[mapnum];
|
ttask->sizes = (size_t *) &ttask->hostaddrs[mapnum];
|
||||||
memcpy (ttask->sizes, sizes, mapnum * sizeof (size_t));
|
memcpy (ttask->sizes, sizes, mapnum * sizeof (size_t));
|
||||||
|
|
|
||||||
|
|
@ -111,6 +111,8 @@ FC = @FC@
|
||||||
FCFLAGS = @FCFLAGS@
|
FCFLAGS = @FCFLAGS@
|
||||||
FGREP = @FGREP@
|
FGREP = @FGREP@
|
||||||
GREP = @GREP@
|
GREP = @GREP@
|
||||||
|
HSA_RUNTIME_INCLUDE = @HSA_RUNTIME_INCLUDE@
|
||||||
|
HSA_RUNTIME_LIB = @HSA_RUNTIME_LIB@
|
||||||
INSTALL = @INSTALL@
|
INSTALL = @INSTALL@
|
||||||
INSTALL_DATA = @INSTALL_DATA@
|
INSTALL_DATA = @INSTALL_DATA@
|
||||||
INSTALL_PROGRAM = @INSTALL_PROGRAM@
|
INSTALL_PROGRAM = @INSTALL_PROGRAM@
|
||||||
|
|
@ -155,6 +157,10 @@ PACKAGE_URL = @PACKAGE_URL@
|
||||||
PACKAGE_VERSION = @PACKAGE_VERSION@
|
PACKAGE_VERSION = @PACKAGE_VERSION@
|
||||||
PATH_SEPARATOR = @PATH_SEPARATOR@
|
PATH_SEPARATOR = @PATH_SEPARATOR@
|
||||||
PERL = @PERL@
|
PERL = @PERL@
|
||||||
|
PLUGIN_HSA = @PLUGIN_HSA@
|
||||||
|
PLUGIN_HSA_CPPFLAGS = @PLUGIN_HSA_CPPFLAGS@
|
||||||
|
PLUGIN_HSA_LDFLAGS = @PLUGIN_HSA_LDFLAGS@
|
||||||
|
PLUGIN_HSA_LIBS = @PLUGIN_HSA_LIBS@
|
||||||
PLUGIN_NVPTX = @PLUGIN_NVPTX@
|
PLUGIN_NVPTX = @PLUGIN_NVPTX@
|
||||||
PLUGIN_NVPTX_CPPFLAGS = @PLUGIN_NVPTX_CPPFLAGS@
|
PLUGIN_NVPTX_CPPFLAGS = @PLUGIN_NVPTX_CPPFLAGS@
|
||||||
PLUGIN_NVPTX_LDFLAGS = @PLUGIN_NVPTX_LDFLAGS@
|
PLUGIN_NVPTX_LDFLAGS = @PLUGIN_NVPTX_LDFLAGS@
|
||||||
|
|
|
||||||
|
|
@ -1,3 +1,8 @@
|
||||||
|
2016-01-19 Martin Jambor <mjambor@suse.cz>
|
||||||
|
* plugin/libgomp-plugin-intelmic.cpp (GOMP_OFFLOAD_async_run): New
|
||||||
|
unused parameter.
|
||||||
|
(GOMP_OFFLOAD_run): Likewise.
|
||||||
|
|
||||||
2015-12-14 Ilya Verbin <ilya.verbin@intel.com>
|
2015-12-14 Ilya Verbin <ilya.verbin@intel.com>
|
||||||
|
|
||||||
* plugin/libgomp-plugin-intelmic.cpp (unregister_main_image): Remove.
|
* plugin/libgomp-plugin-intelmic.cpp (unregister_main_image): Remove.
|
||||||
|
|
|
||||||
|
|
@ -528,7 +528,7 @@ GOMP_OFFLOAD_dev2dev (int device, void *dst_ptr, const void *src_ptr,
|
||||||
|
|
||||||
extern "C" void
|
extern "C" void
|
||||||
GOMP_OFFLOAD_async_run (int device, void *tgt_fn, void *tgt_vars,
|
GOMP_OFFLOAD_async_run (int device, void *tgt_fn, void *tgt_vars,
|
||||||
void *async_data)
|
void **, void *async_data)
|
||||||
{
|
{
|
||||||
TRACE ("(device = %d, tgt_fn = %p, tgt_vars = %p, async_data = %p)", device,
|
TRACE ("(device = %d, tgt_fn = %p, tgt_vars = %p, async_data = %p)", device,
|
||||||
tgt_fn, tgt_vars, async_data);
|
tgt_fn, tgt_vars, async_data);
|
||||||
|
|
@ -544,7 +544,7 @@ GOMP_OFFLOAD_async_run (int device, void *tgt_fn, void *tgt_vars,
|
||||||
}
|
}
|
||||||
|
|
||||||
extern "C" void
|
extern "C" void
|
||||||
GOMP_OFFLOAD_run (int device, void *tgt_fn, void *tgt_vars)
|
GOMP_OFFLOAD_run (int device, void *tgt_fn, void *tgt_vars, void **)
|
||||||
{
|
{
|
||||||
TRACE ("(device = %d, tgt_fn = %p, tgt_vars = %p)", device, tgt_fn, tgt_vars);
|
TRACE ("(device = %d, tgt_fn = %p, tgt_vars = %p)", device, tgt_fn, tgt_vars);
|
||||||
|
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue