Commit 14bb1e8c authored by Yonghong Song's avatar Yonghong Song Committed by Daniel Borkmann
Browse files

selftests/bpf: Fix flaky test btf_map_in_map/lookup_update



Recently, I frequently hit the following test failure:

  [root@arch-fb-vm1 bpf]# ./test_progs -n 33/1
  test_lookup_update:PASS:skel_open 0 nsec
  [...]
  test_lookup_update:PASS:sync_rcu 0 nsec
  test_lookup_update:FAIL:map1_leak inner_map1 leaked!
  #33/1    btf_map_in_map/lookup_update:FAIL
  #33      btf_map_in_map:FAIL

In the test, after map is closed and then after two rcu grace periods,
it is assumed that map_id is not available to user space.

But the above assumption cannot be guaranteed. After zero or one
or two rcu grace periods in different siturations, the actual
freeing-map-work is put into a workqueue. Later on, when the work
is dequeued, the map will be actually freed.
See bpf_map_put() in kernel/bpf/syscall.c.

By using workqueue, there is no ganrantee that map will be actually
freed after a couple of rcu grace periods. This patch removed
such map leak detection and then the test can pass consistently.

Signed-off-by: default avatarYonghong Song <yonghong.song@linux.dev>
Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240322061353.632136-1-yonghong.song@linux.dev
parent 770546ae
Loading
Loading
Loading
Loading
+1 −25
Original line number Diff line number Diff line
@@ -25,7 +25,7 @@ static void test_lookup_update(void)
	int map1_fd, map2_fd, map3_fd, map4_fd, map5_fd, map1_id, map2_id;
	int outer_arr_fd, outer_hash_fd, outer_arr_dyn_fd;
	struct test_btf_map_in_map *skel;
	int err, key = 0, val, i, fd;
	int err, key = 0, val, i;

	skel = test_btf_map_in_map__open_and_load();
	if (CHECK(!skel, "skel_open", "failed to open&load skeleton\n"))
@@ -102,30 +102,6 @@ static void test_lookup_update(void)
	CHECK(map1_id == 0, "map1_id", "failed to get ID 1\n");
	CHECK(map2_id == 0, "map2_id", "failed to get ID 2\n");

	test_btf_map_in_map__destroy(skel);
	skel = NULL;

	/* we need to either wait for or force synchronize_rcu(), before
	 * checking for "still exists" condition, otherwise map could still be
	 * resolvable by ID, causing false positives.
	 *
	 * Older kernels (5.8 and earlier) freed map only after two
	 * synchronize_rcu()s, so trigger two, to be entirely sure.
	 */
	CHECK(kern_sync_rcu(), "sync_rcu", "failed\n");
	CHECK(kern_sync_rcu(), "sync_rcu", "failed\n");

	fd = bpf_map_get_fd_by_id(map1_id);
	if (CHECK(fd >= 0, "map1_leak", "inner_map1 leaked!\n")) {
		close(fd);
		goto cleanup;
	}
	fd = bpf_map_get_fd_by_id(map2_id);
	if (CHECK(fd >= 0, "map2_leak", "inner_map2 leaked!\n")) {
		close(fd);
		goto cleanup;
	}

cleanup:
	test_btf_map_in_map__destroy(skel);
}