mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git
synced 2026-04-23 05:56:14 -04:00
ecb6cc0fd8cd2d34b983e118aa61dd8c9b052d0d
19203 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
8f75ba28b8 |
selftests: ublk: kublk: lift queue initialization out of thread
Currently, each ublk server I/O handler thread initializes its own queue. However, as we move towards decoupled ublk_queues and ublk server threads, this model does not make sense anymore, as there will no longer be a concept of a thread having "its own" queue. So lift queue initialization out of the per-thread ublk_io_handler_fn and into a loop in ublk_start_daemon (which runs once for each device). There is a part of ublk_queue_init (ring initialization) which does actually need to happen on the thread that will use the ring; that is separated into a separate ublk_thread_init which is still called by each I/O handler thread. Signed-off-by: Uday Shankar <ushankar@purestorage.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250529-ublk_task_per_io-v8-4-e9d3b119336a@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
|
|
9773709752 |
selftests: ublk: kublk: tie sqe allocation to io instead of queue
We currently have a helper ublk_queue_alloc_sqes which the ublk targets use to allocate SQEs for their own operations. However, as we move towards decoupled ublk_queues and ublk server threads, this helper does not make sense anymore. SQEs are allocated from rings, and we will have one ring per thread to avoid locking. Change the SQE allocation helper to ublk_io_alloc_sqes. Currently this still allocates SQEs from the io's queue's ring, but when we fully decouple threads and queues, it will allocate from the io's thread's ring instead. Signed-off-by: Uday Shankar <ushankar@purestorage.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250529-ublk_task_per_io-v8-3-e9d3b119336a@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
|
|
bf098d7269 |
selftests: ublk: kublk: plumb q_id in io_uring user_data
Currently, when we process CQEs, we know which ublk_queue we are working on because we know which ring we are working on, and ublk_queues and rings are in 1:1 correspondence. However, as we decouple ublk_queues from ublk server threads, ublk_queues and rings will no longer be in 1:1 correspondence - each ublk server thread will have a ring, and each thread may issue commands against more than one ublk_queue. So in order to know which ublk_queue a CQE refers to, plumb that information in the associated SQE's user_data. Signed-off-by: Uday Shankar <ushankar@purestorage.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250529-ublk_task_per_io-v8-2-e9d3b119336a@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
|
|
4cb6c8af85 |
selftests/filesystems: Fix build of anon_inode_test
The newly added anon_inode_test test fails to build due to attempting to
include a nonexisting overlayfs/wrapper.h:
anon_inode_test.c:10:10: fatal error: overlayfs/wrappers.h: No such file or directory
10 | #include "overlayfs/wrappers.h"
| ^~~~~~~~~~~~~~~~~~~~~~
This is due to
|
||
|
|
b78f1293f9 |
Merge tag 'trace-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing updates from Steven Rostedt: - Have module addresses get updated in the persistent ring buffer The addresses of the modules from the previous boot are saved in the persistent ring buffer. If the same modules are loaded and an address is in the old buffer points to an address that was both saved in the persistent ring buffer and is loaded in memory, shift the address to point to the address that is loaded in memory in the trace event. - Print function names for irqs off and preempt off callsites When ignoring the print fmt of a trace event and just printing the fields directly, have the fields for preempt off and irqs off events still show the function name (via kallsyms) instead of just showing the raw address. - Clean ups of the histogram code The histogram functions saved over 800 bytes on the stack to process events as they come in. Instead, create per-cpu buffers that can hold this information and have a separate location for each context level (thread, softirq, IRQ and NMI). Also add some more comments to the code. - Add "common_comm" field for histograms Add "common_comm" that uses the current->comm as a field in an event histogram and acts like any of the other fields of the event. - Show "subops" in the enabled_functions file When the function graph infrastructure is used, a subsystem has a "subops" that it attaches its callback function to. Instead of the enabled_functions just showing a function calling the function that calls the subops functions, also show the subops functions that will get called for that function too. - Add "copy_trace_marker" option to instances There are cases where an instance is created for tooling to write into, but the old tooling has the top level instance hardcoded into the application. New tools want to consume the data from an instance and not the top level buffer. By adding a copy_trace_marker option, whenever the top instance trace_marker is written into, a copy of it is also written into the instance with this option set. This allows new tools to read what old tools are writing into the top buffer. If this option is cleared by the top instance, then what is written into the trace_marker is not written into the top instance. This is a way to redirect the trace_marker writes into another instance. - Have tracepoints created by DECLARE_TRACE() use trace_<name>_tp() If a tracepoint is created by DECLARE_TRACE() instead of TRACE_EVENT(), then it will not be exposed via tracefs. Currently there's no way to differentiate in the kernel the tracepoint functions between those that are exposed via tracefs or not. A calling convention has been made manually to append a "_tp" prefix for events created by DECLARE_TRACE(). Instead of doing this manually, force it so that all DECLARE_TRACE() events have this notation. - Use __string() for task->comm in some sched events Instead of hardcoding the comm to be TASK_COMM_LEN in some of the scheduler events use __string() which makes it dynamic. Note, if these events are parsed by user space it they may break, and the event may have to be converted back to the hardcoded size. - Have function graph "depth" be unsigned to the user Internally to the kernel, the "depth" field of the function graph event is signed due to -1 being used for end of boundary. What actually gets recorded in the event itself is zero or positive. Reflect this to user space by showing "depth" as unsigned int and be consistent across all events. - Allow an arbitrary long CPU string to osnoise_cpus_write() The filtering of which CPUs to write to can exceed 256 bytes. If a machine has 256 CPUs, and the filter is to filter every other CPU, the write would take a string larger than 256 bytes. Instead of using a fixed size buffer on the stack that is 256 bytes, allocate it to handle what is passed in. - Stop having ftrace check the per-cpu data "disabled" flag The "disabled" flag in the data structure passed to most ftrace functions is checked to know if tracing has been disabled or not. This flag was added back in 2008 before the ring buffer had its own way to disable tracing. The "disable" flag is now not always set when needed, and the ring buffer flag should be used in all locations where the disabled is needed. Since the "disable" flag is redundant and incorrect, stop using it. Fix up some locations that use the "disable" flag to use the ring buffer info. - Use a new tracer_tracing_disable/enable() instead of data->disable flag There's a few cases that set the data->disable flag to stop tracing, but this flag is not consistently used. It is also an on/off switch where if a function set it and calls another function that sets it, the called function may incorrectly enable it. Use a new trace_tracing_disable() and tracer_tracing_enable() that uses a counter and can be nested. These use the ring buffer flags which are always checked making the disabling more consistent. - Save the trace clock in the persistent ring buffer Save what clock was used for tracing in the persistent ring buffer and set it back to that clock after a reboot. - Remove unused reference to a per CPU data pointer in mmiotrace functions - Remove unused buffer_page field from trace_array_cpu structure - Remove more strncpy() instances - Other minor clean ups and fixes * tag 'trace-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (36 commits) tracing: Fix compilation warning on arm32 tracing: Record trace_clock and recover when reboot tracing/sched: Use __string() instead of fixed lengths for task->comm tracepoint: Have tracepoints created with DECLARE_TRACE() have _tp suffix tracing: Cleanup upper_empty() in pid_list tracing: Allow the top level trace_marker to write into another instances tracing: Add a helper function to handle the dereference arg in verifier tracing: Remove unnecessary "goto out" that simply returns ret is trigger code tracing: Fix error handling in event_trigger_parse() tracing: Rename event_trigger_alloc() to trigger_data_alloc() tracing: Replace deprecated strncpy() with strscpy() for stack_trace_filter_buf tracing: Remove unused buffer_page field from trace_array_cpu structure tracing: Use atomic_inc_return() for updating "disabled" counter in irqsoff tracer tracing: Convert the per CPU "disabled" counter to local from atomic tracing: branch: Use trace_tracing_is_on_cpu() instead of "disabled" field ring-buffer: Add ring_buffer_record_is_on_cpu() tracing: Do not use per CPU array_buffer.data->disabled for cpumask ftrace: Do not disabled function graph based on "disabled" field tracing: kdb: Use tracer_tracing_on/off() instead of setting per CPU disabled tracing: Use tracer_tracing_disable() instead of "disabled" field for ftrace_dump_one() ... |
||
|
|
43db111107 |
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Paolo Bonzini:
"As far as x86 goes this pull request "only" includes TDX host support.
Quotes are appropriate because (at 6k lines and 100+ commits) it is
much bigger than the rest, which will come later this week and
consists mostly of bugfixes and selftests. s390 changes will also come
in the second batch.
ARM:
- Add large stage-2 mapping (THP) support for non-protected guests
when pKVM is enabled, clawing back some performance.
- Enable nested virtualisation support on systems that support it,
though it is disabled by default.
- Add UBSAN support to the standalone EL2 object used in nVHE/hVHE
and protected modes.
- Large rework of the way KVM tracks architecture features and links
them with the effects of control bits. While this has no functional
impact, it ensures correctness of emulation (the data is
automatically extracted from the published JSON files), and helps
dealing with the evolution of the architecture.
- Significant changes to the way pKVM tracks ownership of pages,
avoiding page table walks by storing the state in the hypervisor's
vmemmap. This in turn enables the THP support described above.
- New selftest checking the pKVM ownership transition rules
- Fixes for FEAT_MTE_ASYNC being accidentally advertised to guests
even if the host didn't have it.
- Fixes for the address translation emulation, which happened to be
rather buggy in some specific contexts.
- Fixes for the PMU emulation in NV contexts, decoupling PMCR_EL0.N
from the number of counters exposed to a guest and addressing a
number of issues in the process.
- Add a new selftest for the SVE host state being corrupted by a
guest.
- Keep HCR_EL2.xMO set at all times for systems running with the
kernel at EL2, ensuring that the window for interrupts is slightly
bigger, and avoiding a pretty bad erratum on the AmpereOne HW.
- Add workaround for AmpereOne's erratum AC04_CPU_23, which suffers
from a pretty bad case of TLB corruption unless accesses to HCR_EL2
are heavily synchronised.
- Add a per-VM, per-ITS debugfs entry to dump the state of the ITS
tables in a human-friendly fashion.
- and the usual random cleanups.
LoongArch:
- Don't flush tlb if the host supports hardware page table walks.
- Add KVM selftests support.
RISC-V:
- Add vector registers to get-reg-list selftest
- VCPU reset related improvements
- Remove scounteren initialization from VCPU reset
- Support VCPU reset from userspace using set_mpstate() ioctl
x86:
- Initial support for TDX in KVM.
This finally makes it possible to use the TDX module to run
confidential guests on Intel processors. This is quite a large
series, including support for private page tables (managed by the
TDX module and mirrored in KVM for efficiency), forwarding some
TDVMCALLs to userspace, and handling several special VM exits from
the TDX module.
This has been in the works for literally years and it's not really
possible to describe everything here, so I'll defer to the various
merge commits up to and including commit
|
||
|
|
90b83efa67 |
Merge tag 'bpf-next-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Pull bpf updates from Alexei Starovoitov:
- Fix and improve BTF deduplication of identical BTF types (Alan
Maguire and Andrii Nakryiko)
- Support up to 12 arguments in BPF trampoline on arm64 (Xu Kuohai and
Alexis Lothoré)
- Support load-acquire and store-release instructions in BPF JIT on
riscv64 (Andrea Parri)
- Fix uninitialized values in BPF_{CORE,PROBE}_READ macros (Anton
Protopopov)
- Streamline allowed helpers across program types (Feng Yang)
- Support atomic update for hashtab of BPF maps (Hou Tao)
- Implement json output for BPF helpers (Ihor Solodrai)
- Several s390 JIT fixes (Ilya Leoshkevich)
- Various sockmap fixes (Jiayuan Chen)
- Support mmap of vmlinux BTF data (Lorenz Bauer)
- Support BPF rbtree traversal and list peeking (Martin KaFai Lau)
- Tests for sockmap/sockhash redirection (Michal Luczaj)
- Introduce kfuncs for memory reads into dynptrs (Mykyta Yatsenko)
- Add support for dma-buf iterators in BPF (T.J. Mercier)
- The verifier support for __bpf_trap() (Yonghong Song)
* tag 'bpf-next-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (135 commits)
bpf, arm64: Remove unused-but-set function and variable.
selftests/bpf: Add tests with stack ptr register in conditional jmp
bpf: Do not include stack ptr register in precision backtracking bookkeeping
selftests/bpf: enable many-args tests for arm64
bpf, arm64: Support up to 12 function arguments
bpf: Check rcu_read_lock_trace_held() in bpf_map_lookup_percpu_elem()
bpf: Avoid __bpf_prog_ret0_warn when jit fails
bpftool: Add support for custom BTF path in prog load/loadall
selftests/bpf: Add unit tests with __bpf_trap() kfunc
bpf: Warn with __bpf_trap() kfunc maybe due to uninitialized variable
bpf: Remove special_kfunc_set from verifier
selftests/bpf: Add test for open coded dmabuf_iter
selftests/bpf: Add test for dmabuf_iter
bpf: Add open coded dmabuf iterator
bpf: Add dmabuf iterator
dma-buf: Rename debugfs symbols
bpf: Fix error return value in bpf_copy_from_user_dynptr
libbpf: Use mmap to parse vmlinux BTF from sysfs
selftests: bpf: Add a test for mmapable vmlinux BTF
btf: Allow mmap of vmlinux btf
...
|
||
|
|
1b98f357da |
Merge tag 'net-next-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Paolo Abeni:
"Core:
- Implement the Device Memory TCP transmit path, allowing zero-copy
data transmission on top of TCP from e.g. GPU memory to the wire.
- Move all the IPv6 routing tables management outside the RTNL scope,
under its own lock and RCU. The route control path is now 3x times
faster.
- Convert queue related netlink ops to instance lock, reducing again
the scope of the RTNL lock. This improves the control plane
scalability.
- Refactor the software crc32c implementation, removing unneeded
abstraction layers and improving significantly the related
micro-benchmarks.
- Optimize the GRO engine for UDP-tunneled traffic, for a 10%
performance improvement in related stream tests.
- Cover more per-CPU storage with local nested BH locking; this is a
prep work to remove the current per-CPU lock in local_bh_disable()
on PREMPT_RT.
- Introduce and use nlmsg_payload helper, combining buffer bounds
verification with accessing payload carried by netlink messages.
Netfilter:
- Rewrite the procfs conntrack table implementation, improving
considerably the dump performance. A lot of user-space tools still
use this interface.
- Implement support for wildcard netdevice in netdev basechain and
flowtables.
- Integrate conntrack information into nft trace infrastructure.
- Export set count and backend name to userspace, for better
introspection.
BPF:
- BPF qdisc support: BPF-qdisc can be implemented with BPF struct_ops
programs and can be controlled in similar way to traditional qdiscs
using the "tc qdisc" command.
- Refactor the UDP socket iterator, addressing long standing issues
WRT duplicate hits or missed sockets.
Protocols:
- Improve TCP receive buffer auto-tuning and increase the default
upper bound for the receive buffer; overall this improves the
single flow maximum thoughput on 200Gbs link by over 60%.
- Add AFS GSSAPI security class to AF_RXRPC; it provides transport
security for connections to the AFS fileserver and VL server.
- Improve TCP multipath routing, so that the sources address always
matches the nexthop device.
- Introduce SO_PASSRIGHTS for AF_UNIX, to allow disabling SCM_RIGHTS,
and thus preventing DoS caused by passing around problematic FDs.
- Retire DCCP socket. DCCP only receives updates for bugs, and major
distros disable it by default. Its removal allows for better
organisation of TCP fields to reduce the number of cache lines hit
in the fast path.
- Extend TCP drop-reason support to cover PAWS checks.
Driver API:
- Reorganize PTP ioctl flag support to require an explicit opt-in for
the drivers, avoiding the problem of drivers not rejecting new
unsupported flags.
- Converted several device drivers to timestamping APIs.
- Introduce per-PHY ethtool dump helpers, improving the support for
dump operations targeting PHYs.
Tests and tooling:
- Add support for classic netlink in user space C codegen, so that
ynl-c can now read, create and modify links, routes addresses and
qdisc layer configuration.
- Add ynl sub-types for binary attributes, allowing ynl-c to output
known struct instead of raw binary data, clarifying the classic
netlink output.
- Extend MPTCP selftests to improve the code-coverage.
- Add tests for XDP tail adjustment in AF_XDP.
New hardware / drivers:
- OpenVPN virtual driver: offload OpenVPN data channels processing to
the kernel-space, increasing the data transfer throughput WRT the
user-space implementation.
- Renesas glue driver for the gigabit ethernet RZ/V2H(P) SoC.
- Broadcom asp-v3.0 ethernet driver.
- AMD Renoir ethernet device.
- ReakTek MT9888 2.5G ethernet PHY driver.
- Aeonsemi 10G C45 PHYs driver.
Drivers:
- Ethernet high-speed NICs:
- nVidia/Mellanox (mlx5):
- refactor the steering table handling to significantly
reduce the amount of memory used
- add support for complex matches in H/W flow steering
- improve flow streeing error handling
- convert to netdev instance locking
- Intel (100G, ice, igb, ixgbe, idpf):
- ice: add switchdev support for LLDP traffic over VF
- ixgbe: add firmware manipulation and regions devlink support
- igb: introduce support for frame transmission premption
- igb: adds persistent NAPI configuration
- idpf: introduce RDMA support
- idpf: add initial PTP support
- Meta (fbnic):
- extend hardware stats coverage
- add devlink dev flash support
- Broadcom (bnxt):
- add support for RX-side device memory TCP
- Wangxun (txgbe):
- implement support for udp tunnel offload
- complete PTP and SRIOV support for AML 25G/10G devices
- Ethernet NICs embedded and virtual:
- Google (gve):
- add device memory TCP TX support
- Amazon (ena):
- support persistent per-NAPI config
- Airoha:
- add H/W support for L2 traffic offload
- add per flow stats for flow offloading
- RealTek (rtl8211): add support for WoL magic packet
- Synopsys (stmmac):
- dwmac-socfpga 1000BaseX support
- add Loongson-2K3000 support
- introduce support for hardware-accelerated VLAN stripping
- Broadcom (bcmgenet):
- expose more H/W stats
- Freescale (enetc, dpaa2-eth):
- enetc: add MAC filter, VLAN filter RSS and loopback support
- dpaa2-eth: convert to H/W timestamping APIs
- vxlan: convert FDB table to rhashtable, for better scalabilty
- veth: apply qdisc backpressure on full ring to reduce TX drops
- Ethernet switches:
- Microchip (kzZ88x3): add ETS scheduler support
- Ethernet PHYs:
- RealTek (rtl8211):
- add support for WoL magic packet
- add support for PHY LEDs
- CAN:
- Adds RZ/G3E CANFD support to the rcar_canfd driver.
- Preparatory work for CAN-XL support.
- Add self-tests framework with support for CAN physical interfaces.
- WiFi:
- mac80211:
- scan improvements with multi-link operation (MLO)
- Qualcomm (ath12k):
- enable AHB support for IPQ5332
- add monitor interface support to QCN9274
- add multi-link operation support to WCN7850
- add 802.11d scan offload support to WCN7850
- monitor mode for WCN7850, better 6 GHz regulatory
- Qualcomm (ath11k):
- restore hibernation support
- MediaTek (mt76):
- WiFi-7 improvements
- implement support for mt7990
- Intel (iwlwifi):
- enhanced multi-link single-radio (EMLSR) support on 5 GHz links
- rework device configuration
- RealTek (rtw88):
- improve throughput for RTL8814AU
- RealTek (rtw89):
- add multi-link operation support
- STA/P2P concurrency improvements
- support different SAR configs by antenna
- Bluetooth:
- introduce HCI Driver protocol
- btintel_pcie: do not generate coredump for diagnostic events
- btusb: add HCI Drv commands for configuring altsetting
- btusb: add RTL8851BE device 0x0bda:0xb850
- btusb: add new VID/PID 13d3/3584 for MT7922
- btusb: add new VID/PID 13d3/3630 and 13d3/3613 for MT7925
- btnxpuart: implement host-wakeup feature"
* tag 'net-next-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1611 commits)
selftests/bpf: Fix bpf selftest build warning
selftests: netfilter: Fix skip of wildcard interface test
net: phy: mscc: Stop clearing the the UDPv4 checksum for L2 frames
net: openvswitch: Fix the dead loop of MPLS parse
calipso: Don't call calipso functions for AF_INET sk.
selftests/tc-testing: Add a test for HFSC eltree double add with reentrant enqueue behaviour on netem
net_sched: hfsc: Address reentrant enqueue adding class to eltree twice
octeontx2-pf: QOS: Refactor TC_HTB_LEAF_DEL_LAST callback
octeontx2-pf: QOS: Perform cache sync on send queue teardown
net: mana: Add support for Multi Vports on Bare metal
net: devmem: ncdevmem: remove unused variable
net: devmem: ksft: upgrade rx test to send 1K data
net: devmem: ksft: add 5 tuple FS support
net: devmem: ksft: add exit_wait to make rx test pass
net: devmem: ksft: add ipv4 support
net: devmem: preserve sockc_err
page_pool: fix ugly page_pool formatting
net: devmem: move list_add to net_devmem_bind_dmabuf.
selftests: netfilter: nft_queue.sh: include file transfer duration in log message
net: phy: mscc: Fix memory leak when using one step timestamping
...
|
||
|
|
47cf96fbe3 |
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Will Deacon:
"The headline feature is the re-enablement of support for Arm's
Scalable Matrix Extension (SME) thanks to a bumper crop of fixes
from Mark Rutland.
If matrices aren't your thing, then Ryan's page-table optimisation
work is much more interesting.
Summary:
ACPI, EFI and PSCI:
- Decouple Arm's "Software Delegated Exception Interface" (SDEI)
support from the ACPI GHES code so that it can be used by platforms
booted with device-tree
- Remove unnecessary per-CPU tracking of the FPSIMD state across EFI
runtime calls
- Fix a node refcount imbalance in the PSCI device-tree code
CPU Features:
- Ensure register sanitisation is applied to fields in ID_AA64MMFR4
- Expose AIDR_EL1 to userspace via sysfs, primarily so that KVM
guests can reliably query the underlying CPU types from the VMM
- Re-enabling of SME support (CONFIG_ARM64_SME) as a result of fixes
to our context-switching, signal handling and ptrace code
Entry code:
- Hook up TIF_NEED_RESCHED_LAZY so that CONFIG_PREEMPT_LAZY can be
selected
Memory management:
- Prevent BSS exports from being used by the early PI code
- Propagate level and stride information to the low-level TLB
invalidation routines when operating on hugetlb entries
- Use the page-table contiguous hint for vmap() mappings with
VM_ALLOW_HUGE_VMAP where possible
- Optimise vmalloc()/vmap() page-table updates to use "lazy MMU mode"
and hook this up on arm64 so that the trailing DSB (used to publish
the updates to the hardware walker) can be deferred until the end
of the mapping operation
- Extend mmap() randomisation for 52-bit virtual addresses (on par
with 48-bit addressing) and remove limited support for
randomisation of the linear map
Perf and PMUs:
- Add support for probing the CMN-S3 driver using ACPI
- Minor driver fixes to the CMN, Arm-NI and amlogic PMU drivers
Selftests:
- Fix FPSIMD and SME tests to align with the freshly re-enabled SME
support
- Fix default setting of the OUTPUT variable so that tests are
installed in the right location
vDSO:
- Replace raw counter access from inline assembly code with a call to
the the __arch_counter_get_cntvct() helper function
Miscellaneous:
- Add some missing header inclusions to the CCA headers
- Rework rendering of /proc/cpuinfo to follow the x86-approach and
avoid repeated buffer expansion (the user-visible format remains
identical)
- Remove redundant selection of CONFIG_CRC32
- Extend early error message when failing to map the device-tree
blob"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (83 commits)
arm64: cputype: Add cputype definition for HIP12
arm64: el2_setup.h: Make __init_el2_fgt labels consistent, again
perf/arm-cmn: Add CMN S3 ACPI binding
arm64/boot: Disallow BSS exports to startup code
arm64/boot: Move global CPU override variables out of BSS
arm64/boot: Move init_pgdir[] and init_idmap_pgdir[] into __pi_ namespace
perf/arm-cmn: Initialise cmn->cpu earlier
kselftest/arm64: Set default OUTPUT path when undefined
arm64: Update comment regarding values in __boot_cpu_mode
arm64: mm: Drop redundant check in pmd_trans_huge()
arm64/mm: Re-organise setting up FEAT_S1PIE registers PIRE0_EL1 and PIR_EL1
arm64/mm: Permit lazy_mmu_mode to be nested
arm64/mm: Disable barrier batching in interrupt contexts
arm64/cpuinfo: only show one cpu's info in c_show()
arm64/mm: Batch barriers when updating kernel mappings
mm/vmalloc: Enter lazy mmu mode while manipulating vmalloc ptes
arm64/mm: Support huge pte-mapped pages in vmap
mm/vmalloc: Gracefully unmap huge ptes
mm/vmalloc: Warn on improper use of vunmap_range()
arm64/mm: Hoist barriers out of set_ptes_anysz() loop
...
|
||
|
|
96d40793ab |
Merge tag 'seccomp-v6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull seccomp updates from Kees Cook: - selftest fixes for arm32 (Neill Kapron, Terry Tritton) - documentation typo fix (Sumanth Gavini) * tag 'seccomp-v6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: selftests: seccomp: Fix "performace" to "performance" selftests/seccomp: fix negative_ENOSYS tracer tests on arm32 selftests/seccomp: fix syscall_restart test for arm compat |
||
|
|
f6bd8faeb1 |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Merge in late fixes to prepare for the 6.16 net-next PR. No conflicts nor adjacent changes. Signed-off-by: Paolo Abeni <pabeni@redhat.com> |
||
|
|
6da5f1b4b4 |
selftests: netfilter: Fix skip of wildcard interface test
The script is supposed to skip wildcard interface testing if unsupported
by the host's nft tool. The failing check caused script abort due to
'set -e' though. Fix this by running the potentially failing nft command
inside the if-conditional pipe.
Fixes:
|
||
|
|
2945ff733d |
selftests/tc-testing: Add a test for HFSC eltree double add with reentrant enqueue behaviour on netem
Reproduce the UAF scenario where netem is a child of HFSC and HFSC is configured to use the eltree. In such case, this TDC test would cause the HFSC class to be added to the eltree twice resulting in a UAF. Reviewed-by: Victor Nogueira <victor@mojatatu.com> Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Link: https://patch.msgid.link/20250522181448.1439717-3-pctammela@mojatatu.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> |
||
|
|
feacb1774b |
Merge tag 'sched_ext-for-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext
Pull sched_ext updates from Tejun Heo: - More in-kernel idle CPU selection improvements. Expand topology awareness coverage add scx_bpf_select_cpu_and() to allow more flexibility. The idle CPU selection kfuncs can now be called from unlocked contexts too. - A bunch of reorganization changes to lay the foundation for multiple hierarchical scheduler support. This isn't ready yet and the included changes don't make meaningful behavior differences. One notable change is replacing some static_key tests with dynamic tests as the test results may differ depending on the scheduler instance. This isn't expected to cause meaningful performance difference. - Other minor and doc updates. - There were multiple patches in for-6.15-fixes which conflicted with changes in for-6.16. for-6.15-fixes were pulled three times into for-6.16 to resolve the conflicts. * tag 'sched_ext-for-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: (49 commits) sched_ext: Call ops.update_idle() after updating builtin idle bits sched_ext, docs: convert mentions of "CFS" to "fair-class scheduler" selftests/sched_ext: Update test enq_select_cpu_fails sched_ext: idle: Consolidate default idle CPU selection kfuncs selftests/sched_ext: Add test for scx_bpf_select_cpu_and() via test_run sched_ext: idle: Allow scx_bpf_select_cpu_and() from unlocked context sched_ext: idle: Validate locking correctness in scx_bpf_select_cpu_and() sched_ext: Make scx_kf_allowed_if_unlocked() available outside ext.c sched_ext, docs: add label sched_ext: Explain the temporary situation around scx_root dereferences sched_ext: Add @sch to SCX_CALL_OP*() sched_ext: Cleanup [__]scx_exit/error*() sched_ext: Add @sch to SCX_CALL_OP*() sched_ext: Clean up scx_root usages Documentation: scheduler: Changed lowercase acronyms to uppercase sched_ext: Avoid NULL scx_root deref in __scx_exit() sched_ext: Add RCU protection to scx_root in DSQ iterator sched_ext: Clean up SCX_EXIT_NONE handling in scx_disable_workfn() sched_ext: Move disable machinery into scx_sched sched_ext: Move event_stats_cpu into scx_sched ... |
||
|
|
3b66e6b3c0 |
Merge tag 'cgroup-for-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo: - cgroup rstat shared the tracking tree across all controllers with the rationale being that a cgroup which is using one resource is likely to be using other resources at the same time (ie. if something is allocating memory, it's probably consuming CPU cycles). However, this turned out to not scale very well especially with memcg using rstat for internal operations which made memcg stat read and flush patterns substantially different from other controllers. JP Kobryn split the rstat tree per controller. - cgroup BPF support was hooking into cgroup init/exit paths directly. Convert them to use a notifier chain instead so that other usages can be added easily. The two of the patches which implement this are mislabeled as belonging to sched_ext instead of cgroup. Sorry. - Relatively minor cpuset updates - Documentation updates * tag 'cgroup-for-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (23 commits) sched_ext: Convert cgroup BPF support to use cgroup_lifetime_notifier sched_ext: Introduce cgroup_lifetime_notifier cgroup: Minor reorganization of cgroup_create() cgroup, docs: cpu controller's interaction with various scheduling policies cgroup, docs: convert space indentation to tab indentation cgroup: avoid per-cpu allocation of size zero rstat cpu locks cgroup, docs: be specific about bandwidth control of rt processes cgroup: document the rstat per-cpu initialization cgroup: helper for checking rstat participation of css cgroup: use subsystem-specific rstat locks to avoid contention cgroup: use separate rstat trees for each subsystem cgroup: compare css to cgroup::self in helper for distingushing css cgroup: warn on rstat usage by early init subsystems cgroup/cpuset: drop useless cpumask_empty() in compute_effective_exclusive_cpumask() cgroup/rstat: Improve cgroup_rstat_push_children() documentation cgroup: fix goto ordering in cgroup_init() cgroup: fix pointer check in css_rstat_init() cgroup/cpuset: Add warnings to catch inconsistency in exclusive CPUs cgroup/cpuset: Fix obsolete comment in cpuset_css_offline() cgroup/cpuset: Always use cpu_active_mask ... |
||
|
|
f1975e4765 |
Merge tag 'sysctl-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl
Pull sysctl updates from Joel Granados: - Move kern_table members out of kernel/sysctl.c Moved a subset (tracing, panic, signal, stack_tracer and sparc) out of the kern_table array. The goal is for kern_table to only have sysctl elements. All this increases modularity by placing the ctl_tables closer to where they are used while reducing the chances of merge conflicts in kernel/sysctl.c. - Fixed sysctl unit test panic by relocating it to selftests * tag 'sysctl-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl: sysctl: Close test ctl_headers with a for loop sysctl: call sysctl tests with a for loop sysctl: Add 0012 to test the u8 range check sysctl: move u8 register test to lib/test_sysctl.c sparc: mv sparc sysctls into their own file under arch/sparc/kernel stack_tracer: move sysctl registration to kernel/trace/trace_stack.c tracing: Move trace sysctls into trace.c signal: Move signal ctl tables into signal.c panic: Move panic ctl tables into panic.c |
||
|
|
affffcbb87 |
net: devmem: ncdevmem: remove unused variable
This variable is unused and can be removed. Signed-off-by: Mina Almasry <almasrymina@google.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250523230524.1107879-9-almasrymina@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
|
|
baa18bc535 |
net: devmem: ksft: upgrade rx test to send 1K data
The current test just sends "hello\nworld" and verifies that is the string received on the RX side. That is fine, but improve the test a bit by sending 1K data. The test should be improved further to send more data, but for now this should be a welcome improvement. The test will send a repeating pattern of 0x01, 0x02, ... 0x06. The ncdevmem `-v 7` flag will verify this pattern. ncdevmem will provide useful debugging info when the test fails, such as the frags received and verified fine, and which frag exactly failed, what was the expected byte pattern, and what is the actual byte pattern received. All this debug information will be useful when the test fails. Signed-off-by: Mina Almasry <almasrymina@google.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250523230524.1107879-8-almasrymina@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
|
|
243d47a5e1 |
net: devmem: ksft: add 5 tuple FS support
ncdevmem supports drivers that are limited to either 3-tuple or 5-tuple FS support, but the ksft is currently 3-tuple only. Support drivers that have 5-tuple FS supported by adding a ksft arg. Signed-off-by: Mina Almasry <almasrymina@google.com> fix 5-tuple fix 5-tuple Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250523230524.1107879-7-almasrymina@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
|
|
57605ae8e1 |
net: devmem: ksft: add exit_wait to make rx test pass
This exit_wait seems necessary to make the rx side test pass for me. I think this is just missed from the original test add patch. Add it now. Signed-off-by: Mina Almasry <almasrymina@google.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250523230524.1107879-6-almasrymina@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
|
|
12d31142e6 |
net: devmem: ksft: add ipv4 support
ncdevmem supports both ipv4 and ipv6, but the ksft is currently ipv6-only. Propagate the ipv4 support to the ksft, so that folks that are limited to these networks can also test. Signed-off-by: Mina Almasry <almasrymina@google.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250523230524.1107879-5-almasrymina@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
|
|
429d410bf9 |
selftests: netfilter: nft_queue.sh: include file transfer duration in log message
Paolo Abeni says: Recently the nipa CI infra went through some tuning, and the mentioned self-test now often fails. The failing test is the sctp+nfqueue one, where the file transfer takes too long and hits the timeout (1 minute). Because SCTP nfqueue tests had timeout related issues before (esp. on debug kernels) print the file transfer duration in the PASS/FAIL message. This would aallow us to see if there is/was an unexpected slowdown (CI keeps logs around) or 'creeping slowdown' where things got slower over time until 'fail point' was reached. Output of altered lines looks like this: PASS: tcp and nfqueue in forward chan (duration: 2s) PASS: tcp via loopback (duration: 2s) PASS: sctp and nfqueue in forward chain (duration: 42s) PASS: sctp and nfqueue in output chain with GSO (duration: 21s) Reported-by: Paolo Abeni <pabeni@redhat.com Closes: https://lore.kernel.org/netdev/584524ef-9fd7-4326-9f1b-693ca62c5692@redhat.com/ Signed-off-by: Florian Westphal <fw@strlen.de> Link: https://patch.msgid.link/20250523121700.20011-1-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
|
|
d9d836bfa5 |
selftests: net: move wait_local_port_listen to lib.sh
The function wait_local_port_listen() is the only function defined in net_helper.sh. Since some tests source both lib.sh and net_helper.sh, we can simplify the setup by moving wait_local_port_listen() to lib.sh. With this change, net_helper.sh becomes redundant and can be removed. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250526014600.9128-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
|
|
aacc73ceeb |
Merge tag 'gpio-updates-for-v6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio updates from Bartosz Golaszewski:
"We have three new drivers, some refactoring in the GPIO core, lots of
various changes across many drivers, new configfs interface for the
virtual gpio-aggregator module and DT-bindings updates.
The treewide conversion of GPIO drivers to using the new value setter
callbacks is ongoing with another round of GPIO drivers updated. You
will also see these commits coming in from other subsystems as with
the relevant changes merged into mainline last cycle, I've started
converting GPIO providers located elsewhere than drivers/gpio/.
GPIO core:
- use more lock guards where applicable
- refactor GPIO ACPI code and shrink it in the process by 8%
- move GPIO ACPI quirks into a separate file
- remove unneeded #ifdef
- convert GPIO devres helpers to using devm_add_action() where
applicable which shrinks and simplifies the code
- refactor GPIO descriptor validation in GPIO consumer interfaces
- don't allow setting values on input lines in the GPIO core which
will take off the burden from GPIO drivers of checking this down
the line
- provide gpiod_is_equal() as a way of safely comparing two GPIO
descriptors (the only current user is in regulator core)
New drivers:
- add the GPIO module for the max77759 multifunction device
- add the GPIO driver for the VeriSilicon BLZP1600 GPIO controller
- add the GPIO driver for the Spacemit K1 SoC
Driver improvements:
- convert more drivers to using the new GPIO line value setter
callbacks
- convert more drivers to making the irq_chip immutable as is
recommended by the interrupt subsystem
- extend build testing coverage by enabling more modules to be built
with COMPILE_TEST=y
- extend the gpio-aggregator module with a configfs interface that
makes the setup easier for user-space than the existing
driver-level sysfs attributes and also adds more advanced
configuration features (such as referring to aggregated lines by
their original names or modifying their names as exposed by the
aggregated chip)
- add a missing mutex_destroy() in gpio-imx-scu
- add an OF polarity quirk for s5m8767
- allow building gpio-vf610 as a loadable module
- make gpio-mxc not hardcode its GPIO base number with GPIO SYSFS
interface disabled (another small step towards getting rid of the
global GPIO numberspace)
- add support for level-triggered interrupts to gpio-pca953x
- don't double-check the ngpios property in gpio-ds4520 as GPIO core
already does it
- don't double-check the number of GPIOs in gpio-imx-scu as GPIO core
already does it
- remove unused callbacks from gpio-max3191x
DT bindings:
- add device-tree bindings for max77759, spacemit,k1 and blzp1600
(new drivers added this cycle)
- document more properties for gpio-vf610 and gpio-tegra186
- document a new pca95xx variant
- fix style of examples in several GPIO DT-binding documents
Misc:
- TODO list updates"
* tag 'gpio-updates-for-v6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: (123 commits)
gpio: timberdale: select GPIOLIB_IRQCHIP
gpio: lpc18xx: select GPIOLIB_IRQCHIP
gpio: grgpio: select GPIOLIB_IRQCHIP
gpio: bcm-kona: select GPIOLIB_IRQCHIP
dt-bindings: gpio: vf610: add ngpios and gpio-reserved-ranges
gpio: davinci: select GPIOLIB_IRQCHIP
gpiolib-acpi: Update file references in the Documentation and MAINTAINERS
gpiolib: acpi: Move quirks to a separate file
gpiolib: acpi: Add acpi_gpio_need_run_edge_events_on_boot() getter
gpiolib: acpi: Handle deferred list via new API
gpiolib: acpi: Make sure we fill struct acpi_gpio_info
gpiolib: acpi: Switch to use enum in acpi_gpio_in_ignore_list()
gpiolib: acpi: Use temporary variable for struct acpi_gpio_info
gpiolib: remove unneeded #ifdef
gpio: mpc8xxx: select GPIOLIB_IRQCHIP
gpio: pxa: select GPIOLIB_IRQCHIP
gpio: pxa: Make irq_chip immutable
gpio: timberdale: Make irq_chip immutable
gpio: xgene-sb: Make irq_chip immutable
gpio: davinci: Make irq_chip immutable
...
|
||
|
|
5ffb537e41 |
selftests/bpf: Add tests with stack ptr register in conditional jmp
Add two tests:
- one test has 'rX <op> r10' where rX is not r10, and
- another test has 'rX <op> rY' where rX and rY are not r10
but there is an early insn 'rX = r10'.
Without previous verifier change, both tests will fail.
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250524041340.4046304-1-yonghong.song@linux.dev
|
||
|
|
015a99fa76 |
Merge tag 'nolibc-20250526-for-6.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/nolibc/linux-nolibc
Pull nolibc updates from Thomas Weißschuh: - New supported architectures: m68k, SPARC (32 and 64 bit) - Compatibility with kselftest_harness.h - A more robust mechanism to include all of nolibc from each header - Split existing features into new headers to simplify adoption - Compatibility with UBSAN and it is used in the testsuite - Many small new features focussing on usage in kselftests * tag 'nolibc-20250526-for-6.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/nolibc/linux-nolibc: (83 commits) selftests: harness: Stop using setjmp()/longjmp() selftests: harness: Add "variant" and "self" to test metadata selftests: harness: Add teardown callback to test metadata selftests: harness: Move teardown conditional into test metadata selftests: harness: Don't set setup_completed for fixtureless tests selftests: harness: Implement test timeouts through pidfd selftests: harness: Remove dependency on libatomic selftests: harness: Remove inline qualifier for wrappers selftests: harness: Mark functions without prototypes static selftests: harness: Ignore unused variant argument warning selftests: harness: Use C89 comment style selftests: harness: Add kselftest harness selftest selftests/nolibc: drop include guards around standard headers tools/nolibc: move NULL and offsetof() to sys/stddef.h tools/nolibc: move uname() and friends to sys/utsname.h tools/nolibc: move makedev() and friends to sys/sysmacros.h tools/nolibc: move getrlimit() and friends to sys/resource.h tools/nolibc: move reboot() to sys/reboot.h tools/nolibc: move prctl() to sys/prctl.h tools/nolibc: move mount() to sys/mount.h ... |
||
|
|
149ead9d7e |
selftests/bpf: enable many-args tests for arm64
Now that support for up to 12 args is enabled for tracing programs on ARM64, enable the existing tests for this feature on this architecture. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20250527-many_args_arm64-v3-2-3faf7bb8e4a2@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> |
||
|
|
92de53d247 |
selftests/bpf: Add unit tests with __bpf_trap() kfunc
Add some inline-asm tests and C tests where __bpf_trap() or __builtin_trap() is used in the code. The __builtin_trap() test is guarded with llvm21 ([1]) since otherwise the compilation failure will happen. [1] https://github.com/llvm/llvm-project/pull/131731 Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20250523205331.1291734-1-yonghong.song@linux.dev Tested-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> |
||
|
|
7594dcb71f |
selftests/bpf: Add test for open coded dmabuf_iter
Use the same test buffers as the traditional iterator and a new BPF map to verify the test buffers can be found with the open coded dmabuf iterator. Signed-off-by: T.J. Mercier <tjmercier@google.com> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20250522230429.941193-6-tjmercier@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> |
||
|
|
ae5d2c59ec |
selftests/bpf: Add test for dmabuf_iter
This test creates a udmabuf, and a dmabuf from the system dmabuf heap, and uses a BPF program that prints dmabuf metadata with the new dmabuf_iter to verify they can be found. Signed-off-by: T.J. Mercier <tjmercier@google.com> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20250522230429.941193-5-tjmercier@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> |
||
|
|
4e02d4f973 |
Merge tag 'kvm-x86-svm-6.16' of https://github.com/kvm-x86/linux into HEAD
KVM SVM changes for 6.16: - Wait for target vCPU to acknowledge KVM_REQ_UPDATE_PROTECTED_GUEST_STATE to fix a race between AP destroy and VMRUN. - Decrypt and dump the VMSA in dump_vmcb() if debugging enabled for the VM. - Add support for ALLOWED_SEV_FEATURES. - Add #VMGEXIT to the set of handlers special cased for CONFIG_RETPOLINE=y. - Treat DEBUGCTL[5:2] as reserved to pave the way for virtualizing features that utilize those bits. - Don't account temporary allocations in sev_send_update_data(). - Add support for KVM_CAP_X86_BUS_LOCK_EXIT on SVM, via Bus Lock Threshold. |
||
|
|
3e0797f6dd |
Merge tag 'kvm-x86-selftests-6.16' of https://github.com/kvm-x86/linux into HEAD
KVM selftests changes for 6.16: - Add support for SNP to the various SEV selftests. - Add a selftest to verify fastops instructions via forced emulation. - Add MGLRU support to the access tracking perf test. |
||
|
|
393d070135 |
vsock/test: Add test for an unexpectedly lingering close()
There was an issue with SO_LINGER: instead of blocking until all queued messages for the socket have been successfully sent (or the linger timeout has been reached), close() would block until packets were handled by the peer. Add a test to alert on close() lingering when it should not. Signed-off-by: Michal Luczaj <mhal@rbox.co> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://patch.msgid.link/20250522-vsock-linger-v6-5-2ad00b0e447e@rbox.co Signed-off-by: Paolo Abeni <pabeni@redhat.com> |
||
|
|
8b07b7e5c2 |
vsock/test: Introduce enable_so_linger() helper
Add a helper function that sets SO_LINGER. Adapt the caller. Signed-off-by: Michal Luczaj <mhal@rbox.co> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://patch.msgid.link/20250522-vsock-linger-v6-4-2ad00b0e447e@rbox.co Signed-off-by: Paolo Abeni <pabeni@redhat.com> |
||
|
|
e78e0596c7 |
vsock/test: Introduce vsock_wait_sent() helper
Distill the virtio_vsock_sock::bytes_unsent checking loop (ioctl SIOCOUTQ) and move it to utils. Tweak the comment. Signed-off-by: Michal Luczaj <mhal@rbox.co> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://patch.msgid.link/20250522-vsock-linger-v6-3-2ad00b0e447e@rbox.co Signed-off-by: Paolo Abeni <pabeni@redhat.com> |
||
|
|
ca8bf8f383 |
wireguard: selftests: specify -std=gnu17 for bash
GCC 15 defaults to C23, which bash can't compile under, so specify gnu17 explicitly. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Link: https://patch.msgid.link/20250521212707.1767879-6-Jason@zx2c4.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> |
||
|
|
ba3d7b93db |
wireguard: allowedips: add WGALLOWEDIP_F_REMOVE_ME flag
The current netlink API for WireGuard does not directly support removal of allowed ips from a peer. A user can remove an allowed ip from a peer in one of two ways: 1. By using the WGPEER_F_REPLACE_ALLOWEDIPS flag and providing a new list of allowed ips which omits the allowed ip that is to be removed. 2. By reassigning an allowed ip to a "dummy" peer then removing that peer with WGPEER_F_REMOVE_ME. With the first approach, the driver completely rebuilds the allowed ip list for a peer. If my current configuration is such that a peer has allowed ips 192.168.0.2 and 192.168.0.3 and I want to remove 192.168.0.2 the actual transition looks like this. [192.168.0.2, 192.168.0.3] <-- Initial state [] <-- Step 1: Allowed ips removed for peer [192.168.0.3] <-- Step 2: Allowed ips added back for peer This is true even if the allowed ip list is small and the update does not need to be batched into multiple WG_CMD_SET_DEVICE requests, as the removal and subsequent addition of ips is non-atomic within a single request. Consequently, wg_allowedips_lookup_dst and wg_allowedips_lookup_src may return NULL while reconfiguring a peer even for packets bound for ips a user did not intend to remove leading to unintended interruptions in connectivity. This presents in userspace as failed calls to sendto and sendmsg for UDP sockets. In my case, I ran netperf while repeatedly reconfiguring the allowed ips for a peer with wg. /usr/local/bin/netperf -H 10.102.73.72 -l 10m -t UDP_STREAM -- -R 1 -m 1024 send_data: data send error: No route to host (errno 113) netperf: send_omni: send_data failed: No route to host While this may not be of particular concern for environments where peers and allowed ips are mostly static, systems like Cilium manage peers and allowed ips in a dynamic environment where peers (i.e. Kubernetes nodes) and allowed ips (i.e. pods running on those nodes) can frequently change making WGPEER_F_REPLACE_ALLOWEDIPS problematic. The second approach avoids any possible connectivity interruptions but is hacky and less direct, requiring the creation of a temporary peer just to dispose of an allowed ip. Introduce a new flag called WGALLOWEDIP_F_REMOVE_ME which in the same way that WGPEER_F_REMOVE_ME allows a user to remove a single peer from a WireGuard device's configuration allows a user to remove an ip from a peer's set of allowed ips. This enables incremental updates to a device's configuration without any connectivity blips or messy workarounds. A corresponding patch for wg extends the existing `wg set` interface to leverage this feature. $ wg set wg0 peer <PUBKEY> allowed-ips +192.168.88.0/24,-192.168.0.1/32 When '+' or '-' is prepended to any ip in the list, wg clears WGPEER_F_REPLACE_ALLOWEDIPS and sets the WGALLOWEDIP_F_REMOVE_ME flag on any ip prefixed with '-'. Signed-off-by: Jordan Rife <jordan@jrife.io> [Jason: minor style nits, fixes to selftest, bump of wireguard-tools version] Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Link: https://patch.msgid.link/20250521212707.1767879-5-Jason@zx2c4.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> |
||
|
|
e74e9ee2c8 |
wireguard: selftests: cleanup CONFIG_UBSAN_SANITIZE_ALL
Commit
|
||
|
|
785cdec46e |
Merge tag 'x86-core-2025-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core x86 updates from Ingo Molnar:
"Boot code changes:
- A large series of changes to reorganize the x86 boot code into a
better isolated and easier to maintain base of PIC early startup
code in arch/x86/boot/startup/, by Ard Biesheuvel.
Motivation & background:
| Since commit
|
|
|
||
|
|
ddddf9d64f |
Merge tag 'perf-core-2025-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf events updates from Ingo Molnar:
"Core & generic-arch updates:
- Add support for dynamic constraints and propagate it to the Intel
driver (Kan Liang)
- Fix & enhance driver-specific throttling support (Kan Liang)
- Record sample last_period before updating on the x86 and PowerPC
platforms (Mark Barnett)
- Make perf_pmu_unregister() usable (Peter Zijlstra)
- Unify perf_event_free_task() / perf_event_exit_task_context()
(Peter Zijlstra)
- Simplify perf_event_release_kernel() and perf_event_free_task()
(Peter Zijlstra)
- Allocate non-contiguous AUX pages by default (Yabin Cui)
Uprobes updates:
- Add support to emulate NOP instructions (Jiri Olsa)
- selftests/bpf: Add 5-byte NOP uprobe trigger benchmark (Jiri Olsa)
x86 Intel PMU enhancements:
- Support Intel Auto Counter Reload [ACR] (Kan Liang)
- Add PMU support for Clearwater Forest (Dapeng Mi)
- Arch-PEBS preparatory changes: (Dapeng Mi)
- Parse CPUID archPerfmonExt leaves for non-hybrid CPUs
- Decouple BTS initialization from PEBS initialization
- Introduce pairs of PEBS static calls
x86 AMD PMU enhancements:
- Use hrtimer for handling overflows in the AMD uncore driver
(Sandipan Das)
- Prevent UMC counters from saturating (Sandipan Das)
Fixes and cleanups:
- Fix put_ctx() ordering (Frederic Weisbecker)
- Fix irq work dereferencing garbage (Frederic Weisbecker)
- Misc fixes and cleanups (Changbin Du, Frederic Weisbecker, Ian
Rogers, Ingo Molnar, Kan Liang, Peter Zijlstra, Qing Wang, Sandipan
Das, Thorsten Blum)"
* tag 'perf-core-2025-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits)
perf/headers: Clean up <linux/perf_event.h> a bit
perf/uapi: Clean up <uapi/linux/perf_event.h> a bit
perf/uapi: Fix PERF_RECORD_SAMPLE comments in <uapi/linux/perf_event.h>
mips/perf: Remove driver-specific throttle support
xtensa/perf: Remove driver-specific throttle support
sparc/perf: Remove driver-specific throttle support
loongarch/perf: Remove driver-specific throttle support
csky/perf: Remove driver-specific throttle support
arc/perf: Remove driver-specific throttle support
alpha/perf: Remove driver-specific throttle support
perf/apple_m1: Remove driver-specific throttle support
perf/arm: Remove driver-specific throttle support
s390/perf: Remove driver-specific throttle support
powerpc/perf: Remove driver-specific throttle support
perf/x86/zhaoxin: Remove driver-specific throttle support
perf/x86/amd: Remove driver-specific throttle support
perf/x86/intel: Remove driver-specific throttle support
perf: Only dump the throttle log for the leader
perf: Fix the throttle logic for a group
perf/core: Add the is_event_in_freq_mode() helper to simplify the code
...
|
||
|
|
b3570b00dc |
Merge tag 'locking-core-2025-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
"Futexes:
- Add support for task local hash maps (Sebastian Andrzej Siewior,
Peter Zijlstra)
- Implement the FUTEX2_NUMA ABI, which feature extends the futex
interface to be NUMA-aware. On NUMA-aware futexes a second u32 word
containing the NUMA node is added to after the u32 futex value word
(Peter Zijlstra)
- Implement the FUTEX2_MPOL ABI, which feature extends the futex
interface to be mempolicy-aware as well, to further refine futex
node mappings and lookups (Peter Zijlstra)
Locking primitives:
- Misc cleanups (Andy Shevchenko, Borislav Petkov, Colin Ian King,
Ingo Molnar, Nam Cao, Peter Zijlstra)
Lockdep:
- Prevent abuse of lockdep subclasses (Waiman Long)
- Add number of dynamic keys to /proc/lockdep_stats (Waiman Long)
Plus misc cleanups and fixes"
* tag 'locking-core-2025-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (35 commits)
selftests/futex: Fix spelling mistake "unitiliazed" -> "uninitialized"
futex: Correct the kernedoc return value for futex_wait_setup().
tools headers: Synchronize prctl.h ABI header
futex: Use RCU_INIT_POINTER() in futex_mm_init().
selftests/futex: Use TAP output in futex_numa_mpol
selftests/futex: Use TAP output in futex_priv_hash
futex: Fix kernel-doc comments
futex: Relax the rcu_assign_pointer() assignment of mm->futex_phash in futex_mm_init()
futex: Fix outdated comment in struct restart_block
locking/lockdep: Add number of dynamic keys to /proc/lockdep_stats
locking/lockdep: Prevent abuse of lockdep subclass
locking/lockdep: Move hlock_equal() to the respective #ifdeffery
futex,selftests: Add another FUTEX2_NUMA selftest
selftests/futex: Add futex_numa_mpol
selftests/futex: Add futex_priv_hash
selftests/futex: Build without headers nonsense
tools/perf: Allow to select the number of hash buckets
tools headers: Synchronize prctl.h ABI header
futex: Implement FUTEX2_MPOL
futex: Implement FUTEX2_NUMA
...
|
||
|
|
ba45037098 |
Merge tag 'linux_kselftest-kunit-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kunit updates from Shuah Khan: - Enable qemu_config for riscv32, sparc 64-bit, PowerPC 32-bit BE and 64-bit LE - Enable CONFIG_SPARC32 to clearly differentiate between sparc 32-bit and 64-bit configurations - Enable CONFIG_CPU_BIG_ENDIAN to clearly differentiate between powerpc LE and BE configurations - Add feature to list available architectures to kunit tool - Fixes to bugs and changes to documentation * tag 'linux_kselftest-kunit-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: kunit: Fix wrong parameter to kunit_deactivate_static_stub() kunit: tool: add test counts to JSON output Documentation: kunit: improve example on testing static functions kunit: executor: Remove const from kunit_filter_suites() allocation type kunit: qemu_configs: Disable faulting tests on 32-bit SPARC kunit: qemu_configs: Add 64-bit SPARC configuration kunit: qemu_configs: sparc: Explicitly enable CONFIG_SPARC32=y kunit: qemu_configs: Add PowerPC 32-bit BE and 64-bit LE kunit: qemu_configs: powerpc: Explicitly enable CONFIG_CPU_BIG_ENDIAN=y kunit: tool: Implement listing of available architectures kunit: qemu_configs: Add riscv32 config kunit: configs: Enable CONFIG_INIT_STACK_ALL_PATTERN in all_tests |
||
|
|
2d2435e1c8 |
Merge tag 'linux_kselftest-next-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull Kselftest updates from Shuah Khan: "Fixes: - cpufreq test to not double suspend in rtcwake case - compile error in pid_namespace test - run_kselftest.sh to use readlink if realpath is not available - cpufreq basic read and update testcases - ftrace to add poll to a gen_file so test can find it at run-time - spelling errors in perf_events test" * tag 'linux_kselftest-next-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests/run_kselftest.sh: Use readlink if realpath is not available selftests/timens: timerfd: Use correct clockid type in tclock_gettime() selftests/timens: Make run_tests() functions static selftests/timens: Print TAP headers selftests: pid_namespace: add missing sys/mount.h include in pid_max.c kselftest: cpufreq: Get rid of double suspend in rtcwake case selftests/cpufreq: Fix cpufreq basic read and update testcases selftests/ftrace: Convert poll to a gen_file selftests/perf_events: Fix spelling mistake "sycnhronize" -> "synchronize" |
||
|
|
07046958f6 |
Merge tag 'next.2025.05.17a' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux
Pull RCU updates from Joel Fernandes:
- Removed swake_up_one_online() workaround
- Reverted an incorrect rcuog wake-up fix from offline softirq
- Rust RCU Guard methods marked as inline
- Updated MAINTAINERS with Joel’s and Zqiang's new email address
- Replaced magic constant in rcu_seq_done_exact() with named constant
- Added warning mechanism to validate rcu_seq_done_exact()
- Switched SRCU polling API to use rcu_seq_done_exact()
- Commented on redundant delta check in rcu_seq_done_exact()
- Made ->gpwrap tests in rcutorture more frequent
- Fixed reuse of ARM64 images in rcutorture
- rcutorture improved to check Kconfig and reader conflict handling
- Extracted logic from rcu_torture_one_read() for clarity
- Updated LWN RCU API documentation links
- Enabled --do-rt in torture.sh for CONFIG_PREEMPT_RT
- Added tests for SRCU up/down reader primitives
- Added comments and delays checks in rcutorture
- Deprecated srcu_read_lock_lite() and srcu_read_unlock_lite() via checkpatch
- Added --do-normal and --do-no-normal to torture.sh
- Added RCU Rust binding tests to torture.sh
- Reduced CPU overcommit and removed MAXSMP/CPUMASK_OFFSTACK in TREE01
- Replaced kmalloc() with kcalloc() in rcuscale
- Refined listRCU example code for stale data elimination
- Fixed hardirq count bug for x86 in cpu_stall_cputime
- Added safety checks in rcu/nocb for offloaded rdp access
- Other miscellaneous changes
* tag 'next.2025.05.17a' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux: (27 commits)
rcutorture: Fix issue with re-using old images on ARM64
rcutorture: Remove MAXSMP and CPUMASK_OFFSTACK from TREE01
rcutorture: Reduce TREE01 CPU overcommit
torture: Check for "Call trace:" as well as "Call Trace:"
rcutorture: Perform more frequent testing of ->gpwrap
torture: Add testing of RCU's Rust bindings to torture.sh
torture: Add --do-{,no-}normal to torture.sh
checkpatch: Deprecate srcu_read_lock_lite() and srcu_read_unlock_lite()
rcutorture: Comment invocations of tick_dep_set_task()
rcu/nocb: Add Safe checks for access offloaded rdp
rcuscale: using kcalloc() to relpace kmalloc()
doc/RCU/listRCU: refine example code for eliminating stale data
doc: Update LWN RCU API links in whatisRCU.rst
Revert "rcu/nocb: Fix rcuog wake-up from offline softirq"
rust: sync: rcu: Mark Guard methods as inline
rcu/cpu_stall_cputime: fix the hardirq count for x86 architecture
rcu: Remove swake_up_one_online() bandaid
MAINTAINERS: Update Zqiang's email address
rcutorture: Make torture.sh --do-rt use CONFIG_PREEMPT_RT
srcu: Use rcu_seq_done_exact() for polling API
...
|
||
|
|
14418ddcc2 |
Merge tag 'v6.16-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu: "API: - Fix memcpy_sglist to handle partially overlapping SG lists - Use memcpy_sglist to replace null skcipher - Rename CRYPTO_TESTS to CRYPTO_BENCHMARK - Flip CRYPTO_MANAGER_DISABLE_TEST into CRYPTO_SELFTESTS - Hide CRYPTO_MANAGER - Add delayed freeing of driver crypto_alg structures Compression: - Allocate large buffers on first use instead of initialisation in scomp - Drop destination linearisation buffer in scomp - Move scomp stream allocation into acomp - Add acomp scatter-gather walker - Remove request chaining - Add optional async request allocation Hashing: - Remove request chaining - Add optional async request allocation - Move partial block handling into API - Add ahash support to hmac - Fix shash documentation to disallow usage in hard IRQs Algorithms: - Remove unnecessary SIMD fallback code on x86 and arm/arm64 - Drop avx10_256 xts(aes)/ctr(aes) on x86 - Improve avx-512 optimisations for xts(aes) - Move chacha arch implementations into lib/crypto - Move poly1305 into lib/crypto and drop unused Crypto API algorithm - Disable powerpc/poly1305 as it has no SIMD fallback - Move sha256 arch implementations into lib/crypto - Convert deflate to acomp - Set block size correctly in cbcmac Drivers: - Do not use sg_dma_len before mapping in sun8i-ss - Fix warm-reboot failure by making shutdown do more work in qat - Add locking in zynqmp-sha - Remove cavium/zip - Add support for PCI device 0x17D8 to ccp - Add qat_6xxx support in qat - Add support for RK3576 in rockchip-rng - Add support for i.MX8QM in caam Others: - Fix irq_fpu_usable/kernel_fpu_begin inconsistency during CPU bring-up - Add new SEV/SNP platform shutdown API in ccp" * tag 'v6.16-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (382 commits) x86/fpu: Fix irq_fpu_usable() to return false during CPU onlining crypto: qat - add missing header inclusion crypto: api - Redo lookup on EEXIST Revert "crypto: testmgr - Add hash export format testing" crypto: marvell/cesa - Do not chain submitted requests crypto: powerpc/poly1305 - add depends on BROKEN for now Revert "crypto: powerpc/poly1305 - Add SIMD fallback" crypto: ccp - Add missing tee info reg for teev2 crypto: ccp - Add missing bootloader info reg for pspv5 crypto: sun8i-ce - move fallback ahash_request to the end of the struct crypto: octeontx2 - Use dynamic allocated memory region for lmtst crypto: octeontx2 - Initialize cptlfs device info once crypto: xts - Only add ecb if it is not already there crypto: lrw - Only add ecb if it is not already there crypto: testmgr - Add hash export format testing crypto: testmgr - Use ahash for generic tfm crypto: hmac - Add ahash support crypto: testmgr - Ignore EEXIST on shash allocation crypto: algapi - Add driver template support to crypto_inst_setname crypto: shash - Set reqsize in shash_alg ... |
||
|
|
1f7c9d52b1 |
Merge tag 'kvm-riscv-6.16-1' of https://github.com/kvm-riscv/linux into HEAD
KVM/riscv changes for 6.16 - Add vector registers to get-reg-list selftest - VCPU reset related improvements - Remove scounteren initialization from VCPU reset - Support VCPU reset from userspace using set_mpstate() ioctl |
||
|
|
4d526b02df |
Merge tag 'kvmarm-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 updates for 6.16
* New features:
- Add large stage-2 mapping support for non-protected pKVM guests,
clawing back some performance.
- Add UBSAN support to the standalone EL2 object used in nVHE/hVHE and
protected modes.
- Enable nested virtualisation support on systems that support it
(yes, it has been a long time coming), though it is disabled by
default.
* Improvements, fixes and cleanups:
- Large rework of the way KVM tracks architecture features and links
them with the effects of control bits. This ensures correctness of
emulation (the data is automatically extracted from the published
JSON files), and helps dealing with the evolution of the
architecture.
- Significant changes to the way pKVM tracks ownership of pages,
avoiding page table walks by storing the state in the hypervisor's
vmemmap. This in turn enables the THP support described above.
- New selftest checking the pKVM ownership transition rules
- Fixes for FEAT_MTE_ASYNC being accidentally advertised to guests
even if the host didn't have it.
- Fixes for the address translation emulation, which happened to be
rather buggy in some specific contexts.
- Fixes for the PMU emulation in NV contexts, decoupling PMCR_EL0.N
from the number of counters exposed to a guest and addressing a
number of issues in the process.
- Add a new selftest for the SVE host state being corrupted by a
guest.
- Keep HCR_EL2.xMO set at all times for systems running with the
kernel at EL2, ensuring that the window for interrupts is slightly
bigger, and avoiding a pretty bad erratum on the AmpereOne HW.
- Add workaround for AmpereOne's erratum AC04_CPU_23, which suffers
from a pretty bad case of TLB corruption unless accesses to HCR_EL2
are heavily synchronised.
- Add a per-VM, per-ITS debugfs entry to dump the state of the ITS
tables in a human-friendly fashion.
- and the usual random cleanups.
|
||
|
|
6f59de9bc0 |
Merge tag 'for-6.16/block-20250523' of git://git.kernel.dk/linux
Pull block updates from Jens Axboe:
- ublk updates:
- Add support for updating the size of a ublk instance
- Zero-copy improvements
- Auto-registering of buffers for zero-copy
- Series simplifying and improving GET_DATA and request lookup
- Series adding quiesce support
- Lots of selftests additions
- Various cleanups
- NVMe updates via Christoph:
- add per-node DMA pools and use them for PRP/SGL allocations
(Caleb Sander Mateos, Keith Busch)
- nvme-fcloop refcounting fixes (Daniel Wagner)
- support delayed removal of the multipath node and optionally
support the multipath node for private namespaces (Nilay Shroff)
- support shared CQs in the PCI endpoint target code (Wilfred
Mallawa)
- support admin-queue only authentication (Hannes Reinecke)
- use the crc32c library instead of the crypto API (Eric Biggers)
- misc cleanups (Christoph Hellwig, Marcelo Moreira, Hannes
Reinecke, Leon Romanovsky, Gustavo A. R. Silva)
- MD updates via Yu:
- Fix that normal IO can be starved by sync IO, found by mkfs on
newly created large raid5, with some clean up patches for bdev
inflight counters
- Clean up brd, getting rid of atomic kmaps and bvec poking
- Add loop driver specifically for zoned IO testing
- Eliminate blk-rq-qos calls with a static key, if not enabled
- Improve hctx locking for when a plug has IO for multiple queues
pending
- Remove block layer bouncing support, which in turn means we can
remove the per-node bounce stat as well
- Improve blk-throttle support
- Improve delay support for blk-throttle
- Improve brd discard support
- Unify IO scheduler switching. This should also fix a bunch of lockdep
warnings we've been seeing, after enabling lockdep support for queue
freezing/unfreezeing
- Add support for block write streams via FDP (flexible data placement)
on NVMe
- Add a bunch of block helpers, facilitating the removal of a bunch of
duplicated boilerplate code
- Remove obsolete BLK_MQ pci and virtio Kconfig options
- Add atomic/untorn write support to blktrace
- Various little cleanups and fixes
* tag 'for-6.16/block-20250523' of git://git.kernel.dk/linux: (186 commits)
selftests: ublk: add test for UBLK_F_QUIESCE
ublk: add feature UBLK_F_QUIESCE
selftests: ublk: add test case for UBLK_U_CMD_UPDATE_SIZE
traceevent/block: Add REQ_ATOMIC flag to block trace events
ublk: run auto buf unregisgering in same io_ring_ctx with registering
io_uring: add helper io_uring_cmd_ctx_handle()
ublk: remove io argument from ublk_auto_buf_reg_fallback()
ublk: handle ublk_set_auto_buf_reg() failure correctly in ublk_fetch()
selftests: ublk: add test for covering UBLK_AUTO_BUF_REG_FALLBACK
selftests: ublk: support UBLK_F_AUTO_BUF_REG
ublk: support UBLK_AUTO_BUF_REG_FALLBACK
ublk: register buffer to local io_uring with provided buf index via UBLK_F_AUTO_BUF_REG
ublk: prepare for supporting to register request buffer automatically
ublk: convert to refcount_t
selftests: ublk: make IO & device removal test more stressful
nvme: rename nvme_mpath_shutdown_disk to nvme_mpath_remove_disk
nvme: introduce multipath_always_on module param
nvme-multipath: introduce delayed removal of the multipath head node
nvme-pci: derive and better document max segments limits
nvme-pci: use struct_size for allocation struct nvme_dev
...
|
||
|
|
3e406741b1 |
Merge tag 'vfs-6.16-rc1.selftests' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs selftests updates from Christian Brauner: "This contains various cleanups, fixes, and extensions for out filesystem selftests" * tag 'vfs-6.16-rc1.selftests' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: selftests/fs/mount-notify: add a test variant running inside userns selftests/filesystems: create setup_userns() helper selftests/filesystems: create get_unique_mnt_id() helper selftests/fs/mount-notify: build with tools include dir selftests/mount_settattr: remove duplicate syscall definitions selftests/pidfd: move syscall definitions into wrappers.h selftests/fs/statmount: build with tools include dir selftests/filesystems: move wrapper.h out of overlayfs subdir selftests/mount_settattr: ensure that ext4 filesystem can be created selftests/mount_settattr: add missing STATX_MNT_ID_UNIQUE define selftests/mount_settattr: don't define sys_open_tree() twice |
||
|
|
c5bfc48d54 |
Merge tag 'vfs-6.16-rc1.coredump' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull coredump updates from Christian Brauner:
"This adds support for sending coredumps over an AF_UNIX socket. It
also makes (implicit) use of the new SO_PEERPIDFD ability to hand out
pidfds for reaped peer tasks
The new coredump socket will allow userspace to not have to rely on
usermode helpers for processing coredumps and provides a saf way to
handle them instead of relying on super privileged coredumping helpers
This will also be significantly more lightweight since the kernel
doens't have to do a fork()+exec() for each crashing process to spawn
a usermodehelper. Instead the kernel just connects to the AF_UNIX
socket and userspace can process it concurrently however it sees fit.
Support for userspace is incoming starting with systemd-coredump
There's more work coming in that direction next cycle. The rest below
goes into some details and background
Coredumping currently supports two modes:
(1) Dumping directly into a file somewhere on the filesystem.
(2) Dumping into a pipe connected to a usermode helper process
spawned as a child of the system_unbound_wq or kthreadd
For simplicity I'm mostly ignoring (1). There's probably still some
users of (1) out there but processing coredumps in this way can be
considered adventurous especially in the face of set*id binaries
The most common option should be (2) by now. It works by allowing
userspace to put a string into /proc/sys/kernel/core_pattern like:
|/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h
The "|" at the beginning indicates to the kernel that a pipe must be
used. The path following the pipe indicator is a path to a binary that
will be spawned as a usermode helper process. Any additional
parameters pass information about the task that is generating the
coredump to the binary that processes the coredump
In the example the core_pattern shown causes the kernel to spawn
systemd-coredump as a usermode helper. There's various conceptual
consequences of this (non-exhaustive list):
- systemd-coredump is spawned with file descriptor number 0 (stdin)
connected to the read-end of the pipe. All other file descriptors
are closed. That specifically includes 1 (stdout) and 2 (stderr).
This has already caused bugs because userspace assumed that this
cannot happen (Whether or not this is a sane assumption is
irrelevant)
- systemd-coredump will be spawned as a child of system_unbound_wq.
So it is not a child of any userspace process and specifically not
a child of PID 1. It cannot be waited upon and is in a weird hybrid
upcall which are difficult for userspace to control correctly
- systemd-coredump is spawned with full kernel privileges. This
necessitates all kinds of weird privilege dropping excercises in
userspace to make this safe
- A new usermode helper has to be spawned for each crashing process
This adds a new mode:
(3) Dumping into an AF_UNIX socket
Userspace can set /proc/sys/kernel/core_pattern to:
@/path/to/coredump.socket
The "@" at the beginning indicates to the kernel that an AF_UNIX
coredump socket will be used to process coredumps
The coredump socket must be located in the initial mount namespace.
When a task coredumps it opens a client socket in the initial network
namespace and connects to the coredump socket:
- The coredump server uses SO_PEERPIDFD to get a stable handle on the
connected crashing task. The retrieved pidfd will provide a stable
reference even if the crashing task gets SIGKILLed while generating
the coredump. That is a huge attack vector right now
- By setting core_pipe_limit non-zero userspace can guarantee that
the crashing task cannot be reaped behind it's back and thus
process all necessary information in /proc/<pid>. The SO_PEERPIDFD
can be used to detect whether /proc/<pid> still refers to the same
process
The core_pipe_limit isn't used to rate-limit connections to the
socket. This can simply be done via AF_UNIX socket directly
- The pidfd for the crashing task will contain information how the
task coredumps. The PIDFD_GET_INFO ioctl gained a new flag
PIDFD_INFO_COREDUMP which can be used to retreive the coredump
information
If the coredump gets a new coredump client connection the kernel
guarantees that PIDFD_INFO_COREDUMP information is available.
Currently the following information is provided in the new
@coredump_mask extension to struct pidfd_info:
* PIDFD_COREDUMPED is raised if the task did actually coredump
* PIDFD_COREDUMP_SKIP is raised if the task skipped coredumping
(e.g., undumpable)
* PIDFD_COREDUMP_USER is raised if this is a regular coredump and
doesn't need special care by the coredump server
* PIDFD_COREDUMP_ROOT is raised if the generated coredump should
be treated as sensitive and the coredump server should restrict
access to the generated coredump to sufficiently privileged
users"
* tag 'vfs-6.16-rc1.coredump' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
mips, net: ensure that SOCK_COREDUMP is defined
selftests/coredump: add tests for AF_UNIX coredumps
selftests/pidfd: add PIDFD_INFO_COREDUMP infrastructure
coredump: validate socket name as it is written
coredump: show supported coredump modes
pidfs, coredump: add PIDFD_INFO_COREDUMP
coredump: add coredump socket
coredump: reflow dump helpers a little
coredump: massage do_coredump()
coredump: massage format_corename()
|