Compare commits

...

440 Commits

Author SHA1 Message Date
Jakub Kicinski
1caa871bb0 Merge branch 'net-stmmac-fix-tegra234-mgbe-clock'
Jon Hunter says:

====================
net: stmmac: Fix Tegra234 MGBE clock

The name of the PTP ref clock for the Tegra234 MGBE ethernet controller
does not match the generic name in the stmmac platform driver. Despite
this basic ethernet is functional on the Tegra234 platforms that use
this driver and as far as I know, we have not tested PTP support with
this driver. Hence, the risk of breaking any functionality is low.

The previous attempt to fix this in the stmmac platform driver, by
supporting the Tegra234 PTP clock name, was rejected [0]. The preference
from the netdev maintainers is to fix this in the DT binding for
Tegra234.

This series fixes this by correcting the device-tree binding to align
with the generic name for the PTP clock. I understand that this is
breaking the ABI for this device, which we should never do, but this
is a last resort for getting this fixed. I am open to any better ideas
to fix this. Please note that we still maintain backward compatibility
in the driver to allow older device-trees to work, but we don't
advertise this via the binding, because I did not see any value in doing
so.
====================

Link: https://patch.msgid.link/20260401102941.17466-1-jonathanh@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-03 16:02:31 -07:00
Jon Hunter
fb22b1fc5b dt-bindings: net: Fix Tegra234 MGBE PTP clock
The PTP clock for the Tegra234 MGBE device is incorrectly named
'ptp-ref' and should be 'ptp_ref'. This is causing the following
warning to be observed on Tegra234 platforms that use this device:

 ERR KERN tegra-mgbe 6800000.ethernet eth0: Invalid PTP clock rate
 WARNING KERN tegra-mgbe 6800000.ethernet eth0: PTP init failed

Although this constitutes an ABI breakage in the binding for this
device, PTP support has clearly never worked and so fix this now
so we can correct the device-tree for this device. Note that the
MGBE driver still supports the legacy 'ptp-ref' clock name and so
older/existing device-trees will still work, but given that this
is not the correct name, there is no point to advertise this in the
binding.

Fixes: 189c2e5c76 ("dt-bindings: net: Add Tegra234 MGBE")
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Link: https://patch.msgid.link/20260401102941.17466-3-jonathanh@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-03 16:02:30 -07:00
Jon Hunter
1345e9f4e3 net: stmmac: Fix PTP ref clock for Tegra234
Since commit 030ce919e1 ("net: stmmac: make sure that ptp_rate is not
0 before configuring timestamping") was added the following error is
observed on Tegra234:

 ERR KERN tegra-mgbe 6800000.ethernet eth0: Invalid PTP clock rate
 WARNING KERN tegra-mgbe 6800000.ethernet eth0: PTP init failed

It turns out that the Tegra234 device-tree binding defines the PTP ref
clock name as 'ptp-ref' and not 'ptp_ref' and the above commit now
exposes this and that the PTP clock is not configured correctly.

In order to update device-tree to use the correct 'ptp_ref' name, update
the Tegra MGBE driver to use 'ptp_ref' by default and fallback to using
'ptp-ref' if this clock name is present.

Fixes: d8ca113724 ("net: stmmac: tegra: Add MGBE support")
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260401102941.17466-2-jonathanh@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-03 16:02:21 -07:00
Pengpeng Hou
5c14a19d5b nfc: s3fwrn5: allocate rx skb before consuming bytes
s3fwrn82_uart_read() reports the number of accepted bytes to the serdev
core. The current code consumes bytes into recv_skb and may already
deliver a complete frame before allocating a fresh receive buffer.

If that alloc_skb() fails, the callback returns 0 even though it has
already consumed bytes, and it leaves recv_skb as NULL for the next
receive callback. That breaks the receive_buf() accounting contract and
can also lead to a NULL dereference on the next skb_put_u8().

Allocate the receive skb lazily before consuming the next byte instead.
If allocation fails, return the number of bytes already accepted.

Fixes: 3f52c2cb7e ("nfc: s3fwrn5: Support a UART interface")
Signed-off-by: Pengpeng Hou <pengpeng@iscas.ac.cn>
Link: https://patch.msgid.link/20260402042148.65236-1-pengpeng@iscas.ac.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-03 15:57:46 -07:00
Chris J Arges
77facb3522 net: increase IP_TUNNEL_RECURSION_LIMIT to 5
In configurations with multiple tunnel layers and MPLS lwtunnel routing, a
single tunnel hop can increment the counter beyond this limit. This causes
packets to be dropped with the "Dead loop on virtual device" message even
when a routing loop doesn't exist.

Increase IP_TUNNEL_RECURSION_LIMIT from 4 to 5 to handle this use-case.

Fixes: 6f1a9140ec ("net: add xmit recursion limit to tunnel xmit functions")
Link: https://lore.kernel.org/netdev/88deb91b-ef1b-403c-8eeb-0f971f27e34f@redhat.com/
Signed-off-by: Chris J Arges <carges@cloudflare.com>
Link: https://patch.msgid.link/20260402222401.3408368-1-carges@cloudflare.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-03 15:52:10 -07:00
Yiqi Sun
fde29fd934 ipv4: icmp: fix null-ptr-deref in icmp_build_probe()
ipv6_stub->ipv6_dev_find() may return ERR_PTR(-EAFNOSUPPORT) when the
IPv6 stack is not active (CONFIG_IPV6=m and not loaded), and passing
this error pointer to dev_hold() will cause a kernel crash with
null-ptr-deref.

Instead, silently discard the request. RFC 8335 does not appear to
define a specific response for the case where an IPv6 interface
identifier is syntactically valid but the implementation cannot perform
the lookup at runtime, and silently dropping the request may safer than
misreporting "No Such Interface".

Fixes: d329ea5bd8 ("icmp: add response to RFC 8335 PROBE messages")
Signed-off-by: Yiqi Sun <sunyiqixm@gmail.com>
Link: https://patch.msgid.link/20260402070419.2291578-1-sunyiqixm@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-03 15:46:17 -07:00
Fernando Fernandez Mancera
14cf0cd353 ipv4: nexthop: allocate skb dynamically in rtm_get_nexthop()
When querying a nexthop object via RTM_GETNEXTHOP, the kernel currently
allocates a fixed-size skb using NLMSG_GOODSIZE. While sufficient for
single nexthops and small Equal-Cost Multi-Path groups, this fixed
allocation fails for large nexthop groups like 512 nexthops.

This results in the following warning splat:

 WARNING: net/ipv4/nexthop.c:3395 at rtm_get_nexthop+0x176/0x1c0, CPU#20: rep/4608
 [...]
 RIP: 0010:rtm_get_nexthop (net/ipv4/nexthop.c:3395)
 [...]
 Call Trace:
  <TASK>
  rtnetlink_rcv_msg (net/core/rtnetlink.c:6989)
  netlink_rcv_skb (net/netlink/af_netlink.c:2550)
  netlink_unicast (net/netlink/af_netlink.c:1319 net/netlink/af_netlink.c:1344)
  netlink_sendmsg (net/netlink/af_netlink.c:1894)
  ____sys_sendmsg (net/socket.c:721 net/socket.c:736 net/socket.c:2585)
  ___sys_sendmsg (net/socket.c:2641)
  __sys_sendmsg (net/socket.c:2671)
  do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94)
  entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
  </TASK>

Fix this by allocating the size dynamically using nh_nlmsg_size() and
using nlmsg_new(), this is consistent with nexthop_notify() behavior. In
addition, adjust nh_nlmsg_size_grp() so it calculates the size needed
based on flags passed. While at it, also add the size of NHA_FDB for
nexthop group size calculation as it was missing too.

This cannot be reproduced via iproute2 as the group size is currently
limited and the command fails as follows:

addattr_l ERROR: message exceeded bound of 1048

Fixes: 430a049190 ("nexthop: Add support for nexthop groups")
Reported-by: Yiming Qian <yimingqian591@gmail.com>
Closes: https://lore.kernel.org/netdev/CAL_bE8Li2h4KO+AQFXW4S6Yb_u5X4oSKnkywW+LPFjuErhqELA@mail.gmail.com/
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20260402072613.25262-2-fmancera@suse.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-03 15:34:27 -07:00
Fernando Fernandez Mancera
06aaf04ca8 ipv4: nexthop: avoid duplicate NHA_HW_STATS_ENABLE on nexthop group dump
Currently NHA_HW_STATS_ENABLE is included twice everytime a dump of
nexthop group is performed with NHA_OP_FLAG_DUMP_STATS. As all the stats
querying were moved to nla_put_nh_group_stats(), leave only that
instance of the attribute querying.

Fixes: 5072ae00ae ("net: nexthop: Expose nexthop group HW stats to user space")
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20260402072613.25262-1-fmancera@suse.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-03 15:34:27 -07:00
Pengpeng Hou
b76254c55d net: qualcomm: qca_uart: report the consumed byte on RX skb allocation failure
qca_tty_receive() consumes each input byte before checking whether a
completed frame needs a fresh receive skb. When the current byte completes
a frame, the driver delivers that frame and then allocates a new skb for
the next one.

If that allocation fails, the current code returns i even though data[i]
has already been consumed and may already have completed the delivered
frame. Since serdev interprets the return value as the number of accepted
bytes, this under-reports progress by one byte and can replay the final
byte of the completed frame into a fresh parser state on the next call.

Return i + 1 in that failure path so the accepted-byte count matches the
actual receive-state progress.

Fixes: dfc768fbe6 ("net: qualcomm: add QCA7000 UART driver")
Cc: stable@vger.kernel.org
Signed-off-by: Pengpeng Hou <pengpeng@iscas.ac.cn>
Reviewed-by: Stefan Wahren <wahrenst@gmx.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260402071207.4036-1-pengpeng@iscas.ac.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-03 15:32:56 -07:00
Oleh Konko
48a5fe3877 tipc: fix bc_ackers underflow on duplicate GRP_ACK_MSG
The GRP_ACK_MSG handler in tipc_group_proto_rcv() currently decrements
bc_ackers on every inbound group ACK, even when the same member has
already acknowledged the current broadcast round.

Because bc_ackers is a u16, a duplicate ACK received after the last
legitimate ACK wraps the counter to 65535. Once wrapped,
tipc_group_bc_cong() keeps reporting congestion and later group
broadcasts on the affected socket stay blocked until the group is
recreated.

Fix this by ignoring duplicate or stale ACKs before touching bc_acked or
bc_ackers. This makes repeated GRP_ACK_MSG handling idempotent and
prevents the underflow path.

Fixes: 2f487712b8 ("tipc: guarantee that group broadcast doesn't bypass group unicast")
Cc: stable@vger.kernel.org
Signed-off-by: Oleh Konko <security@1seal.org>
Reviewed-by: Tung Nguyen <tung.quang.nguyen@est.tech>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/41a4833f368641218e444fdcff822039.security@1seal.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-03 15:31:17 -07:00
Nikolaos Gkarlis
7b735ef812 rtnetlink: add missing netlink_ns_capable() check for peer netns
rtnl_newlink() lacks a CAP_NET_ADMIN capability check on the peer
network namespace when creating paired devices (veth, vxcan,
netkit). This allows an unprivileged user with a user namespace
to create interfaces in arbitrary network namespaces, including
init_net.

Add a netlink_ns_capable() check for CAP_NET_ADMIN in the peer
namespace before allowing device creation to proceed.

Fixes: 81adee47df ("net: Support specifying the network namespace upon device creation.")
Signed-off-by: Nikolaos Gkarlis <nickgarlis@gmail.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260402181432.4126920-1-nickgarlis@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-03 15:07:18 -07:00
Zijing Yin
1979645e18 bridge: guard local VLAN-0 FDB helpers against NULL vlan group
When CONFIG_BRIDGE_VLAN_FILTERING is not set, br_vlan_group() and
nbp_vlan_group() return NULL (br_private.h stub definitions). The
BR_BOOLOPT_FDB_LOCAL_VLAN_0 toggle code is compiled unconditionally and
reaches br_fdb_delete_locals_per_vlan_port() and
br_fdb_insert_locals_per_vlan_port(), where the NULL vlan group pointer
is dereferenced via list_for_each_entry(v, &vg->vlan_list, vlist).

The observed crash is in the delete path, triggered when creating a
bridge with IFLA_BR_MULTI_BOOLOPT containing BR_BOOLOPT_FDB_LOCAL_VLAN_0
via RTM_NEWLINK. The insert helper has the same bug pattern.

  Oops: general protection fault, probably for non-canonical address 0xdffffc0000000056: 0000 [#1] KASAN NOPTI
  KASAN: null-ptr-deref in range [0x00000000000002b0-0x00000000000002b7]
  RIP: 0010:br_fdb_delete_locals_per_vlan+0x2b9/0x310
  Call Trace:
   br_fdb_toggle_local_vlan_0+0x452/0x4c0
   br_toggle_fdb_local_vlan_0+0x31/0x80 net/bridge/br.c:276
   br_boolopt_toggle net/bridge/br.c:313
   br_boolopt_multi_toggle net/bridge/br.c:364
   br_changelink net/bridge/br_netlink.c:1542
   br_dev_newlink net/bridge/br_netlink.c:1575

Add NULL checks for the vlan group pointer in both helpers, returning
early when there are no VLANs to iterate. This matches the existing
pattern used by other bridge FDB functions such as br_fdb_add() and
br_fdb_delete().

Fixes: 21446c06b4 ("net: bridge: Introduce UAPI for BR_BOOLOPT_FDB_LOCAL_VLAN_0")
Signed-off-by: Zijing Yin <yzjaurora@gmail.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/20260402140153.3925663-1-yzjaurora@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-03 14:45:51 -07:00
Eric Dumazet
4e65a8b8da ipv6: ioam: fix potential NULL dereferences in __ioam6_fill_trace_data()
We need to check __in6_dev_get() for possible NULL value, as
suggested by Yiming Qian.

Also add skb_dst_dev_rcu() instead of skb_dst_dev(),
and two missing READ_ONCE().

Note that @dev can't be NULL.

Fixes: 9ee11f0fff ("ipv6: ioam: Data plane support for Pre-allocated Trace")
Reported-by: Yiming Qian <yimingqian591@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Justin Iurman <justin.iurman@gmail.com>
Link: https://patch.msgid.link/20260402101732.1188059-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-03 14:44:43 -07:00
Lorenzo Bianconi
285fa6b1e0 net: airoha: Fix memory leak in airoha_qdma_rx_process()
If an error occurs on the subsequents buffers belonging to the
non-linear part of the skb (e.g. due to an error in the payload length
reported by the NIC or if we consumed all the available fragments for
the skb), the page_pool fragment will not be linked to the skb so it will
not return to the pool in the airoha_qdma_rx_process() error path. Fix the
memory leak partially reverting commit 'd6d2b0e1538d ("net: airoha: Fix
page recycling in airoha_qdma_rx_process()")' and always running
page_pool_put_full_page routine in the airoha_qdma_rx_process() error
path.

Fixes: d6d2b0e153 ("net: airoha: Fix page recycling in airoha_qdma_rx_process()")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260402-airoha_qdma_rx_process-mem-leak-fix-v1-1-b5706f402d3c@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-03 14:42:51 -07:00
Eric Dumazet
b120e4432f net: lapbether: handle NETDEV_PRE_TYPE_CHANGE
lapbeth_data_transmit() expects the underlying device type
to be ARPHRD_ETHER.

Returning NOTIFY_BAD from lapbeth_device_event() makes sure
bonding driver can not break this expectation.

Fixes: 872254dd6b ("net/bonding: Enable bonding to enslave non ARPHRD_ETHER")
Reported-by: syzbot+d8c285748fa7292580a9@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/69cd22a1.050a0220.70c3a.0002.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Martin Schiller <ms@dev.tdt.de>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260402103519.1201565-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-03 14:40:36 -07:00
Arnd Bergmann
e16a0d3677 net: fec: make FIXED_PHY dependency unconditional
When CONFIG_FIXED_PHY is in a loadable module, the fec driver cannot be
built-in any more:

x86_64-linux-ld: vmlinux.o: in function `fec_enet_mii_probe':
fec_main.c:(.text+0xc4f367): undefined reference to `fixed_phy_unregister'
x86_64-linux-ld: vmlinux.o: in function `fec_enet_close':
fec_main.c:(.text+0xc59591): undefined reference to `fixed_phy_unregister'
x86_64-linux-ld: vmlinux.o: in function `fec_enet_mii_probe.cold':

Select the fixed phy support on all targets to make this build
correctly, not just on coldfire.

Notat that Essentially the stub helpers in include/linux/phy_fixed.h
cannot be used correctly because of this build time dependency,
and we could just remove them to hit the build failure more often
when a driver uses them without the 'select FIXED_PHY'.

Fixes: dc86b621e1 ("net: fec: register a fixed phy using fixed_phy_register_100fd if needed")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260402141048.2713445-1-arnd@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-03 14:39:37 -07:00
Ruide Cao
c842743d07 net: sched: act_csum: validate nested VLAN headers
tcf_csum_act() walks nested VLAN headers directly from skb->data when an
skb still carries in-payload VLAN tags. The current code reads
vlan->h_vlan_encapsulated_proto and then pulls VLAN_HLEN bytes without
first ensuring that the full VLAN header is present in the linear area.

If only part of an inner VLAN header is linearized, accessing
h_vlan_encapsulated_proto reads past the linear area, and the following
skb_pull(VLAN_HLEN) may violate skb invariants.

Fix this by requiring pskb_may_pull(skb, VLAN_HLEN) before accessing and
pulling each nested VLAN header. If the header still is not fully
available, drop the packet through the existing error path.

Fixes: 2ecba2d1e4 ("net: sched: act_csum: Fix csum calc for tagged packets")
Reported-by: Yifan Wu <yifanwucs@gmail.com>
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Co-developed-by: Yuan Tan <yuantan098@gmail.com>
Signed-off-by: Yuan Tan <yuantan098@gmail.com>
Suggested-by: Xin Liu <bird@lzu.edu.cn>
Tested-by: Ren Wei <enjou1224z@gmail.com>
Signed-off-by: Ruide Cao <caoruide123@gmail.com>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/22df2fcb49f410203eafa5d97963dd36089f4ecf.1774892775.git.caoruide123@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-03 14:34:56 -07:00
Tyllis Xu
51f4e090b9 net: stmmac: fix integer underflow in chain mode
The jumbo_frm() chain-mode implementation unconditionally computes

    len = nopaged_len - bmax;

where nopaged_len = skb_headlen(skb) (linear bytes only) and bmax is
BUF_SIZE_8KiB or BUF_SIZE_2KiB.  However, the caller stmmac_xmit()
decides to invoke jumbo_frm() based on skb->len (total length including
page fragments):

    is_jumbo = stmmac_is_jumbo_frm(priv, skb->len, enh_desc);

When a packet has a small linear portion (nopaged_len <= bmax) but a
large total length due to page fragments (skb->len > bmax), the
subtraction wraps as an unsigned integer, producing a huge len value
(~0xFFFFxxxx).  This causes the while (len != 0) loop to execute
hundreds of thousands of iterations, passing skb->data + bmax * i
pointers far beyond the skb buffer to dma_map_single().  On IOMMU-less
SoCs (the typical deployment for stmmac), this maps arbitrary kernel
memory to the DMA engine, constituting a kernel memory disclosure and
potential memory corruption from hardware.

Fix this by introducing a buf_len local variable clamped to
min(nopaged_len, bmax).  Computing len = nopaged_len - buf_len is then
always safe: it is zero when the linear portion fits within a single
descriptor, causing the while (len != 0) loop to be skipped naturally,
and the fragment loop in stmmac_xmit() handles page fragments afterward.

Fixes: 286a837217 ("stmmac: add CHAINED descriptor mode support (V4)")
Cc: stable@vger.kernel.org
Signed-off-by: Tyllis Xu <LivelyCarpet87@gmail.com>
Link: https://patch.msgid.link/20260401044708.1386919-1-LivelyCarpet87@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-02 18:28:10 -07:00
David Carlier
6dede39676 net: altera-tse: fix skb leak on DMA mapping error in tse_start_xmit()
When dma_map_single() fails in tse_start_xmit(), the function returns
NETDEV_TX_OK without freeing the skb. Since NETDEV_TX_OK tells the
stack the packet was consumed, the skb is never freed, leaking memory
on every DMA mapping failure.

Add dev_kfree_skb_any() before returning to properly free the skb.

Fixes: bbd2190ce9 ("Altera TSE: Add main and header file for Altera Ethernet Driver")
Cc: stable@vger.kernel.org
Signed-off-by: David Carlier <devnexen@gmail.com>
Link: https://patch.msgid.link/20260401211218.279185-1-devnexen@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-02 18:25:23 -07:00
Allison Henderson
e9c9f084cd MAINTAINERS: Update email for Allison Henderson
Switch active email address to kernel.org alias

Signed-off-by: Allison Henderson <achender@kernel.org>
Link: https://patch.msgid.link/20260402005833.38376-1-achender@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-02 18:06:03 -07:00
Qingfang Deng
2fd68c7ea2 MAINTAINERS: orphan PPP over Ethernet driver
We haven't seen activities from Michal Ostrowski for quite a long time.
The last commit from him is fb64bb560e ("PPPoE: Fix flush/close
races."), which was in 2009. Email to mostrows@earthlink.net also
bounces.

Signed-off-by: Qingfang Deng <dqfext@gmail.com>
Link: https://patch.msgid.link/20260401022842.15082-1-dqfext@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-02 18:03:47 -07:00
Linus Torvalds
f8f5627a8a Merge tag 'net-7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
 "With fixes from wireless, bluetooth and netfilter included we're back
  to each PR carrying 30%+ more fixes than in previous era.

  The good news is that so far none of the "extra" fixes are themselves
  causing real regressions. Not sure how much comfort that is.

  Current release - fix to a fix:

   - netdevsim: fix build if SKB_EXTENSIONS=n

   - eth: stmmac: skip VLAN restore when VLAN hash ops are missing

  Previous releases - regressions:

   - wifi: iwlwifi: mvm: don't send a 6E related command when
     not supported

  Previous releases - always broken:

   - some info leak fixes

   - add missing clearing of skb->cb[] on ICMP paths from tunnels

   - ipv6:
      - flowlabel: defer exclusive option free until RCU teardown
      - avoid overflows in ip6_datagram_send_ctl()

   - mpls: add seqcount to protect platform_labels from OOB access

   - bridge: improve safety of parsing ND options

   - bluetooth: fix leaks, overflows and races in hci_sync

   - netfilter: add more input validation, some to address bugs directly
     some to prevent exploits from cooking up broken configurations

   - wifi:
      - ath: avoid poor performance due to stopping the wrong
        aggregation session
      - virt_wifi: remove SET_NETDEV_DEV to avoid use-after-free

   - eth:
      - fec: fix the PTP periodic output sysfs interface
      - enetc: safely reinitialize TX BD ring when it has unsent frames"

* tag 'net-7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (95 commits)
  eth: fbnic: Increase FBNIC_QUEUE_SIZE_MIN to 64
  ipv6: avoid overflows in ip6_datagram_send_ctl()
  net: hsr: fix VLAN add unwind on slave errors
  net: hsr: serialize seq_blocks merge across nodes
  vsock: initialize child_ns_mode_locked in vsock_net_init()
  selftests/tc-testing: add tests for cls_fw and cls_flow on shared blocks
  net/sched: cls_flow: fix NULL pointer dereference on shared blocks
  net/sched: cls_fw: fix NULL pointer dereference on shared blocks
  net/x25: Fix overflow when accumulating packets
  net/x25: Fix potential double free of skb
  bnxt_en: Restore default stat ctxs for ULP when resource is available
  bnxt_en: Don't assume XDP is never enabled in bnxt_init_dflt_ring_mode()
  bnxt_en: Refactor some basic ring setup and adjustment logic
  net/mlx5: Fix switchdev mode rollback in case of failure
  net/mlx5: Avoid "No data available" when FW version queries fail
  net/mlx5: lag: Check for LAG device before creating debugfs
  net: macb: properly unregister fixed rate clocks
  net: macb: fix clk handling on PCI glue driver removal
  virtio_net: clamp rss_max_key_size to NETDEV_RSS_KEY_LEN
  net/sched: sch_netem: fix out-of-bounds access in packet corruption
  ...
2026-04-02 09:57:06 -07:00
Linus Torvalds
4c2c526b5a Merge tag 'iommu-fixes-v7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux
Pull iommu fixes from Joerg Roedel:

 - IOMMU-PT related compile breakage in for AMD driver

 - IOTLB flushing behavior when unmapped region is larger than requested
   due to page-sizes

 - Fix IOTLB flush behavior with empty gathers

* tag 'iommu-fixes-v7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux:
  iommupt/amdv1: mark amdv1pt_install_leaf_entry as __always_inline
  iommupt: Fix short gather if the unmap goes into a large mapping
  iommu: Do not call drivers for empty gathers
2026-04-02 09:53:16 -07:00
Linus Torvalds
2ec9074b28 Merge tag 'sound-7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
 "People have been so busy for hunting and we're still getting more
  changes than wished for, but it doesn't look too scary; almost all
  changes are device-specific small fixes.

  I guess it's rather a casual bump, and no more Easter eggs are left
  for 7.0 (hopefully)...

   - Fixes for the recent regression on ctxfi driver

   - Fix missing INIT_LIST_HEAD() for ASoC card_aux_list

   - Usual HD- and USB-audio, and ASoC AMD quirk updates

   - ASoC fixes for AMD and Intel"

* tag 'sound-7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (24 commits)
  ASoC: amd: ps: Fix missing leading zeros in subsystem_device SSID log
  ALSA: usb-audio: Exclude Scarlett 2i2 1st Gen (8016) from SKIP_IFACE_SETUP
  ALSA: hda/realtek: add quirk for Acer Swift SFG14-73
  ALSA: hda/realtek: Add quirk for Lenovo Yoga Pro 7 14IMH9
  ASoC: Intel: boards: fix unmet dependency on PINCTRL
  ASoC: Intel: ehl_rt5660: Use the correct rtd->dev device in hw_params
  ALSA: ctxfi: Don't enumerate SPDIF1 at DAIO initialization
  ALSA: hda/realtek: Add quirk for Lenovo Yoga Slim 7 14AKP10
  ALSA: hda/realtek: add quirk for HP Laptop 15-fc0xxx
  ASoC: ep93xx: Fix unchecked clk_prepare_enable() and add rollback on failure
  ASoC: soc-core: call missing INIT_LIST_HEAD() for card_aux_list
  ALSA: hda/realtek: Add quirk for Samsung Book2 Pro 360 (NP950QED)
  ASoC: amd: yc: Add DMI entry for HP Laptop 15-fc0xxx
  ASoC: amd: yc: Add DMI quirk for ASUS Vivobook Pro 16X OLED M7601RM
  ALSA: hda/realtek: Add quirk for ASUS ROG Strix SCAR 15
  ALSA: usb-audio: Exclude Scarlett Solo 1st Gen from SKIP_IFACE_SETUP
  ALSA: caiaq: fix stack out-of-bounds read in init_card
  ALSA: ctxfi: Check the error for index mapping
  ALSA: ctxfi: Fix missing SPDIFI1 index handling
  ALSA: hda/realtek: add quirk for HP Victus 15-fb0xxx
  ...
2026-04-02 09:41:21 -07:00
Linus Torvalds
2064d7784e Merge tag 'auxdisplay-v7.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/andy/linux-auxdisplay
Pull auxdisplay fixes from Andy Shevchenko:

 - Fix NULL dereference in linedisp_release()

 - Fix ht16k33 DT bindings to avoid warnings

 - Handle errors in I²C transfers in lcd2s driver

* tag 'auxdisplay-v7.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/andy/linux-auxdisplay:
  auxdisplay: line-display: fix NULL dereference in linedisp_release
  auxdisplay: lcd2s: add error handling for i2c transfers
  dt-bindings: auxdisplay: ht16k33: Use unevaluatedProperties to fix common property warning
2026-04-02 09:34:22 -07:00
Dimitri Daskalakis
ec7067e661 eth: fbnic: Increase FBNIC_QUEUE_SIZE_MIN to 64
On systems with 64K pages, RX queues will be wedged if users set the
descriptor count to the current minimum (16). Fbnic fragments large
pages into 4K chunks, and scales down the ring size accordingly. With
64K pages and 16 descriptors, the ring size mask is 0 and will never
be filled.

32 descriptors is another special case that wedges the RX rings.
Internally, the rings track pages for the head/tail pointers, not page
fragments. So with 32 descriptors, there's only 1 usable page as one
ring slot is kept empty to disambiguate between an empty/full ring.
As a result, the head pointer never advances and the HW stalls after
consuming 16 page fragments.

Fixes: 0cb4c0a137 ("eth: fbnic: Implement Rx queue alloc/start/stop/free")
Signed-off-by: Dimitri Daskalakis <daskald@meta.com>
Link: https://patch.msgid.link/20260401162848.2335350-1-dimitri.daskalakis1@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-02 08:38:34 -07:00
Eric Dumazet
4e45337556 ipv6: avoid overflows in ip6_datagram_send_ctl()
Yiming Qian reported :
<quote>
 I believe I found a locally triggerable kernel bug in the IPv6 sendmsg
 ancillary-data path that can panic the kernel via `skb_under_panic()`
 (local DoS).

 The core issue is a mismatch between:

 - a 16-bit length accumulator (`struct ipv6_txoptions::opt_flen`, type
 `__u16`) and
 - a pointer to the *last* provided destination-options header (`opt->dst1opt`)

 when multiple `IPV6_DSTOPTS` control messages (cmsgs) are provided.

 - `include/net/ipv6.h`:
   - `struct ipv6_txoptions::opt_flen` is `__u16` (wrap possible).
 (lines 291-307, especially 298)
 - `net/ipv6/datagram.c:ip6_datagram_send_ctl()`:
   - Accepts repeated `IPV6_DSTOPTS` and accumulates into `opt_flen`
 without rejecting duplicates. (lines 909-933)
 - `net/ipv6/ip6_output.c:__ip6_append_data()`:
   - Uses `opt->opt_flen + opt->opt_nflen` to compute header
 sizes/headroom decisions. (lines 1448-1466, especially 1463-1465)
 - `net/ipv6/ip6_output.c:__ip6_make_skb()`:
   - Calls `ipv6_push_frag_opts()` if `opt->opt_flen` is non-zero.
 (lines 1930-1934)
 - `net/ipv6/exthdrs.c:ipv6_push_frag_opts()` / `ipv6_push_exthdr()`:
   - Push size comes from `ipv6_optlen(opt->dst1opt)` (based on the
 pointed-to header). (lines 1179-1185 and 1206-1211)

 1. `opt_flen` is a 16-bit accumulator:

 - `include/net/ipv6.h:298` defines `__u16 opt_flen; /* after fragment hdr */`.

 2. `ip6_datagram_send_ctl()` accepts *repeated* `IPV6_DSTOPTS` cmsgs
 and increments `opt_flen` each time:

 - In `net/ipv6/datagram.c:909-933`, for `IPV6_DSTOPTS`:
   - It computes `len = ((hdr->hdrlen + 1) << 3);`
   - It checks `CAP_NET_RAW` using `ns_capable(net->user_ns,
 CAP_NET_RAW)`. (line 922)
   - Then it does:
     - `opt->opt_flen += len;` (line 927)
     - `opt->dst1opt = hdr;` (line 928)

 There is no duplicate rejection here (unlike the legacy
 `IPV6_2292DSTOPTS` path which rejects duplicates at
 `net/ipv6/datagram.c:901-904`).

 If enough large `IPV6_DSTOPTS` cmsgs are provided, `opt_flen` wraps
 while `dst1opt` still points to a large (2048-byte)
 destination-options header.

 In the attached PoC (`poc.c`):

 - 32 cmsgs with `hdrlen=255` => `len = (255+1)*8 = 2048`
 - 1 cmsg with `hdrlen=0` => `len = 8`
 - Total increment: `32*2048 + 8 = 65544`, so `(__u16)opt_flen == 8`
 - The last cmsg is 2048 bytes, so `dst1opt` points to a 2048-byte header.

 3. The transmit path sizes headers using the wrapped `opt_flen`:

- In `net/ipv6/ip6_output.c:1463-1465`:
  - `headersize = sizeof(struct ipv6hdr) + (opt ? opt->opt_flen +
 opt->opt_nflen : 0) + ...;`

 With wrapped `opt_flen`, `headersize`/headroom decisions underestimate
 what will be pushed later.

 4. When building the final skb, the actual push length comes from
 `dst1opt` and is not limited by wrapped `opt_flen`:

 - In `net/ipv6/ip6_output.c:1930-1934`:
   - `if (opt->opt_flen) proto = ipv6_push_frag_opts(skb, opt, proto);`
 - In `net/ipv6/exthdrs.c:1206-1211`, `ipv6_push_frag_opts()` pushes
 `dst1opt` via `ipv6_push_exthdr()`.
 - In `net/ipv6/exthdrs.c:1179-1184`, `ipv6_push_exthdr()` does:
   - `skb_push(skb, ipv6_optlen(opt));`
   - `memcpy(h, opt, ipv6_optlen(opt));`

 With insufficient headroom, `skb_push()` underflows and triggers
 `skb_under_panic()` -> `BUG()`:

 - `net/core/skbuff.c:2669-2675` (`skb_push()` calls `skb_under_panic()`)
 - `net/core/skbuff.c:207-214` (`skb_panic()` ends in `BUG()`)

 - The `IPV6_DSTOPTS` cmsg path requires `CAP_NET_RAW` in the target
 netns user namespace (`ns_capable(net->user_ns, CAP_NET_RAW)`).
 - Root (or any task with `CAP_NET_RAW`) can trigger this without user
 namespaces.
 - An unprivileged `uid=1000` user can trigger this if unprivileged
 user namespaces are enabled and it can create a userns+netns to obtain
 namespaced `CAP_NET_RAW` (the attached PoC does this).

 - Local denial of service: kernel BUG/panic (system crash).
 - Reproducible with a small userspace PoC.
</quote>

This patch does not reject duplicated options, as this might break
some user applications.

Instead, it makes sure to adjust opt_flen and opt_nflen to correctly
reflect the size of the current option headers, preventing the overflows
and the potential for panics.

This applies to IPV6_DSTOPTS, IPV6_HOPOPTS, and IPV6_RTHDR.

Specifically:

When a new IPV6_DSTOPTS is processed, the length of the old opt->dst1opt
is subtracted from opt->opt_flen before adding the new length.

When a new IPV6_HOPOPTS is processed, the length of the old opt->dst0opt
is subtracted from opt->opt_nflen.

When a new Routing Header (IPV6_RTHDR or IPV6_2292RTHDR) is processed,
the length of the old opt->srcrt is subtracted from opt->opt_nflen.

In the special case within IPV6_2292RTHDR handling where dst1opt is moved
to dst0opt, the length of the old opt->dst0opt is subtracted from
opt->opt_nflen before the new one is added.

Fixes: 333fad5364 ("[IPV6]: Support several new sockopt / ancillary data in Advanced API (RFC3542).")
Reported-by: Yiming Qian <yimingqian591@gmail.com>
Closes: https://lore.kernel.org/netdev/CAL_bE8JNzawgr5OX5m+3jnQDHry2XxhQT5=jThW1zDPtUikRYA@mail.gmail.com/
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260401154721.3740056-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-02 08:25:22 -07:00
Jakub Kicinski
be193568be Merge branch 'net-hsr-fixes-for-prp-duplication-and-vlan-unwind'
Luka Gejak says:

====================
net: hsr: fixes for PRP duplication and VLAN unwind

This series addresses two logic bugs in the HSR/PRP implementation
identified during a protocol audit. These are targeted for the 'net'
tree as they fix potential memory corruption and state inconsistency.

The primary change resolves a race condition in the node merging path by
implementing address-based lock ordering. This ensures that concurrent
mutations of sequence blocks do not lead to state corruption or
deadlocks.

An additional fix corrects asymmetric VLAN error unwinding by
implementing a centralized unwind path on slave errors.
====================

Link: https://patch.msgid.link/20260401092243.52121-1-luka.gejak@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-02 08:23:55 -07:00
Luka Gejak
2e3514e63b net: hsr: fix VLAN add unwind on slave errors
When vlan_vid_add() fails for a secondary slave, the error path calls
vlan_vid_del() on the failing port instead of the peer slave that had
already succeeded. This results in asymmetric VLAN state across the HSR
pair.

Fix this by switching to a centralized unwind path that removes the VID
from any slave device that was already programmed.

Fixes: 1a8a63a530 ("net: hsr: Add VLAN CTAG filter support")
Signed-off-by: Luka Gejak <luka.gejak@linux.dev>
Link: https://patch.msgid.link/20260401092243.52121-3-luka.gejak@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-02 08:23:49 -07:00
Luka Gejak
f5df2990c3 net: hsr: serialize seq_blocks merge across nodes
During node merging, hsr_handle_sup_frame() walks node_curr->seq_blocks
to update node_real without holding node_curr->seq_out_lock. This
allows concurrent mutations from duplicate registration paths, risking
inconsistent state or XArray/bitmap corruption.

Fix this by locking both nodes' seq_out_lock during the merge.
To prevent ABBA deadlocks, locks are acquired in order of memory
address.

Reviewed-by: Felix Maurer <fmaurer@redhat.com>
Fixes: 415e636751 ("hsr: Implement more robust duplicate discard for PRP")
Signed-off-by: Luka Gejak <luka.gejak@linux.dev>
Link: https://patch.msgid.link/20260401092243.52121-2-luka.gejak@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-02 08:23:49 -07:00
Stefano Garzarella
b18c833888 vsock: initialize child_ns_mode_locked in vsock_net_init()
The `child_ns_mode_locked` field lives in `struct net`, which persists
across vsock module reloads. When the module is unloaded and reloaded,
`vsock_net_init()` resets `mode` and `child_ns_mode` back to their
default values, but does not reset `child_ns_mode_locked`.

The stale lock from the previous module load causes subsequent writes
to `child_ns_mode` to silently fail: `vsock_net_set_child_mode()` sees
the old lock, skips updating the actual value, and returns success
when the requested mode matches the stale lock. The sysctl handler
reports no error, but `child_ns_mode` remains unchanged.

Steps to reproduce:
    $ modprobe vsock
    $ echo local > /proc/sys/net/vsock/child_ns_mode
    $ cat /proc/sys/net/vsock/child_ns_mode
    local
    $ modprobe -r vsock
    $ modprobe vsock
    $ echo local > /proc/sys/net/vsock/child_ns_mode
    $ cat /proc/sys/net/vsock/child_ns_mode
    global    <--- expected "local"

Fix this by initializing `child_ns_mode_locked` to 0 (unlocked) in
`vsock_net_init()`, so the write-once mechanism works correctly after
module reload.

Fixes: 102eab95f0 ("vsock: lock down child_ns_mode as write-once")
Reported-by: Jin Liu <jinl@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Bobby Eshleman <bobbyeshleman@meta.com>
Link: https://patch.msgid.link/20260401092153.28462-1-sgarzare@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-02 08:18:56 -07:00
Xiang Mei
70f73562d2 selftests/tc-testing: add tests for cls_fw and cls_flow on shared blocks
Regression tests for the shared-block NULL derefs fixed in the previous
two patches:

  - fw: attempt to attach an empty fw filter to a shared block and
    verify the configuration is rejected with EINVAL.
  - flow: create a flow filter on a shared block without a baseclass
    and verify the configuration is rejected with EINVAL.

Signed-off-by: Xiang Mei <xmei5@asu.edu>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Victor Nogueira <victor@mojatatu.com>
Link: https://patch.msgid.link/20260331050217.504278-3-xmei5@asu.edu
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-04-02 15:08:42 +02:00
Xiang Mei
1a280dd4bd net/sched: cls_flow: fix NULL pointer dereference on shared blocks
flow_change() calls tcf_block_q() and dereferences q->handle to derive
a default baseclass.  Shared blocks leave block->q NULL, causing a NULL
deref when a flow filter without a fully qualified baseclass is created
on a shared block.

Check tcf_block_shared() before accessing block->q and return -EINVAL
for shared blocks.  This avoids the null-deref shown below:

=======================================================================
KASAN: null-ptr-deref in range [0x0000000000000038-0x000000000000003f]
RIP: 0010:flow_change (net/sched/cls_flow.c:508)
Call Trace:
 tc_new_tfilter (net/sched/cls_api.c:2432)
 rtnetlink_rcv_msg (net/core/rtnetlink.c:6980)
 [...]
=======================================================================

Fixes: 1abf272022 ("net: sched: tcindex, fw, flow: use tcf_block_q helper to get struct Qdisc")
Reported-by: Weiming Shi <bestswngs@gmail.com>
Signed-off-by: Xiang Mei <xmei5@asu.edu>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://patch.msgid.link/20260331050217.504278-2-xmei5@asu.edu
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-04-02 15:08:42 +02:00
Xiang Mei
faeea8bbf6 net/sched: cls_fw: fix NULL pointer dereference on shared blocks
The old-method path in fw_classify() calls tcf_block_q() and
dereferences q->handle.  Shared blocks leave block->q NULL, causing a
NULL deref when an empty cls_fw filter is attached to a shared block
and a packet with a nonzero major skb mark is classified.

Reject the configuration in fw_change() when the old method (no
TCA_OPTIONS) is used on a shared block, since fw_classify()'s
old-method path needs block->q which is NULL for shared blocks.

The fixed null-ptr-deref calling stack:
 KASAN: null-ptr-deref in range [0x0000000000000038-0x000000000000003f]
 RIP: 0010:fw_classify (net/sched/cls_fw.c:81)
 Call Trace:
  tcf_classify (./include/net/tc_wrapper.h:197 net/sched/cls_api.c:1764 net/sched/cls_api.c:1860)
  tc_run (net/core/dev.c:4401)
  __dev_queue_xmit (net/core/dev.c:4535 net/core/dev.c:4790)

Fixes: 1abf272022 ("net: sched: tcindex, fw, flow: use tcf_block_q helper to get struct Qdisc")
Reported-by: Weiming Shi <bestswngs@gmail.com>
Signed-off-by: Xiang Mei <xmei5@asu.edu>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://patch.msgid.link/20260331050217.504278-1-xmei5@asu.edu
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-04-02 15:08:41 +02:00
Paolo Abeni
a80a014f83 Merge branch 'net-x25-fix-overflow-and-double-free'
Martin Schiller says:

====================
net/x25: Fix overflow and double free

This patch set includes 2 fixes:

The first removes a potential double free of received skb
The second fixes an overflow when accumulating packets with the more-bit
set.

Signed-off-by: Martin Schiller <ms@dev.tdt.de>
====================

Link: https://patch.msgid.link/20260331-x25_fraglen-v4-0-3e69f18464b4@dev.tdt.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-04-02 13:36:10 +02:00
Martin Schiller
a1822cb524 net/x25: Fix overflow when accumulating packets
Add a check to ensure that `x25_sock.fraglen` does not overflow.

The `fraglen` also needs to be resetted when purging `fragment_queue` in
`x25_clear_queues()`.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Suggested-by: Yiming Qian <yimingqian591@gmail.com>
Signed-off-by: Martin Schiller <ms@dev.tdt.de>
Link: https://patch.msgid.link/20260331-x25_fraglen-v4-2-3e69f18464b4@dev.tdt.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-04-02 13:36:08 +02:00
Martin Schiller
d10a26aa4d net/x25: Fix potential double free of skb
When alloc_skb fails in x25_queue_rx_frame it calls kfree_skb(skb) at
line 48 and returns 1 (error).
This error propagates back through the call chain:

x25_queue_rx_frame returns 1
    |
    v
x25_state3_machine receives the return value 1 and takes the else
branch at line 278, setting queued=0 and returning 0
    |
    v
x25_process_rx_frame returns queued=0
    |
    v
x25_backlog_rcv at line 452 sees queued=0 and calls kfree_skb(skb)
again

This would free the same skb twice. Looking at x25_backlog_rcv:

net/x25/x25_in.c:x25_backlog_rcv() {
    ...
    queued = x25_process_rx_frame(sk, skb);
    ...
    if (!queued)
        kfree_skb(skb);
}

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Martin Schiller <ms@dev.tdt.de>
Link: https://patch.msgid.link/20260331-x25_fraglen-v4-1-3e69f18464b4@dev.tdt.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-04-02 13:36:08 +02:00
Takashi Iwai
b477ab8893 Merge tag 'asoc-fix-v7.0-rc6' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v7.0

Another smallish batch of fixes and quirks, these days it's AMD that is
getting all the DMI entries added.  We've got one core fix for a missing
list initialisation with auxiliary devices, otherwise it's all fairly
small things.
2026-04-02 09:08:03 +02:00
Jakub Kicinski
9351edf65c Merge branch 'bnxt_en-bug-fixes'
Michael Chan says:

====================
bnxt_en: Bug fixes

The first patch is a refactor patch needed by the second patch to
fix XDP ring initialization during FW reset.  The third patch
fixes an issue related to stats context reservation for RoCE.
====================

Link: https://patch.msgid.link/20260331065138.948205-1-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-01 20:13:00 -07:00
Pavan Chebbi
071dbfa304 bnxt_en: Restore default stat ctxs for ULP when resource is available
During resource reservation, if the L2 driver does not have enough
MSIX vectors to provide to the RoCE driver, it sets the stat ctxs for
ULP also to 0 so that we don't have to reserve it unnecessarily.

However, subsequently the user may reduce L2 rings thereby freeing up
some resources that the L2 driver can now earmark for RoCE. In this
case, the driver should restore the default ULP stat ctxs to make
sure that all RoCE resources are ready for use.

The RoCE driver may fail to initialize in this scenario without this
fix.

Fixes: d630624ebd ("bnxt_en: Utilize ulp client resources if RoCE is not registered")
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20260331065138.948205-4-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-01 20:12:56 -07:00
Michael Chan
e4bf81dcad bnxt_en: Don't assume XDP is never enabled in bnxt_init_dflt_ring_mode()
The original code made the assumption that when we set up the initial
default ring mode, we must be just loading the driver and XDP cannot
be enabled yet.  This is not true when the FW goes through a resource
or capability change.  Resource reservations will be cancelled and
reinitialized with XDP already enabled.  devlink reload with XDP enabled
will also have the same issue.  This scenario will cause the ring
arithmetic to be all wrong in the bnxt_init_dflt_ring_mode() path
causing failure:

bnxt_en 0000:a1:00.0 ens2f0np0: bnxt_setup_int_mode err: ffffffea
bnxt_en 0000:a1:00.0 ens2f0np0: bnxt_request_irq err: ffffffea
bnxt_en 0000:a1:00.0 ens2f0np0: nic open fail (rc: ffffffea)

Fix it by properly accounting for XDP in the bnxt_init_dflt_ring_mode()
path by using the refactored helper functions in the previous patch.

Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Fixes: ec5d31e3c1 ("bnxt_en: Handle firmware reset status during IF_UP.")
Fixes: 228ea8c187 ("bnxt_en: implement devlink dev reload driver_reinit")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20260331065138.948205-3-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-01 20:12:56 -07:00
Michael Chan
ceee35e567 bnxt_en: Refactor some basic ring setup and adjustment logic
Refactor out the basic code that trims the default rings, sets up and
adjusts XDP TX rings and CP rings.  There is no change in behavior.
This is to prepare for the next bug fix patch.

Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20260331065138.948205-2-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-01 20:12:56 -07:00
Jakub Kicinski
a59dc0f871 Merge branch 'mlx5-misc-fixes-2026-03-30'
Tariq Toukan says:

====================
mlx5 misc fixes 2026-03-30

This patchset provides misc bug fixes from the team to the mlx5
core driver.
====================

Link: https://patch.msgid.link/20260330194015.53585-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-01 20:10:46 -07:00
Saeed Mahameed
403186400a net/mlx5: Fix switchdev mode rollback in case of failure
If for some internal reason switchdev mode fails, we rollback to legacy
mode, before this patch, rollback will unregister the uplink netdev and
leave it unregistered causing the below kernel bug.

To fix this, we need to avoid netdev unregister by setting the proper
rollback flag 'MLX5_PRIV_FLAGS_SWITCH_LEGACY' to indicate legacy mode.

devlink (431) used greatest stack depth: 11048 bytes left
mlx5_core 0000:00:03.0: E-Switch: Disable: mode(LEGACY), nvfs(0), \
	necvfs(0), active vports(0)
mlx5_core 0000:00:03.0: E-Switch: Supported tc chains and prios offload
mlx5_core 0000:00:03.0: Loading uplink representor for vport 65535
mlx5_core 0000:00:03.0: mlx5_cmd_out_err:816:(pid 456): \
	QUERY_HCA_CAP(0x100) op_mod(0x0) failed, \
	status bad parameter(0x3), syndrome (0x3a3846), err(-22)
mlx5_core 0000:00:03.0 enp0s3np0 (unregistered): Unloading uplink \
	representor for vport 65535
 ------------[ cut here ]------------
kernel BUG at net/core/dev.c:12070!
Oops: invalid opcode: 0000 [#1] SMP NOPTI
CPU: 2 UID: 0 PID: 456 Comm: devlink Not tainted 6.16.0-rc3+ \
	#9 PREEMPT(voluntary)
RIP: 0010:unregister_netdevice_many_notify+0x123/0xae0
...
Call Trace:
[   90.923094]  unregister_netdevice_queue+0xad/0xf0
[   90.923323]  unregister_netdev+0x1c/0x40
[   90.923522]  mlx5e_vport_rep_unload+0x61/0xc6
[   90.923736]  esw_offloads_enable+0x8e6/0x920
[   90.923947]  mlx5_eswitch_enable_locked+0x349/0x430
[   90.924182]  ? is_mp_supported+0x57/0xb0
[   90.924376]  mlx5_devlink_eswitch_mode_set+0x167/0x350
[   90.924628]  devlink_nl_eswitch_set_doit+0x6f/0xf0
[   90.924862]  genl_family_rcv_msg_doit+0xe8/0x140
[   90.925088]  genl_rcv_msg+0x18b/0x290
[   90.925269]  ? __pfx_devlink_nl_pre_doit+0x10/0x10
[   90.925506]  ? __pfx_devlink_nl_eswitch_set_doit+0x10/0x10
[   90.925766]  ? __pfx_devlink_nl_post_doit+0x10/0x10
[   90.926001]  ? __pfx_genl_rcv_msg+0x10/0x10
[   90.926206]  netlink_rcv_skb+0x52/0x100
[   90.926393]  genl_rcv+0x28/0x40
[   90.926557]  netlink_unicast+0x27d/0x3d0
[   90.926749]  netlink_sendmsg+0x1f7/0x430
[   90.926942]  __sys_sendto+0x213/0x220
[   90.927127]  ? __sys_recvmsg+0x6a/0xd0
[   90.927312]  __x64_sys_sendto+0x24/0x30
[   90.927504]  do_syscall_64+0x50/0x1c0
[   90.927687]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[   90.927929] RIP: 0033:0x7f7d0363e047

Fixes: 2a4f56fbcc ("net/mlx5e: Keep netdev when leave switchdev for devlink set legacy only")
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260330194015.53585-4-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-01 20:10:41 -07:00
Saeed Mahameed
10dc35f6a4 net/mlx5: Avoid "No data available" when FW version queries fail
Avoid printing the misleading "kernel answers: No data available" devlink
output when querying firmware or pending firmware version fails
(e.g. MLX5 fw state errors / flash failures).

FW can fail on loading the pending flash image and get its version due
to various reasons, examples:

mlxfw: Firmware flash failed: key not applicable, err (7)
mlx5_fw_image_pending: can't read pending fw version while fw state is 1

and the resulting:
$ devlink dev info
kernel answers: No data available

Instead, just report 0 or 0xfff.. versions in case of failure to indicate
a problem, and let other information be shown.

after the fix:
$ devlink dev info
pci/0000:00:06.0:
  driver mlx5_core
  serial_number xxx...
  board.serial_number MT2225300179
  versions:
      fixed:
        fw.psid MT_0000000436
      running:
        fw.version 22.41.0188
        fw 22.41.0188
      stored:
        fw.version 255.255.65535
        fw 255.255.65535

Fixes: 9c86b07e30 ("net/mlx5: Added fw version query command")
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260330194015.53585-3-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-01 20:10:41 -07:00
Shay Drory
bf16bca665 net/mlx5: lag: Check for LAG device before creating debugfs
__mlx5_lag_dev_add_mdev() may return 0 (success) even when an error
occurs that is handled gracefully. Consequently, the initialization
flow proceeds to call mlx5_ldev_add_debugfs() even when there is no
valid LAG context.

mlx5_ldev_add_debugfs() blindly created the debugfs directory and
attributes. This exposed interfaces (like the members file) that rely on
a valid ldev pointer, leading to potential NULL pointer dereferences if
accessed when ldev is NULL.

Add a check to verify that mlx5_lag_dev(dev) returns a valid pointer
before attempting to create the debugfs entries.

Fixes: 7f46a0b732 ("net/mlx5: Lag, add debugfs to query hardware lag state")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260330194015.53585-2-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-01 20:10:40 -07:00
Fedor Pchelkin
f0f367a4f4 net: macb: properly unregister fixed rate clocks
The additional resources allocated with clk_register_fixed_rate() need
to be released with clk_unregister_fixed_rate(), otherwise they are lost.

Fixes: 83a77e9ec4 ("net: macb: Added PCI wrapper for Platform Driver.")
Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
Link: https://patch.msgid.link/20260330184542.626619-2-pchelkin@ispras.ru
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-01 19:57:20 -07:00
Fedor Pchelkin
ce8fe5287b net: macb: fix clk handling on PCI glue driver removal
platform_device_unregister() may still want to use the registered clks
during runtime resume callback.

Note that there is a commit d82d5303c4 ("net: macb: fix use after free
on rmmod") that addressed the similar problem of clk vs platform device
unregistration but just moved the bug to another place.

Save the pointers to clks into local variables for reuse after platform
device is unregistered.

BUG: KASAN: use-after-free in clk_prepare+0x5a/0x60
Read of size 8 at addr ffff888104f85e00 by task modprobe/597

CPU: 2 PID: 597 Comm: modprobe Not tainted 6.1.164+ #114
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.1-0-g3208b098f51a-prebuilt.qemu.org 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0x8d/0xba
 print_report+0x17f/0x496
 kasan_report+0xd9/0x180
 clk_prepare+0x5a/0x60
 macb_runtime_resume+0x13d/0x410 [macb]
 pm_generic_runtime_resume+0x97/0xd0
 __rpm_callback+0xc8/0x4d0
 rpm_callback+0xf6/0x230
 rpm_resume+0xeeb/0x1a70
 __pm_runtime_resume+0xb4/0x170
 bus_remove_device+0x2e3/0x4b0
 device_del+0x5b3/0xdc0
 platform_device_del+0x4e/0x280
 platform_device_unregister+0x11/0x50
 pci_device_remove+0xae/0x210
 device_remove+0xcb/0x180
 device_release_driver_internal+0x529/0x770
 driver_detach+0xd4/0x1a0
 bus_remove_driver+0x135/0x260
 driver_unregister+0x72/0xb0
 pci_unregister_driver+0x26/0x220
 __do_sys_delete_module+0x32e/0x550
 do_syscall_64+0x35/0x80
 entry_SYSCALL_64_after_hwframe+0x6e/0xd8
 </TASK>

Allocated by task 519:
 kasan_save_stack+0x2c/0x50
 kasan_set_track+0x21/0x30
 __kasan_kmalloc+0x8e/0x90
 __clk_register+0x458/0x2890
 clk_hw_register+0x1a/0x60
 __clk_hw_register_fixed_rate+0x255/0x410
 clk_register_fixed_rate+0x3c/0xa0
 macb_probe+0x1d8/0x42e [macb_pci]
 local_pci_probe+0xd7/0x190
 pci_device_probe+0x252/0x600
 really_probe+0x255/0x7f0
 __driver_probe_device+0x1ee/0x330
 driver_probe_device+0x4c/0x1f0
 __driver_attach+0x1df/0x4e0
 bus_for_each_dev+0x15d/0x1f0
 bus_add_driver+0x486/0x5e0
 driver_register+0x23a/0x3d0
 do_one_initcall+0xfd/0x4d0
 do_init_module+0x18b/0x5a0
 load_module+0x5663/0x7950
 __do_sys_finit_module+0x101/0x180
 do_syscall_64+0x35/0x80
 entry_SYSCALL_64_after_hwframe+0x6e/0xd8

Freed by task 597:
 kasan_save_stack+0x2c/0x50
 kasan_set_track+0x21/0x30
 kasan_save_free_info+0x2a/0x50
 __kasan_slab_free+0x106/0x180
 __kmem_cache_free+0xbc/0x320
 clk_unregister+0x6de/0x8d0
 macb_remove+0x73/0xc0 [macb_pci]
 pci_device_remove+0xae/0x210
 device_remove+0xcb/0x180
 device_release_driver_internal+0x529/0x770
 driver_detach+0xd4/0x1a0
 bus_remove_driver+0x135/0x260
 driver_unregister+0x72/0xb0
 pci_unregister_driver+0x26/0x220
 __do_sys_delete_module+0x32e/0x550
 do_syscall_64+0x35/0x80
 entry_SYSCALL_64_after_hwframe+0x6e/0xd8

Fixes: d82d5303c4 ("net: macb: fix use after free on rmmod")
Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
Link: https://patch.msgid.link/20260330184542.626619-1-pchelkin@ispras.ru
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-01 19:57:16 -07:00
Srujana Challa
b4e5f04c58 virtio_net: clamp rss_max_key_size to NETDEV_RSS_KEY_LEN
rss_max_key_size in the virtio spec is the maximum key size supported by
the device, not a mandatory size the driver must use. Also the value 40
is a spec minimum, not a spec maximum.

The current code rejects RSS and can fail probe when the device reports a
larger rss_max_key_size than the driver buffer limit. Instead, clamp the
effective key length to min(device rss_max_key_size, NETDEV_RSS_KEY_LEN)
and keep RSS enabled.

This keeps probe working on devices that advertise larger maximum key sizes
while respecting the netdev RSS key buffer size limit.

Fixes: 3f7d9c1964 ("virtio_net: Add hash_key_length check")
Cc: stable@vger.kernel.org
Signed-off-by: Srujana Challa <schalla@marvell.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://patch.msgid.link/20260326142344.1171317-1-schalla@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-01 19:47:44 -07:00
Yucheng Lu
d64cb81dcb net/sched: sch_netem: fix out-of-bounds access in packet corruption
In netem_enqueue(), the packet corruption logic uses
get_random_u32_below(skb_headlen(skb)) to select an index for
modifying skb->data. When an AF_PACKET TX_RING sends fully non-linear
packets over an IPIP tunnel, skb_headlen(skb) evaluates to 0.

Passing 0 to get_random_u32_below() takes the variable-ceil slow path
which returns an unconstrained 32-bit random integer. Using this
unconstrained value as an offset into skb->data results in an
out-of-bounds memory access.

Fix this by verifying skb_headlen(skb) is non-zero before attempting
to corrupt the linear data area. Fully non-linear packets will silently
bypass the corruption logic.

Fixes: c865e5d99e ("[PKT_SCHED] netem: packet corruption option")
Reported-by: Yifan Wu <yifanwucs@gmail.com>
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Signed-off-by: Yuan Tan <tanyuan98@outlook.com>
Signed-off-by: Xin Liu <bird@lzu.edu.cn>
Signed-off-by: Yuhang Zheng <z1652074432@gmail.com>
Signed-off-by: Yucheng Lu <kanolyc@gmail.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Link: https://patch.msgid.link/45435c0935df877853a81e6d06205ac738ec65fa.1774941614.git.kanolyc@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-01 19:24:20 -07:00
Jakub Kicinski
aba53ccf05 Merge tag 'nf-26-04-01' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net. Note that most
of the bugs fixed here are >5 years old.  The large PR is not due to an
increase in regressions.

1) Flowtable hardware offload support in IPv6 can lead to out-of-bounds
   when populating the rule action array when combined with double-tagged
   vlan. Bump the maximum number of actions from 16 to 24 and check that
   such limit is never reached, otherwise bail out.  This bugs stems from
   the original flowtable hardware offload support.

2) nfnetlink_log does not include the netlink header size of the trailing
   NLMSG_DONE message when calculating the skb size. From Florian Westphal.

3) Reject names in xt_cgroup and xt_rateest extensions which are not
   nul-terminated. Also from Florian.

4) Use nla_strcmp in ipset lookup by set name, since IPSET_ATTR_NAME and
   IPSET_ATTR_NAMEREF are of NLA_STRING type. From Florian Westphal.

5) When unregistering conntrack helpers, pass the helper that is going
   away so the expectation cleanup is done accordingly, otherwise UaF is
   possible when accessing expectation that refer to the helper that is
   gone. From Qi Tang.

6) Zero expectation NAT fields to address leaking kernel memory through
   the expectation netlink dump when unset. Also from Qi Tang.

7) Use the master conntrack helper when creating expectations via
   ctnetlink, ignore the suggested helper through CTA_EXPECT_HELP_NAME.
   This allows to address a possible read of kernel memory off the
   expectation object boundary.

8) Fix incorrect release of the hash bucket logic in ipset when the
   bucket is empty, leading to shrinking the hash bucket to size 0
   which deals to out-of-bound write in next element additions.
   From Yifan Wu.

9) Allow the use of x_tables extensions that explicitly declare
   NFPROTO_ARP support only. This is to avoid an incorrect hook number
   validation due to non-overlapping arp and inet hook number
   definitions.

10) Reject immediate NF_QUEUE verdict in nf_tables. The userspace
    nft tool always uses the nft_queue expression for queueing.
    This ensures this verdict cannot be used for the arp family,
    which does supported this.

* tag 'nf-26-04-01' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  netfilter: nf_tables: reject immediate NF_QUEUE verdict
  netfilter: x_tables: restrict xt_check_match/xt_check_target extensions for NFPROTO_ARP
  netfilter: ipset: drop logically empty buckets in mtype_del
  netfilter: ctnetlink: ignore explicit helper on new expectations
  netfilter: ctnetlink: zero expect NAT fields when CTA_EXPECT_NAT absent
  netfilter: nf_conntrack_helper: pass helper to expect cleanup
  netfilter: ipset: use nla_strcmp for IPSET_ATTR_NAME attr
  netfilter: x_tables: ensure names are nul-terminated
  netfilter: nfnetlink_log: account for netlink header size
  netfilter: flowtable: strictly check for maximum number of actions
====================

Link: https://patch.msgid.link/20260401103646.1015423-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-01 19:19:36 -07:00
Jakub Kicinski
6d6be7070e Merge tag 'for-net-2026-04-01' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth
Luiz Augusto von Dentz says:

====================
bluetooth pull request for net:

 - hci_sync: Fix UAF in le_read_features_complete
 - hci_sync: call destroy in hci_cmd_sync_run if immediate
 - hci_sync: hci_cmd_sync_queue_once() return -EEXIST if exists
 - hci_sync: fix leaks when hci_cmd_sync_queue_once fails
 - hci_sync: fix stack buffer overflow in hci_le_big_create_sync
 - hci_conn: fix potential UAF in set_cig_params_sync
 - hci_event: fix potential UAF in hci_le_remote_conn_param_req_evt
 - hci_event: move wake reason storage into validated event handlers
 - SMP: force responder MITM requirements before building the pairing response
 - SMP: derive legacy responder STK authentication from MITM state
 - MGMT: validate LTK enc_size on load
 - MGMT: validate mesh send advertising payload length
 - SCO: fix race conditions in sco_sock_connect()
 - hci_h4: Fix race during initialization

* tag 'for-net-2026-04-01' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
  Bluetooth: hci_sync: fix stack buffer overflow in hci_le_big_create_sync
  Bluetooth: SMP: derive legacy responder STK authentication from MITM state
  Bluetooth: SMP: force responder MITM requirements before building the pairing response
  Bluetooth: MGMT: validate mesh send advertising payload length
  Bluetooth: hci_event: fix potential UAF in hci_le_remote_conn_param_req_evt
  Bluetooth: hci_conn: fix potential UAF in set_cig_params_sync
  Bluetooth: MGMT: validate LTK enc_size on load
  Bluetooth: hci_h4: Fix race during initialization
  Bluetooth: hci_sync: Fix UAF in le_read_features_complete
  Bluetooth: hci_sync: fix leaks when hci_cmd_sync_queue_once fails
  Bluetooth: hci_sync: hci_cmd_sync_queue_once() return -EEXIST if exists
  Bluetooth: hci_event: move wake reason storage into validated event handlers
  Bluetooth: SCO: fix race conditions in sco_sock_connect()
  Bluetooth: hci_sync: call destroy in hci_cmd_sync_run if immediate
====================

Link: https://patch.msgid.link/20260401205834.2189162-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-01 19:12:23 -07:00
Weiming Shi
a54ecccfae rds: ib: reject FRMR registration before IB connection is established
rds_ib_get_mr() extracts the rds_ib_connection from conn->c_transport_data
and passes it to rds_ib_reg_frmr() for FRWR memory registration. On a
fresh outgoing connection, ic is allocated in rds_ib_conn_alloc() with
i_cm_id = NULL because the connection worker has not yet called
rds_ib_conn_path_connect() to create the rdma_cm_id. When sendmsg() with
RDS_CMSG_RDMA_MAP is called on such a connection, the sendmsg path parses
the control message before any connection establishment, allowing
rds_ib_post_reg_frmr() to dereference ic->i_cm_id->qp and crash the
kernel.

The existing guard in rds_ib_reg_frmr() only checks for !ic (added in
commit 9e630bcb77), which does not catch this case since ic is allocated
early and is always non-NULL once the connection object exists.

 KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017]
 RIP: 0010:rds_ib_post_reg_frmr+0x50e/0x920
 Call Trace:
  rds_ib_post_reg_frmr (net/rds/ib_frmr.c:167)
  rds_ib_map_frmr (net/rds/ib_frmr.c:252)
  rds_ib_reg_frmr (net/rds/ib_frmr.c:430)
  rds_ib_get_mr (net/rds/ib_rdma.c:615)
  __rds_rdma_map (net/rds/rdma.c:295)
  rds_cmsg_rdma_map (net/rds/rdma.c:860)
  rds_sendmsg (net/rds/send.c:1363)
  ____sys_sendmsg
  do_syscall_64

Add a check in rds_ib_get_mr() that verifies ic, i_cm_id, and qp are all
non-NULL before proceeding with FRMR registration, mirroring the guard
already present in rds_ib_post_inv(). Return -ENODEV when the connection
is not ready, which the existing error handling in rds_cmsg_send() converts
to -EAGAIN for userspace retry and triggers rds_conn_connect_if_down() to
start the connection worker.

Fixes: 1659185fb4 ("RDS: IB: Support Fastreg MR (FRMR) memory registration mode")
Reported-by: Xiang Mei <xmei5@asu.edu>
Signed-off-by: Weiming Shi <bestswngs@gmail.com>
Reviewed-by: Allison Henderson <achender@kernel.org>
Link: https://patch.msgid.link/20260330163237.2752440-2-bestswngs@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-01 17:52:40 -07:00
Hangbin Liu
ffb5a4843c ipv6: fix data race in fib6_metric_set() using cmpxchg
fib6_metric_set() may be called concurrently from softirq context without
holding the FIB table lock. A typical path is:

  ndisc_router_discovery()
    spin_unlock_bh(&table->tb6_lock)        <- lock released
    fib6_metric_set(rt, RTAX_HOPLIMIT, ...) <- lockless call

When two CPUs process Router Advertisement packets for the same router
simultaneously, they can both arrive at fib6_metric_set() with the same
fib6_info pointer whose fib6_metrics still points to dst_default_metrics.

  if (f6i->fib6_metrics == &dst_default_metrics) {   /* both CPUs: true */
      struct dst_metrics *p = kzalloc_obj(*p, GFP_ATOMIC);
      refcount_set(&p->refcnt, 1);
      f6i->fib6_metrics = p;   /* CPU1 overwrites CPU0's p -> p0 leaked */
  }

The dst_metrics allocated by the losing CPU has refcnt=1 but no pointer
to it anywhere in memory, producing a kmemleak report:

  unreferenced object 0xff1100025aca1400 (size 96):
    comm "softirq", pid 0, jiffies 4299271239
    backtrace:
      kmalloc_trace+0x28a/0x380
      fib6_metric_set+0xcd/0x180
      ndisc_router_discovery+0x12dc/0x24b0
      icmpv6_rcv+0xc16/0x1360

Fix this by:
 - Set val for p->metrics before published via cmpxchg() so the metrics
   value is ready before the pointer becomes visible to other CPUs.
 - Replace the plain pointer store with cmpxchg() and free the allocation
   safely when competition failed.
 - Add READ_ONCE()/WRITE_ONCE() for metrics[] setting in the non-default
   metrics path to prevent compiler-based data races.

Fixes: d4ead6b34b ("net/ipv6: move metrics from dst to rt6_info")
Reported-by: Fei Liu <feliu@redhat.com>
Reviewed-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260331-b4-fib6_metric_set-kmemleak-v3-1-88d27f4d8825@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-01 17:44:35 -07:00
hkbinbin
bc39a09473 Bluetooth: hci_sync: fix stack buffer overflow in hci_le_big_create_sync
hci_le_big_create_sync() uses DEFINE_FLEX to allocate a
struct hci_cp_le_big_create_sync on the stack with room for 0x11 (17)
BIS entries.  However, conn->num_bis can hold up to HCI_MAX_ISO_BIS (31)
entries — validated against ISO_MAX_NUM_BIS (0x1f) in the caller
hci_conn_big_create_sync().  When conn->num_bis is between 18 and 31,
the memcpy that copies conn->bis into cp->bis writes up to 14 bytes
past the stack buffer, corrupting adjacent stack memory.

This is trivially reproducible: binding an ISO socket with
bc_num_bis = ISO_MAX_NUM_BIS (31) and calling listen() will
eventually trigger hci_le_big_create_sync() from the HCI command
sync worker, causing a KASAN-detectable stack-out-of-bounds write:

  BUG: KASAN: stack-out-of-bounds in hci_le_big_create_sync+0x256/0x3b0
  Write of size 31 at addr ffffc90000487b48 by task kworker/u9:0/71

Fix this by changing the DEFINE_FLEX count from the incorrect 0x11 to
HCI_MAX_ISO_BIS, which matches the maximum number of BIS entries that
conn->bis can actually carry.

Fixes: 42ecf19471 ("Bluetooth: ISO: Do not emit LE BIG Create Sync if previous is pending")
Cc: stable@vger.kernel.org
Signed-off-by: hkbinbin <hkbinbinbin@gmail.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2026-04-01 16:48:28 -04:00
Oleh Konko
20756fec2f Bluetooth: SMP: derive legacy responder STK authentication from MITM state
The legacy responder path in smp_random() currently labels the stored
STK as authenticated whenever pending_sec_level is BT_SECURITY_HIGH.
That reflects what the local service requested, not what the pairing
flow actually achieved.

For Just Works/Confirm legacy pairing, SMP_FLAG_MITM_AUTH stays clear
and the resulting STK should remain unauthenticated even if the local
side requested HIGH security. Use the established MITM state when
storing the responder STK so the key metadata matches the pairing result.

This also keeps the legacy path aligned with the Secure Connections code,
which already treats JUST_WORKS/JUST_CFM as unauthenticated.

Fixes: fff3490f47 ("Bluetooth: Fix setting correct authentication information for SMP STK")
Cc: stable@vger.kernel.org
Signed-off-by: Oleh Konko <security@1seal.org>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2026-04-01 16:48:06 -04:00
Oleh Konko
d05111bfe3 Bluetooth: SMP: force responder MITM requirements before building the pairing response
smp_cmd_pairing_req() currently builds the pairing response from the
initiator auth_req before enforcing the local BT_SECURITY_HIGH
requirement. If the initiator omits SMP_AUTH_MITM, the response can
also omit it even though the local side still requires MITM.

tk_request() then sees an auth value without SMP_AUTH_MITM and may
select JUST_CFM, making method selection inconsistent with the pairing
policy the responder already enforces.

When the local side requires HIGH security, first verify that MITM can
be achieved from the IO capabilities and then force SMP_AUTH_MITM in the
response in both rsp.auth_req and auth. This keeps the responder auth bits
and later method selection aligned.

Fixes: 2b64d153a0 ("Bluetooth: Add MITM mechanism to LE-SMP")
Cc: stable@vger.kernel.org
Suggested-by: Luiz Augusto von Dentz <luiz.dentz@gmail.com>
Signed-off-by: Oleh Konko <security@1seal.org>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2026-04-01 16:47:43 -04:00
Keenan Dong
bda93eec78 Bluetooth: MGMT: validate mesh send advertising payload length
mesh_send() currently bounds MGMT_OP_MESH_SEND by total command
length, but it never verifies that the bytes supplied for the
flexible adv_data[] array actually match the embedded adv_data_len
field. MGMT_MESH_SEND_SIZE only covers the fixed header, so a
truncated command can still pass the existing 20..50 byte range
check and later drive the async mesh send path past the end of the
queued command buffer.

Keep rejecting zero-length and oversized advertising payloads, but
validate adv_data_len explicitly and require the command length to
exactly match the flexible array size before queueing the request.

Fixes: b338d91703 ("Bluetooth: Implement support for Mesh")
Reported-by: Keenan Dong <keenanat2000@gmail.com>
Signed-off-by: Keenan Dong <keenanat2000@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2026-04-01 16:47:19 -04:00
Pauli Virtanen
b255531b27 Bluetooth: hci_event: fix potential UAF in hci_le_remote_conn_param_req_evt
hci_conn lookup and field access must be covered by hdev lock in
hci_le_remote_conn_param_req_evt, otherwise it's possible it is freed
concurrently.

Extend the hci_dev_lock critical section to cover all conn usage.

Fixes: 95118dd4ed ("Bluetooth: hci_event: Use of a function table to handle LE subevents")
Signed-off-by: Pauli Virtanen <pav@iki.fi>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2026-04-01 16:46:55 -04:00
Pauli Virtanen
a2639a7f0f Bluetooth: hci_conn: fix potential UAF in set_cig_params_sync
hci_conn lookup and field access must be covered by hdev lock in
set_cig_params_sync, otherwise it's possible it is freed concurrently.

Take hdev lock to prevent hci_conn from being deleted or modified
concurrently.  Just RCU lock is not suitable here, as we also want to
avoid "tearing" in the configuration.

Fixes: a091289218 ("Bluetooth: hci_conn: Fix hci_le_set_cig_params")
Signed-off-by: Pauli Virtanen <pav@iki.fi>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2026-04-01 16:46:33 -04:00
Keenan Dong
b8dbe9648d Bluetooth: MGMT: validate LTK enc_size on load
Load Long Term Keys stores the user-provided enc_size and later uses
it to size fixed-size stack operations when replying to LE LTK
requests. An enc_size larger than the 16-byte key buffer can therefore
overflow the reply stack buffer.

Reject oversized enc_size values while validating the management LTK
record so invalid keys never reach the stored key state.

Fixes: 346af67b8d ("Bluetooth: Add MGMT handlers for dealing with SMP LTK's")
Reported-by: Keenan Dong <keenanat2000@gmail.com>
Signed-off-by: Keenan Dong <keenanat2000@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2026-04-01 16:46:09 -04:00
Jonathan Rissanen
0ffac654e9 Bluetooth: hci_h4: Fix race during initialization
Commit 5df5dafc17 ("Bluetooth: hci_uart: Fix another race during
initialization") fixed a race for hci commands sent during initialization.
However, there is still a race that happens if an hci event from one of
these commands is received before HCI_UART_REGISTERED has been set at
the end of hci_uart_register_dev(). The event will be ignored which
causes the command to fail with a timeout in the log:

"Bluetooth: hci0: command 0x1003 tx timeout"

This is because the hci event receive path (hci_uart_tty_receive ->
h4_recv) requires HCI_UART_REGISTERED to be set in h4_recv(), while the
hci command transmit path (hci_uart_send_frame -> h4_enqueue) only
requires HCI_UART_PROTO_INIT to be set in hci_uart_send_frame().

The check for HCI_UART_REGISTERED was originally added in commit
c257820291 ("Bluetooth: Fix H4 crash from incoming UART packets")
to fix a crash caused by hu->hdev being null dereferenced. That can no
longer happen: once HCI_UART_PROTO_INIT is set in hci_uart_register_dev()
all pointers (hu, hu->priv and hu->hdev) are valid, and
hci_uart_tty_receive() already calls h4_recv() on HCI_UART_PROTO_INIT
or HCI_UART_PROTO_READY.

Remove the check for HCI_UART_REGISTERED in h4_recv() to fix the race
condition.

Fixes: 5df5dafc17 ("Bluetooth: hci_uart: Fix another race during initialization")
Signed-off-by: Jonathan Rissanen <jonathan.rissanen@axis.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2026-04-01 16:45:47 -04:00
Luiz Augusto von Dentz
035c25007c Bluetooth: hci_sync: Fix UAF in le_read_features_complete
This fixes the following backtrace caused by hci_conn being freed
before le_read_features_complete but after
hci_le_read_remote_features_sync so hci_conn_del -> hci_cmd_sync_dequeue
is not able to prevent it:

==================================================================
BUG: KASAN: slab-use-after-free in instrument_atomic_read_write include/linux/instrumented.h:96 [inline]
BUG: KASAN: slab-use-after-free in atomic_dec_and_test include/linux/atomic/atomic-instrumented.h:1383 [inline]
BUG: KASAN: slab-use-after-free in hci_conn_drop include/net/bluetooth/hci_core.h:1688 [inline]
BUG: KASAN: slab-use-after-free in le_read_features_complete+0x5b/0x340 net/bluetooth/hci_sync.c:7344
Write of size 4 at addr ffff8880796b0010 by task kworker/u9:0/52

CPU: 0 UID: 0 PID: 52 Comm: kworker/u9:0 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025
Workqueue: hci0 hci_cmd_sync_work
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:94 [inline]
 dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120
 print_address_description mm/kasan/report.c:378 [inline]
 print_report+0xcd/0x630 mm/kasan/report.c:482
 kasan_report+0xe0/0x110 mm/kasan/report.c:595
 check_region_inline mm/kasan/generic.c:194 [inline]
 kasan_check_range+0x100/0x1b0 mm/kasan/generic.c:200
 instrument_atomic_read_write include/linux/instrumented.h:96 [inline]
 atomic_dec_and_test include/linux/atomic/atomic-instrumented.h:1383 [inline]
 hci_conn_drop include/net/bluetooth/hci_core.h:1688 [inline]
 le_read_features_complete+0x5b/0x340 net/bluetooth/hci_sync.c:7344
 hci_cmd_sync_work+0x1ff/0x430 net/bluetooth/hci_sync.c:334
 process_one_work+0x9ba/0x1b20 kernel/workqueue.c:3257
 process_scheduled_works kernel/workqueue.c:3340 [inline]
 worker_thread+0x6c8/0xf10 kernel/workqueue.c:3421
 kthread+0x3c5/0x780 kernel/kthread.c:463
 ret_from_fork+0x983/0xb10 arch/x86/kernel/process.c:158
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246
 </TASK>

Allocated by task 5932:
 kasan_save_stack+0x33/0x60 mm/kasan/common.c:56
 kasan_save_track+0x14/0x30 mm/kasan/common.c:77
 poison_kmalloc_redzone mm/kasan/common.c:400 [inline]
 __kasan_kmalloc+0xaa/0xb0 mm/kasan/common.c:417
 kmalloc_noprof include/linux/slab.h:957 [inline]
 kzalloc_noprof include/linux/slab.h:1094 [inline]
 __hci_conn_add+0xf8/0x1c70 net/bluetooth/hci_conn.c:963
 hci_conn_add_unset+0x76/0x100 net/bluetooth/hci_conn.c:1084
 le_conn_complete_evt+0x639/0x1f20 net/bluetooth/hci_event.c:5714
 hci_le_enh_conn_complete_evt+0x23d/0x380 net/bluetooth/hci_event.c:5861
 hci_le_meta_evt+0x357/0x5e0 net/bluetooth/hci_event.c:7408
 hci_event_func net/bluetooth/hci_event.c:7716 [inline]
 hci_event_packet+0x685/0x11c0 net/bluetooth/hci_event.c:7773
 hci_rx_work+0x2c9/0xeb0 net/bluetooth/hci_core.c:4076
 process_one_work+0x9ba/0x1b20 kernel/workqueue.c:3257
 process_scheduled_works kernel/workqueue.c:3340 [inline]
 worker_thread+0x6c8/0xf10 kernel/workqueue.c:3421
 kthread+0x3c5/0x780 kernel/kthread.c:463
 ret_from_fork+0x983/0xb10 arch/x86/kernel/process.c:158
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246

Freed by task 5932:
 kasan_save_stack+0x33/0x60 mm/kasan/common.c:56
 kasan_save_track+0x14/0x30 mm/kasan/common.c:77
 __kasan_save_free_info+0x3b/0x60 mm/kasan/generic.c:587
 kasan_save_free_info mm/kasan/kasan.h:406 [inline]
 poison_slab_object mm/kasan/common.c:252 [inline]
 __kasan_slab_free+0x5f/0x80 mm/kasan/common.c:284
 kasan_slab_free include/linux/kasan.h:234 [inline]
 slab_free_hook mm/slub.c:2540 [inline]
 slab_free mm/slub.c:6663 [inline]
 kfree+0x2f8/0x6e0 mm/slub.c:6871
 device_release+0xa4/0x240 drivers/base/core.c:2565
 kobject_cleanup lib/kobject.c:689 [inline]
 kobject_release lib/kobject.c:720 [inline]
 kref_put include/linux/kref.h:65 [inline]
 kobject_put+0x1e7/0x590 lib/kobject.c:737
 put_device drivers/base/core.c:3797 [inline]
 device_unregister+0x2f/0xc0 drivers/base/core.c:3920
 hci_conn_del_sysfs+0xb4/0x180 net/bluetooth/hci_sysfs.c:79
 hci_conn_cleanup net/bluetooth/hci_conn.c:173 [inline]
 hci_conn_del+0x657/0x1180 net/bluetooth/hci_conn.c:1234
 hci_disconn_complete_evt+0x410/0xa00 net/bluetooth/hci_event.c:3451
 hci_event_func net/bluetooth/hci_event.c:7719 [inline]
 hci_event_packet+0xa10/0x11c0 net/bluetooth/hci_event.c:7773
 hci_rx_work+0x2c9/0xeb0 net/bluetooth/hci_core.c:4076
 process_one_work+0x9ba/0x1b20 kernel/workqueue.c:3257
 process_scheduled_works kernel/workqueue.c:3340 [inline]
 worker_thread+0x6c8/0xf10 kernel/workqueue.c:3421
 kthread+0x3c5/0x780 kernel/kthread.c:463
 ret_from_fork+0x983/0xb10 arch/x86/kernel/process.c:158
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246

The buggy address belongs to the object at ffff8880796b0000
 which belongs to the cache kmalloc-8k of size 8192
The buggy address is located 16 bytes inside of
 freed 8192-byte region [ffff8880796b0000, ffff8880796b2000)

The buggy address belongs to the physical page:
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x796b0
head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
anon flags: 0xfff00000000040(head|node=0|zone=1|lastcpupid=0x7ff)
page_type: f5(slab)
raw: 00fff00000000040 ffff88813ff27280 0000000000000000 0000000000000001
raw: 0000000000000000 0000000000020002 00000000f5000000 0000000000000000
head: 00fff00000000040 ffff88813ff27280 0000000000000000 0000000000000001
head: 0000000000000000 0000000000020002 00000000f5000000 0000000000000000
head: 00fff00000000003 ffffea0001e5ac01 00000000ffffffff 00000000ffffffff
head: ffffffffffffffff 0000000000000000 00000000ffffffff 0000000000000008
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 3, migratetype Unmovable, gfp_mask 0xd2040(__GFP_IO|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC), pid 5657, tgid 5657 (dhcpcd-run-hook), ts 79819636908, free_ts 79814310558
 set_page_owner include/linux/page_owner.h:32 [inline]
 post_alloc_hook+0x1af/0x220 mm/page_alloc.c:1845
 prep_new_page mm/page_alloc.c:1853 [inline]
 get_page_from_freelist+0xd0b/0x31a0 mm/page_alloc.c:3879
 __alloc_frozen_pages_noprof+0x25f/0x2440 mm/page_alloc.c:5183
 alloc_pages_mpol+0x1fb/0x550 mm/mempolicy.c:2416
 alloc_slab_page mm/slub.c:3075 [inline]
 allocate_slab mm/slub.c:3248 [inline]
 new_slab+0x2c3/0x430 mm/slub.c:3302
 ___slab_alloc+0xe18/0x1c90 mm/slub.c:4651
 __slab_alloc.constprop.0+0x63/0x110 mm/slub.c:4774
 __slab_alloc_node mm/slub.c:4850 [inline]
 slab_alloc_node mm/slub.c:5246 [inline]
 __kmalloc_cache_noprof+0x477/0x800 mm/slub.c:5766
 kmalloc_noprof include/linux/slab.h:957 [inline]
 kzalloc_noprof include/linux/slab.h:1094 [inline]
 tomoyo_print_bprm security/tomoyo/audit.c:26 [inline]
 tomoyo_init_log+0xc8a/0x2140 security/tomoyo/audit.c:264
 tomoyo_supervisor+0x302/0x13b0 security/tomoyo/common.c:2198
 tomoyo_audit_env_log security/tomoyo/environ.c:36 [inline]
 tomoyo_env_perm+0x191/0x200 security/tomoyo/environ.c:63
 tomoyo_environ security/tomoyo/domain.c:672 [inline]
 tomoyo_find_next_domain+0xec1/0x20b0 security/tomoyo/domain.c:888
 tomoyo_bprm_check_security security/tomoyo/tomoyo.c:102 [inline]
 tomoyo_bprm_check_security+0x12d/0x1d0 security/tomoyo/tomoyo.c:92
 security_bprm_check+0x1b9/0x1e0 security/security.c:794
 search_binary_handler fs/exec.c:1659 [inline]
 exec_binprm fs/exec.c:1701 [inline]
 bprm_execve fs/exec.c:1753 [inline]
 bprm_execve+0x81e/0x1620 fs/exec.c:1729
 do_execveat_common.isra.0+0x4a5/0x610 fs/exec.c:1859
page last free pid 5657 tgid 5657 stack trace:
 reset_page_owner include/linux/page_owner.h:25 [inline]
 free_pages_prepare mm/page_alloc.c:1394 [inline]
 __free_frozen_pages+0x7df/0x1160 mm/page_alloc.c:2901
 discard_slab mm/slub.c:3346 [inline]
 __put_partials+0x130/0x170 mm/slub.c:3886
 qlink_free mm/kasan/quarantine.c:163 [inline]
 qlist_free_all+0x4c/0xf0 mm/kasan/quarantine.c:179
 kasan_quarantine_reduce+0x195/0x1e0 mm/kasan/quarantine.c:286
 __kasan_slab_alloc+0x69/0x90 mm/kasan/common.c:352
 kasan_slab_alloc include/linux/kasan.h:252 [inline]
 slab_post_alloc_hook mm/slub.c:4948 [inline]
 slab_alloc_node mm/slub.c:5258 [inline]
 __kmalloc_cache_noprof+0x274/0x800 mm/slub.c:5766
 kmalloc_noprof include/linux/slab.h:957 [inline]
 tomoyo_print_header security/tomoyo/audit.c:156 [inline]
 tomoyo_init_log+0x197/0x2140 security/tomoyo/audit.c:255
 tomoyo_supervisor+0x302/0x13b0 security/tomoyo/common.c:2198
 tomoyo_audit_env_log security/tomoyo/environ.c:36 [inline]
 tomoyo_env_perm+0x191/0x200 security/tomoyo/environ.c:63
 tomoyo_environ security/tomoyo/domain.c:672 [inline]
 tomoyo_find_next_domain+0xec1/0x20b0 security/tomoyo/domain.c:888
 tomoyo_bprm_check_security security/tomoyo/tomoyo.c:102 [inline]
 tomoyo_bprm_check_security+0x12d/0x1d0 security/tomoyo/tomoyo.c:92
 security_bprm_check+0x1b9/0x1e0 security/security.c:794
 search_binary_handler fs/exec.c:1659 [inline]
 exec_binprm fs/exec.c:1701 [inline]
 bprm_execve fs/exec.c:1753 [inline]
 bprm_execve+0x81e/0x1620 fs/exec.c:1729
 do_execveat_common.isra.0+0x4a5/0x610 fs/exec.c:1859
 do_execve fs/exec.c:1933 [inline]
 __do_sys_execve fs/exec.c:2009 [inline]
 __se_sys_execve fs/exec.c:2004 [inline]
 __x64_sys_execve+0x8e/0xb0 fs/exec.c:2004
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xcd/0xf80 arch/x86/entry/syscall_64.c:94

Memory state around the buggy address:
 ffff8880796aff00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff8880796aff80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff8880796b0000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                         ^
 ffff8880796b0080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff8880796b0100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================

Fixes: a106e50be7 ("Bluetooth: HCI: Add support for LL Extended Feature Set")
Reported-by: syzbot+87badbb9094e008e0685@syzkaller.appspotmail.com
Tested-by: syzbot+87badbb9094e008e0685@syzkaller.appspotmail.com
Closes: https://syzbot.org/bug?extid=87badbb9094e008e0685
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Pauli Virtanen <pav@iki.fi>
2026-04-01 16:45:24 -04:00
Pauli Virtanen
aca377208e Bluetooth: hci_sync: fix leaks when hci_cmd_sync_queue_once fails
When hci_cmd_sync_queue_once() returns with error, the destroy callback
will not be called.

Fix leaking references / memory on these failures.

Signed-off-by: Pauli Virtanen <pav@iki.fi>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2026-04-01 16:45:00 -04:00
Pauli Virtanen
2969554bcf Bluetooth: hci_sync: hci_cmd_sync_queue_once() return -EEXIST if exists
hci_cmd_sync_queue_once() needs to indicate whether a queue item was
added, so caller can know if callbacks are called, so it can avoid
leaking resources.

Change the function to return -EEXIST if queue item already exists.

Modify all callsites to handle that.

Signed-off-by: Pauli Virtanen <pav@iki.fi>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2026-04-01 16:44:38 -04:00
Oleh Konko
2b2bf47cd7 Bluetooth: hci_event: move wake reason storage into validated event handlers
hci_store_wake_reason() is called from hci_event_packet() immediately
after stripping the HCI event header but before hci_event_func()
enforces the per-event minimum payload length from hci_ev_table.
This means a short HCI event frame can reach bacpy() before any bounds
check runs.

Rather than duplicating skb parsing and per-event length checks inside
hci_store_wake_reason(), move wake-address storage into the individual
event handlers after their existing event-length validation has
succeeded. Convert hci_store_wake_reason() into a small helper that only
stores an already-validated bdaddr while the caller holds hci_dev_lock().
Use the same helper after hci_event_func() with a NULL address to
preserve the existing unexpected-wake fallback semantics when no
validated event handler records a wake address.

Annotate the helper with __must_hold(&hdev->lock) and add
lockdep_assert_held(&hdev->lock) so future call paths keep the lock
contract explicit.

Call the helper from hci_conn_request_evt(), hci_conn_complete_evt(),
hci_sync_conn_complete_evt(), le_conn_complete_evt(),
hci_le_adv_report_evt(), hci_le_ext_adv_report_evt(),
hci_le_direct_adv_report_evt(), hci_le_pa_sync_established_evt(), and
hci_le_past_received_evt().

Fixes: 2f20216c1d ("Bluetooth: Emit controller suspend and resume events")
Cc: stable@vger.kernel.org
Signed-off-by: Oleh Konko <security@1seal.org>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2026-04-01 16:44:15 -04:00
Cen Zhang
8a5b0135d4 Bluetooth: SCO: fix race conditions in sco_sock_connect()
sco_sock_connect() checks sk_state and sk_type without holding
the socket lock. Two concurrent connect() syscalls on the same
socket can both pass the check and enter sco_connect(), leading
to use-after-free.

The buggy scenario involves three participants and was confirmed
with additional logging instrumentation:

  Thread A (connect):    HCI disconnect:      Thread B (connect):

  sco_sock_connect(sk)                        sco_sock_connect(sk)
  sk_state==BT_OPEN                           sk_state==BT_OPEN
  (pass, no lock)                             (pass, no lock)
  sco_connect(sk):                            sco_connect(sk):
    hci_dev_lock                                hci_dev_lock
    hci_connect_sco                               <- blocked
      -> hcon1
    sco_conn_add->conn1
    lock_sock(sk)
    sco_chan_add:
      conn1->sk = sk
      sk->conn = conn1
    sk_state=BT_CONNECT
    release_sock
    hci_dev_unlock
                           hci_dev_lock
                           sco_conn_del:
                             lock_sock(sk)
                             sco_chan_del:
                               sk->conn=NULL
                               conn1->sk=NULL
                               sk_state=
                                 BT_CLOSED
                               SOCK_ZAPPED
                             release_sock
                           hci_dev_unlock
                                                  (unblocked)
                                                  hci_connect_sco
                                                    -> hcon2
                                                  sco_conn_add
                                                    -> conn2
                                                  lock_sock(sk)
                                                  sco_chan_add:
                                                    sk->conn=conn2
                                                  sk_state=
                                                    BT_CONNECT
                                                  // zombie sk!
                                                  release_sock
                                                  hci_dev_unlock

Thread B revives a BT_CLOSED + SOCK_ZAPPED socket back to
BT_CONNECT. Subsequent cleanup triggers double sock_put() and
use-after-free. Meanwhile conn1 is leaked as it was orphaned
when sco_conn_del() cleared the association.

Fix this by:
- Moving lock_sock() before the sk_state/sk_type checks in
  sco_sock_connect() to serialize concurrent connect attempts
- Fixing the sk_type != SOCK_SEQPACKET check to actually
  return the error instead of just assigning it
- Adding a state re-check in sco_connect() after lock_sock()
  to catch state changes during the window between the locks
- Adding sco_pi(sk)->conn check in sco_chan_add() to prevent
  double-attach of a socket to multiple connections
- Adding hci_conn_drop() on sco_chan_add failure to prevent
  HCI connection leaks

Fixes: 9a8ec9e8eb ("Bluetooth: SCO: Fix possible circular locking dependency on sco_connect_cfm")
Signed-off-by: Cen Zhang <zzzccc427@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2026-04-01 16:43:53 -04:00
Pauli Virtanen
a834a0b66e Bluetooth: hci_sync: call destroy in hci_cmd_sync_run if immediate
hci_cmd_sync_run() may run the work immediately if called from existing
sync work (otherwise it queues a new sync work). In this case it fails
to call the destroy() function.

On immediate run, make it behave same way as if item was queued
successfully: call destroy, and return 0.

The only callsite is hci_abort_conn() via hci_cmd_sync_run_once(), and
this changes its return value. However, its return value is not used
except as the return value for hci_disconnect(), and nothing uses the
return value of hci_disconnect(). Hence there should be no behavior
change anywhere.

Fixes: c898f6d7b0 ("Bluetooth: hci_sync: Introduce hci_cmd_sync_run/hci_cmd_sync_run_once")
Signed-off-by: Pauli Virtanen <pav@iki.fi>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2026-04-01 16:39:43 -04:00
Simon Trimmer
e74c38ef6f ASoC: amd: ps: Fix missing leading zeros in subsystem_device SSID log
Ensure that subsystem_device is printed with leading zeros when combined
with subsystem_vendor to form the SSID. Without this, devices with upper
bits unset may appear to have an incorrect SSID in the debug output.

Signed-off-by: Simon Trimmer <simont@opensource.cirrus.com>
Link: https://patch.msgid.link/20260331131916.145546-1-simont@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2026-04-01 13:08:47 +01:00
Pablo Neira Ayuso
da107398cb netfilter: nf_tables: reject immediate NF_QUEUE verdict
nft_queue is always used from userspace nftables to deliver the NF_QUEUE
verdict. Immediately emitting an NF_QUEUE verdict is never used by the
userspace nft tools, so reject immediate NF_QUEUE verdicts.

The arp family does not provide queue support, but such an immediate
verdict is still reachable. Globally reject NF_QUEUE immediate verdicts
to address this issue.

Fixes: f342de4e2f ("netfilter: nf_tables: reject QUEUE/DROP verdict parameters")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2026-04-01 11:55:30 +02:00
Pablo Neira Ayuso
3d5d488f11 netfilter: x_tables: restrict xt_check_match/xt_check_target extensions for NFPROTO_ARP
Weiming Shi says:

xt_match and xt_target structs registered with NFPROTO_UNSPEC can be
loaded by any protocol family through nft_compat. When such a
match/target sets .hooks to restrict which hooks it may run on, the
bitmask uses NF_INET_* constants. This is only correct for families
whose hook layout matches NF_INET_*: IPv4, IPv6, INET, and bridge
all share the same five hooks (PRE_ROUTING ... POST_ROUTING).

ARP only has three hooks (IN=0, OUT=1, FORWARD=2) with different
semantics. Because NF_ARP_OUT == 1 == NF_INET_LOCAL_IN, the .hooks
validation silently passes for the wrong reasons, allowing matches to
run on ARP chains where the hook assumptions (e.g. state->in being
set on input hooks) do not hold. This leads to NULL pointer
dereferences; xt_devgroup is one concrete example:

 Oops: general protection fault, probably for non-canonical address 0xdffffc0000000044: 0000 [#1] SMP KASAN NOPTI
 KASAN: null-ptr-deref in range [0x0000000000000220-0x0000000000000227]
 RIP: 0010:devgroup_mt+0xff/0x350
 Call Trace:
  <TASK>
  nft_match_eval (net/netfilter/nft_compat.c:407)
  nft_do_chain (net/netfilter/nf_tables_core.c:285)
  nft_do_chain_arp (net/netfilter/nft_chain_filter.c:61)
  nf_hook_slow (net/netfilter/core.c:623)
  arp_xmit (net/ipv4/arp.c:666)
  </TASK>
 Kernel panic - not syncing: Fatal exception in interrupt

Fix it by restricting arptables to NFPROTO_ARP extensions only.
Note that arptables-legacy only supports:

- arpt_CLASSIFY
- arpt_mangle
- arpt_MARK

that provide explicit NFPROTO_ARP match/target declarations.

Fixes: 9291747f11 ("netfilter: xtables: add device group match")
Reported-by: Xiang Mei <xmei5@asu.edu>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2026-04-01 11:55:29 +02:00
Yifan Wu
9862ef9ab0 netfilter: ipset: drop logically empty buckets in mtype_del
mtype_del() counts empty slots below n->pos in k, but it only drops the
bucket when both n->pos and k are zero. This misses buckets whose live
entries have all been removed while n->pos still points past deleted slots.

Treat a bucket as empty when all positions below n->pos are unused and
release it directly instead of shrinking it further.

Fixes: 8af1c6fbd9 ("netfilter: ipset: Fix forceadd evaluation path")
Cc: stable@vger.kernel.org
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Reported-by: Xin Liu <dstsmallbird@foxmail.com>
Signed-off-by: Yifan Wu <yifanwucs@gmail.com>
Co-developed-by: Yuan Tan <yuantan098@gmail.com>
Signed-off-by: Yuan Tan <yuantan098@gmail.com>
Reviewed-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2026-04-01 11:55:29 +02:00
Pablo Neira Ayuso
917b61fa20 netfilter: ctnetlink: ignore explicit helper on new expectations
Use the existing master conntrack helper, anything else is not really
supported and it just makes validation more complicated, so just ignore
what helper userspace suggests for this expectation.

This was uncovered when validating CTA_EXPECT_CLASS via different helper
provided by userspace than the existing master conntrack helper:

  BUG: KASAN: slab-out-of-bounds in nf_ct_expect_related_report+0x2479/0x27c0
  Read of size 4 at addr ffff8880043fe408 by task poc/102
  Call Trace:
   nf_ct_expect_related_report+0x2479/0x27c0
   ctnetlink_create_expect+0x22b/0x3b0
   ctnetlink_new_expect+0x4bd/0x5c0
   nfnetlink_rcv_msg+0x67a/0x950
   netlink_rcv_skb+0x120/0x350

Allowing to read kernel memory bytes off the expectation boundary.

CTA_EXPECT_HELP_NAME is still used to offer the helper name to userspace
via netlink dump.

Fixes: bd07793705 ("netfilter: nfnetlink_queue: allow to attach expectations to conntracks")
Reported-by: Qi Tang <tpluszz77@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2026-04-01 11:55:29 +02:00
Qi Tang
35177c6877 netfilter: ctnetlink: zero expect NAT fields when CTA_EXPECT_NAT absent
ctnetlink_alloc_expect() allocates expectations from a non-zeroing
slab cache via nf_ct_expect_alloc().  When CTA_EXPECT_NAT is not
present in the netlink message, saved_addr and saved_proto are
never initialized.  Stale data from a previous slab occupant can
then be dumped to userspace by ctnetlink_exp_dump_expect(), which
checks these fields to decide whether to emit CTA_EXPECT_NAT.

The safe sibling nf_ct_expect_init(), used by the packet path,
explicitly zeroes these fields.

Zero saved_addr, saved_proto and dir in the else branch, guarded
by IS_ENABLED(CONFIG_NF_NAT) since these fields only exist when
NAT is enabled.

Confirmed by priming the expect slab with NAT-bearing expectations,
freeing them, creating a new expectation without CTA_EXPECT_NAT,
and observing that the ctnetlink dump emits a spurious
CTA_EXPECT_NAT containing stale data from the prior allocation.

Fixes: 076a0ca026 ("netfilter: ctnetlink: add NAT support for expectations")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Qi Tang <tpluszz77@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2026-04-01 11:55:29 +02:00
Qi Tang
a242a9ae58 netfilter: nf_conntrack_helper: pass helper to expect cleanup
nf_conntrack_helper_unregister() calls nf_ct_expect_iterate_destroy()
to remove expectations belonging to the helper being unregistered.
However, it passes NULL instead of the helper pointer as the data
argument, so expect_iter_me() never matches any expectation and all
of them survive the cleanup.

After unregister returns, nfnl_cthelper_del() frees the helper
object immediately.  Subsequent expectation dumps or packet-driven
init_conntrack() calls then dereference the freed exp->helper,
causing a use-after-free.

Pass the actual helper pointer so expectations referencing it are
properly destroyed before the helper object is freed.

  BUG: KASAN: slab-use-after-free in string+0x38f/0x430
  Read of size 1 at addr ffff888003b14d20 by task poc/103
  Call Trace:
   string+0x38f/0x430
   vsnprintf+0x3cc/0x1170
   seq_printf+0x17a/0x240
   exp_seq_show+0x2e5/0x560
   seq_read_iter+0x419/0x1280
   proc_reg_read+0x1ac/0x270
   vfs_read+0x179/0x930
   ksys_read+0xef/0x1c0
  Freed by task 103:
  The buggy address is located 32 bytes inside of
   freed 192-byte region [ffff888003b14d00, ffff888003b14dc0)

Fixes: ac7b848390 ("netfilter: expect: add and use nf_ct_expect_iterate helpers")
Signed-off-by: Qi Tang <tpluszz77@gmail.com>
Reviewed-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2026-04-01 11:55:29 +02:00
Florian Westphal
b7e8590987 netfilter: ipset: use nla_strcmp for IPSET_ATTR_NAME attr
IPSET_ATTR_NAME and IPSET_ATTR_NAMEREF are of NLA_STRING type, they
cannot be treated like a c-string.

They either have to be switched to NLA_NUL_STRING, or the compare
operations need to use the nla functions.

Fixes: f830837f0e ("netfilter: ipset: list:set set type support")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2026-04-01 11:55:29 +02:00
Florian Westphal
a958a4f90d netfilter: x_tables: ensure names are nul-terminated
Reject names that lack a \0 character before feeding them
to functions that expect c-strings.

Fixes tag is the most recent commit that needs this change.

Fixes: c38c4597e4 ("netfilter: implement xt_cgroup cgroup2 path match")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2026-04-01 11:55:29 +02:00
Florian Westphal
6d52a4a052 netfilter: nfnetlink_log: account for netlink header size
This is a followup to an old bug fix: NLMSG_DONE needs to account
for the netlink header size, not just the attribute size.

This can result in a WARN splat + drop of the netlink message,
but other than this there are no ill effects.

Fixes: 9dfa1dfe4d ("netfilter: nf_log: account for size of NLMSG_DONE attribute")
Reported-by: Yiming Qian <yimingqian591@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2026-04-01 11:55:29 +02:00
Pablo Neira Ayuso
76522fcdbc netfilter: flowtable: strictly check for maximum number of actions
The maximum number of flowtable hardware offload actions in IPv6 is:

* ethernet mangling (4 payload actions, 2 for each ethernet address)
* SNAT (4 payload actions)
* DNAT (4 payload actions)
* Double VLAN (4 vlan actions, 2 for popping vlan, and 2 for pushing)
  for QinQ.
* Redirect (1 action)

Which makes 17, while the maximum is 16. But act_ct supports for tunnels
actions too. Note that payload action operates at 32-bit word level, so
mangling an IPv6 address takes 4 payload actions.

Update flow_action_entry_next() calls to check for the maximum number of
supported actions.

While at it, rise the maximum number of actions per flow from 16 to 24
so this works fine with IPv6 setups.

Fixes: c29f74e0df ("netfilter: nf_flow_table: hardware offload support")
Reported-by: Hyunwoo Kim <imv4bel@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2026-04-01 11:50:14 +02:00
Geoffrey D. Bennett
a0dafdbd10 ALSA: usb-audio: Exclude Scarlett 2i2 1st Gen (8016) from SKIP_IFACE_SETUP
Same issue as the other 1st Gen Scarletts: QUIRK_FLAG_SKIP_IFACE_SETUP
causes distorted audio on this revision of the Scarlett 2i2 1st Gen
(1235:8016).

Fixes: 38c322068a ("ALSA: usb-audio: Add QUIRK_FLAG_SKIP_IFACE_SETUP")
Reported-by: lukas-reineke [https://github.com/geoffreybennett/linux-fcp/issues/54]
Signed-off-by: Geoffrey D. Bennett <g@b4.vu>
Link: https://patch.msgid.link/acytr8aEUba4VXmZ@m.b4.vu
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2026-04-01 09:43:21 +02:00
Linus Torvalds
9147566d80 Merge tag 'sched_ext-for-7.0-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext
Pull sched_ext fixes from Tejun Heo:

 - Fix SCX_KICK_WAIT deadlock where multiple CPUs waiting for each other
   in hardirq context form a cycle. Move the wait to a balance callback
   which can drop the rq lock and process IPIs.

 - Fix inconsistent NUMA node lookup in scx_select_cpu_dfl() where
   the waker_node used cpu_to_node() while prev_cpu used
   scx_cpu_node_if_enabled(), leading to undefined behavior when
   per-node idle tracking is disabled.

* tag 'sched_ext-for-7.0-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext:
  selftests/sched_ext: Add cyclic SCX_KICK_WAIT stress test
  sched_ext: Fix SCX_KICK_WAIT deadlock by deferring wait to balance callback
  sched_ext: Fix inconsistent NUMA node lookup in scx_select_cpu_dfl()
2026-03-31 14:23:12 -07:00
Linus Torvalds
0958d657b4 Merge tag 'wq-for-7.0-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Pull workqueue fix from Tejun Heo:

 - Fix false positive stall reports on weakly ordered architectures
   where the lockless worklist/timestamp check in the watchdog can
   observe stale values due to memory reordering.

   Recheck under pool->lock to confirm.

* tag 'wq-for-7.0-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: Better describe stall check
  workqueue: Fix false positive stall reports
2026-03-31 14:20:39 -07:00
Linus Torvalds
53d85a2056 Merge tag 'cgroup-for-7.0-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fixes from Tejun Heo:

 - Fix cgroup rmdir racing with dying tasks.

   Deferred task cgroup unlink introduced a window where cgroup.procs
   is empty but the cgroup is still populated, causing rmdir to fail
   with -EBUSY and selftest failures.

   Make rmdir wait for dying tasks to fully leave and fix selftests to
   not depend on synchronous populated updates.

 - Fix cpuset v1 task migration failure from empty cpusets under strict
   security policies.

   When CPU hotplug removes the last CPU from a v1 cpuset, tasks must be
   migrated to an ancestor without a security_task_setscheduler() check
   that would block the migration.

* tag 'cgroup-for-7.0-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup/cpuset: Skip security check for hotplug induced v1 task migration
  cgroup/cpuset: Simplify setsched decision check in task iteration loop of cpuset_can_attach()
  cgroup: Fix cgroup_drain_dying() testing the wrong condition
  selftests/cgroup: Don't require synchronous populated update on task exit
  cgroup: Wait for dying tasks to leave on rmdir
2026-03-31 13:59:51 -07:00
Waiman Long
089f3fcd69 cgroup/cpuset: Skip security check for hotplug induced v1 task migration
When a CPU hot removal causes a v1 cpuset to lose all its CPUs, the
cpuset hotplug handler will schedule a work function to migrate tasks
in that cpuset with no CPU to its ancestor to enable those tasks to
continue running.

If a strict security policy is in place, however, the task migration
may fail when security_task_setscheduler() call in cpuset_can_attach()
returns a -EACCES error. That will mean that those tasks will have
no CPU to run on. The system administrators will have to explicitly
intervene to either add CPUs to that cpuset or move the tasks elsewhere
if they are aware of it.

This problem was found by a reported test failure in the LTP's
cpuset_hotplug_test.sh. Fix this problem by treating this special case as
an exception to skip the setsched security check in cpuset_can_attach()
when a v1 cpuset with tasks have no CPU left.

With that patch applied, the cpuset_hotplug_test.sh test can be run
successfully without failure.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2026-03-31 09:14:13 -10:00
Waiman Long
bbe5ab8191 cgroup/cpuset: Simplify setsched decision check in task iteration loop of cpuset_can_attach()
Centralize the check required to run security_task_setscheduler() in
the task iteration loop of cpuset_can_attach() outside of the loop as
it has no dependency on the characteristics of the tasks themselves.

There is no functional change.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2026-03-31 09:14:13 -10:00
Linus Torvalds
dbf00d8d23 Merge tag 'fs_for_v7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull udf fix from Jan Kara:
 "Fix for a race in UDF that can lead to memory corruption"

* tag 'fs_for_v7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  udf: Fix race between file type conversion and writeback
  mpage: Provide variant of mpage_writepages() with own optional folio handler
2026-03-31 10:28:08 -07:00
Zhang Heng
dd9b99b822 ALSA: hda/realtek: add quirk for Acer Swift SFG14-73
fix mute/micmute LEDs and headset microphone for Acer Swift SFG14-73.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=220279
Cc: stable@vger.kernel.org
Signed-off-by: Zhang Heng <zhangheng@kylinos.cn>
Link: https://patch.msgid.link/20260331094614.186063-1-zhangheng@kylinos.cn
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2026-03-31 16:21:15 +02:00
Alexander Savenko
217d5bc9f9 ALSA: hda/realtek: Add quirk for Lenovo Yoga Pro 7 14IMH9
The Lenovo Yoga Pro 7 14IMH9 (DMI: 83E2) shares PCI SSID 17aa:3847
with the Legion 7 16ACHG6, but has a different codec subsystem ID
(17aa:38cf). The existing SND_PCI_QUIRK for 17aa:3847 applies
ALC287_FIXUP_LEGION_16ACHG6, which attempts to initialize an external
I2C amplifier (CLSA0100) that is not present on the Yoga Pro 7 14IMH9.

As a result, pin 0x17 (bass speakers) is connected to DAC 0x06 which
has no volume control, making hardware volume adjustment completely
non-functional. Audio is either silent or at maximum volume regardless
of the slider position.

Add a HDA_CODEC_QUIRK entry using the codec subsystem ID (17aa:38cf)
to correctly identify the Yoga Pro 7 14IMH9 and apply
ALC287_FIXUP_YOGA9_14IMH9_BASS_SPK_PIN, which redirects pin 0x17 to
DAC 0x02 and restores proper volume control. The existing Legion entry
is preserved unchanged.

This follows the same pattern used for 17aa:386e, where Legion Y9000X
and Yoga Pro 7 14ARP8 share a PCI SSID but are distinguished via
HDA_CODEC_QUIRK.

Link: https://github.com/nomad4tech/lenovo-yoga-pro-7-linux
Tested-by: Alexander Savenko <alex.sav4387@gmail.com>
Signed-off-by: Alexander Savenko <alex.sav4387@gmail.com>
Link: https://patch.msgid.link/20260331082929.44890-1-alex.sav4387@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2026-03-31 16:21:00 +02:00
Julian Braha
e920c36f20 ASoC: Intel: boards: fix unmet dependency on PINCTRL
This reverts commit c073f07576 ("ASoC: Intel: sof_sdw: select PINCTRL_CS42L43 and SPI_CS42L43")

Currently, SND_SOC_INTEL_SOUNDWIRE_SOF_MACH selects PINCTRL_CS42L43
without also selecting or depending on PINCTRL, despite PINCTRL_CS42L43
depending on PINCTRL.

See the following Kbuild warning:

WARNING: unmet direct dependencies detected for PINCTRL_CS42L43
  Depends on [n]: PINCTRL [=n] && MFD_CS42L43 [=m]
  Selected by [m]:
  - SND_SOC_INTEL_SOUNDWIRE_SOF_MACH [=m] && SOUND [=y] && SND [=m] && SND_SOC [=m] && SND_SOC_INTEL_MACH [=y] && (SND_SOC_SOF_INTEL_COMMON [=m] || !SND_SOC_SOF_INTEL_COMMON [=m]) && SND_SOC_SOF_INTEL_SOUNDWIRE [=m] && I2C [=y] && SPI_MASTER [=y] && ACPI [=y] && (MFD_INTEL_LPSS [=n] || COMPILE_TEST [=y]) && (SND_SOC_INTEL_USER_FRIENDLY_LONG_NAMES [=n] || COMPILE_TEST [=y]) && SOUNDWIRE [=m]

In response to v1 of this patch [1], Arnd pointed out that there is
no compile-time dependency sof_sdw and the PINCTRL_CS42L43 driver.
After testing, I can confirm that the kernel compiled with
SND_SOC_INTEL_SOUNDWIRE_SOF_MACH enabled and PINCTRL_CS42L43 disabled.

This unmet dependency was detected by kconfirm, a static analysis
tool for Kconfig.

Link: https://lore.kernel.org/all/b8aecc71-1fed-4f52-9f6c-263fbe56d493@app.fastmail.com/ [1]
Fixes: c073f07576 ("ASoC: Intel: sof_sdw: select PINCTRL_CS42L43 and SPI_CS42L43")
Signed-off-by: Julian Braha <julianbraha@gmail.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patch.msgid.link/20260325001522.1727678-1-julianbraha@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2026-03-31 14:05:17 +01:00
Sachin Mokashi
51e3eb3d07 ASoC: Intel: ehl_rt5660: Use the correct rtd->dev device in hw_params
In rt5660_hw_params(), the error path for snd_soc_dai_set_sysclk()
correctly uses rtd->dev as the logging device, but the error path for
snd_soc_dai_set_pll() uses codec_dai->dev instead.

These two devices are distinct:
- rtd->dev is the platform device of the PCM runtime (the Intel HDA/SSP
  controller, e.g. 0000:00:1f.3), which owns the machine driver callback.
- codec_dai->dev is the I2C device of the rt5660 codec itself
  (i2c-10EC5660:00).

Since hw_params is a machine driver operation and both calls are made
within the same function from the machine driver's context, all error
messages should be attributed to rtd->dev. Using codec_dai->dev for one
of them would suggest the error originates inside the codec driver,
which is misleading.

Align the PLL error log with the sysclk one to use rtd->dev, matching
the convention used by all other Intel board drivers in this directory.

Signed-off-by: Sachin Mokashi <sachin.mokashi@intel.com>
Link: https://patch.msgid.link/20260327131439.1330373-1-sachin.mokashi@intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2026-03-31 12:37:13 +01:00
Takashi Iwai
75dc1980cf ALSA: ctxfi: Don't enumerate SPDIF1 at DAIO initialization
The recent refactoring of xfi driver changed the assignment of
atc->daios[] at atc_get_resources(); now it loops over all enum
DAIOTYP entries while it looped formerly only a part of them.
The problem is that the last entry, SPDIF1, is a special type that
is used only for hw20k1 CTSB073X model (as a replacement of SPDIFIO),
and there is no corresponding definition for hw20k2.  Due to the lack
of the info, it caused a kernel crash on hw20k2, which was already
worked around by the commit b045ab3dff ("ALSA: ctxfi: Fix missing
SPDIFI1 index handling").

This patch addresses the root cause of the regression above properly,
simply by skipping the incorrect SPDIF1 type in the parser loop.

For making the change clearer, the code is slightly arranged, too.

Fixes: a2dbaeb5c6 ("ALSA: ctxfi: Refactor resource alloc for sparse mappings")
Cc: <stable@vger.kernel.org>
Link: https://bugzilla.suse.com/show_bug.cgi?id=1259925
Link: https://patch.msgid.link/20260331081227.216134-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2026-03-31 10:13:10 +02:00
songxiebing
e6c8882022 ALSA: hda/realtek: Add quirk for Lenovo Yoga Slim 7 14AKP10
The Pin Complex 0x17 (bass/woofer speakers) is incorrectly reported as
unconnected in the BIOS (pin default 0x411111f0 = N/A). This causes the
kernel to configure speaker_outs=0, meaning only the tweeters (pin 0x14)
are used. The result is very low, tinny audio with no bass.

The existing quirk ALC287_FIXUP_YOGA9_14IAP7_BASS_SPK_PIN (already present
in patch_realtek.c for SSID 0x17aa3801) fixes the issue completely.

Reported-by: Garcicasti <andresgarciacastilla@gmail.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=221298
Signed-off-by: songxiebing <songxiebing@kylinos.cn>
Link: https://patch.msgid.link/20260331033650.285601-1-songxiebing@kylinos.cn
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2026-03-31 08:49:58 +02:00
Zhang Heng
7204607223 ALSA: hda/realtek: add quirk for HP Laptop 15-fc0xxx
For the HP Laptop 15-fc0xxx with ALC236, the built-in mic 0x12 was
not set up, making it unusable; after adding it, it now works properly.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=221233
Signed-off-by: Zhang Heng <zhangheng@kylinos.cn>
Link: https://patch.msgid.link/20260331013536.13778-1-zhangheng@kylinos.cn
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2026-03-31 08:49:49 +02:00
Linus Torvalds
d0c3bcd5b8 Merge tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull crypto library fix from Eric Biggers:
 "Fix missing zeroization of the ChaCha state"

* tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
  lib/crypto: chacha: Zeroize permuted_state before it leaves scope
2026-03-30 13:40:48 -07:00
Linus Torvalds
f1b24d8bdd Merge tag 'trace-rtla-v7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull rtla build fix from Steven Rostedt:

 - Fix build failure when libbpf does not exist

   RTLA supports building without BPF libraries, but a recent change
   added a libbpf.h include outside of the BPF protection which caused
   build failures when libbpf was not installed.

* tag 'trace-rtla-v7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  rtla: Fix build without libbpf header
2026-03-30 13:12:00 -07:00
Jihed Chaibi
622363757b ASoC: ep93xx: Fix unchecked clk_prepare_enable() and add rollback on failure
ep93xx_i2s_enable() calls clk_prepare_enable() on three clocks in
sequence (mclk, sclk, lrclk) without checking the return value of any
of them. If an intermediate enable fails, the clocks that were already
enabled are never rolled back, leaking them until the next disable cycle
— which may never come if the stream never started cleanly.

Change ep93xx_i2s_enable() from void to int. Add error checking after
each clk_prepare_enable() call and unwind already-enabled clocks on
failure. Propagate the error through ep93xx_i2s_startup() and
ep93xx_i2s_resume(), both of which already return int.

Signed-off-by: Jihed Chaibi <jihed.chaibi.dev@gmail.com>
Fixes: f4ff6b56bc ("ASoC: cirrus: i2s: Prepare clock before using it")
Link: https://patch.msgid.link/20260324210909.45494-1-jihed.chaibi.dev@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2026-03-30 19:38:22 +01:00
Tejun Heo
090d34f0f0 selftests/sched_ext: Add cyclic SCX_KICK_WAIT stress test
Add a test that creates a 3-CPU kick_wait cycle (A->B->C->A). A BPF
scheduler kicks the next CPU in the ring with SCX_KICK_WAIT on every
enqueue while userspace workers generate continuous scheduling churn via
sched_yield(). Without the preceding fix, this hangs the machine within seconds.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Christian Loehle <christian.loehle@arm.com>
Tested-by: Christian Loehle <christian.loehle@arm.com>
2026-03-30 08:37:55 -10:00
Tejun Heo
415cb193bb sched_ext: Fix SCX_KICK_WAIT deadlock by deferring wait to balance callback
SCX_KICK_WAIT busy-waits in kick_cpus_irq_workfn() using
smp_cond_load_acquire() until the target CPU's kick_sync advances. Because
the irq_work runs in hardirq context, the waiting CPU cannot reschedule and
its own kick_sync never advances. If multiple CPUs form a wait cycle, all
CPUs deadlock.

Replace the busy-wait in kick_cpus_irq_workfn() with resched_curr() to
force the CPU through do_pick_task_scx(), which queues a balance callback
to perform the wait. The balance callback drops the rq lock and enables
IRQs following the sched_core_balance() pattern, so the CPU can process
IPIs while waiting. The local CPU's kick_sync is advanced on entry to
do_pick_task_scx() and continuously during the wait, ensuring any CPU that
starts waiting for us sees the advancement and cannot form cyclic
dependencies.

Fixes: 90e55164da ("sched_ext: Implement SCX_KICK_WAIT")
Cc: stable@vger.kernel.org # v6.12+
Reported-by: Christian Loehle <christian.loehle@arm.com>
Link: https://lore.kernel.org/r/20260316100249.1651641-1-christian.loehle@arm.com
Signed-off-by: Tejun Heo <tj@kernel.org>
Tested-by: Christian Loehle <christian.loehle@arm.com>
2026-03-30 08:37:27 -10:00
Kuninori Morimoto
b9eff9732c ASoC: soc-core: call missing INIT_LIST_HEAD() for card_aux_list
Component has "card_aux_list" which is added/deled in bind/unbind aux dev
function (A), and used in for_each_card_auxs() loop (B).

	static void soc_unbind_aux_dev(...)
	{
		...
		for_each_card_auxs_safe(...) {
			...
(A)			list_del(&component->card_aux_list);
		}			     ^^^^^^^^^^^^^
	}

	static int soc_bind_aux_dev(...)
	{
		...
		for_each_card_pre_auxs(...) {
			...
(A)			list_add(&component->card_aux_list, ...);
		}			     ^^^^^^^^^^^^^
		...
	}

	#define for_each_card_auxs(card, component)	\
(B)		list_for_each_entry(component, ..., card_aux_list)
						    ^^^^^^^^^^^^^

But it has been used without calling INIT_LIST_HEAD().

	> git grep card_aux_list sound/soc
	sound/soc/soc-core.c:           list_del(&component->card_aux_list);
	sound/soc/soc-core.c:           list_add(&component->card_aux_list, ...);

call missing INIT_LIST_HEAD() for it.

Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Link: https://patch.msgid.link/87341mxa8l.wl-kuninori.morimoto.gx@renesas.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2026-03-30 18:24:56 +01:00
Tomas Glozar
2e8b1a1d12 rtla: Fix build without libbpf header
rtla supports building without libbpf. However, BPF actions
patchset [1] adds an include of bpf/libbpf.h into timerlat_bpf.h,
which breaks build on systems that don't have libbpf headers
installed.

This is a leftover from a draft version of the patchset where
timerlat_bpf_set_action() (which takes a struct bpf_program * argument)
was defined in the header. timerlat_bpf.c already includes bpf/libbpf.h
via timerlat.skel.h when libbpf is present.

Remove the redundant include to fix build on systems without libbpf
headers.

[1] https://lore.kernel.org/linux-trace-kernel/20251126144205.331954-1-tglozar@redhat.com/T/

Cc: John Kacur <jkacur@redhat.com>
Cc: Luis Goncalves <lgoncalv@redhat.com>
Cc: Crystal Wood <crwood@redhat.com>
Cc: Costa Shulyupin <costa.shul@redhat.com>
Link: https://patch.msgid.link/20260330091207.16184-1-tglozar@redhat.com
Reported-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Closes: https://lore.kernel.org/linux-trace-kernel/20260329122202.65a8b575@robin/
Fixes: 8cd0f08ac7 ("rtla/timerlat: Support tail call from BPF program")
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Reviewed-by: Wander Lairson Costa <wander@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-03-30 12:44:48 -04:00
Takashi Iwai
ea31be8a2c ALSA: hda/realtek: Add quirk for Samsung Book2 Pro 360 (NP950QED)
There is another Book2 Pro model (NP950QED) that seems equipped with
the same speaker module as the non-360 model, which requires
ALC298_FIXUP_SAMSUNG_AMP_V2_2_AMPS quirk.

Reported-by: Throw <zakkabj@gmail.com>
Link: https://patch.msgid.link/20260330162249.147665-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2026-03-30 18:24:16 +02:00
Gilson Marquato Júnior
8ec017cf31 ASoC: amd: yc: Add DMI entry for HP Laptop 15-fc0xxx
The HP Laptop 15-fc0xxx (subsystem ID 0x103c8dc9) has an internal
DMIC connected to the AMD ACP6x audio coprocessor. Add a DMI quirk
entry so the internal microphone is properly detected on this model.

Tested on HP Laptop 15-fc0237ns with Fedora 43 (kernel 6.19.9).

Signed-off-by: Gilson Marquato Júnior <gilsonmandalogo@hotmail.com>
Link: https://patch.msgid.link/20260330-hp-15-fc0xxx-dmic-v2-v1-1-6dd6f53a1917@hotmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2026-03-30 15:16:28 +01:00
Zhang Heng
27c2996984 ASoC: amd: yc: Add DMI quirk for ASUS Vivobook Pro 16X OLED M7601RM
Add a DMI quirk for the ASUS Vivobook Pro 16X OLED M7601RM fixing the
issue where the internal microphone was not detected.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=221288
Signed-off-by: Zhang Heng <zhangheng@kylinos.cn>
Link: https://patch.msgid.link/20260330095133.81786-1-zhangheng@kylinos.cn
Signed-off-by: Mark Brown <broonie@kernel.org>
2026-03-30 14:46:09 +01:00
Zhang Heng
f1af71d568 ALSA: hda/realtek: Add quirk for ASUS ROG Strix SCAR 15
ASUS ROG Strix SCAR 15, like the Strix G15, requires the
ALC285_FIXUP_ASUS_G533Z_PINS quirk to work properly.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=221247
Cc: <stable@vger.kernel.org>
Signed-off-by: Zhang Heng <zhangheng@kylinos.cn>
Link: https://patch.msgid.link/20260330075334.50962-2-zhangheng@kylinos.cn
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2026-03-30 13:45:14 +02:00
Dag Smedberg
f025ac8c69 ALSA: usb-audio: Exclude Scarlett Solo 1st Gen from SKIP_IFACE_SETUP
Same issue that the Scarlett 2i2 1st Gen had:
QUIRK_FLAG_SKIP_IFACE_SETUP causes distorted audio on the
Scarlett Solo 1st Gen (1235:801c).

Fixes: 38c322068a ("ALSA: usb-audio: Add QUIRK_FLAG_SKIP_IFACE_SETUP")
Reported-by: Dag Smedberg <dag@dsmedberg.se>
Tested-by: Dag Smedberg <dag@dsmedberg.se>
Signed-off-by: Dag Smedberg <dag@dsmedberg.se>
Link: https://patch.msgid.link/20260329170420.4122-1-dag@dsmedberg.se
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2026-03-30 09:33:29 +02:00
Berk Cem Goksel
45424e871a ALSA: caiaq: fix stack out-of-bounds read in init_card
The loop creates a whitespace-stripped copy of the card shortname
where `len < sizeof(card->id)` is used for the bounds check. Since
sizeof(card->id) is 16 and the local id buffer is also 16 bytes,
writing 16 non-space characters fills the entire buffer,
overwriting the terminating nullbyte.

When this non-null-terminated string is later passed to
snd_card_set_id() -> copy_valid_id_string(), the function scans
forward with `while (*nid && ...)` and reads past the end of the
stack buffer, reading the contents of the stack.

A USB device with a product name containing many non-ASCII, non-space
characters (e.g. multibyte UTF-8) will reliably trigger this as follows:

  BUG: KASAN: stack-out-of-bounds in copy_valid_id_string
       sound/core/init.c:696 [inline]
  BUG: KASAN: stack-out-of-bounds in snd_card_set_id_no_lock+0x698/0x74c
       sound/core/init.c:718

The off-by-one has been present since commit bafeee5b1f ("ALSA:
snd_usb_caiaq: give better shortname") from June 2009 (v2.6.31-rc1),
which first introduced this whitespace-stripping loop. The original
code never accounted for the null terminator when bounding the copy.

Fix this by changing the loop bound to `sizeof(card->id) - 1`,
ensuring at least one byte remains as the null terminator.

Fixes: bafeee5b1f ("ALSA: snd_usb_caiaq: give better shortname")
Cc: stable@vger.kernel.org
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Reported-by: Berk Cem Goksel <berkcgoksel@gmail.com>
Signed-off-by: Berk Cem Goksel <berkcgoksel@gmail.com>
Link: https://patch.msgid.link/20260329133825.581585-1-berkcgoksel@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2026-03-30 09:31:35 +02:00
Linus Torvalds
7aaa8047ea Linux 7.0-rc6 2026-03-29 15:40:00 -07:00
Linus Torvalds
d1384f70b2 Merge tag 'vfs-7.0-rc6.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:

 - Fix netfs_limit_iter() hitting BUG() when an ITER_KVEC iterator
   reaches it via core dump writes to 9P filesystems. Add ITER_KVEC
   handling following the same pattern as the existing ITER_BVEC code.

 - Fix a NULL pointer dereference in the netfs unbuffered write retry
   path when the filesystem (e.g., 9P) doesn't set the prepare_write
   operation.

 - Clear I_DIRTY_TIME in sync_lazytime for filesystems implementing
  ->sync_lazytime. Without this the flag stays set and may cause
   additional unnecessary calls during inode deactivation.

 - Increase tmpfs size in mount_setattr selftests. A recent commit
   bumped the ext4 image size to 2 GB but didn't adjust the tmpfs
   backing store, so mkfs.ext4 fails with ENOSPC writing metadata.

 - Fix an invalid folio access in iomap when i_blkbits matches the folio
   size but differs from the I/O granularity. The cur_folio pointer
   would not get invalidated and iomap_read_end() would still be called
   on it despite the IO helper owning it.

 - Fix hash_name() docstring.

 - Fix read abandonment during netfs retry where the subreq variable
   used for abandonment could be uninitialized on the first pass or
   point to a deleted subrequest on later passes.

 - Don't block sync for filesystems with no data integrity guarantees.
   Add a SB_I_NO_DATA_INTEGRITY superblock flag replacing the per-inode
   AS_NO_DATA_INTEGRITY mapping flag so sync kicks off writeback but
   doesn't wait for flusher threads. This fixes a suspend-to-RAM hang on
   fuse-overlayfs where the flusher thread blocks when the fuse daemon
   is frozen.

 - Fix a lockdep splat in iomap when reads fail. iomap_read_end_io()
   invokes fserror_report() which calls igrab() taking i_lock in hardirq
   context while i_lock is normally held with interrupts enabled. Kick
   failed read handling to a workqueue.

 - Remove the redundant netfs_io_stream::front member and use
   stream->subrequests.next instead, fixing a potential issue in the
   direct write code path.

* tag 'vfs-7.0-rc6.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  netfs: Fix the handling of stream->front by removing it
  iomap: fix lockdep complaint when reads fail
  writeback: don't block sync for filesystems with no data integrity guarantees
  netfs: Fix read abandonment during retry
  vfs: fix docstring of hash_name()
  iomap: fix invalid folio access when i_blkbits differs from I/O granularity
  selftests/mount_setattr: increase tmpfs size for idmapped mount tests
  fs: clear I_DIRTY_TIME in sync_lazytime
  netfs: Fix NULL pointer dereference in netfs_unbuffered_write() on retry
  netfs: Fix kernel BUG in netfs_limit_iter() for ITER_KVEC iterators
2026-03-29 15:24:28 -07:00
Linus Torvalds
fc9eae25ec Merge tag 'phy-fixes-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy
Pull phy fixes from Vinod Koul:

 - Qualcomm PCS table fix for ufs phy

 - TI device node reference fix

 - Common prop kconfig fix

 - lynx CDR lock workaround for lanes disabled

 - usb disconnect function fix of k1 driver

* tag 'phy-fixes-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy:
  phy: qcom: qmp-ufs: Fix SM8650 PCS table for Gear 4
  phy: ti: j721e-wiz: Fix device node reference leak in wiz_get_lane_phy_types()
  phy: k1-usb: add disconnect function support
  phy: lynx-28g: skip CDR lock workaround for lanes disabled in the device tree
  phy: make PHY_COMMON_PROPS Kconfig symbol conditionally user-selectable
2026-03-29 12:48:52 -07:00
Linus Torvalds
a516c618a6 Merge tag 'dmaengine-fix-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine
Pull dmaengine fixes from Vinod Koul:
 "A bunch of driver fixes with idxd ones being the biggest:

   - Xilinx regmap init error handling, dma_device directions, residue
     calculation, and reset related timeout fixes

   - Renesas CHCTRL updates and driver list fixes

   - DW HDMA cycle bits and MSI data programming fix

   - IDXD pile of fixes for memeory leak and FLR fixes"

* tag 'dmaengine-fix-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (21 commits)
  dmaengine: xilinx_dma: Fix reset related timeout with two-channel AXIDMA
  dmaengine: xilinx: xilinx_dma: Fix unmasked residue subtraction
  dmaengine: xilinx: xilinx_dma: Fix residue calculation for cyclic DMA
  dmaengine: xilinx: xilinx_dma: Fix dma_device directions
  dmaengine: sh: rz-dmac: Move CHCTRL updates under spinlock
  dmaengine: sh: rz-dmac: Protect the driver specific lists
  dmaengine: idxd: fix possible wrong descriptor completion in llist_abort_desc()
  dmaengine: xilinx: xdma: Fix regmap init error handling
  dmaengine: dw-edma: Fix multiple times setting of the CYCLE_STATE and CYCLE_BIT bits for HDMA.
  dmaengine: idxd: Fix leaking event log memory
  dmaengine: idxd: Fix freeing the allocated ida too late
  dmaengine: idxd: Fix memory leak when a wq is reset
  dmaengine: idxd: Fix not releasing workqueue on .release()
  dmaengine: idxd: Wait for submitted operations on .device_synchronize()
  dmaengine: idxd: Flush all pending descriptors
  dmaengine: idxd: Flush kernel workqueues on Function Level Reset
  dmaengine: idxd: Fix possible invalid memory access after FLR
  dmaengine: idxd: Fix crash when the event log is disabled
  dmaengine: idxd: Fix lockdep warnings when calling idxd_device_config()
  dmaengine: dw-edma: fix MSI data programming for multi-IRQ case
  ...
2026-03-29 12:42:31 -07:00
Linus Torvalds
32ee88daf7 Merge tag 'i2c-for-7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:

 - designware: fix resume-probe race causing NULL-deref in amdisp

 - imx: fix timeout on repeated reads and extra clock at end

 - MAINTAINERS: drop outdated I2C website

* tag 'i2c-for-7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  MAINTAINERS: drop outdated I2C website
  i2c: designware: amdisp: Fix resume-probe race condition issue
  i2c: imx: ensure no clock is generated after last read
  i2c: imx: fix i2c issue when reading multiple messages
2026-03-29 12:27:13 -07:00
Linus Torvalds
ac354b5cb0 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
 "s390:

   - Lots of small and not-so-small fixes for the newly rewritten gmap,
     mostly affecting the handling of nested guests.

  x86:

   - Fix an issue with shadow paging, which causes KVM to install an
     MMIO PTE in the shadow page tables without first zapping a non-MMIO
     SPTE if KVM didn't see the write that modified the shadowed guest
     PTE.

     While commit a54aa15c6b ("KVM: x86/mmu: Handle MMIO SPTEs
     directly in mmu_set_spte()") was right about it being impossible to
     miss such a write if it was coming from the guest, it failed to
     account for writes to guest memory that are outside the scope of
     KVM: if userspace modifies the guest PTE, and then the guest hits a
     relevant page fault, KVM will get confused"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86/mmu: Only WARN in direct MMUs when overwriting shadow-present SPTE
  KVM: x86/mmu: Drop/zap existing present SPTE even when creating an MMIO SPTE
  KVM: s390: Fix KVM_S390_VCPU_FAULT ioctl
  KVM: s390: vsie: Fix guest page tables protection
  KVM: s390: vsie: Fix unshadowing while shadowing
  KVM: s390: vsie: Fix refcount overflow for shadow gmaps
  KVM: s390: vsie: Fix nested guest memory shadowing
  KVM: s390: Correctly handle guest mappings without struct page
  KVM: s390: Fix gmap_link()
  KVM: s390: vsie: Fix check for pre-existing shadow mapping
  KVM: s390: Remove non-atomic dat_crstep_xchg()
  KVM: s390: vsie: Fix dat_split_ste()
2026-03-29 11:58:47 -07:00
Linus Torvalds
b8a3bc8567 Merge tag 'for-linus-7.0a-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fix from Juergen Gross:
 "A single fix for a very rare bug introduced in rc5"

* tag 'for-linus-7.0a-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/privcmd: unregister xenstore notifier on module exit
2026-03-29 11:51:37 -07:00
Linus Torvalds
f242ac4a09 Merge tag 'x86-urgent-2026-03-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:

 - Fix an early boot crash in AMD SEV-SNP guests, caused by incorrect
   FSGSBASE init ordering (Nikunj A Dadhania)

 - Remove X86_CR4_FRED from the CR4 pinned bits mask, to fix a race
   window during the bootup of SEV-{ES,SNP} or TDX guests, which can
   crash them if they trigger exceptions in that window (Borislav
   Petkov)

 - Fix early boot failures on SEV-ES/SNP guests, due to incorrect early
   GHCB access (Nikunj A Dadhania)

 - Add clarifying comment to the CRn pinning logic, to avoid future
   confusion & bugs (Peter Zijlstra)

* tag 'x86-urgent-2026-03-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/cpu: Add comment clarifying CRn pinning
  x86/fred: Fix early boot failures on SEV-ES/SNP guests
  x86/cpu: Remove X86_CR4_FRED from the CR4 pinned bits mask
  x86/cpu: Enable FSGSBASE early in cpu_init_exception_handling()
2026-03-29 10:04:37 -07:00
Linus Torvalds
47e3f23f0e Merge tag 'timers-urgent-2026-03-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fix from Ingo Molnar:
 "Fix an argument order bug in the alarm timer forwarding logic, which
  may cause missed expirations or incorrect overrun accounting"

* tag 'timers-urgent-2026-03-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  alarmtimer: Fix argument order in alarm_timer_forward()
2026-03-29 10:02:38 -07:00
Linus Torvalds
f087b0bad4 Merge tag 'locking-urgent-2026-03-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull futex fixes from Ingo Molnar:

 - Tighten up the sys_futex_requeue() ABI a bit, to disallow dissimilar
   futex flags and potential UaF access (Peter Zijlstra)

 - Fix UaF between futex_key_to_node_opt() and vma_replace_policy()
   (Hao-Yu Yang)

 - Clear stale exiting pointer in futex_lock_pi() retry path, which
   triggered a warning (and potential misbehavior) in stress-testing
   (Davidlohr Bueso)

* tag 'locking-urgent-2026-03-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  futex: Clear stale exiting pointer in futex_lock_pi() retry path
  futex: Fix UaF between futex_key_to_node_opt() and vma_replace_policy()
  futex: Require sys_futex_requeue() to have identical flags
2026-03-29 09:59:46 -07:00
Linus Torvalds
21047b17b3 Merge tag 'irq-urgent-2026-03-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Ingo Molnar:

 - Fix TX completion signaling bug in the Qualcomm MPM irqchip driver

 - Fix probe error handling in the Renesas RZ/V2H(P) irqchip driver

* tag 'irq-urgent-2026-03-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/renesas-rzv2h: Fix error path in rzv2h_icu_probe_common()
  irqchip/qcom-mpm: Add missing mailbox TX done acknowledgment
2026-03-29 09:53:01 -07:00
Linus Torvalds
a3d97d1d3f Merge tag 'ovl-fixes-7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs
Pull overlayfs fixes from Amir Goldstein:

 - Fix regression in 'xino' feature detection

   I clumsily introduced this regression myself when working on another
   subsystem (fsnotify). Both the regression and the fix have almost no
   visible impact on users except for some kmsg prints.

 - Fix to performance regression in v6.12.

   This regression was reported by Google COS developers.

   It is not uncommon these days for the year-old mature LTS to get
   adopted by distros and get exposed to many new workloads. We made a
   sub-smart move of making a behavior change in v6.12 which could
   impact performance, without making it opt-in. Fixing this mistake
   retroactively, to be picked by LTS.

* tag 'ovl-fixes-7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs:
  ovl: make fsync after metadata copy-up opt-in mount option
  ovl: fix wrong detection of 32bit inode numbers
2026-03-29 09:34:50 -07:00
Linus Torvalds
241d4ca15d Merge tag 'ext4_for_linus-7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 fixes from Ted Ts'o:

 - Update the MAINTAINERS file to add reviewers for the ext4 file system

 - Add a test issue an ext4 warning (not a WARN_ON) if there are still
   dirty pages attached to an evicted inode.

 - Fix a number of Syzkaller issues

 - Fix memory leaks on error paths

 - Replace some BUG and WARN with EFSCORRUPTED reporting

 - Fix a potential crash when disabling discard via remount followed by
   an immediate unmount. (Found by Sashiko)

 - Fix a corner case which could lead to allocating blocks for an
   indirect-mapped inode block numbers > 2**32

 - Fix a race when reallocating a freed inode that could result in a
   deadlock

 - Fix a user-after-free in update_super_work when racing with umount

 - Fix build issues when trying to build ext4's kunit tests as a module

 - Fix a bug where ext4_split_extent_zeroout() could fail to pass back
   an error from ext4_ext_dirty()

 - Avoid allocating blocks from a corrupted block group in
   ext4_mb_find_by_goal()

 - Fix a percpu_counters list corruption BUG triggered by an ext4
   extents kunit

 - Fix a potetial crash caused by the fast commit flush path potentially
   accessing the jinode structure before it is fully initialized

 - Fix fsync(2) in no-journal mode to make sure the dirtied inode is
   write to storage

 - Fix a bug when in no-journal mode, when ext4 tries to avoid using
   recently deleted inodes, if lazy itable initialization is enabled,
   can lead to an unitialized inode getting skipped and triggering an
   e2fsck complaint

 - Fix journal credit calculation when setting an xattr when both the
   encryption and ea_inode feeatures are enabled

 - Fix corner cases which could result in stale xarray tags after
   writeback

 - Fix generic/475 failures caused by ENOSPC errors while creating a
   symlink when the system crashes resulting to a file system
   inconsistency when replaying the fast commit journal

* tag 'ext4_for_linus-7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (27 commits)
  ext4: always drain queued discard work in ext4_mb_release()
  ext4: handle wraparound when searching for blocks for indirect mapped blocks
  ext4: skip split extent recovery on corruption
  ext4: fix iloc.bh leak in ext4_fc_replay_inode() error paths
  ext4: fix deadlock on inode reallocation
  ext4: fix use-after-free in update_super_work when racing with umount
  ext4: fix the might_sleep() warnings in kvfree()
  ext4: reject mount if bigalloc with s_first_data_block != 0
  ext4: fix extents-test.c is not compiled when EXT4_KUNIT_TESTS=M
  ext4: fix mballoc-test.c is not compiled when EXT4_KUNIT_TESTS=M
  ext4: introduce EXPORT_SYMBOL_FOR_EXT4_TEST() helper
  jbd2: gracefully abort on checkpointing state corruptions
  ext4: avoid infinite loops caused by residual data
  ext4: validate p_idx bounds in ext4_ext_correct_indexes
  ext4: test if inode's all dirty pages are submitted to disk
  ext4: minor fix for ext4_split_extent_zeroout()
  ext4: avoid allocate block from corrupted group in ext4_mb_find_by_goal()
  ext4: kunit: extents-test: lix percpu_counters list corruption
  ext4: publish jinode after initialization
  ext4: replace BUG_ON with proper error handling in ext4_read_inline_folio
  ...
2026-03-29 09:30:06 -07:00
Takashi Iwai
277c6960d4 ALSA: ctxfi: Check the error for index mapping
The ctxfi driver blindly assumed a proper value returned from
daio_device_index(), but it's not always true.  Add a proper error
check to deal with the error from the function.

Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/87cy149n6k.wl-tiwai@suse.de
Link: https://patch.msgid.link/20260329091240.420194-2-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2026-03-29 11:13:53 +02:00
Takashi Iwai
b045ab3dff ALSA: ctxfi: Fix missing SPDIFI1 index handling
SPDIF1 DAIO type isn't properly handled in daio_device_index() for
hw20k2, and it returned -EINVAL, which ended up with the out-of-bounds
array access.  Follow the hw20k1 pattern and return the proper index
for this type, too.

Reported-and-tested-by: Karsten Hohmeier <linux@hohmatik.de>
Closes: https://lore.kernel.org/20260315155004.15633-1-linux@hohmatik.de
Cc: <stable@vger.kernel.org>
Link: https://patch.msgid.link/20260329091240.420194-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2026-03-29 11:13:42 +02:00
Linus Torvalds
b51ad67773 Merge tag 'for-7.0-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
 "A few more fixes. There's one that stands out in size as it fixes an
  edge case in fsync.

   - fix issue on fsync where file with zero size appears as a non-zero
     after log replay

   - in zlib compression, handle a crash when data alignment causes
     folio reference issues

   - fix possible crash with enabled tracepoints on a overlayfs mount

   - handle device stats update error

   - on zoned filesystems, fix kobject leak on sub-block groups

   - fix super block offset in an error message in validation"

* tag 'for-7.0-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: fix lost error when running device stats on multiple devices fs
  btrfs: tracepoints: get correct superblock from dentry in event btrfs_sync_file()
  btrfs: zlib: handle page aligned compressed size correctly
  btrfs: fix leak of kobject name for sub-group space_info
  btrfs: fix zero size inode with non-zero size after log replay
  btrfs: fix super block offset in error message in btrfs_validate_super()
2026-03-28 15:23:03 -07:00
Linus Torvalds
0bcb517f0a Merge tag 'mm-hotfixes-stable-2026-03-28-10-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
 "10 hotfixes.  8 are cc:stable.  9 are for MM.

  There's a 3-patch series of DAMON fixes from Josh Law and SeongJae
  Park. The rest are singletons - please see the changelogs for details"

* tag 'mm-hotfixes-stable-2026-03-28-10-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  mm/mseal: update VMA end correctly on merge
  bug: avoid format attribute warning for clang as well
  mm/pagewalk: fix race between concurrent split and refault
  mm/memory: fix PMD/PUD checks in follow_pfnmap_start()
  mm/damon/sysfs: check contexts->nr in repeat_call_fn
  mm/damon/sysfs: check contexts->nr before accessing contexts_arr[0]
  mm/damon/sysfs: fix param_ctx leak on damon_sysfs_new_test_ctx() failure
  mm/swap: fix swap cache memcg accounting
  MAINTAINERS, mailmap: update email address for Harry Yoo
  mm/huge_memory: fix folio isn't locked in softleaf_to_folio()
2026-03-28 14:19:55 -07:00
Wolfram Sang
b0faf733fc MAINTAINERS: drop outdated I2C website
As stated on the website: "This wiki has been archived and the content
is no longer updated." No need to reference it.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
2026-03-28 20:31:33 +01:00
Linus Torvalds
cbfffcca2b Merge tag 'trace-v7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:

 - Fix potential deadlock in osnoise and hotplug

   The interface_lock can be called by a osnoise thread and the CPU
   shutdown logic of osnoise can wait for this thread to finish. But
   cpus_read_lock() can also be taken while holding the interface_lock.
   This produces a circular lock dependency and can cause a deadlock.

   Swap the ordering of cpus_read_lock() and the interface_lock to have
   interface_lock taken within the cpus_read_lock() context to prevent
   this circular dependency.

 - Fix freeing of event triggers in early boot up

   If the same trigger is added on the kernel command line, the second
   one will fail to be applied and the trigger created will be freed.
   This calls into the deferred logic and creates a kernel thread to do
   the freeing. But the command line logic is called before kernel
   threads can be created and this leads to a NULL pointer dereference.

   Delay freeing event triggers until late init.

* tag 'trace-v7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing: Drain deferred trigger frees if kthread creation fails
  tracing: Fix potential deadlock in cpu hotplug with osnoise
2026-03-28 09:59:09 -07:00
Linus Torvalds
e522b75c44 Merge tag 's390-7.0-6' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Vasily Gorbik:

 - Add array_index_nospec() to syscall dispatch table lookup to prevent
   limited speculative out-of-bounds access with user-controlled syscall
   number

 - Mark array_index_mask_nospec() __always_inline since GCC may emit an
   out-of-line call instead of the inline data dependency sequence the
   mitigation relies on

 - Clear r12 on kernel entry to prevent potential speculative use of
   user value in system_call, ext/io/mcck interrupt handlers

* tag 's390-7.0-6' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/entry: Scrub r12 register on kernel entry
  s390/syscalls: Add spectre boundary for syscall dispatch table
  s390/barrier: Make array_index_mask_nospec() __always_inline
2026-03-28 09:50:11 -07:00
Davidlohr Bueso
210d36d892 futex: Clear stale exiting pointer in futex_lock_pi() retry path
Fuzzying/stressing futexes triggered:

    WARNING: kernel/futex/core.c:825 at wait_for_owner_exiting+0x7a/0x80, CPU#11: futex_lock_pi_s/524

When futex_lock_pi_atomic() sees the owner is exiting, it returns -EBUSY
and stores a refcounted task pointer in 'exiting'.

After wait_for_owner_exiting() consumes that reference, the local pointer
is never reset to nil. Upon a retry, if futex_lock_pi_atomic() returns a
different error, the bogus pointer is passed to wait_for_owner_exiting().

  CPU0			     CPU1		       CPU2
  futex_lock_pi(uaddr)
  // acquires the PI futex
  exit()
    futex_cleanup_begin()
      futex_state = EXITING;
			     futex_lock_pi(uaddr)
			       futex_lock_pi_atomic()
				 attach_to_pi_owner()
				   // observes EXITING
				   *exiting = owner;  // takes ref
				   return -EBUSY
			       wait_for_owner_exiting(-EBUSY, owner)
				 put_task_struct();   // drops ref
			       // exiting still points to owner
			       goto retry;
			       futex_lock_pi_atomic()
				 lock_pi_update_atomic()
				   cmpxchg(uaddr)
					*uaddr ^= WAITERS // whatever
				   // value changed
				 return -EAGAIN;
			       wait_for_owner_exiting(-EAGAIN, exiting) // stale
				 WARN_ON_ONCE(exiting)

Fix this by resetting upon retry, essentially aligning it with requeue_pi.

Fixes: 3ef240eaff ("futex: Prevent exit livelock")
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260326001759.4129680-1-dave@stgolabs.net
2026-03-28 13:54:02 +01:00
Wesley Atwell
250ab25391 tracing: Drain deferred trigger frees if kthread creation fails
Boot-time trigger registration can fail before the trigger-data cleanup
kthread exists. Deferring those frees until late init is fine, but the
post-boot fallback must still drain the deferred list if kthread
creation never succeeds.

Otherwise, boot-deferred nodes can accumulate on
trigger_data_free_list, later frees fall back to synchronously freeing
only the current object, and the older queued entries are leaked
forever.

To trigger this, add the following to the kernel command line:

  trace_event=sched_switch trace_trigger=sched_switch.traceon,sched_switch.traceon

The second traceon trigger will fail and be freed. This triggers a NULL
pointer dereference and crashes the kernel.

Keep the deferred boot-time behavior, but when kthread creation fails,
drain the whole queued list synchronously. Do the same in the late-init
drain path so queued entries are not stranded there either.

Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260324221326.1395799-3-atwellwea@gmail.com
Fixes: 61d445af0a ("tracing: Add bulk garbage collection of freeing event_trigger_data")
Signed-off-by: Wesley Atwell <atwellwea@gmail.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-03-28 08:32:44 -04:00
Sourav Nayak
1fbf85dbf0 ALSA: hda/realtek: add quirk for HP Victus 15-fb0xxx
This adds a mute led quirck for HP Victus 15-fb0xxx (103c:8a3d) model

- As it used 0x8(full bright)/0x7f(little dim) for mute led on and other
  values as 0ff (0x0, 0x4, ...)

- So, use ALC245_FIXUP_HP_MUTE_LED_V2_COEFBIT insted for safer approach

Cc: <stable@vger.kernel.org>
Signed-off-by: Sourav Nayak <nonameblank007@gmail.com>
Link: https://patch.msgid.link/20260327142805.17139-1-nonameblank007@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2026-03-28 10:54:19 +01:00
Stuart Hayhurst
696b0a9bd2 ALSA: hda/intel: Add MSI X870E Tomahawk to denylist by DMI ID
This motherboard uses USB audio instead, causing this driver to complain
about "no codecs found!".
Add it to the denylist to silence the warning.

The first attempt only matched on the PCI device, but this caused issues
for some laptops, so DMI match against the board as well.

Signed-off-by: Stuart Hayhurst <stuart.a.hayhurst@gmail.com>
Link: https://patch.msgid.link/20260327155737.21818-2-stuart.a.hayhurst@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2026-03-28 10:32:53 +01:00
Dustin L. Howett
bac1e57adf ALSA: hda/realtek: add quirk for Framework F111:000F
Similar to commit 7b509910b3 ("ALSA hda/realtek: Add quirk for
Framework F111:000C") and previous quirks for Framework systems with
Realtek codecs.

000F is another new platform with an ALC285 which needs the same quirk.

Signed-off-by: Dustin L. Howett <dustin@howett.net>
Link: https://patch.msgid.link/20260327-framework-alsa-000f-v1-1-74013aba1c00@howett.net
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2026-03-28 10:31:30 +01:00
Phil Willoughby
bc5b4e5ae1 ALSA: usb-audio: Fix quirk flags for NeuralDSP Quad Cortex
The NeuralDSP Quad Cortex does not support DSD playback. We need
this product-specific entry with zero quirks because otherwise it
falls through to the vendor-specific entry which marks it as
supporting DSD playback.

Cc: Yue Wang <yuleopen@gmail.com>
Cc: Jaroslav Kysela <perex@perex.cz>
Cc: Takashi Iwai <tiwai@suse.com>
Signed-off-by: Phil Willoughby <willerz@gmail.com>
Link: https://patch.msgid.link/20260328080921.3310-1-willerz@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2026-03-28 09:58:44 +01:00
Lorenzo Stoakes (Oracle)
2697dd8ae7 mm/mseal: update VMA end correctly on merge
Previously we stored the end of the current VMA in curr_end, and then upon
iterating to the next VMA updated curr_start to curr_end to advance to the
next VMA.

However, this doesn't take into account the fact that a VMA might be
updated due to a merge by vma_modify_flags(), which can result in curr_end
being stale and thus, upon setting curr_start to curr_end, ending up with
an incorrect curr_start on the next iteration.

Resolve the issue by setting curr_end to vma->vm_end unconditionally to
ensure this value remains updated should this occur.

While we're here, eliminate this entire class of bug by simply setting
const curr_[start/end] to be clamped to the input range and VMAs, which
also happens to simplify the logic.

Link: https://lkml.kernel.org/r/20260327173104.322405-1-ljs@kernel.org
Fixes: 6c2da14ae1 ("mm/mseal: rework mseal apply logic")
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Reported-by: Antonius <antonius@bluedragonsec.com>
Closes: https://lore.kernel.org/linux-mm/CAK8a0jwWGj9-SgFk0yKFh7i8jMkwKm5b0ao9=kmXWjO54veX2g@mail.gmail.com/
Suggested-by: David Hildenbrand (ARM) <david@kernel.org>
Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Cc: Jann Horn <jannh@google.com>
Cc: Jeff Xu <jeffxu@chromium.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-03-27 20:48:38 -07:00
Arnd Bergmann
2598ab9d63 bug: avoid format attribute warning for clang as well
Like gcc, clang-22 now also warns about a function that it incorrectly
identifies as a printf-style format:

lib/bug.c:190:22: error: diagnostic behavior may be improved by adding the 'format(printf, 1, 0)' attribute to the declaration of '__warn_printf' [-Werror,-Wmissing-format-attribute]
  179 | static void __warn_printf(const char *fmt, struct pt_regs *regs)
      | __attribute__((format(printf, 1, 0)))
  180 | {
  181 |         if (!fmt)
  182 |                 return;
  183 |
  184 | #ifdef HAVE_ARCH_BUG_FORMAT_ARGS
  185 |         if (regs) {
  186 |                 struct arch_va_list _args;
  187 |                 va_list *args = __warn_args(&_args, regs);
  188 |
  189 |                 if (args) {
  190 |                         vprintk(fmt, *args);
      |                                           ^

Revert the change that added a gcc-specific workaround, and instead add
the generic annotation that avoid the warning.

Link: https://lkml.kernel.org/r/20260323205534.1284284-1-arnd@kernel.org
Fixes: d36067d6ea ("bug: Hush suggest-attribute=format for __warn_printf()")
Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Suggested-by: Brendan Jackman <jackmanb@google.com>
Link: https://lore.kernel.org/all/20251208141618.2805983-1-andriy.shevchenko@linux.intel.com/T/#u
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Brendan Jackman <jackmanb@google.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-03-27 20:48:38 -07:00
Max Boone
3b89863c3f mm/pagewalk: fix race between concurrent split and refault
The splitting of a PUD entry in walk_pud_range() can race with a
concurrent thread refaulting the PUD leaf entry causing it to try walking
a PMD range that has disappeared.

An example and reproduction of this is to try reading numa_maps of a
process while VFIO-PCI is setting up DMA (specifically the
vfio_pin_pages_remote call) on a large BAR for that process.

This will trigger a kernel BUG:
vfio-pci 0000:03:00.0: enabling device (0000 -> 0002)
BUG: unable to handle page fault for address: ffffa23980000000
PGD 0 P4D 0
Oops: Oops: 0000 [#1] SMP NOPTI
...
RIP: 0010:walk_pgd_range+0x3b5/0x7a0
Code: 8d 43 ff 48 89 44 24 28 4d 89 ce 4d 8d a7 00 00 20 00 48 8b 4c 24
28 49 81 e4 00 00 e0 ff 49 8d 44 24 ff 48 39 c8 4c 0f 43 e3 <49> f7 06
   9f ff ff ff 75 3b 48 8b 44 24 20 48 8b 40 28 48 85 c0 74
RSP: 0018:ffffac23e1ecf808 EFLAGS: 00010287
RAX: 00007f44c01fffff RBX: 00007f4500000000 RCX: 00007f44ffffffff
RDX: 0000000000000000 RSI: 000ffffffffff000 RDI: ffffffff93378fe0
RBP: ffffac23e1ecf918 R08: 0000000000000004 R09: ffffa23980000000
R10: 0000000000000020 R11: 0000000000000004 R12: 00007f44c0200000
R13: 00007f44c0000000 R14: ffffa23980000000 R15: 00007f44c0000000
FS:  00007fe884739580(0000) GS:ffff9b7d7a9c0000(0000)
knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffa23980000000 CR3: 000000c0650e2005 CR4: 0000000000770ef0
PKRU: 55555554
Call Trace:
 <TASK>
 __walk_page_range+0x195/0x1b0
 walk_page_vma+0x62/0xc0
 show_numa_map+0x12b/0x3b0
 seq_read_iter+0x297/0x440
 seq_read+0x11d/0x140
 vfs_read+0xc2/0x340
 ksys_read+0x5f/0xe0
 do_syscall_64+0x68/0x130
 ? get_page_from_freelist+0x5c2/0x17e0
 ? mas_store_prealloc+0x17e/0x360
 ? vma_set_page_prot+0x4c/0xa0
 ? __alloc_pages_noprof+0x14e/0x2d0
 ? __mod_memcg_lruvec_state+0x8d/0x140
 ? __lruvec_stat_mod_folio+0x76/0xb0
 ? __folio_mod_stat+0x26/0x80
 ? do_anonymous_page+0x705/0x900
 ? __handle_mm_fault+0xa8d/0x1000
 ? __count_memcg_events+0x53/0xf0
 ? handle_mm_fault+0xa5/0x360
 ? do_user_addr_fault+0x342/0x640
 ? arch_exit_to_user_mode_prepare.constprop.0+0x16/0xa0
 ? irqentry_exit_to_user_mode+0x24/0x100
 entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7fe88464f47e
Code: c0 e9 b6 fe ff ff 50 48 8d 3d be 07 0b 00 e8 69 01 02 00 66 0f 1f
84 00 00 00 00 00 64 8b 04 25 18 00 00 00 85 c0 75 14 0f 05 <48> 3d 00
   f0 ff ff 77 5a c3 66 0f 1f 84 00 00 00 00 00 48 83 ec 28
RSP: 002b:00007ffe6cd9a9b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007fe88464f47e
RDX: 0000000000020000 RSI: 00007fe884543000 RDI: 0000000000000003
RBP: 00007fe884543000 R08: 00007fe884542010 R09: 0000000000000000
R10: fffffffffffffbc5 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000003 R14: 0000000000020000 R15: 0000000000020000
 </TASK>

Fix this by validating the PUD entry in walk_pmd_range() using a stable
snapshot (pudp_get()).  If the PUD is not present or is a leaf, retry the
walk via ACTION_AGAIN instead of descending further.  This mirrors the
retry logic in walk_pte_range(), which lets walk_pmd_range() retry if the
PTE is not being got by pte_offset_map_lock().

Link: https://lkml.kernel.org/r/20260325-pagewalk-check-pmd-refault-v2-1-707bff33bc60@akamai.com
Fixes: f9e54c3a2f ("vfio/pci: implement huge_fault support")
Co-developed-by: David Hildenbrand (Arm) <david@kernel.org>
Signed-off-by: David Hildenbrand (Arm) <david@kernel.org>
Signed-off-by: Max Boone <mboone@akamai.com>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-03-27 20:48:38 -07:00
David Hildenbrand (Arm)
ffef67b93a mm/memory: fix PMD/PUD checks in follow_pfnmap_start()
follow_pfnmap_start() suffers from two problems:

(1) We are not re-fetching the pmd/pud after taking the PTL

Therefore, we are not properly stabilizing what the lock actually
protects.  If there is concurrent zapping, we would indicate to the
caller that we found an entry, however, that entry might already have
been invalidated, or contain a different PFN after taking the lock.

Properly use pmdp_get() / pudp_get() after taking the lock.

(2) pmd_leaf() / pud_leaf() are not well defined on non-present entries

pmd_leaf()/pud_leaf() could wrongly trigger on non-present entries.

There is no real guarantee that pmd_leaf()/pud_leaf() returns something
reasonable on non-present entries.  Most architectures indeed either
perform a present check or make it work by smart use of flags.

However, for example loongarch checks the _PAGE_HUGE flag in pmd_leaf(),
and always sets the _PAGE_HUGE flag in __swp_entry_to_pmd().  Whereby
pmd_trans_huge() explicitly checks pmd_present(), pmd_leaf() does not do
that.

Let's check pmd_present()/pud_present() before assuming "the is a present
PMD leaf" when spotting pmd_leaf()/pud_leaf(), like other page table
handling code that traverses user page tables does.

Given that non-present PMD entries are likely rare in VM_IO|VM_PFNMAP, (1)
is likely more relevant than (2).  It is questionable how often (1) would
actually trigger, but let's CC stable to be sure.

This was found by code inspection.

Link: https://lkml.kernel.org/r/20260323-follow_pfnmap_fix-v1-1-5b0ec10872b3@kernel.org
Fixes: 6da8e9634b ("mm: new follow_pfnmap API")
Signed-off-by: David Hildenbrand (Arm) <david@kernel.org>
Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-03-27 20:48:38 -07:00
Josh Law
6557004a8b mm/damon/sysfs: check contexts->nr in repeat_call_fn
damon_sysfs_repeat_call_fn() calls damon_sysfs_upd_tuned_intervals(),
damon_sysfs_upd_schemes_stats(), and
damon_sysfs_upd_schemes_effective_quotas() without checking contexts->nr. 
If nr_contexts is set to 0 via sysfs while DAMON is running, these
functions dereference contexts_arr[0] and cause a NULL pointer
dereference.  Add the missing check.

For example, the issue can be reproduced using DAMON sysfs interface and
DAMON user-space tool (damo) [1] like below.

    $ sudo damo start --refresh_interval 1s
    $ echo 0 | sudo tee \
            /sys/kernel/mm/damon/admin/kdamonds/0/contexts/nr_contexts

Link: https://patch.msgid.link/20260320163559.178101-3-objecting@objecting.org
Link: https://lkml.kernel.org/r/20260321175427.86000-4-sj@kernel.org
Link: https://github.com/damonitor/damo [1]
Fixes: d809a7c64b ("mm/damon/sysfs: implement refresh_ms file internal work")
Signed-off-by: Josh Law <objecting@objecting.org>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org>	[6.17+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-03-27 20:48:38 -07:00
Josh Law
1bfe9fb5ed mm/damon/sysfs: check contexts->nr before accessing contexts_arr[0]
Multiple sysfs command paths dereference contexts_arr[0] without first
verifying that kdamond->contexts->nr == 1.  A user can set nr_contexts to
0 via sysfs while DAMON is running, causing NULL pointer dereferences.

In more detail, the issue can be triggered by privileged users like
below.

First, start DAMON and make contexts directory empty
(kdamond->contexts->nr == 0).

    # damo start
    # cd /sys/kernel/mm/damon/admin/kdamonds/0
    # echo 0 > contexts/nr_contexts

Then, each of below commands will cause the NULL pointer dereference.

    # echo update_schemes_stats > state
    # echo update_schemes_tried_regions > state
    # echo update_schemes_tried_bytes > state
    # echo update_schemes_effective_quotas > state
    # echo update_tuned_intervals > state

Guard all commands (except OFF) at the entry point of
damon_sysfs_handle_cmd().

Link: https://lkml.kernel.org/r/20260321175427.86000-3-sj@kernel.org
Fixes: 0ac32b8aff ("mm/damon/sysfs: support DAMOS stats")
Signed-off-by: Josh Law <objecting@objecting.org>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org>	[5.18+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-03-27 20:48:38 -07:00
Josh Law
7fe000eb32 mm/damon/sysfs: fix param_ctx leak on damon_sysfs_new_test_ctx() failure
Patch series "mm/damon/sysfs: fix memory leak and NULL dereference
issues", v4.

DAMON_SYSFS can leak memory under allocation failure, and do NULL pointer
dereference when a privileged user make wrong sequences of control.  Fix
those.


This patch (of 3):

When damon_sysfs_new_test_ctx() fails in damon_sysfs_commit_input(),
param_ctx is leaked because the early return skips the cleanup at the out
label.  Destroy param_ctx before returning.

Link: https://lkml.kernel.org/r/20260321175427.86000-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20260321175427.86000-2-sj@kernel.org
Fixes: f0c5118ebb ("mm/damon/sysfs: catch commit test ctx alloc failure")
Signed-off-by: Josh Law <objecting@objecting.org>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org>	[6.18+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-03-27 20:48:37 -07:00
Alexandre Ghiti
9e0d0ddfbc mm/swap: fix swap cache memcg accounting
The swap readahead path was recently refactored and while doing this, the
order between the charging of the folio in the memcg and the addition of
the folio in the swap cache was inverted.

Since the accounting of the folio is done while adding the folio to the
swap cache and the folio is not charged in the memcg yet, the accounting
is then done at the node level, which is wrong.

Fix this by charging the folio in the memcg before adding it to the swap cache.

Link: https://lkml.kernel.org/r/20260320050601.1833108-1-alex@ghiti.fr
Fixes: 2732acda82 ("mm, swap: use swap cache as the swap in synchronize layer")
Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Acked-by: Kairui Song <kasong@tencent.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Nhat Pham <nphamcs@gmail.com>
Acked-by: Chris Li <chrisl@kernel.org>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Baoquan He <bhe@redhat.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Kemeng Shi <shikemeng@huaweicloud.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-03-27 20:48:37 -07:00
Harry Yoo (Oracle)
26d3dca201 MAINTAINERS, mailmap: update email address for Harry Yoo
Update my email address to harry@kernel.org.

Link: https://lkml.kernel.org/r/20260320125925.2259998-1-harry@kernel.org
Signed-off-by: Harry Yoo (Oracle) <harry@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-03-27 20:48:37 -07:00
Jinjiang Tu
4c5e7f0fcd mm/huge_memory: fix folio isn't locked in softleaf_to_folio()
On arm64 server, we found folio that get from migration entry isn't locked
in softleaf_to_folio().  This issue triggers when mTHP splitting and
zap_nonpresent_ptes() races, and the root cause is lack of memory barrier
in softleaf_to_folio().  The race is as follows:

	CPU0                                             CPU1

deferred_split_scan()                              zap_nonpresent_ptes()
  lock folio
  split_folio()
    unmap_folio()
      change ptes to migration entries
    __split_folio_to_order()                         softleaf_to_folio()
      set flags(including PG_locked) for tail pages    folio = pfn_folio(softleaf_to_pfn(entry))
      smp_wmb()                                        VM_WARN_ON_ONCE(!folio_test_locked(folio))
      prep_compound_page() for tail pages

In __split_folio_to_order(), smp_wmb() guarantees page flags of tail pages
are visible before the tail page becomes non-compound.  smp_wmb() should
be paired with smp_rmb() in softleaf_to_folio(), which is missed.  As a
result, if zap_nonpresent_ptes() accesses migration entry that stores tail
pfn, softleaf_to_folio() may see the updated compound_head of tail page
before page->flags.

This issue will trigger VM_WARN_ON_ONCE() in pfn_swap_entry_folio()
because of the race between folio split and zap_nonpresent_ptes()
leading to a folio incorrectly undergoing modification without a folio
lock being held.

This is a BUG_ON() before commit 93976a2034 ("mm: eliminate further
swapops predicates"), which in merged in v6.19-rc1.

To fix it, add missing smp_rmb() if the softleaf entry is migration entry
in softleaf_to_folio() and softleaf_to_page().

[tujinjiang@huawei.com: update function name and comments]
  Link: https://lkml.kernel.org/r/20260321075214.3305564-1-tujinjiang@huawei.com
Link: https://lkml.kernel.org/r/20260319012541.4158561-1-tujinjiang@huawei.com
Fixes: e9b61f1985 ("thp: reintroduce split_huge_page()")
Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Cc: Barry Song <baohua@kernel.org>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nanyong Sun <sunnanyong@huawei.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-03-27 20:48:37 -07:00
Theodore Ts'o
9ee29d20aa ext4: always drain queued discard work in ext4_mb_release()
While reviewing recent ext4 patch[1], Sashiko raised the following
concern[2]:

> If the filesystem is initially mounted with the discard option,
> deleting files will populate sbi->s_discard_list and queue
> s_discard_work. If it is then remounted with nodiscard, the
> EXT4_MOUNT_DISCARD flag is cleared, but the pending s_discard_work is
> neither cancelled nor flushed.

[1] https://lore.kernel.org/r/20260319094545.19291-1-qiang.zhang@linux.dev/
[2] https://sashiko.dev/#/patchset/20260319094545.19291-1-qiang.zhang%40linux.dev

The concern was valid, but it had nothing to do with the patch[1].
One of the problems with Sashiko in its current (early) form is that
it will detect pre-existing issues and report it as a problem with the
patch that it is reviewing.

In practice, it would be hard to hit deliberately (unless you are a
malicious syzkaller fuzzer), since it would involve mounting the file
system with -o discard, and then deleting a large number of files,
remounting the file system with -o nodiscard, and then immediately
unmounting the file system before the queued discard work has a change
to drain on its own.

Fix it because it's a real bug, and to avoid Sashiko from raising this
concern when analyzing future patches to mballoc.c.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Fixes: 55cdd0af2b ("ext4: get discard out of jbd2 commit kthread contex")
Cc: stable@kernel.org
2026-03-27 23:39:10 -04:00
Theodore Ts'o
bb81702370 ext4: handle wraparound when searching for blocks for indirect mapped blocks
Commit 4865c768b5 ("ext4: always allocate blocks only from groups
inode can use") restricts what blocks will be allocated for indirect
block based files to block numbers that fit within 32-bit block
numbers.

However, when using a review bot running on the latest Gemini LLM to
check this commit when backporting into an LTS based kernel, it raised
this concern:

   If ac->ac_g_ex.fe_group is >= ngroups (for instance, if the goal
   group was populated via stream allocation from s_mb_last_groups),
   then start will be >= ngroups.

   Does this allow allocating blocks beyond the 32-bit limit for
   indirect block mapped files? The commit message mentions that
   ext4_mb_scan_groups_linear() takes care to not select unsupported
   groups. However, its loop uses group = *start, and the very first
   iteration will call ext4_mb_scan_group() with this unsupported
   group because next_linear_group() is only called at the end of the
   iteration.

After reviewing the code paths involved and considering the LLM
review, I determined that this can happen when there is a file system
where some files/directories are extent-mapped and others are
indirect-block mapped.  To address this, add a safety clamp in
ext4_mb_scan_groups().

Fixes: 4865c768b5 ("ext4: always allocate blocks only from groups inode can use")
Cc: Jan Kara <jack@suse.cz>
Reviewed-by: Baokun Li <libaokun@linux.alibaba.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Link: https://patch.msgid.link/20260326045834.1175822-1-tytso@mit.edu
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
2026-03-27 23:39:02 -04:00
hongao
3ceda17325 ext4: skip split extent recovery on corruption
ext4_split_extent_at() retries after ext4_ext_insert_extent() fails by
refinding the original extent and restoring its length. That recovery is
only safe for transient resource failures such as -ENOSPC, -EDQUOT, and
-ENOMEM.

When ext4_ext_insert_extent() fails because the extent tree is already
corrupted, ext4_find_extent() can return a leaf path without p_ext.
ext4_split_extent_at() then dereferences path[depth].p_ext while trying to
fix up the original extent length, causing a NULL pointer dereference while
handling a pre-existing filesystem corruption.

Do not enter the recovery path for corruption errors, and validate p_ext
after refinding the extent before touching it. This keeps the recovery path
limited to cases it can actually repair and turns the syzbot-triggered crash
into a proper corruption report.

Fixes: 716b9c23b8 ("ext4: refactor split and convert extents")
Reported-by: syzbot+1ffa5d865557e51cb604@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=1ffa5d865557e51cb604
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: hongao <hongao@uniontech.com>
Link: https://patch.msgid.link/EF77870F23FF9C90+20260324015815.35248-1-hongao@uniontech.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
2026-03-27 23:38:52 -04:00
Baokun Li
ec0a7500d8 ext4: fix iloc.bh leak in ext4_fc_replay_inode() error paths
During code review, Joseph found that ext4_fc_replay_inode() calls
ext4_get_fc_inode_loc() to get the inode location, which holds a
reference to iloc.bh that must be released via brelse().

However, several error paths jump to the 'out' label without
releasing iloc.bh:

 - ext4_handle_dirty_metadata() failure
 - sync_dirty_buffer() failure
 - ext4_mark_inode_used() failure
 - ext4_iget() failure

Fix this by introducing an 'out_brelse' label placed just before
the existing 'out' label to ensure iloc.bh is always released.

Additionally, make ext4_fc_replay_inode() propagate errors
properly instead of always returning 0.

Reported-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Fixes: 8016e29f43 ("ext4: fast commit recovery path")
Signed-off-by: Baokun Li <libaokun@linux.alibaba.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20260323060836.3452660-1-libaokun@linux.alibaba.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
2026-03-27 23:38:01 -04:00
Jan Kara
0c90eed1b9 ext4: fix deadlock on inode reallocation
Currently there is a race in ext4 when reallocating freed inode
resulting in a deadlock:

Task1					Task2
ext4_evict_inode()
  handle = ext4_journal_start();
  ...
  if (IS_SYNC(inode))
    handle->h_sync = 1;
  ext4_free_inode()
					ext4_new_inode()
					  handle = ext4_journal_start()
					  finds the bit in inode bitmap
					    already clear
					  insert_inode_locked()
					    waits for inode to be
					      removed from the hash.
  ext4_journal_stop(handle)
    jbd2_journal_stop(handle)
      jbd2_log_wait_commit(journal, tid);
        - deadlocks waiting for transaction handle Task2 holds

Fix the problem by removing inode from the hash already in
ext4_clear_inode() by which time all IO for the inode is done so reuse
is already fine but we are still before possibly blocking on transaction
commit.

Reported-by: "Lai, Yi" <yi1.lai@linux.intel.com>
Link: https://lore.kernel.org/all/abNvb2PcrKj1FBeC@ly-workstation
Fixes: 88ec797c46 ("fs: make insert_inode_locked() wait for inode destruction")
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20260320090428.24899-2-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
2026-03-27 23:37:48 -04:00
Jiayuan Chen
d15e4b0a41 ext4: fix use-after-free in update_super_work when racing with umount
Commit b98535d091 ("ext4: fix bug_on in start_this_handle during umount
filesystem") moved ext4_unregister_sysfs() before flushing s_sb_upd_work
to prevent new error work from being queued via /proc/fs/ext4/xx/mb_groups
reads during unmount. However, this introduced a use-after-free because
update_super_work calls ext4_notify_error_sysfs() -> sysfs_notify() which
accesses the kobject's kernfs_node after it has been freed by kobject_del()
in ext4_unregister_sysfs():

  update_super_work                ext4_put_super
  -----------------                --------------
                                   ext4_unregister_sysfs(sb)
                                     kobject_del(&sbi->s_kobj)
                                       __kobject_del()
                                         sysfs_remove_dir()
                                           kobj->sd = NULL
                                         sysfs_put(sd)
                                           kernfs_put()  // RCU free
  ext4_notify_error_sysfs(sbi)
    sysfs_notify(&sbi->s_kobj)
      kn = kobj->sd              // stale pointer
      kernfs_get(kn)             // UAF on freed kernfs_node
                                   ext4_journal_destroy()
                                     flush_work(&sbi->s_sb_upd_work)

Instead of reordering the teardown sequence, fix this by making
ext4_notify_error_sysfs() detect that sysfs has already been torn down
by checking s_kobj.state_in_sysfs, and skipping the sysfs_notify() call
in that case. A dedicated mutex (s_error_notify_mutex) serializes
ext4_notify_error_sysfs() against kobject_del() in ext4_unregister_sysfs()
to prevent TOCTOU races where the kobject could be deleted between the
state_in_sysfs check and the sysfs_notify() call.

Fixes: b98535d091 ("ext4: fix bug_on in start_this_handle during umount filesystem")
Cc: Jiayuan Chen <jiayuan.chen@linux.dev>
Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jiayuan Chen <jiayuan.chen@shopee.com>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20260319120336.157873-1-jiayuan.chen@linux.dev
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
2026-03-27 23:37:39 -04:00
Zqiang
496bb99b7e ext4: fix the might_sleep() warnings in kvfree()
Use the kvfree() in the RCU read critical section can trigger
the following warnings:

EXT4-fs (vdb): unmounting filesystem cd983e5b-3c83-4f5a-a136-17b00eb9d018.

WARNING: suspicious RCU usage

./include/linux/rcupdate.h:409 Illegal context switch in RCU read-side critical section!

other info that might help us debug this:

rcu_scheduler_active = 2, debug_locks = 1

Call Trace:
 <TASK>
 dump_stack_lvl+0xbb/0xd0
 dump_stack+0x14/0x20
 lockdep_rcu_suspicious+0x15a/0x1b0
 __might_resched+0x375/0x4d0
 ? put_object.part.0+0x2c/0x50
 __might_sleep+0x108/0x160
 vfree+0x58/0x910
 ? ext4_group_desc_free+0x27/0x270
 kvfree+0x23/0x40
 ext4_group_desc_free+0x111/0x270
 ext4_put_super+0x3c8/0xd40
 generic_shutdown_super+0x14c/0x4a0
 ? __pfx_shrinker_free+0x10/0x10
 kill_block_super+0x40/0x90
 ext4_kill_sb+0x6d/0xb0
 deactivate_locked_super+0xb4/0x180
 deactivate_super+0x7e/0xa0
 cleanup_mnt+0x296/0x3e0
 __cleanup_mnt+0x16/0x20
 task_work_run+0x157/0x250
 ? __pfx_task_work_run+0x10/0x10
 ? exit_to_user_mode_loop+0x6a/0x550
 exit_to_user_mode_loop+0x102/0x550
 do_syscall_64+0x44a/0x500
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
 </TASK>

BUG: sleeping function called from invalid context at mm/vmalloc.c:3441
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 556, name: umount
preempt_count: 1, expected: 0
CPU: 3 UID: 0 PID: 556 Comm: umount
Call Trace:
 <TASK>
 dump_stack_lvl+0xbb/0xd0
 dump_stack+0x14/0x20
 __might_resched+0x275/0x4d0
 ? put_object.part.0+0x2c/0x50
 __might_sleep+0x108/0x160
 vfree+0x58/0x910
 ? ext4_group_desc_free+0x27/0x270
 kvfree+0x23/0x40
 ext4_group_desc_free+0x111/0x270
 ext4_put_super+0x3c8/0xd40
 generic_shutdown_super+0x14c/0x4a0
 ? __pfx_shrinker_free+0x10/0x10
 kill_block_super+0x40/0x90
 ext4_kill_sb+0x6d/0xb0
 deactivate_locked_super+0xb4/0x180
 deactivate_super+0x7e/0xa0
 cleanup_mnt+0x296/0x3e0
 __cleanup_mnt+0x16/0x20
 task_work_run+0x157/0x250
 ? __pfx_task_work_run+0x10/0x10
 ? exit_to_user_mode_loop+0x6a/0x550
 exit_to_user_mode_loop+0x102/0x550
 do_syscall_64+0x44a/0x500
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

The above scenarios occur in initialization failures and teardown
paths, there are no parallel operations on the resources released
by kvfree(), this commit therefore remove rcu_read_lock/unlock() and
use rcu_access_pointer() instead of rcu_dereference() operations.

Fixes: 7c990728b9 ("ext4: fix potential race between s_flex_groups online resizing and access")
Fixes: df3da4ea5a ("ext4: fix potential race between s_group_info online resizing and access")
Signed-off-by: Zqiang <qiang.zhang@linux.dev>
Reviewed-by: Baokun Li <libaokun@linux.alibaba.com>
Link: https://patch.msgid.link/20260319094545.19291-1-qiang.zhang@linux.dev
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
2026-03-27 23:36:20 -04:00
Helen Koike
3822743dc2 ext4: reject mount if bigalloc with s_first_data_block != 0
bigalloc with s_first_data_block != 0 is not supported, reject mounting
it.

Signed-off-by: Helen Koike <koike@igalia.com>
Suggested-by: Theodore Ts'o <tytso@mit.edu>
Reported-by: syzbot+b73703b873a33d8eb8f6@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=b73703b873a33d8eb8f6
Link: https://patch.msgid.link/20260317142325.135074-1-koike@igalia.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
2026-03-27 23:36:11 -04:00
Ye Bin
9e1b14320b ext4: fix extents-test.c is not compiled when EXT4_KUNIT_TESTS=M
Now, only EXT4_KUNIT_TESTS=Y testcase will be compiled in 'extents.c'.
To solve this issue, the ext4 test code needs to be decoupled. The
'extents-test' module is compiled into 'ext4-test' module.

Fixes: cb1e0c1d1f ("ext4: kunit tests for extent splitting and conversion")
Signed-off-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20260314075258.1317579-4-yebin@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2026-03-27 23:36:06 -04:00
Ye Bin
519b76ac0b ext4: fix mballoc-test.c is not compiled when EXT4_KUNIT_TESTS=M
Now, only EXT4_KUNIT_TESTS=Y testcase will be compiled in 'mballoc.c'.
To solve this issue, the ext4 test code needs to be decoupled. The ext4
test module is compiled into a separate module.

Reported-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Closes: https://patchwork.kernel.org/project/cifs-client/patch/20260118091313.1988168-2-chenxiaosong.chenxiaosong@linux.dev/
Fixes: 7c9fa399a3 ("ext4: add first unit test for ext4_mb_new_blocks_simple in mballoc")
Signed-off-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20260314075258.1317579-3-yebin@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2026-03-27 23:36:02 -04:00
Ye Bin
49504a5125 ext4: introduce EXPORT_SYMBOL_FOR_EXT4_TEST() helper
Introduce EXPORT_SYMBOL_FOR_EXT4_TEST() helper for kuint test.

Signed-off-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20260314075258.1317579-2-yebin@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2026-03-27 23:34:19 -04:00
Milos Nikic
bac3190a8e jbd2: gracefully abort on checkpointing state corruptions
This patch targets two internal state machine invariants in checkpoint.c
residing inside functions that natively return integer error codes.

- In jbd2_cleanup_journal_tail(): A blocknr of 0 indicates a severely
corrupted journal superblock. Replaced the J_ASSERT with a WARN_ON_ONCE
and a graceful journal abort, returning -EFSCORRUPTED.

- In jbd2_log_do_checkpoint(): Replaced the J_ASSERT_BH checking for
an unexpected buffer_jwrite state. If the warning triggers, we
explicitly drop the just-taken get_bh() reference and call __flush_batch()
to safely clean up any previously queued buffers in the j_chkpt_bhs array,
preventing a memory leak before returning -EFSCORRUPTED.

Signed-off-by: Milos Nikic <nikic.milos@gmail.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Baokun Li <libaokun@linux.alibaba.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20260311041548.159424-1-nikic.milos@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
2026-03-27 23:34:09 -04:00
Edward Adam Davis
5422fe71d2 ext4: avoid infinite loops caused by residual data
On the mkdir/mknod path, when mapping logical blocks to physical blocks,
if inserting a new extent into the extent tree fails (in this example,
because the file system disabled the huge file feature when marking the
inode as dirty), ext4_ext_map_blocks() only calls ext4_free_blocks() to
reclaim the physical block without deleting the corresponding data in
the extent tree. This causes subsequent mkdir operations to reference
the previously reclaimed physical block number again, even though this
physical block is already being used by the xattr block. Therefore, a
situation arises where both the directory and xattr are using the same
buffer head block in memory simultaneously.

The above causes ext4_xattr_block_set() to enter an infinite loop about
"inserted" and cannot release the inode lock, ultimately leading to the
143s blocking problem mentioned in [1].

If the metadata is corrupted, then trying to remove some extent space
can do even more harm. Also in case EXT4_GET_BLOCKS_DELALLOC_RESERVE
was passed, remove space wrongly update quota information.
Jan Kara suggests distinguishing between two cases:

1) The error is ENOSPC or EDQUOT - in this case the filesystem is fully
consistent and we must maintain its consistency including all the
accounting. However these errors can happen only early before we've
inserted the extent into the extent tree. So current code works correctly
for this case.

2) Some other error - this means metadata is corrupted. We should strive to
do as few modifications as possible to limit damage. So I'd just skip
freeing of allocated blocks.

[1]
INFO: task syz.0.17:5995 blocked for more than 143 seconds.
Call Trace:
 inode_lock_nested include/linux/fs.h:1073 [inline]
 __start_dirop fs/namei.c:2923 [inline]
 start_dirop fs/namei.c:2934 [inline]

Reported-by: syzbot+512459401510e2a9a39f@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=1659aaaaa8d9d11265d7
Tested-by: syzbot+1659aaaaa8d9d11265d7@syzkaller.appspotmail.com
Reported-by: syzbot+1659aaaaa8d9d11265d7@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=512459401510e2a9a39f
Tested-by: syzbot+1659aaaaa8d9d11265d7@syzkaller.appspotmail.com
Signed-off-by: Edward Adam Davis <eadavis@qq.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Tested-by: syzbot+512459401510e2a9a39f@syzkaller.appspotmail.com
Link: https://patch.msgid.link/tencent_43696283A68450B761D76866C6F360E36705@qq.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
2026-03-27 23:33:56 -04:00
Tejas Bharambe
2acb5c12eb ext4: validate p_idx bounds in ext4_ext_correct_indexes
ext4_ext_correct_indexes() walks up the extent tree correcting
index entries when the first extent in a leaf is modified. Before
accessing path[k].p_idx->ei_block, there is no validation that
p_idx falls within the valid range of index entries for that
level.

If the on-disk extent header contains a corrupted or crafted
eh_entries value, p_idx can point past the end of the allocated
buffer, causing a slab-out-of-bounds read.

Fix this by validating path[k].p_idx against EXT_LAST_INDEX() at
both access sites: before the while loop and inside it. Return
-EFSCORRUPTED if the index pointer is out of range, consistent
with how other bounds violations are handled in the ext4 extent
tree code.

Reported-by: syzbot+04c4e65cab786a2e5b7e@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=04c4e65cab786a2e5b7e
Signed-off-by: Tejas Bharambe <tejas.bharambe@outlook.com>
Link: https://patch.msgid.link/JH0PR06MB66326016F9B6AD24097D232B897CA@JH0PR06MB6632.apcprd06.prod.outlook.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
2026-03-27 23:33:46 -04:00
Ye Bin
73bf12adbe ext4: test if inode's all dirty pages are submitted to disk
The commit aa373cf550 ("writeback: stop background/kupdate works from
livelocking other works") introduced an issue where unmounting a filesystem
in a multi-logical-partition scenario could lead to batch file data loss.
This problem was not fixed until the commit d92109891f ("fs/writeback:
bail out if there is no more inodes for IO and queued once"). It took
considerable time to identify the root cause. Additionally, in actual
production environments, we frequently encountered file data loss after
normal system reboots. Therefore, we are adding a check in the inode
release flow to verify whether all dirty pages have been flushed to disk,
in order to determine whether the data loss is caused by a logic issue in
the filesystem code.

Signed-off-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20260303012242.3206465-1-yebin@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
2026-03-27 23:33:34 -04:00
Ojaswin Mujoo
c4a48e9eee ext4: minor fix for ext4_split_extent_zeroout()
We missed storing the error which triggerd smatch warning:

	fs/ext4/extents.c:3369 ext4_split_extent_zeroout()
	warn: duplicate zero check 'err' (previous on line 3363)

fs/ext4/extents.c
    3361
    3362         err = ext4_ext_get_access(handle, inode, path + depth);
    3363         if (err)
    3364                 return err;
    3365
    3366         ext4_ext_mark_initialized(ex);
    3367
    3368         ext4_ext_dirty(handle, inode, path + depth);
--> 3369         if (err)
    3370                 return err;
    3371
    3372         return 0;
    3373 }

Fix it by correctly storing the err value from ext4_ext_dirty().

Link: https://lore.kernel.org/all/aYXvVgPnKltX79KE@stanley.mountain/
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Fixes: a985e07c26 ("ext4: refactor zeroout path and handle all cases")
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Baokun Li <libaokun@linux.alibaba.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Link: https://patch.msgid.link/20260302143811.605174-1-ojaswin@linux.ibm.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
2026-03-27 23:33:22 -04:00
Ye Bin
46066e3a06 ext4: avoid allocate block from corrupted group in ext4_mb_find_by_goal()
There's issue as follows:
...
EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 206 at logical offset 0 with max blocks 1 with error 117
EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost

EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 206 at logical offset 0 with max blocks 1 with error 117
EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost

EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 206 at logical offset 0 with max blocks 1 with error 117
EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost

EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 206 at logical offset 0 with max blocks 1 with error 117
EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost

EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 2243 at logical offset 0 with max blocks 1 with error 117
EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost

EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 2239 at logical offset 0 with max blocks 1 with error 117
EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost

EXT4-fs (mmcblk0p1): error count since last fsck: 1
EXT4-fs (mmcblk0p1): initial error at time 1765597433: ext4_mb_generate_buddy:760
EXT4-fs (mmcblk0p1): last error at time 1765597433: ext4_mb_generate_buddy:760
...

According to the log analysis, blocks are always requested from the
corrupted block group. This may happen as follows:
ext4_mb_find_by_goal
  ext4_mb_load_buddy
   ext4_mb_load_buddy_gfp
     ext4_mb_init_cache
      ext4_read_block_bitmap_nowait
      ext4_wait_block_bitmap
       ext4_validate_block_bitmap
        if (!grp || EXT4_MB_GRP_BBITMAP_CORRUPT(grp))
         return -EFSCORRUPTED; // There's no logs.
 if (err)
  return err;  // Will return error
ext4_lock_group(ac->ac_sb, group);
  if (unlikely(EXT4_MB_GRP_BBITMAP_CORRUPT(e4b->bd_info))) // Unreachable
   goto out;

After commit 9008a58e5d ("ext4: make the bitmap read routines return
real error codes") merged, Commit 163a203ddb ("ext4: mark block group
as corrupt on block bitmap error") is no real solution for allocating
blocks from corrupted block groups. This is because if
'EXT4_MB_GRP_BBITMAP_CORRUPT(e4b->bd_info)' is true, then
'ext4_mb_load_buddy()' may return an error. This means that the block
allocation will fail.
Therefore, check block group if corrupted when ext4_mb_load_buddy()
returns error.

Fixes: 163a203ddb ("ext4: mark block group as corrupt on block bitmap error")
Fixes: 9008a58e5d ("ext4: make the bitmap read routines return real error codes")
Signed-off-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20260302134619.3145520-1-yebin@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
2026-03-27 23:33:08 -04:00
Ritesh Harjani (IBM)
afe376d2c1 ext4: kunit: extents-test: lix percpu_counters list corruption
commit 82f80e2e3b ("ext4: add extent status cache support to kunit tests"),
added ext4_es_register_shrinker() in extents_kunit_init() function but
failed to add the unregister shrinker routine in extents_kunit_exit().

This could cause the following percpu_counters list corruption bug.

         ok 1 split unwrit extent to 2 extents and convert 1st half writ
  slab kmalloc-4k start c0000002007ff000 pointer offset 1448 size 4096
 list_add corruption. next->prev should be prev (c000000004bc9e60), but was 0000000000000000. (next=c0000002007ff5a8).
 ------------[ cut here ]------------
 kernel BUG at lib/list_debug.c:29!
cpu 0x2: Vector: 700 (Program Check) at [c000000241927a30]
    pc: c000000000f26ed0: __list_add_valid_or_report+0x120/0x164
    lr: c000000000f26ecc: __list_add_valid_or_report+0x11c/0x164
    sp: c000000241927cd0
   msr: 800000000282b033
  current = 0xc000000241215200
  paca    = 0xc0000003fffff300   irqmask: 0x03   irq_happened: 0x09
    pid   = 258, comm = kunit_try_catch
kernel BUG at lib/list_debug.c:29!
enter ? for help
 __percpu_counter_init_many+0x148/0x184
 ext4_es_register_shrinker+0x74/0x23c
 extents_kunit_init+0x100/0x308
 kunit_try_run_case+0x78/0x1f8
 kunit_generic_run_threadfn_adapter+0x40/0x70
 kthread+0x190/0x1a0
 start_kernel_thread+0x14/0x18
2:mon>

This happens because:

extents_kunit_init(test N):
  ext4_es_register_shrinker(sbi)
    percpu_counters_init() x 4; // this adds 4 list nodes to global percpu_counters list
      list_add(&fbc->list, &percpu_counters);
    shrinker_register();

extents_kunit_exit(test N):
  kfree(sbi);			// frees sbi w/o removing those 4 list nodes.
  				// So, those list node now becomes dangling pointers

extents_kunit_init(test N+1):
  kzalloc_obj(ext4_sb_info)	// allocator returns same page, but zeroed.
  ext4_es_register_shrinker(sbi)
    percpu_counters_init()
      list_add(&fbc->list, &percpu_counters);
        __list_add_valid(new, prev, next);
	next->prev != prev 		// list corruption bug detected, since next->prev = NULL

Fixes: 82f80e2e3b ("ext4: add extent status cache support to kunit tests")
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Link: https://patch.msgid.link/5bb9041471dab8ce870c191c19cbe4df57473be8.1772381213.git.ritesh.list@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
2026-03-27 23:32:23 -04:00
Li Chen
1aec30021e ext4: publish jinode after initialization
ext4_inode_attach_jinode() publishes ei->jinode to concurrent users.
It used to set ei->jinode before jbd2_journal_init_jbd_inode(),
allowing a reader to observe a non-NULL jinode with i_vfs_inode
still unset.

The fast commit flush path can then pass this jinode to
jbd2_wait_inode_data(), which dereferences i_vfs_inode->i_mapping and
may crash.

Below is the crash I observe:
```
BUG: unable to handle page fault for address: 000000010beb47f4
PGD 110e51067 P4D 110e51067 PUD 0
Oops: Oops: 0000 [#1] SMP NOPTI
CPU: 1 UID: 0 PID: 4850 Comm: fc_fsync_bench_ Not tainted 6.18.0-00764-g795a690c06a5 #1 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.17.0-2-2 04/01/2014
RIP: 0010:xas_find_marked+0x3d/0x2e0
Code: e0 03 48 83 f8 02 0f 84 f0 01 00 00 48 8b 47 08 48 89 c3 48 39 c6 0f 82 fd 01 00 00 48 85 c9 74 3d 48 83 f9 03 77 63 4c 8b 0f <49> 8b 71 08 48 c7 47 18 00 00 00 00 48 89 f1 83 e1 03 48 83 f9 02
RSP: 0018:ffffbbee806e7bf0 EFLAGS: 00010246
RAX: 000000000010beb4 RBX: 000000000010beb4 RCX: 0000000000000003
RDX: 0000000000000001 RSI: 0000002000300000 RDI: ffffbbee806e7c10
RBP: 0000000000000001 R08: 0000002000300000 R09: 000000010beb47ec
R10: ffff9ea494590090 R11: 0000000000000000 R12: 0000002000300000
R13: ffffbbee806e7c90 R14: ffff9ea494513788 R15: ffffbbee806e7c88
FS: 00007fc2f9e3e6c0(0000) GS:ffff9ea6b1444000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000010beb47f4 CR3: 0000000119ac5000 CR4: 0000000000750ef0
PKRU: 55555554
Call Trace:
<TASK>
filemap_get_folios_tag+0x87/0x2a0
__filemap_fdatawait_range+0x5f/0xd0
? srso_alias_return_thunk+0x5/0xfbef5
? __schedule+0x3e7/0x10c0
? srso_alias_return_thunk+0x5/0xfbef5
? srso_alias_return_thunk+0x5/0xfbef5
? srso_alias_return_thunk+0x5/0xfbef5
? preempt_count_sub+0x5f/0x80
? srso_alias_return_thunk+0x5/0xfbef5
? cap_safe_nice+0x37/0x70
? srso_alias_return_thunk+0x5/0xfbef5
? preempt_count_sub+0x5f/0x80
? srso_alias_return_thunk+0x5/0xfbef5
filemap_fdatawait_range_keep_errors+0x12/0x40
ext4_fc_commit+0x697/0x8b0
? ext4_file_write_iter+0x64b/0x950
? srso_alias_return_thunk+0x5/0xfbef5
? preempt_count_sub+0x5f/0x80
? srso_alias_return_thunk+0x5/0xfbef5
? vfs_write+0x356/0x480
? srso_alias_return_thunk+0x5/0xfbef5
? preempt_count_sub+0x5f/0x80
ext4_sync_file+0xf7/0x370
do_fsync+0x3b/0x80
? syscall_trace_enter+0x108/0x1d0
__x64_sys_fdatasync+0x16/0x20
do_syscall_64+0x62/0x2c0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
...
```

Fix this by initializing the jbd2_inode first.
Use smp_wmb() and WRITE_ONCE() to publish ei->jinode after
initialization. Readers use READ_ONCE() to fetch the pointer.

Fixes: a361293f5f ("jbd2: Fix oops in jbd2_journal_file_inode()")
Cc: stable@vger.kernel.org
Signed-off-by: Li Chen <me@linux.beauty>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20260225082617.147957-1-me@linux.beauty
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
2026-03-27 23:32:07 -04:00
Yuto Ohnuki
356227096e ext4: replace BUG_ON with proper error handling in ext4_read_inline_folio
Replace BUG_ON() with proper error handling when inline data size
exceeds PAGE_SIZE. This prevents kernel panic and allows the system to
continue running while properly reporting the filesystem corruption.

The error is logged via ext4_error_inode(), the buffer head is released
to prevent memory leak, and -EFSCORRUPTED is returned to indicate
filesystem corruption.

Signed-off-by: Yuto Ohnuki <ytohnuki@amazon.com>
Link: https://patch.msgid.link/20260223123345.14838-2-ytohnuki@amazon.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
2026-03-27 23:31:52 -04:00
Jan Kara
1308255bbf ext4: fix fsync(2) for nojournal mode
When inode metadata is changed, we sometimes just call
ext4_mark_inode_dirty() to track modified metadata. This copies inode
metadata into block buffer which is enough when we are journalling
metadata. However when we are running in nojournal mode we currently
fail to write the dirtied inode buffer during fsync(2) because the inode
is not marked as dirty. Use explicit ext4_write_inode() call to make
sure the inode table buffer is written to the disk. This is a band aid
solution but proper solution requires a much larger rewrite including
changes in metadata bh tracking infrastructure.

Reported-by: Free Ekanayaka <free.ekanayaka@gmail.com>
Link: https://lore.kernel.org/all/87il8nhxdm.fsf@x1.mail-host-address-is-not-set/
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Link: https://patch.msgid.link/20260216164848.3074-4-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
2026-03-27 23:31:43 -04:00
Jan Kara
bd060afa7c ext4: make recently_deleted() properly work with lazy itable initialization
recently_deleted() checks whether inode has been used in the near past.
However this can give false positive result when inode table is not
initialized yet and we are in fact comparing to random garbage (or stale
itable block of a filesystem before mkfs). Ultimately this results in
uninitialized inodes being skipped during inode allocation and possibly
they are never initialized and thus e2fsck complains.  Verify if the
inode has been initialized before checking for dtime.

Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Link: https://patch.msgid.link/20260216164848.3074-3-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
2026-03-27 23:30:37 -04:00
Simon Weber
b1d682f199 ext4: fix journal credit check when setting fscrypt context
Fix an issue arising when ext4 features has_journal, ea_inode, and encrypt
are activated simultaneously, leading to ENOSPC when creating an encrypted
file.

Fix by passing XATTR_CREATE flag to xattr_set_handle function if a handle
is specified, i.e., when the function is called in the control flow of
creating a new inode. This aligns the number of jbd2 credits set_handle
checks for with the number allocated for creating a new inode.

ext4_set_context must not be called with a non-null handle (fs_data) if
fscrypt context xattr is not guaranteed to not exist yet. The only other
usage of this function currently is when handling the ioctl
FS_IOC_SET_ENCRYPTION_POLICY, which calls it with fs_data=NULL.

Fixes: c1a5d5f6ab ("ext4: improve journal credit handling in set xattr paths")

Co-developed-by: Anthony Durrer <anthonydev@fastmail.com>
Signed-off-by: Anthony Durrer <anthonydev@fastmail.com>
Signed-off-by: Simon Weber <simon.weber.39@gmail.com>
Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Link: https://patch.msgid.link/20260207100148.724275-4-simon.weber.39@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
2026-03-27 23:30:25 -04:00
Deepanshu Kartikey
ed9356a30e ext4: convert inline data to extents when truncate exceeds inline size
Add a check in ext4_setattr() to convert files from inline data storage
to extent-based storage when truncate() grows the file size beyond the
inline capacity. This prevents the filesystem from entering an
inconsistent state where the inline data flag is set but the file size
exceeds what can be stored inline.

Without this fix, the following sequence causes a kernel BUG_ON():

1. Mount filesystem with inode that has inline flag set and small size
2. truncate(file, 50MB) - grows size but inline flag remains set
3. sendfile() attempts to write data
4. ext4_write_inline_data() hits BUG_ON(write_size > inline_capacity)

The crash occurs because ext4_write_inline_data() expects inline storage
to accommodate the write, but the actual inline capacity (~60 bytes for
i_block + ~96 bytes for xattrs) is far smaller than the file size and
write request.

The fix checks if the new size from setattr exceeds the inode's actual
inline capacity (EXT4_I(inode)->i_inline_size) and converts the file to
extent-based storage before proceeding with the size change.

This addresses the root cause by ensuring the inline data flag and file
size remain consistent during truncate operations.

Reported-by: syzbot+7de5fe447862fc37576f@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=7de5fe447862fc37576f
Tested-by: syzbot+7de5fe447862fc37576f@syzkaller.appspotmail.com
Signed-off-by: Deepanshu Kartikey <Kartikey406@gmail.com>
Link: https://patch.msgid.link/20260207043607.1175976-1-kartikey406@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
2026-03-27 23:29:54 -04:00
Jan Kara
f4a2b42e78 ext4: fix stale xarray tags after writeback
There are cases where ext4_bio_write_page() gets called for a page which
has no buffers to submit. This happens e.g. when the part of the file is
actually a hole, when we cannot allocate blocks due to being called from
jbd2, or in data=journal mode when checkpointing writes the buffers
earlier. In these cases we just return from ext4_bio_write_page()
however if the page didn't need redirtying, we will leave stale DIRTY
and/or TOWRITE tags in xarray because those get cleared only in
__folio_start_writeback(). As a result we can leave these tags set in
mappings even after a final sync on filesystem that's getting remounted
read-only or that's being frozen. Various assertions can then get upset
when writeback is started on such filesystems (Gerald reported assertion
in ext4_journal_check_start() firing).

Fix the problem by cycling the page through writeback state even if we
decide nothing needs to be written for it so that xarray tags get
properly updated. This is slightly silly (we could update the xarray
tags directly) but I don't think a special helper messing with xarray
tags is really worth it in this relatively rare corner case.

Reported-by: Gerald Yang <gerald.yang@canonical.com>
Link: https://lore.kernel.org/all/20260128074515.2028982-1-gerald.yang@canonical.com
Fixes: dff4ac75ee ("ext4: move keep_towrite handling to ext4_bio_write_page()")
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20260205092223.21287-2-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
2026-03-27 23:29:39 -04:00
Zhang Yi
84e21e3fb8 ext4: do not check fast symlink during orphan recovery
Commit '5f920d5d6083 ("ext4: verify fast symlink length")' causes the
generic/475 test to fail during orphan cleanup of zero-length symlinks.

  generic/475  84s ... _check_generic_filesystem: filesystem on /dev/vde is inconsistent

The fsck reports are provided below:

  Deleted inode 9686 has zero dtime.
  Deleted inode 158230 has zero dtime.
  ...
  Inode bitmap differences:  -9686 -158230
  Orphan file (inode 12) block 13 is not clean.
  Failed to initialize orphan file.

In ext4_symlink(), a newly created symlink can be added to the orphan
list due to ENOSPC. Its data has not been initialized, and its size is
zero. Therefore, we need to disregard the length check of the symbolic
link when cleaning up orphan inodes. Instead, we should ensure that the
nlink count is zero.

Fixes: 5f920d5d60 ("ext4: verify fast symlink length")
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20260131091156.1733648-1-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
2026-03-27 23:14:25 -04:00
Theodore Ts'o
82c4f23d27 Update MAINTAINERS file to add reviewers for ext4
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
2026-03-27 23:14:20 -04:00
Linus Torvalds
be762d8b6d Merge tag 'hwmon-for-v7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:

 - PMBus driver fixes:
     - Add mutex protection for regulator operations
     - Fix reading from "write-only" attributes
     - Mark lowest/average/highest/rated attributes as read-only
     - isl68137: Add mutex protection for AVS enable sysfs attributes
     - ina233:  Fix error handling and sign extension when reading shunt voltage

 - adm1177: Fix sysfs ABI violation and current unit conversion

 - peci: Fix off-by-one in cputemp_is_visible(), and crit_hyst returning
   delta instead of absolute temperature

* tag 'hwmon-for-v7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (pmbus/core) Protect regulator operations with mutex
  hwmon: (pmbus) Introduce the concept of "write-only" attributes
  hwmon: (pmbus) Mark lowest/average/highest/rated attributes as read-only
  hwmon: (adm1177) fix sysfs ABI violation and current unit conversion
  hwmon: (peci/cputemp) Fix off-by-one in cputemp_is_visible()
  hwmon: (peci/cputemp) Fix crit_hyst returning delta instead of absolute temperature
  hwmon: (pmbus/isl68137) Add mutex protection for AVS enable sysfs attributes
  hwmon: (pmbus/ina233) Fix error handling and sign extension in shunt voltage read
2026-03-27 20:02:34 -07:00
Linus Torvalds
afb54c1404 Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
 "Driver (and enclosure) only fixes. Most are obvious. The big change is
  in the tcm_loop driver to add command draining to error handling (the
  lack of which was causing hangs with the potential for double use
  crashes)"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: target: file: Use kzalloc_flex for aio_cmd
  scsi: scsi_transport_sas: Fix the maximum channel scanning issue
  scsi: target: tcm_loop: Drain commands in target_reset handler
  scsi: ibmvfc: Fix OOB access in ibmvfc_discover_targets_done()
  scsi: ses: Handle positive SCSI error from ses_recv_diag()
2026-03-27 19:58:22 -07:00
Linus Torvalds
26df51adf3 Merge tag 'drm-fixes-2026-03-28-1' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
 "Weekly fixes, still a bit busy, but the usual suspects amdgpu and
  i915/xe have a bunch of small fixes, and otherwise it's just a few
  minor driver fixes.

  loognsoon:
   - update MAINTAINERS

  shmem:
   - fault handler fix

  syncobj:
   - fix GFP flags

  amdgpu:
   - DSC fix
   - Module parameter parsing fix
   - PASID reuse fix
   - drm_edid leak fix
   - SMU 13.x fixes
   - SMU 14.x fix
   - Fence fix in amdgpu_amdkfd_submit_ib()
   - LVDS fixes
   - GPU page fault fix for non-4K pages

  amdkfd:
   - Ordering fix in kfd_ioctl_create_process()

  i915/display:
   - DP tunnel error handling fix
   - Spurious GMBUS timeout fix
   - Unlink NV12 planes earlier
   - Order OP vs. timeout correctly in __wait_for()

  xe:
   - Fix UAF in SRIOV migration restore
   - Updates to HW W/a
   - VMBind remap fix

  ivpu:
   - poweroff fix

  mediatek:
   - fix register ordering"

* tag 'drm-fixes-2026-03-28-1' of https://gitlab.freedesktop.org/drm/kernel: (25 commits)
  MAINTAINERS: Update GPU driver maintainer information
  drm/xe: always keep track of remap prev/next
  drm/syncobj: Fix xa_alloc allocation flags
  drm/amd/display: Fix DCE LVDS handling
  drm/amdgpu: Handle GPU page faults correctly on non-4K page systems
  drm/amd/pm: disable OD_FAN_CURVE if temp or pwm range invalid for smu v14
  drm/amdkfd: Fix NULL pointer check order in kfd_ioctl_create_process
  drm/amd/display: check if ext_caps is valid in BL setup
  drm/amdgpu: Fix fence put before wait in amdgpu_amdkfd_submit_ib
  drm/xe: Implement recent spec updates to Wa_16025250150
  accel/ivpu: Add disable clock relinquish workaround for NVL-A0
  drm/i915/dp_tunnel: Fix error handling when clearing stream BW in atomic state
  drm/amd/pm: disable OD_FAN_CURVE if temp or pwm range invalid for smu v13
  drm/amd/pm: Return -EOPNOTSUPP for unsupported OD_MCLK on smu_v13_0_6
  drm/amd/pm: Skip redundant UCLK restore in smu_v13_0_6
  drm/amd/display: Fix drm_edid leak in amdgpu_dm
  drm/amdgpu: prevent immediate PASID reuse case
  drm/amdgpu: fix strsep() corrupting lockup_timeout on multi-GPU (v3)
  drm/amd/display: Do not skip unrelated mode changes in DSC validation
  drm/xe/pf: Fix use-after-free in migration restore
  ...
2026-03-27 17:21:37 -07:00
Vasily Gorbik
0738d395aa s390/entry: Scrub r12 register on kernel entry
Before commit f33f2d4c7c ("s390/bp: remove TIF_ISOLATE_BP"),
all entry handlers loaded r12 with the current task pointer
(lg %r12,__LC_CURRENT) for use by the BPENTER/BPEXIT macros. That
commit removed TIF_ISOLATE_BP, dropping both the branch prediction
macros and the r12 load, but did not add r12 to the register clearing
sequence.

Add the missing xgr %r12,%r12 to make the register scrub consistent
across all entry points.

Fixes: f33f2d4c7c ("s390/bp: remove TIF_ISOLATE_BP")
Cc: stable@kernel.org
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2026-03-28 00:43:39 +01:00
Greg Kroah-Hartman
48b8814e25 s390/syscalls: Add spectre boundary for syscall dispatch table
The s390 syscall number is directly controlled by userspace, but does
not have an array_index_nospec() boundary to prevent access past the
syscall function pointer tables.

Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Fixes: 56e62a7370 ("s390: convert to generic entry")
Cc: stable@kernel.org
Assisted-by: gkh_clanker_2000
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/2026032404-sterling-swoosh-43e6@gregkh
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2026-03-28 00:43:39 +01:00
Vasily Gorbik
c5c0a268b3 s390/barrier: Make array_index_mask_nospec() __always_inline
Mark array_index_mask_nospec() as __always_inline to guarantee the
mitigation is emitted inline regardless of compiler inlining decisions.

Fixes: e2dd833389 ("s390: add optimized array_index_mask_nospec")
Cc: stable@kernel.org
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2026-03-28 00:43:24 +01:00
Linus Torvalds
335c9017e3 Merge tag 'spi-fix-v7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
 "There are two core fixes here. One is from Johan dealing with an issue
  introduced by a devm_ API usage update causing things to be freed
  earlier than they had earlier when we fail to register a device,
  another from Danilo avoids unlocked acccess to data by converting to
  use a driver core API.

  We also have a few relatively minor driver specific fixes"

* tag 'spi-fix-v7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: spi-fsl-lpspi: fix teardown order issue (UAF)
  spi: fix use-after-free on managed registration failure
  spi: use generic driver_override infrastructure
  spi: meson-spicc: Fix double-put in remove path
  spi: sn-f-ospi: Use devm_mutex_init() to simplify code
  spi: sn-f-ospi: Fix resource leak in f_ospi_probe()
2026-03-27 16:38:55 -07:00
Linus Torvalds
cd0bbd5a66 Merge tag 'regulator-fix-v7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator fix from Mark Brown:
 "A fix from Alice for the rust bindings, they didn't handle the stub
  implementation of the C API used when CONFIG_REGULATOR is disabled
  leading to undefined behaviour"

* tag 'regulator-fix-v7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  rust: regulator: do not assume that regulator_get() returns non-null
2026-03-27 16:36:23 -07:00
Linus Torvalds
30052002e6 Merge tag 'regmap-fix-v7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap fix from Mark Brown:
 "A fix from Andy Shevchenko for an issue with caching of page selector
  registers which are located inside the page they are switching"

* tag 'regmap-fix-v7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
  regmap: Synchronize cache for the page selector
2026-03-27 16:34:25 -07:00
Linus Torvalds
dd09eb4433 Merge tag 'tsm-fixes-7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/devsec/tsm
Pull tsm fix from Dan Williams:

 - Fix a VMM controlled buffer length used to emit TDX attestation
   reports

* tag 'tsm-fixes-7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/devsec/tsm:
  virt: tdx-guest: Fix handling of host controlled 'quote' buffer length
2026-03-27 16:19:51 -07:00
Linus Torvalds
faf44e54f6 Merge tag 'vfio-v7.0-rc6' of https://github.com/awilliam/linux-vfio
Pull VFIO fix from Alex Williamson:

 - Fix double-free and reference count underflow if dma-buf file
   allocation fails (Alex Williamson)

* tag 'vfio-v7.0-rc6' of https://github.com/awilliam/linux-vfio:
  vfio/pci: Fix double free in dma-buf feature
2026-03-27 15:59:30 -07:00
Linus Torvalds
56bea42415 Merge tag 'efi-fixes-for-v7.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
Pull EFI fix from Ard Biesheuvel:
 "Fix a potential buffer overrun issue introduced by the previous fix
  for EFI boot services region reservations on x86"

* tag 'efi-fixes-for-v7.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
  x86/efi: efi_unmap_boot_services: fix calculation of ranges_to_free size
2026-03-27 15:55:25 -07:00
Linus Torvalds
a361474ba3 Merge tag 'loongarch-fixes-7.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch fixes from Huacai Chen:
 "Fix missing NULL checks for kstrdup(), workaround LS2K/LS7A GPU
  DMA hang bug, emit GNU_EH_FRAME for vDSO correctly, and fix some
  KVM-related bugs"

* tag 'loongarch-fixes-7.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
  LoongArch: KVM: Fix base address calculation in kvm_eiointc_regs_access()
  LoongArch: KVM: Handle the case that EIOINTC's coremap is empty
  LoongArch: KVM: Make kvm_get_vcpu_by_cpuid() more robust
  LoongArch: vDSO: Emit GNU_EH_FRAME correctly
  LoongArch: Workaround LS2K/LS7A GPU DMA hang bug
  LoongArch: Fix missing NULL checks for kstrdup()
2026-03-27 15:39:41 -07:00
Linus Torvalds
196ef74abd Merge tag 'io_uring-7.0-20260327' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull io_uring fixes from Jens Axboe:
 "Just two small fixes, both fixing regressions added in the fdinfo code
  in 6.19 with the SQE mixed size support"

* tag 'io_uring-7.0-20260327' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
  io_uring/fdinfo: fix OOB read in SQE_MIXED wrap check
  io_uring/fdinfo: fix SQE_MIXED SQE displaying
2026-03-27 15:35:38 -07:00
Dave Airlie
5ba61d8a25 Merge tag 'mediatek-drm-fixes-20260323' of https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux into drm-fixes
Mediatek DRM Fixes - 20260323

1. dsi: Store driver data before invoking mipi_dsi_host_register

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Link: https://patch.msgid.link/20260323160135.39609-1-chunkuang.hu@kernel.org
2026-03-28 08:05:36 +10:00
Sean Christopherson
df83746075 KVM: x86/mmu: Only WARN in direct MMUs when overwriting shadow-present SPTE
Adjust KVM's sanity check against overwriting a shadow-present SPTE with a
another SPTE with a different target PFN to only apply to direct MMUs,
i.e. only to MMUs without shadowed gPTEs.  While it's impossible for KVM
to overwrite a shadow-present SPTE in response to a guest write, writes
from outside the scope of KVM, e.g. from host userspace, aren't detected
by KVM's write tracking and so can break KVM's shadow paging rules.

  ------------[ cut here ]------------
  pfn != spte_to_pfn(*sptep)
  WARNING: arch/x86/kvm/mmu/mmu.c:3069 at mmu_set_spte+0x1e4/0x440 [kvm], CPU#0: vmx_ept_stale_r/872
  Modules linked in: kvm_intel kvm irqbypass
  CPU: 0 UID: 1000 PID: 872 Comm: vmx_ept_stale_r Not tainted 7.0.0-rc2-eafebd2d2ab0-sink-vm #319 PREEMPT
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  RIP: 0010:mmu_set_spte+0x1e4/0x440 [kvm]
  Call Trace:
   <TASK>
   ept_page_fault+0x535/0x7f0 [kvm]
   kvm_mmu_do_page_fault+0xee/0x1f0 [kvm]
   kvm_mmu_page_fault+0x8d/0x620 [kvm]
   vmx_handle_exit+0x18c/0x5a0 [kvm_intel]
   kvm_arch_vcpu_ioctl_run+0xc55/0x1c20 [kvm]
   kvm_vcpu_ioctl+0x2d5/0x980 [kvm]
   __x64_sys_ioctl+0x8a/0xd0
   do_syscall_64+0xb5/0x730
   entry_SYSCALL_64_after_hwframe+0x4b/0x53
   </TASK>
  ---[ end trace 0000000000000000 ]---

Fixes: 11d4517511 ("KVM: x86/mmu: Warn if PFN changes on shadow-present SPTE in shadow MMU")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-03-27 22:33:33 +01:00
Sean Christopherson
aad885e774 KVM: x86/mmu: Drop/zap existing present SPTE even when creating an MMIO SPTE
When installing an emulated MMIO SPTE, do so *after* dropping/zapping the
existing SPTE (if it's shadow-present).  While commit a54aa15c6b was
right about it being impossible to convert a shadow-present SPTE to an
MMIO SPTE due to a _guest_ write, it failed to account for writes to guest
memory that are outside the scope of KVM.

E.g. if host userspace modifies a shadowed gPTE to switch from a memslot
to emulted MMIO and then the guest hits a relevant page fault, KVM will
install the MMIO SPTE without first zapping the shadow-present SPTE.

  ------------[ cut here ]------------
  is_shadow_present_pte(*sptep)
  WARNING: arch/x86/kvm/mmu/mmu.c:484 at mark_mmio_spte+0xb2/0xc0 [kvm], CPU#0: vmx_ept_stale_r/4292
  Modules linked in: kvm_intel kvm irqbypass
  CPU: 0 UID: 1000 PID: 4292 Comm: vmx_ept_stale_r Not tainted 7.0.0-rc2-eafebd2d2ab0-sink-vm #319 PREEMPT
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  RIP: 0010:mark_mmio_spte+0xb2/0xc0 [kvm]
  Call Trace:
   <TASK>
   mmu_set_spte+0x237/0x440 [kvm]
   ept_page_fault+0x535/0x7f0 [kvm]
   kvm_mmu_do_page_fault+0xee/0x1f0 [kvm]
   kvm_mmu_page_fault+0x8d/0x620 [kvm]
   vmx_handle_exit+0x18c/0x5a0 [kvm_intel]
   kvm_arch_vcpu_ioctl_run+0xc55/0x1c20 [kvm]
   kvm_vcpu_ioctl+0x2d5/0x980 [kvm]
   __x64_sys_ioctl+0x8a/0xd0
   do_syscall_64+0xb5/0x730
   entry_SYSCALL_64_after_hwframe+0x4b/0x53
  RIP: 0033:0x47fa3f
   </TASK>
  ---[ end trace 0000000000000000 ]---

Reported-by: Alexander Bulekov <bkov@amazon.com>
Debugged-by: Alexander Bulekov <bkov@amazon.com>
Suggested-by: Fred Griffoul <fgriffo@amazon.co.uk>
Fixes: a54aa15c6b ("KVM: x86/mmu: Handle MMIO SPTEs directly in mmu_set_spte()")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-03-27 22:33:33 +01:00
Paolo Bonzini
6c6ba54895 Merge tag 'kvm-s390-master-7.0-2' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD
KVM: s390: More memory management fixes

Lots of small and not-so-small fixes for the newly rewritten gmap,
mostly affecting the handling of nested guests.
2026-03-27 22:30:44 +01:00
Eric Biggers
e5046823f8 lib/crypto: chacha: Zeroize permuted_state before it leaves scope
Since the ChaCha permutation is invertible, the local variable
'permuted_state' is sufficient to compute the original 'state', and thus
the key, even after the permutation has been done.

While the kernel is quite inconsistent about zeroizing secrets on the
stack (and some prominent userspace crypto libraries don't bother at all
since it's not guaranteed to work anyway), the kernel does try to do it
as a best practice, especially in cases involving the RNG.

Thus, explicitly zeroize 'permuted_state' before it goes out of scope.

Fixes: c08d0e6473 ("crypto: chacha20 - Add a generic ChaCha20 stream cipher implementation")
Cc: stable@vger.kernel.org
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260326032920.39408-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-03-27 13:35:35 -07:00
Linus Torvalds
7df48e3631 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma fixes from Jason Gunthorpe:

 - Quite a few irdma bug fixes, several user triggerable

 - Fix a 0 SMAC header in ionic

 - Tolerate FW errors for RAAS in bng_re

 - Don't UAF in efa when printing error events

 - Better handle pool exhaustion in the new bvec paths

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  RDMA/irdma: Harden depth calculation functions
  RDMA/irdma: Return EINVAL for invalid arp index error
  RDMA/irdma: Fix deadlock during netdev reset with active connections
  RDMA/irdma: Remove reset check from irdma_modify_qp_to_err()
  RDMA/irdma: Clean up unnecessary dereference of event->cm_node
  RDMA/irdma: Remove a NOP wait_event() in irdma_modify_qp_roce()
  RDMA/irdma: Update ibqp state to error if QP is already in error state
  RDMA/irdma: Initialize free_qp completion before using it
  RDMA/efa: Fix possible deadlock
  RDMA/rw: Fix MR pool exhaustion in bvec RDMA READ path
  RDMA/rw: Fall back to direct SGE on MR pool exhaustion
  RDMA/efa: Fix use of completion ctx after free
  RDMA/bng_re: Fix silent failure in HWRM version query
  RDMA/ionic: Preserve and set Ethernet source MAC after ib_ud_header_init()
  RDMA/irdma: Fix double free related to rereg_user_mr
2026-03-27 13:30:04 -07:00
Linus Torvalds
8af4fad545 Merge tag 'pci-v7.0-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
Pull pci fixes from Bjorn Helgaas:

 - Remove power-off from pwrctrl drivers since this is now done directly
   by the PCI controller drivers (Chen-Yu Tsai)

 - Fix pwrctrl device node leak (Felix Gu)

 - Document a TLP header decoder for AER log messages (Lukas Wunner)

* tag 'pci-v7.0-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
  Documentation: PCI: Document PCIe TLP Header decoder for AER messages
  PCI/pwrctrl: Fix pci_pwrctrl_is_required() device node leak
  PCI/pwrctrl: Do not power off on pwrctrl device removal
2026-03-27 13:25:58 -07:00
Linus Torvalds
83ce1c753f Merge tag 'sound-7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
 "This became slightly big partly due to my time off in the last week.
  But all changes are about device-specific fixes, so it should be
  safely applicable.

  ASoC:
   - Fix double free in sma1307
   - Fix uninitialized variables in simple-card-utils/imx-card
   - Address clock leaks and error propagation in ADAU1372
   - Add DMI quirks and ACP/SDW support for ASUS
   - Fix Intel CATPT DMA mask
   - Fix SOF topology parsing
   - Fix DT bindings for RK3576 SPDIF, STM32 SAI and WCD934x

  HD-audio:
   - Quirks for Lenovo, ASUS, and various HP models, as well as
     a speaker pop fix on Star Labs StarFighter
   - Revert MSI X870E Tomahawk denylist again

  USB-Audio:
   - Fix distorted audio on Focusrite Scarlett 2i2/2i4 1st Gen
   - Add iface reset quirk for AB17X
   - Update Qualcomm USB audio Kconfig dependencies and license

  Misc:
   - Fix minor compile warnings for firewire and asihpi drivers"

* tag 'sound-7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (35 commits)
  Revert "ALSA: hda/intel: Add MSI X870E Tomahawk to denylist"
  ALSA: usb-audio: Add iface reset and delay quirk for AB17X USB Audio
  ALSA: hda/realtek: add HP Laptop 15-fd0xxx mute LED quirk
  ALSA: usb-audio: Exclude Scarlett 2i4 1st Gen from SKIP_IFACE_SETUP
  ALSA: hda/realtek: Add mute LED quirk for HP Pavilion 15-eg0xxx
  ALSA: hda/realtek - Fixed Speaker Mute LED for HP EliteBoard G1a platform
  ASoC: SOF: ipc4-topology: Allow bytes controls without initial payload
  ASoC: adau1372: Fix clock leak on PLL lock failure
  ASoC: adau1372: Fix unchecked clk_prepare_enable() return value
  ASoC: SDCA: fix finding wrong entity
  ASoC: SDCA: remove the max count of initialization table
  ASoC: codecs: wcd934x: fix typo in dt parsing
  ASoC: dt-bindings: stm32: Fix incorrect compatible string in stm32h7-sai match
  ASoC: Intel: catpt: Fix the device initialization
  ASoC: amd: acp: add ASUS HN7306EA quirk for legacy SDW machine
  ASoC: SOF: topology: reject invalid vendor array size in token parser
  ASoC: tas2781: Add null check for calibration data
  ALSA: asihpi: avoid write overflow check warning
  ASoC: fsl: imx-card: initialize playback_only and capture_only
  ASoC: simple-card-utils: Check value of is_playback_only and is_capture_only
  ...
2026-03-27 13:16:40 -07:00
Linus Torvalds
f44c65111e Merge tag 'media/v7.0-6' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media fixes from Mauro Carvalho Chehab:

 - uvcvideo may cause OOPS when out of memory

 - remove a deadlock in the ccs driver

* tag 'media/v7.0-6' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  media: ccs: Avoid deadlock in ccs_init_state()
  media: uvcvideo: Fix bug in error path of uvc_alloc_urb_buffers
2026-03-27 13:10:49 -07:00
Linus Torvalds
0b8bf3b64e Merge tag 'sysctl-7.00-fixes-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl
Pull sysctl fix from Joel Granados:
 "Fix uninitialized variable error when writing to a sysctl bitmap

  Removed the possibility of returning an unjustified -EINVAL when
  writing to a sysctl bitmap"

* tag 'sysctl-7.00-fixes-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl:
  sysctl: fix uninitialized variable in proc_do_large_bitmap
2026-03-27 13:04:34 -07:00
Linus Torvalds
3577cfd738 Merge tag 'xfs-fixes-7.0-rc6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs fixes from Carlos Maiolino:
 "This includes a few important bug fixes, and some code refactoring
  that was necessary for one of the fixes"

* tag 'xfs-fixes-7.0-rc6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: remove file_path tracepoint data
  xfs: don't irele after failing to iget in xfs_attri_recover_work
  xfs: remove redundant validation in xlog_recover_attri_commit_pass2
  xfs: fix ri_total validation in xlog_recover_attri_commit_pass2
  xfs: close crash window in attr dabtree inactivation
  xfs: factor out xfs_attr3_leaf_init
  xfs: factor out xfs_attr3_node_entry_remove
  xfs: only assert new size for datafork during truncate extents
  xfs: annotate struct xfs_attr_list_context with __counted_by_ptr
  xfs: cleanup buftarg handling in XFS_IOC_VERIFY_MEDIA
  xfs: scrub: unlock dquot before early return in quota scrub
  xfs: refactor xfsaild_push loop into helper
  xfs: save ailp before dropping the AIL lock in push callbacks
  xfs: avoid dereferencing log items after push callbacks
  xfs: stop reclaim before pushing AIL during unmount
2026-03-27 12:22:45 -07:00
Luo Haiyang
1f98857322 tracing: Fix potential deadlock in cpu hotplug with osnoise
The following sequence may leads deadlock in cpu hotplug:

    task1        task2        task3
    -----        -----        -----

 mutex_lock(&interface_lock)

            [CPU GOING OFFLINE]

            cpus_write_lock();
            osnoise_cpu_die();
              kthread_stop(task3);
                wait_for_completion();

                      osnoise_sleep();
                        mutex_lock(&interface_lock);

 cpus_read_lock();

 [DEAD LOCK]

Fix by swap the order of cpus_read_lock() and mutex_lock(&interface_lock).

Cc: stable@vger.kernel.org
Cc: <mathieu.desnoyers@efficios.com>
Cc: <zhang.run@zte.com.cn>
Cc: <yang.tao172@zte.com.cn>
Cc: <ran.xiaokai@zte.com.cn>
Fixes: bce29ac9ce ("trace: Add osnoise tracer")
Link: https://patch.msgid.link/20260326141953414bVSj33dAYktqp9Oiyizq8@zte.com.cn
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Luo Haiyang <luo.haiyang@zte.com.cn>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-03-27 15:18:06 -04:00
Linus Torvalds
34892992d0 Merge tag 'v7.0-rc5-ksmbd-srv-fixes' of git://git.samba.org/ksmbd
Pull smb server fixes from Steve French:

 - Fix out of bounds write

 - Fix for better calculating max output buffers

 - Fix memory leaks in SMB2/SMB3 lock

 - Fix use after free

 - Multichannel fix

* tag 'v7.0-rc5-ksmbd-srv-fixes' of git://git.samba.org/ksmbd:
  ksmbd: fix potencial OOB in get_file_all_info() for compound requests
  ksmbd: replace hardcoded hdr2_len with offsetof() in smb2_calc_max_out_buf_len()
  ksmbd: fix memory leaks and NULL deref in smb2_lock()
  ksmbd: fix use-after-free and NULL deref in smb_grant_oplock()
  ksmbd: do not expire session on binding failure
2026-03-27 12:03:39 -07:00
Jan Kara
102e57d56f udf: Fix race between file type conversion and writeback
udf_setsize() can race with udf_writepages() as follows:

udf_setsize()			udf_writepages()
				  if (iinfo->i_alloc_type ==
						ICBTAG_FLAG_AD_IN_ICB)
  err = udf_expand_file_adinicb(inode);
  err = udf_extend_file(inode, newsize);
				    udf_adinicb_writepages()
				      memcpy_from_file_folio() - crash
					because inode size is too big.

Fix the problem by checking the file type under folio lock in
udf_handle_page_wb() handler called from __mpage_writepages() which
properly serializes with udf_expand_file_adinicb().

Reported-by: Jianzhou Zhao <luckd0g@163.com>
Link: https://lore.kernel.org/all/f622c01.67ac.19cdbdd777d.Coremail.luckd0g@163.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://patch.msgid.link/20260326140635.15895-4-jack@suse.cz
Signed-off-by: Jan Kara <jack@suse.cz>
2026-03-27 17:01:40 +01:00
Jan Kara
fffca572f9 mpage: Provide variant of mpage_writepages() with own optional folio handler
Some filesystems need to treat some folios specially (for example for
inodes with inline data). Doing the handling in their .writepages method
in a race-free manner results in duplicating some of the writeback
internals. So provide generalized version of mpage_writepages() that
allows filesystem to provide a handler called for each folio which can
handle the folio in a special way.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://patch.msgid.link/20260326140635.15895-3-jack@suse.cz
Signed-off-by: Jan Kara <jack@suse.cz>
2026-03-27 17:01:36 +01:00
Zhang Heng
73ff3916d8 ALSA: hda/realtek: change quirk for HP OmniBook 7 Laptop 16-bh0xxx
HP OmniBook 7 Laptop 16-bh0xxx has the same PCI subsystem ID 0x103c8e60,
and the ALC245 on it needs this quirk to control the mute LED.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=221214
Cc: <stable@vger.kernel.org>
Tested-by: Artem S. Tashkinov <aros@gmx.com>
Signed-off-by: Zhang Heng <zhangheng@kylinos.cn>
Link: https://patch.msgid.link/20260327101215.481108-1-zhangheng@kylinos.cn
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2026-03-27 16:27:22 +01:00
Wolfram Sang
4c10830fda Merge tag 'i2c-host-fixes-7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux into i2c/for-current
i2c-fixes for v7.0-rc6

designware: fix resume-probe race causing NULL-deref in amdisp
imx: fix timeout on repeated reads and extra clock at end
2026-03-27 16:20:24 +01:00
Pratap Nirujogi
e2f1ada8e0 i2c: designware: amdisp: Fix resume-probe race condition issue
Identified resume-probe race condition in kernel v7.0 with the commit
38fa29b01a ("i2c: designware: Combine the init functions"),but this
issue existed from the beginning though not detected.

The amdisp i2c device requires ISP to be in power-on state for probe
to succeed. To meet this requirement, this device is added to genpd
to control ISP power using runtime PM. The pm_runtime_get_sync() called
before i2c_dw_probe() triggers PM resume, which powers on ISP and also
invokes the amdisp i2c runtime resume before the probe completes resulting
in this race condition and a NULL dereferencing issue in v7.0

Fix this race condition by using the genpd APIs directly during probe:
  - Call dev_pm_genpd_resume() to Power ON ISP before probe
  - Call dev_pm_genpd_suspend() to Power OFF ISP after probe
  - Set the device to suspended state with pm_runtime_set_suspended()
  - Enable runtime PM only after the device is fully initialized

Fixes: d6263c468a ("i2c: amd-isp: Add ISP i2c-designware driver")
Co-developed-by: Bin Du <bin.du@amd.com>
Signed-off-by: Bin Du <bin.du@amd.com>
Signed-off-by: Pratap Nirujogi <pratap.nirujogi@amd.com>
Cc: <stable@vger.kernel.org> # v6.16+
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Link: https://lore.kernel.org/r/20260320201302.3490570-1-pratap.nirujogi@amd.com
2026-03-27 13:51:21 +01:00
Stefan Eichenberger
13101db735 i2c: imx: ensure no clock is generated after last read
When reading from the I2DR register, right after releasing the bus by
clearing MSTA and MTX, the I2C controller might still generate an
additional clock cycle which can cause devices to misbehave. Ensure to
only read from I2DR after the bus is not busy anymore. Because this
requires polling, the read of the last byte is moved outside of the
interrupt handler.

An example for such a failing transfer is this:
i2ctransfer -y -a 0 w1@0x00 0x02 r1
Error: Sending messages failed: Connection timed out
It does not happen with every device because not all devices react to
the additional clock cycle.

Fixes: 5f5c2d4579 ("i2c: imx: prevent rescheduling in non dma mode")
Cc: stable@vger.kernel.org # v6.13+
Signed-off-by: Stefan Eichenberger <stefan.eichenberger@toradex.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Link: https://lore.kernel.org/r/20260218150940.131354-3-eichest@gmail.com
2026-03-27 13:51:21 +01:00
Stefan Eichenberger
f88e2e748a i2c: imx: fix i2c issue when reading multiple messages
When reading multiple messages, meaning a repeated start is required,
polling the bus busy bit must be avoided. This must only be done for
the last message. Otherwise, the driver will timeout.

Here an example of such a sequence that fails with an error:
i2ctransfer -y -a 0 w1@0x00 0x02 r1 w1@0x00 0x02 r1
Error: Sending messages failed: Connection timed out

Fixes: 5f5c2d4579 ("i2c: imx: prevent rescheduling in non dma mode")
Cc: stable@vger.kernel.org # v6.13+
Signed-off-by: Stefan Eichenberger <stefan.eichenberger@toradex.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Link: https://lore.kernel.org/r/20260218150940.131354-2-eichest@gmail.com
2026-03-27 13:51:20 +01:00
Fei Lv
1f6ee9be92 ovl: make fsync after metadata copy-up opt-in mount option
Commit 7d6899fb69 ("ovl: fsync after metadata copy-up") was done to
fix durability of overlayfs copy up on an upper filesystem which does
not enforce ordering on storing of metadata changes (e.g. ubifs).

In an earlier revision of the regressing commit by Lei Lv, the metadata
fsync behavior was opt-in via a new "fsync=strict" mount option.
We were hoping that the opt-in mount option could be avoided, so the
change was only made to depend on metacopy=off, in the hope of not
hurting performance of metadata heavy workloads, which are more likely
to be using metacopy=on.

This hope was proven wrong by a performance regression report from Google
COS workload after upgrade to kernel 6.12.

This is an adaptation of Lei's original "fsync=strict" mount option
to the existing upstream code.

The new mount option is mutually exclusive with the "volatile" mount
option, so the latter is now an alias to the "fsync=volatile" mount
option.

Reported-by: Chenglong Tang <chenglongtang@google.com>
Closes: https://lore.kernel.org/linux-unionfs/CAOdxtTadAFH01Vui1FvWfcmQ8jH1O45owTzUcpYbNvBxnLeM7Q@mail.gmail.com/
Link: https://lore.kernel.org/linux-unionfs/CAOQ4uxgKC1SgjMWre=fUb00v8rxtd6sQi-S+dxR8oDzAuiGu8g@mail.gmail.com/
Fixes: 7d6899fb69 ("ovl: fsync after metadata copy-up")
Depends: 50e638beb6 ("ovl: Use str_on_off() helper in ovl_show_options()")
Cc: stable@vger.kernel.org # v6.12+
Signed-off-by: Fei Lv <feilv@asrmicro.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2026-03-27 12:48:10 +01:00
Mario Limonciello
ed4da361bf Revert "ALSA: hda/intel: Add MSI X870E Tomahawk to denylist"
commit 30b3211aa2 ("ALSA: hda/intel: Add MSI X870E Tomahawk
to denylist") was added to silence a warning, but this effectively
reintroduced commit df42ee7e22 ("ALSA: hda: Add ASRock
X670E Taichi to denylist") which was already reported to cause
problems and reverted in commit ee8f161359 ("Revert "ALSA: hda:
Add ASRock X670E Taichi to denylist"")

Revert it yet again.

Cc: stable@vger.kernel.org
Reported-by: Juhyun Song <juju6985@outlook.kr>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=221274
Cc: Stuart Hayhurst <stuart.a.hayhurst@gmail.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://patch.msgid.link/20260326190542.524515-1-mario.limonciello@amd.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2026-03-27 10:54:04 +01:00
Lianqin Hu
ee6c551a7d ALSA: usb-audio: Add iface reset and delay quirk for AB17X USB Audio
Setting up the interface when suspended/resumeing fail on this card.
Adding a reset and delay quirk will eliminate this problem.

usb 1-1: new full-speed USB device number 2 using xhci-hcd
usb 1-1: New USB device found, idVendor=001f, idProduct=0b23
usb 1-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
usb 1-1: Product: AB17X USB Audio
usb 1-1: Manufacturer: Generic
usb 1-1: SerialNumber: 20241228172028

Signed-off-by: Lianqin Hu <hulianqin@vivo.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://patch.msgid.link/PUZPR06MB6224CA59AD2B26054120B276D249A@PUZPR06MB6224.apcprd06.prod.outlook.com
2026-03-27 10:50:57 +01:00
Kshamendra Kumar Mishra
faceb5cf5d ALSA: hda/realtek: add HP Laptop 15-fd0xxx mute LED quirk
HP Laptop 15-fd0xxx with ALC236 codec does not handle the toggling of
the mute LED.
This patch adds a quirk entry for subsystem ID 0x8dd7 using
ALC236_FIXUP_HP_MUTE_LED_COEFBIT2 fixup, enabling correct mute LED
behavior.

Signed-off-by: Kshamendra Kumar Mishra <kshamendrakumarmishra@gmail.com>
Link: https://patch.msgid.link/DHAB51ISUM96.2K9SZIABIDEQ0@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2026-03-27 10:43:56 +01:00
Geoffrey D. Bennett
990a8b0732 ALSA: usb-audio: Exclude Scarlett 2i4 1st Gen from SKIP_IFACE_SETUP
Same issue that the Scarlett 2i2 1st Gen had:
QUIRK_FLAG_SKIP_IFACE_SETUP causes distorted/flanging audio on the
Scarlett 2i4 1st Gen (1235:800a).

Fixes: 38c322068a ("ALSA: usb-audio: Add QUIRK_FLAG_SKIP_IFACE_SETUP")
Reported-by: dcferreira [https://github.com/geoffreybennett/linux-fcp/issues/54]
Signed-off-by: Geoffrey D. Bennett <g@b4.vu>
Link: https://patch.msgid.link/acEkEbftzyNe8W7C@m.b4.vu
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2026-03-27 10:43:14 +01:00
César Montoya
2f388b4e8f ALSA: hda/realtek: Add mute LED quirk for HP Pavilion 15-eg0xxx
The HP Pavilion 15-eg0xxx with subsystem ID 0x103c87cb uses a Realtek
ALC287 codec with a mute LED wired to GPIO pin 4 (mask 0x10). The
existing ALC287_FIXUP_HP_GPIO_LED fixup already handles this correctly,
but the subsystem ID was missing from the quirk table.

GPIO pin confirmed via manual hda-verb testing:
  hda-verb SET_GPIO_MASK 0x10
  hda-verb SET_GPIO_DIRECTION 0x10
  hda-verb SET_GPIO_DATA 0x10

Signed-off-by: César Montoya <sprit152009@gmail.com>
Link: https://patch.msgid.link/20260321153603.12771-1-sprit152009@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2026-03-27 10:31:47 +01:00
Kailang Yang
d3be95efc6 ALSA: hda/realtek - Fixed Speaker Mute LED for HP EliteBoard G1a platform
On the HP EliteBoard G1a platform (models without a headphone jack).
the speaker mute LED failed to function. The Sysfs ctl-led info showed
empty values because the standard LED registration couldn't correctly
bind to the master switch.
Adding this patch will fix and enable the speaker mute LED feature.

Tested-by: Chris Chiu <chris.chiu@canonical.com>
Signed-off-by: Kailang Yang <kailang@realtek.com>
Link: https://lore.kernel.org/279e929e884849df84687dbd67f20037@realtek.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2026-03-27 10:27:58 +01:00
Takashi Iwai
50c8f83c41 Merge tag 'asoc-fix-v7.0-rc5' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v7.0

This is two week's worth of fixes and quirks so it's a bit larger than
you might expect, there's nothing too exciting individually and nothing
in core code.
2026-03-27 10:16:52 +01:00
Guangshuo Li
7f138de156 auxdisplay: line-display: fix NULL dereference in linedisp_release
linedisp_release() currently retrieves the enclosing struct linedisp via
to_linedisp(). That lookup depends on the attachment list, but the
attachment may already have been removed before put_device() invokes the
release callback. This can happen in linedisp_unregister(), and can also
be reached from some linedisp_register() error paths.

In that case, to_linedisp() returns NULL and linedisp_release()
dereferences it while freeing the display resources.

The struct device released here is the embedded linedisp->dev used by
linedisp_register(), so retrieve the enclosing object directly with
container_of() instead.

Fixes: 66c9380948 ("auxdisplay: linedisp: encapsulate container_of usage within to_linedisp")
Cc: stable@vger.kernel.org
Signed-off-by: Guangshuo Li <lgs201920130244@gmail.com>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2026-03-27 09:54:31 +01:00
Sherry Yang
8b72aa5704 iommupt/amdv1: mark amdv1pt_install_leaf_entry as __always_inline
After enabling CONFIG_GCOV_KERNEL and CONFIG_GCOV_PROFILE_ALL, following
build failure is observed under GCC 14.2.1:

In function 'amdv1pt_install_leaf_entry',
    inlined from '__do_map_single_page' at drivers/iommu/generic_pt/fmt/../iommu_pt.h:650:3,
    inlined from '__map_single_page0' at drivers/iommu/generic_pt/fmt/../iommu_pt.h:661:1,
    inlined from 'pt_descend' at drivers/iommu/generic_pt/fmt/../pt_iter.h:391:9,
    inlined from '__do_map_single_page' at drivers/iommu/generic_pt/fmt/../iommu_pt.h:657:10,
    inlined from '__map_single_page1.constprop' at drivers/iommu/generic_pt/fmt/../iommu_pt.h:661:1:
././include/linux/compiler_types.h:706:45: error: call to '__compiletime_assert_71' declared with attribute error: FIELD_PREP: value too large for the field
  706 |         _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
      |

......

drivers/iommu/generic_pt/fmt/amdv1.h:220:26: note: in expansion of macro 'FIELD_PREP'
  220 |                          FIELD_PREP(AMDV1PT_FMT_OA,
      |                          ^~~~~~~~~~

In the path '__do_map_single_page()', level 0 always invokes
'pt_install_leaf_entry(&pts, map->oa, PAGE_SHIFT, …)'. At runtime that
lands in the 'if (oasz_lg2 == isz_lg2)' arm of 'amdv1pt_install_leaf_entry()';
the contiguous-only 'else' block is unreachable for 4 KiB pages.

With CONFIG_GCOV_KERNEL + CONFIG_GCOV_PROFILE_ALL, the extra
instrumentation changes GCC's inlining so that the "dead" 'else' branch
still gets instantiated. The compiler constant-folds the contiguous OA
expression, runs the 'FIELD_PREP()' compile-time check, and produces:

    FIELD_PREP: value too large for the field

gcov-enabled builds therefore fail even though the code path never executes.

Fix this by marking amdv1pt_install_leaf_entry as __always_inline.

Fixes: dcd6a011a8 ("iommupt: Add map_pages op")
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sherry Yang <sherry.yang@oracle.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2026-03-27 09:41:48 +01:00
Jason Gunthorpe
ee6e69d032 iommupt: Fix short gather if the unmap goes into a large mapping
unmap has the odd behavior that it can unmap more than requested if the
ending point lands within the middle of a large or contiguous IOPTE.

In this case the gather should flush everything unmapped which can be
larger than what was requested to be unmapped. The gather was only
flushing the range requested to be unmapped, not extending to the extra
range, resulting in a short invalidation if the caller hits this special
condition.

This was found by the new invalidation/gather test I am adding in
preparation for ARMv8. Claude deduced the root cause.

As far as I remember nothing relies on unmapping a large entry, so this is
likely not a triggerable bug.

Cc: stable@vger.kernel.org
Fixes: 7c53f4238a ("iommupt: Add unmap_pages op")
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Samiullah Khawaja <skhawaja@google.com>
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2026-03-27 09:07:13 +01:00
Jason Gunthorpe
90c5def10b iommu: Do not call drivers for empty gathers
An empty gather is coded with start=U64_MAX, end=0 and several drivers go
on to convert that to a size with:

 end - start + 1

Which gives 2 for an empty gather. This then causes Weird Stuff to
happen (for example an UBSAN splat in VT-d) that is hopefully harmless,
but maybe not.

Prevent drivers from being called right in iommu_iotlb_sync().

Auditing shows that AMD, Intel, Mediatek and RSIC-V drivers all do things
on these empty gathers.

Further, there are several callers that can trigger empty gathers,
especially in unusual conditions. For example iommu_map_nosync() will call
a 0 size unmap on some error paths. Also in VFIO, iommupt and other
places.

Cc: stable@vger.kernel.org
Reported-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Closes: https://lore.kernel.org/r/11145826.aFP6jjVeTY@jkrzyszt-mobl2.ger.corp.intel.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Samiullah Khawaja <skhawaja@google.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2026-03-27 09:07:13 +01:00
Dave Airlie
83318d0c1f Merge tag 'drm-xe-fixes-2026-03-26' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes
- Fix UAF in SRIOV migration restore (Winiarski)
- Updates to HW W/a (Roper)
- VMBind remap fix (Auld)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/acUgq2q2DrCUzFql@intel.com
2026-03-27 17:48:48 +10:00
Dave Airlie
aab01a8808 Merge tag 'drm-misc-fixes-2026-03-26' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
A page mapping fix for shmem fault handler, a power-off fix for ivpu, a
GFP_* flag fix for syncobj, and a MAINTAINERS update.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <mripard@redhat.com>
Link: https://patch.msgid.link/20260326-lush-cuddly-limpet-ab2aa9@houat
2026-03-27 17:46:26 +10:00
Dave Airlie
355223cb84 Merge tag 'drm-intel-fixes-2026-03-26' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes
- DP tunnel error handling fix
- Spurious GMBUS timeout fix
- Unlink NV12 planes earlier
- Order OP vs. timeout correctly in __wait_for()

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patch.msgid.link/acTdjAoOGkzl3dcc@jlahtine-mobl
2026-03-27 17:19:34 +10:00
Dave Airlie
7261c2fceb Merge tag 'amd-drm-fixes-7.0-2026-03-25' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-7.0-2026-03-25:

amdgpu:
- DSC fix
- Module parameter parsing fix
- PASID reuse fix
- drm_edid leak fix
- SMU 13.x fixes
- SMU 14.x fix
- Fence fix in amdgpu_amdkfd_submit_ib()
- LVDS fixes
- GPU page fault fix for non-4K pages

amdkfd:
- Ordering fix in kfd_ioctl_create_process()

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patch.msgid.link/20260325155600.4184877-1-alexander.deucher@amd.com
2026-03-27 16:57:05 +10:00
Nicholas Carlini
5170efd9c3 io_uring/fdinfo: fix OOB read in SQE_MIXED wrap check
__io_uring_show_fdinfo() iterates over pending SQEs and, for 128-byte
SQEs on an IORING_SETUP_SQE_MIXED ring, needs to detect when the second
half of the SQE would be past the end of the sq_sqes array. The current
check tests (++sq_head & sq_mask) == 0, but sq_head is only incremented
when a 128-byte SQE is encountered, not on every iteration. The actual
array index is sq_idx = (i + sq_head) & sq_mask, which can be sq_mask
(the last slot) while the wrap check passes.

Fix by checking sq_idx directly. Keep the sq_head increment so the loop
still skips the second half of the 128-byte SQE on the next iteration.

Fixes: 1cba30bf9f ("io_uring: add support for IORING_SETUP_SQE_MIXED")
Signed-off-by: Nicholas Carlini <nicholas@carlini.com>
Link: https://patch.msgid.link/20260327021823.3138396-1-nicholas@carlini.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-03-26 20:28:28 -06:00
Linus Torvalds
46b5132504 Merge tag 'v7.0-rc5-smb3-client-fix' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fix from Steve French:

 - Fix rebuild of mapping table

* tag 'v7.0-rc5-smb3-client-fix' of git://git.samba.org/sfrench/cifs-2.6:
  smb/client: ensure smb2_mapping_table rebuild on cmd changes
2026-03-26 14:01:26 -07:00
Linus Torvalds
d813f42193 Merge tag 'pm-7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
 "These fix two cpufreq issues, one in the core and one in the
  conservative governor, and two issues related to system sleep:

   - Restore the cpufreq core behavior changed inadvertently during the
     6.19 development cycle to call cpufreq_frequency_table_cpuinfo()
     for cpufreq policies getting re-initialized which ensures that
     policy->max and policy->cpuinfo_max_freq will be valid going
     forward (Viresh Kumar)

   - Adjust the cached requested frequency in the conservative cpufreq
     governor on policy limits changes to prevent it from becoming stale
     in some cases (Viresh Kumar)

   - Prevent pm_restore_gfp_mask() from triggering a WARN_ON() in some
     code paths in which it is legitimately called without invoking
     pm_restrict_gfp_mask() previously (Youngjun Park)

   - Update snapshot_write_finalize() to take trailing zero pages into
     account properly which prevents user space restore from failing
     subsequently in some cases (Alberto Garcia)"

* tag 'pm-7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM: sleep: Drop spurious WARN_ON() from pm_restore_gfp_mask()
  PM: hibernate: Drain trailing zero pages on userspace restore
  cpufreq: conservative: Reset requested_freq on limits change
  cpufreq: Don't skip cpufreq_frequency_table_cpuinfo()
2026-03-26 12:42:28 -07:00
Linus Torvalds
9c2b23a275 Merge tag 'thermal-7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull thermal control fix from Rafael Wysocki:
 "This prevents the int340x thermal driver from taking the power slider
  offset parameter into account incorrectly in some cases (Srinivas
  Pandruvada)"

* tag 'thermal-7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  thermal: intel: int340x: soc_slider: Set offset only for balanced mode
2026-03-26 12:27:17 -07:00
Linus Torvalds
2d74bd3c58 Merge tag 'acpi-7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI support fix from Rafael Wysocki:
 "Prevent use-after-free from occurring on reduced-hardware ACPI
  platforms when -EPROBE_DEFER is returned by ec_install_handlers()
  during ACPI EC driver initialization (Weiming Shi)"

* tag 'acpi-7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: EC: clean up handlers on probe failure in acpi_ec_setup()
2026-03-26 12:06:40 -07:00
Linus Torvalds
25b69ebe28 Merge tag 'landlock-7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux
Pull Landlock fixes from Mickaël Salaün:
 "This mainly fixes Landlock TSYNC issues related to interrupts and
  unexpected task exit.

  Other fixes touch documentation and sample, and a new test extends
  coverage"

* tag 'landlock-7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux:
  landlock: Expand restrict flags example for ABI version 8
  selftests/landlock: Test tsync interruption and cancellation paths
  landlock: Clean up interrupted thread logic in TSYNC
  landlock: Serialize TSYNC thread restriction
  samples/landlock: Bump ABI version to 8
  landlock: Improve TSYNC types
  landlock: Fully release unused TSYNC work entries
  landlock: Fix formatting
2026-03-26 12:03:37 -07:00
Rafael J. Wysocki
042f99c493 Merge branch 'pm-sleep'
Merge fixes related to system sleep for 7.0-rc6:

 - Prevent pm_restore_gfp_mask() from triggering a WARN_ON() in some
   code paths in which it is legitimately called without invoking
   pm_restrict_gfp_mask() previously (Youngjun Park)

 - Update snapshot_write_finalize() to take trailing zero pages into
   account properly which prevents user space restore from failing
   subsequently in some cases (Alberto Garcia)

* pm-sleep:
  PM: sleep: Drop spurious WARN_ON() from pm_restore_gfp_mask()
  PM: hibernate: Drain trailing zero pages on userspace restore
2026-03-26 18:44:46 +01:00
Hao-Yu Yang
190a8c48ff futex: Fix UaF between futex_key_to_node_opt() and vma_replace_policy()
During futex_key_to_node_opt() execution, vma->vm_policy is read under
speculative mmap lock and RCU. Concurrently, mbind() may call
vma_replace_policy() which frees the old mempolicy immediately via
kmem_cache_free().

This creates a race where __futex_key_to_node() dereferences a freed
mempolicy pointer, causing a use-after-free read of mpol->mode.

[  151.412631] BUG: KASAN: slab-use-after-free in __futex_key_to_node (kernel/futex/core.c:349)
[  151.414046] Read of size 2 at addr ffff888001c49634 by task e/87

[  151.415969] Call Trace:

[  151.416732]  __asan_load2 (mm/kasan/generic.c:271)
[  151.416777]  __futex_key_to_node (kernel/futex/core.c:349)
[  151.416822]  get_futex_key (kernel/futex/core.c:374 kernel/futex/core.c:386 kernel/futex/core.c:593)

Fix by adding rcu to __mpol_put().

Fixes: c042c50521 ("futex: Implement FUTEX2_MPOL")
Reported-by: Hao-Yu Yang <naup96721@gmail.com>
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Hao-Yu Yang <naup96721@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Link: https://patch.msgid.link/20260324174418.GB1850007@noisy.programming.kicks-ass.net
2026-03-26 16:13:48 +01:00
Peter Zijlstra
19f94b3905 futex: Require sys_futex_requeue() to have identical flags
Nicholas reported that his LLM found it was possible to create a UaF
when sys_futex_requeue() is used with different flags. The initial
motivation for allowing different flags was the variable sized futex,
but since that hasn't been merged (yet), simply mandate the flags are
identical, as is the case for the old style sys_futex() requeue
operations.

Fixes: 0f4b5f9722 ("futex: Add sys_futex_requeue()")
Reported-by: Nicholas Carlini <npc@anthropic.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2026-03-26 16:13:48 +01:00
Claudio Imbrenda
0a28e06575 KVM: s390: Fix KVM_S390_VCPU_FAULT ioctl
A previous commit changed the behaviour of the KVM_S390_VCPU_FAULT
ioctl. The current (wrong) implementation will trigger a guest
addressing exception if the requested address lies outside of a
memslot, unless the VM is UCONTROL.

Restore the previous behaviour by open coding the fault-in logic.

Fixes: 3762e905ec ("KVM: s390: use __kvm_faultin_pfn()")
Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Reviewed-by: Steffen Eiden <seiden@linux.ibm.com>
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
2026-03-26 16:12:38 +01:00
Claudio Imbrenda
a12cc7e3d6 KVM: s390: vsie: Fix guest page tables protection
When shadowing, the guest page tables are write-protected, in order to
trap changes and properly unshadow the shadow mapping for the nested
guest. Already shadowed levels are skipped, so that only the needed
levels are write protected.

Currently the levels that get write protected are exactly one level too
deep: the last level (nested guest memory) gets protected in the wrong
way, and will be protected again correctly a few lines afterwards; most
importantly, the highest non-shadowed level does *not* get write
protected.

Moreover, if the nested guest is running in a real address space, there
are no DAT tables to shadow.

Write protect the correct levels, so that all the levels that need to
be protected are protected, and avoid double protecting the last level;
skip attempting to shadow the DAT tables when the nested guest is
running in a real address space.

Fixes: e38c884df9 ("KVM: s390: Switch to new gmap")
Tested-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
2026-03-26 16:12:34 +01:00
Claudio Imbrenda
19d6c5b804 KVM: s390: vsie: Fix unshadowing while shadowing
If shadowing causes the shadow gmap to get unshadowed, exit early to
prevent an attempt to dereference the parent pointer, which at this
point is NULL.

Opportunistically add some more checks to prevent NULL parents.

Fixes: a2c17f9270 ("KVM: s390: New gmap code")
Fixes: e5f98a6899 ("KVM: s390: Add some helper functions needed for vSIE")
Fixes: e38c884df9 ("KVM: s390: Switch to new gmap")
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
2026-03-26 16:12:30 +01:00
Claudio Imbrenda
0ec456b8a5 KVM: s390: vsie: Fix refcount overflow for shadow gmaps
In most cases gmap_put() was not called when it should have.

Add the missing gmap_put() in vsie_run().

Fixes: e38c884df9 ("KVM: s390: Switch to new gmap")
Reviewed-by: Steffen Eiden <seiden@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
2026-03-26 16:12:25 +01:00
Claudio Imbrenda
fd7bc612cf KVM: s390: vsie: Fix nested guest memory shadowing
Fix _do_shadow_pte() to use the correct pointer (guest pte instead of
nested guest) to set up the new pte.

Add a check to return -EOPNOTSUPP if the mapping for the nested guest
is writeable but the same page in the guest is only read-only.

Fixes: e38c884df9 ("KVM: s390: Switch to new gmap")
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
2026-03-26 16:12:21 +01:00
Claudio Imbrenda
0f2b760a17 KVM: s390: Correctly handle guest mappings without struct page
Introduce a new special softbit for large pages, like already presend
for normal pages, and use it to mark guest mappings that do not have
struct pages.

Whenever a leaf DAT entry becomes dirty, check the special softbit and
only call SetPageDirty() if there is an actual struct page.

Move the logic to mark pages dirty inside _gmap_ptep_xchg() and
_gmap_crstep_xchg_atomic(), to avoid needlessly duplicating the code.

Fixes: 5a74e3d934 ("KVM: s390: KVM-specific bitfields and helper functions")
Fixes: a2c17f9270 ("KVM: s390: New gmap code")
Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
2026-03-26 16:12:18 +01:00
Claudio Imbrenda
45921d0212 KVM: s390: Fix gmap_link()
The slow path of the fault handler ultimately called gmap_link(), which
assumed the fault was a major fault, and blindly called dat_link().

In case of minor faults, things were not always handled properly; in
particular the prefix and vsie marker bits were ignored.

Move dat_link() into gmap.c, renaming it accordingly. Once moved, the
new _gmap_link() function will be able to correctly honour the prefix
and vsie markers.

This will cause spurious unshadows in some uncommon cases.

Fixes: 94fd9b16cc ("KVM: s390: KVM page table management functions: lifecycle management")
Fixes: a2c17f9270 ("KVM: s390: New gmap code")
Reviewed-by: Steffen Eiden <seiden@linux.ibm.com>
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
2026-03-26 16:12:13 +01:00
Claudio Imbrenda
6f93d1ed6f KVM: s390: vsie: Fix check for pre-existing shadow mapping
When shadowing a nested guest, a check is performed and no shadowing is
attempted if the nested guest is already shadowed.

The existing check was incomplete; fix it by also checking whether the
leaf DAT table entry in the existing shadow gmap has the same protection
as the one specified in the guest DAT entry.

Fixes: e38c884df9 ("KVM: s390: Switch to new gmap")
Reviewed-by: Steffen Eiden <seiden@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
2026-03-26 16:12:07 +01:00
Claudio Imbrenda
b827ef02f4 KVM: s390: Remove non-atomic dat_crstep_xchg()
In practice dat_crstep_xchg() is racy and hard to use correctly. Simply
remove it and replace its uses with dat_crstep_xchg_atomic().

This solves some actual races that lead to system hangs / crashes.

Opportunistically fix an alignment issue in _gmap_crstep_xchg_atomic().

Fixes: 589071eaaa ("KVM: s390: KVM page table management functions: clear and replace")
Fixes: 94fd9b16cc ("KVM: s390: KVM page table management functions: lifecycle management")
Reviewed-by: Steffen Eiden <seiden@linux.ibm.com>
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
2026-03-26 16:12:03 +01:00
Biju Das
897cf98926 irqchip/renesas-rzv2h: Fix error path in rzv2h_icu_probe_common()
Replace pm_runtime_put() with pm_runtime_put_sync() when
irq_domain_create_hierarchy() fails to ensure the device suspends
synchronously before devres cleanup disables runtime PM via
pm_runtime_disable().

Fixes: 5ec8cabc3b ("irqchip/renesas-rzv2h: Use devm_pm_runtime_enable()")
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260323124917.41602-1-biju.das.jz@bp.renesas.com
2026-03-26 16:12:01 +01:00
Claudio Imbrenda
0f54755343 KVM: s390: vsie: Fix dat_split_ste()
If the guest misbehaves and puts the page tables for its nested guest
inside the memory of the nested guest itself, and the guest and nested
guest are being mapped with large pages, the shadow mapping will
lose synchronization with the actual mapping, since this will cause the
large page with the vsie notification bit to be split, but the
vsie notification bit will not be propagated to the resulting small
pages.

Fix this by propagating the vsie_notif bit from large pages to normal
pages when splitting a large page.

Fixes: 2db149a0a6 ("KVM: s390: KVM page table management functions: walks")
Reviewed-by: Christoph Schlameuss <schlameuss@linux.ibm.com>
Reviewed-by: Steffen Eiden <seiden@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
2026-03-26 16:11:58 +01:00
Jassi Brar
cfe02147e8 irqchip/qcom-mpm: Add missing mailbox TX done acknowledgment
The mbox_client for qcom-mpm sends NULL doorbell messages via
mbox_send_message() but never signals TX completion.

Set knows_txdone=true and call mbox_client_txdone() after a successful
send, matching the pattern used by other Qualcomm mailbox clients (smp2p,
smsm, qcom_aoss etc).

Fixes: a6199bb514 "irqchip: Add Qualcomm MPM controller driver"
Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260322171533.608436-1-jassisinghbrar@gmail.com
2026-03-26 16:11:53 +01:00
David Howells
0e764b9d46 netfs: Fix the handling of stream->front by removing it
The netfs_io_stream::front member is meant to point to the subrequest
currently being collected on a stream, but it isn't actually used this way
by direct write (which mostly ignores it).  However, there's a tracepoint
which looks at it.  Further, stream->front is actually redundant with
stream->subrequests.next.

Fix the potential problem in the direct code by just removing the member
and using stream->subrequests.next instead, thereby also simplifying the
code.

Fixes: a0b4c7a491 ("netfs: Fix unbuffered/DIO writes to dispatch subrequests in strict sequence")
Reported-by: Paulo Alcantara <pc@manguebit.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://patch.msgid.link/4158599.1774426817@warthog.procyon.org.uk
Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-03-26 15:18:45 +01:00
Darrick J. Wong
e31c53a806 xfs: remove file_path tracepoint data
The xfile/xmbuf shmem file descriptions are no longer as detailed as
they were when online fsck was first merged, because moving to static
strings in commit 60382993a2 ("xfs: get rid of the
xchk_xfile_*_descr calls") removed a memory allocation and hence a
source of failure.

However this makes encoding the description in the tracepoints sort of a
waste of memory.  David Laight also points out that file_path doesn't
zero the whole buffer which causes exposure of stale trace bytes, and
Steven Rostedt wonders why we're not using a dynamic array for the file
path.

I don't think this is worth fixing, so let's just rip it out.

Cc: rostedt@goodmis.org
Cc: david.laight.linux@gmail.com
Link: https://lore.kernel.org/linux-xfs/20260323172204.work.979-kees@kernel.org/
Cc: stable@vger.kernel.org # v6.11
Fixes: 19ebc8f84e ("xfs: fix file_path handling in tracepoints")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-03-26 14:25:23 +01:00
Darrick J. Wong
70685c291e xfs: don't irele after failing to iget in xfs_attri_recover_work
xlog_recovery_iget* never set @ip to a valid pointer if they return
an error, so this irele will walk off a dangling pointer.  Fix that.

Cc: stable@vger.kernel.org # v6.10
Fixes: ae673f534a ("xfs: record inode generation in xattr update log intent items")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Long Li <leo.lilong@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-03-26 14:25:06 +01:00
Jens Axboe
b59efde9e6 io_uring/fdinfo: fix SQE_MIXED SQE displaying
When displaying pending SQEs for a MIXED ring, each 128-byte SQE
increments sq_head to skip the second slot, but the loop counter is not
adjusted. This can cause the loop to read past sq_tail by one entry for
each 128-byte SQE encountered, displaying SQEs that haven't been made
consumable yet by the application.

Match the kernel's own consumption logic in io_init_req() which
decrements what's left when consuming the extra slot.

Fixes: 1cba30bf9f ("io_uring: add support for IORING_SETUP_SQE_MIXED")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-03-26 07:15:55 -06:00
Alex Williamson
e98137f0a8 vfio/pci: Fix double free in dma-buf feature
The error path through vfio_pci_core_feature_dma_buf() ignores its
own advice to only use dma_buf_put() after dma_buf_export(), instead
falling through the entire unwind chain.  In the unlikely event that
we encounter file descriptor exhaustion, this can result in an
unbalanced refcount on the vfio device and double free of allocated
objects.

Avoid this by moving the "put" directly into the error path and return
the errno rather than entering the unwind chain.

Reported-by: Renato Marziano <renato@marziano.top>
Fixes: 5d74781ebc ("vfio/pci: Add dma-buf export support for MMIO regions")
Cc: stable@vger.kernel.org
Acked-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Alex Williamson <alex.williamson@nvidia.com>
Link: https://lore.kernel.org/r/20260323215659.2108191-3-alex.williamson@nvidia.com
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Alex Williamson <alex@shazbot.org>
2026-03-26 06:38:27 -06:00
Marc Kleine-Budde
b341c1176f spi: spi-fsl-lpspi: fix teardown order issue (UAF)
There is a teardown order issue in the driver. The SPI controller is
registered using devm_spi_register_controller(), which delays
unregistration of the SPI controller until after the fsl_lpspi_remove()
function returns.

As the fsl_lpspi_remove() function synchronously tears down the DMA
channels, a running SPI transfer triggers the following NULL pointer
dereference due to use after free:

| fsl_lpspi 42550000.spi: I/O Error in DMA RX
| Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
[...]
| Call trace:
|  fsl_lpspi_dma_transfer+0x260/0x340 [spi_fsl_lpspi]
|  fsl_lpspi_transfer_one+0x198/0x448 [spi_fsl_lpspi]
|  spi_transfer_one_message+0x49c/0x7c8
|  __spi_pump_transfer_message+0x120/0x420
|  __spi_sync+0x2c4/0x520
|  spi_sync+0x34/0x60
|  spidev_message+0x20c/0x378 [spidev]
|  spidev_ioctl+0x398/0x750 [spidev]
[...]

Switch from devm_spi_register_controller() to spi_register_controller() in
fsl_lpspi_probe() and add the corresponding spi_unregister_controller() in
fsl_lpspi_remove().

Fixes: 5314987de5 ("spi: imx: add lpspi bus driver")
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Link: https://patch.msgid.link/20260319-spi-fsl-lpspi-fixes-v1-1-b433e435b2d8@pengutronix.de
Signed-off-by: Mark Brown <broonie@kernel.org>
2026-03-26 12:27:42 +00:00
Sakari Ailus
7587fbf5ad media: ccs: Avoid deadlock in ccs_init_state()
The sub-device state lock has been already acquired when ccs_init_state()
is called. Do not try to acquire it again.

Reported-by: David Heidelberg <david@ixit.cz>
Fixes: a88883d120 ("media: ccs: Rely on sub-device state locking")
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
2026-03-26 13:14:07 +01:00
Ricardo Ribalda
7c39f48568 media: uvcvideo: Fix bug in error path of uvc_alloc_urb_buffers
Recent cleanup introduced a bug in the error path of
uvc_alloc_urb_buffers(). If there is not enough memory for the
allocation the following error will be triggered:

[  739.196672] UBSAN: shift-out-of-bounds in mm/page_alloc.c:1403:22
[  739.196710] shift exponent 52 is too large for 32-bit type 'int'

Resulting in:
[  740.464422] BUG: unable to handle page fault for address: fffffac1c0800000

The reason for the bug is that usb_free_noncoherent is called with an
invalid size (0) instead of the actual size of the urb.

This patch takes care of that.

Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Closes: https://lore.kernel.org/linux-media/abycbXzYupZpGkvR@hyeyoo/T/#t
Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Fixes: c824345288 ("media: uvcvideo: Pass allocation size directly to uvc_alloc_urb_buffer")
Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Link: https://patch.msgid.link/20260320-uvc-urb-free-error-v1-1-b12cc3762a19@chromium.org
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
2026-03-26 13:14:07 +01:00
Peter Ujfalusi
d40a198e2b ASoC: SOF: ipc4-topology: Allow bytes controls without initial payload
It is unexpected, but allowed to have no initial payload for a bytes
control and the code is prepared to handle this case, but the size check
missed this corner case.

Update the check for minimal size to allow the initial size to be 0.

Cc: stable@vger.kernel.org
Fixes: a653820700 ("ASoC: SOF: ipc4-topology: Correct the allocation size for bytes controls")
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Reviewed-by: Liam Girdwood <liam.r.girdwood@intel.com>
Reviewed-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>
Reviewed-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Link: https://patch.msgid.link/20260326075618.1603-1-peter.ujfalusi@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2026-03-26 11:16:03 +00:00
Johan Hovold
8d2e0cb322 spi: fix use-after-free on managed registration failure
The SPI API is asymmetric and the controller is freed as part of
deregistration (unless it has been allocated using
devm_spi_alloc_host/target()).

A recent change converting the managed registration function to use
devm_add_action_or_reset() inadvertently introduced a (mostly
theoretical) regression where a non-devres managed controller could be
freed as part of failed registration. This in turn would lead to
use-after-free in controller driver error paths.

Fix this by taking another reference before calling
devm_add_action_or_reset() and not releasing it on errors for
non-devres allocated controllers.

An alternative would be a partial revert of the offending commit, but
it is better to handle this explicitly until the API has been fixed
(e.g. see 5e844cc37a ("spi: Introduce device-managed SPI controller
allocation")).

Fixes: b6376dbed8 ("spi: Simplify devm_spi_*_controller()")
Reported-by: Felix Gu <ustc.gu@gmail.com>
Link: https://lore.kernel.org/all/20260324145548.139952-1-ustc.gu@gmail.com/
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://patch.msgid.link/20260325145319.1132072-1-johan@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
2026-03-26 10:45:28 +00:00
Mark Brown
c6eea4ff84 ASoC: adau1372: Fix error handling in adau1372_set_power()
Jihed Chaibi <jihed.chaibi.dev@gmail.com> says:

adau1372_set_power() had two related error handling issues in its enable
path: clk_prepare_enable() was called but its return value discarded, and
adau1372_enable_pll() was a void function that silently swallowed lock
failures, leaving mclk enabled and adau1372->enabled set to true despite
the device being in a broken state.

Patch 1 fixes the unchecked clk_prepare_enable() by making
adau1372_set_power() return int and propagating the error.

Patch 2 converts adau1372_enable_pll() to return int and adds a full
unwind in adau1372_set_power() if PLL lock fails, reversing the regcache,
GPIO power-down, and clock state.
2026-03-26 10:33:38 +00:00
Jihed Chaibi
bfe6a264ef ASoC: adau1372: Fix clock leak on PLL lock failure
adau1372_enable_pll() was a void function that logged a dev_err() on
PLL lock timeout but did not propagate the error. As a result,
adau1372_set_power() would continue with adau1372->enabled set to true
despite the PLL being unlocked, and the mclk left enabled with no
corresponding disable on the error path.

Convert adau1372_enable_pll() to return int, using -ETIMEDOUT on lock
timeout and propagating regmap errors directly. In adau1372_set_power(),
check the return value and unwind in reverse order: restore regcache to
cache-only mode, reassert GPIO power-down, and disable the clock before
returning the error.

Signed-off-by: Jihed Chaibi <jihed.chaibi.dev@gmail.com>
Fixes: 6cd4c6459e ("ASoC: Add ADAU1372 audio CODEC support")
Reviewed-by: Nuno Sá <nuno.sa@analog.com>
Link: https://patch.msgid.link/20260325210704.76847-3-jihed.chaibi.dev@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2026-03-26 10:33:36 +00:00
Jihed Chaibi
326fe8104a ASoC: adau1372: Fix unchecked clk_prepare_enable() return value
adau1372_set_power() calls clk_prepare_enable() but discards the return
value. If the clock enable fails, the driver proceeds to access registers
on unpowered hardware, potentially causing silent corruption.

Make adau1372_set_power() return int and propagate the error from
clk_prepare_enable(). Update adau1372_set_bias_level() to return the
error directly for the STANDBY and OFF cases.

Signed-off-by: Jihed Chaibi <jihed.chaibi.dev@gmail.com>
Fixes: 6cd4c6459e ("ASoC: Add ADAU1372 audio CODEC support")
Reviewed-by: Nuno Sá <nuno.sa@analog.com>
Link: https://patch.msgid.link/20260325210704.76847-2-jihed.chaibi.dev@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2026-03-26 10:33:35 +00:00
Marc Buerg
f63a9df7e3 sysctl: fix uninitialized variable in proc_do_large_bitmap
proc_do_large_bitmap() does not initialize variable c, which is expected
to be set to a trailing character by proc_get_long().

However, proc_get_long() only sets c when the input buffer contains a
trailing character after the parsed value.

If c is not initialized it may happen to contain a '-'. If this is the
case proc_do_large_bitmap() expects to be able to parse a second part of
the input buffer. If there is no second part an unjustified -EINVAL will
be returned.

Initialize c to 0 to prevent returning -EINVAL on valid input.

Fixes: 9f977fb7ae ("sysctl: add proc_do_large_bitmap")
Signed-off-by: Marc Buerg <buermarc@googlemail.com>
Reviewed-by: Joel Granados <joel.granados@kernel.org>
Signed-off-by: Joel Granados <joel.granados@kernel.org>
2026-03-26 09:32:19 +01:00
GuoHan Zhao
cd7e1fef5a xen/privcmd: unregister xenstore notifier on module exit
Commit 453b8fb68f ("xen/privcmd: restrict usage in
unprivileged domU") added a xenstore notifier to defer setting the
restriction target until Xenstore is ready.

XEN_PRIVCMD can be built as a module, but privcmd_exit() leaves that
notifier behind. Balance the notifier lifecycle by unregistering it on
module exit.

This is harmless even if xenstore was already ready at registration
time and the notifier was never queued on the chain.

Fixes: 453b8fb68f ("xen/privcmd: restrict usage in unprivileged domU")
Signed-off-by: GuoHan Zhao <zhaoguohan@kylinos.cn>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Message-ID: <20260325120246.252899-1-zhaoguohan@kylinos.cn>
2026-03-26 08:57:51 +01:00
Bibo Mao
6bcfb7f46d LoongArch: KVM: Fix base address calculation in kvm_eiointc_regs_access()
In function kvm_eiointc_regs_access(), the register base address is
caculated from array base address plus offset, the offset is absolute
value from the base address. The data type of array base address is
u64, it should be converted into the "void *" type and then plus the
offset.

Cc: <stable@vger.kernel.org>
Fixes: d3e43a1f34 ("LoongArch: KVM: Use 64-bit register definition for EIOINTC").
Reported-by: Aurelien Jarno <aurel32@debian.org>
Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1131431
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2026-03-26 14:29:09 +08:00
Huacai Chen
b97bd69eb0 LoongArch: KVM: Handle the case that EIOINTC's coremap is empty
EIOINTC's coremap in eiointc_update_sw_coremap() can be empty, currently
we get a cpuid with -1 in this case, but we actually need 0 because it's
similar as the case that cpuid >= 4.

This fix an out-of-bounds access to kvm_arch::phyid_map::phys_map[].

Cc: <stable@vger.kernel.org>
Fixes: 3956a52bc0 ("LoongArch: KVM: Add EIOINTC read and write functions")
Reported-by: Aurelien Jarno <aurel32@debian.org>
Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1131431
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2026-03-26 14:29:09 +08:00
Huacai Chen
2db06c15d8 LoongArch: KVM: Make kvm_get_vcpu_by_cpuid() more robust
kvm_get_vcpu_by_cpuid() takes a cpuid parameter whose type is int, so
cpuid can be negative. Let kvm_get_vcpu_by_cpuid() return NULL for this
case so as to make it more robust.

This fix an out-of-bounds access to kvm_arch::phyid_map::phys_map[].

Cc: <stable@vger.kernel.org>
Fixes: 73516e9da5 ("LoongArch: KVM: Add vcpu mapping from physical cpuid")
Reported-by: Aurelien Jarno <aurel32@debian.org>
Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1131431
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2026-03-26 14:29:09 +08:00
Xi Ruoyao
e4878c37f6 LoongArch: vDSO: Emit GNU_EH_FRAME correctly
With -fno-asynchronous-unwind-tables and --no-eh-frame-hdr (the default
of the linker), the GNU_EH_FRAME segment (specified by vdso.lds.S) is
empty.  This is not valid, as the current DWARF specification mandates
the first byte of the EH frame to be the version number 1.  It causes
some unwinders to complain, for example the ClickHouse query profiler
spams the log with messages:

    clickhouse-server[365854]: libunwind: unsupported .eh_frame_hdr
    version: 127 at 7ffffffb0000

Here "127" is just the byte located at the p_vaddr (0, i.e. the
beginning of the vDSO) of the empty GNU_EH_FRAME segment. Cross-
checking with /proc/365854/maps has also proven 7ffffffb0000 is the
start of vDSO in the process VM image.

In LoongArch the -fno-asynchronous-unwind-tables option seems just a
MIPS legacy, and MIPS only uses this option to satisfy the MIPS-specific
"genvdso" program, per the commit cfd75c2db1 ("MIPS: VDSO: Explicitly
use -fno-asynchronous-unwind-tables").  IIRC it indicates some inherent
limitation of the MIPS ELF ABI and has nothing to do with LoongArch.  So
we can simply flip it over to -fasynchronous-unwind-tables and pass
--eh-frame-hdr for linking the vDSO, allowing the profilers to unwind the
stack for statistics even if the sample point is taken when the PC is in
the vDSO.

However simply adjusting the options above would exploit an issue: when
the libgcc unwinder saw the invalid GNU_EH_FRAME segment, it silently
falled back to a machine-specific routine to match the code pattern of
rt_sigreturn() and extract the registers saved in the sigframe if the
code pattern is matched.  As unwinding from signal handlers is vital for
libgcc to support pthread cancellation etc., the fall-back routine had
been silently keeping the LoongArch Linux systems functioning since
Linux 5.19.  But when we start to emit GNU_EH_FRAME with the correct
format, fall-back routine will no longer be used and libgcc will fail
to unwind the sigframe, and unwinding from signal handlers will no
longer work, causing dozens of glibc test failures.  To make it possible
to unwind from signal handlers again, it's necessary to code the unwind
info in __vdso_rt_sigreturn via .cfi_* directives.

The offsets in the .cfi_* directives depend on the layout of struct
sigframe, notably the offset of sigcontext in the sigframe.  To use the
offset in the assembly file, factor out struct sigframe into a header to
allow asm-offsets.c to output the offset for assembly.

To work around a long-term issue in the libgcc unwinder (the pc is
unconditionally substracted by 1: doing so is technically incorrect for
a signal frame), a nop instruction is included with the two real
instructions in __vdso_rt_sigreturn in the same FDE PC range.  The same
hack has been used on x86 for a long time.

Cc: stable@vger.kernel.org
Fixes: c6b99bed6b ("LoongArch: Add VDSO and VSYSCALL support")
Signed-off-by: Xi Ruoyao <xry111@xry111.site>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2026-03-26 14:29:09 +08:00
Huacai Chen
95db0c9f52 LoongArch: Workaround LS2K/LS7A GPU DMA hang bug
1. Hardware limitation: GPU, DC and VPU are typically PCI device 06.0,
06.1 and 06.2. They share some hardware resources, so when configure the
PCI 06.0 device BAR1, DMA memory access cannot be performed through this
BAR, otherwise it will cause hardware abnormalities.

2. In typical scenarios of reboot or S3/S4, DC access to memory through
BAR is not prohibited, resulting in GPU DMA hangs.

3. Workaround method: When configuring the 06.0 device BAR1, turn off
the memory access of DC, GPU and VPU (via DC's CRTC registers).

Cc: stable@vger.kernel.org
Signed-off-by: Qianhai Wu <wuqianhai@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2026-03-26 14:29:09 +08:00
Li Jun
3a28daa9b7 LoongArch: Fix missing NULL checks for kstrdup()
1. Replace "of_find_node_by_path("/")" with "of_root" to avoid multiple
calls to "of_node_put()".

2. Fix a potential kernel oops during early boot when memory allocation
fails while parsing CPU model from device tree.

Cc: stable@vger.kernel.org
Signed-off-by: Li Jun <lijun01@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2026-03-26 14:29:08 +08:00
Tejun Heo
4c56a8ac68 cgroup: Fix cgroup_drain_dying() testing the wrong condition
cgroup_drain_dying() was using cgroup_is_populated() to test whether there are
dying tasks to wait for. cgroup_is_populated() tests nr_populated_csets,
nr_populated_domain_children and nr_populated_threaded_children, but
cgroup_drain_dying() only needs to care about this cgroup's own tasks - whether
there are children is cgroup_destroy_locked()'s concern.

This caused hangs during shutdown. When systemd tried to rmdir a cgroup that had
no direct tasks but had a populated child, cgroup_drain_dying() would enter its
wait loop because cgroup_is_populated() was true from
nr_populated_domain_children. The task iterator found nothing to wait for, yet
the populated state never cleared because it was driven by live tasks in the
child cgroup.

Fix it by using cgroup_has_tasks() which only tests nr_populated_csets.

v3: Fix cgroup_is_populated() -> cgroup_has_tasks() (Sebastian).

v2: https://lore.kernel.org/r/20260323200205.1063629-1-tj@kernel.org

Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Fixes: 1b164b876c ("cgroup: Wait for dying tasks to leave on rmdir")
Signed-off-by: Tejun Heo <tj@kernel.org>
Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
2026-03-25 14:08:04 -10:00
Namjae Jeon
beef2634f8 ksmbd: fix potencial OOB in get_file_all_info() for compound requests
When a compound request consists of QUERY_DIRECTORY + QUERY_INFO
(FILE_ALL_INFORMATION) and the first command consumes nearly the entire
max_trans_size, get_file_all_info() would blindly call smbConvertToUTF16()
with PATH_MAX, causing out-of-bounds write beyond the response buffer.
In get_file_all_info(), there was a missing validation check for
the client-provided OutputBufferLength before copying the filename into
FileName field of the smb2_file_all_info structure.
If the filename length exceeds the available buffer space, it could lead to
potential buffer overflows or memory corruption during smbConvertToUTF16
conversion. This calculating the actual free buffer size using
smb2_calc_max_out_buf_len() and returning -EINVAL if the buffer is
insufficient and updating smbConvertToUTF16 to use the actual filename
length (clamped by PATH_MAX) to ensure a safe copy operation.

Cc: stable@vger.kernel.org
Fixes: e2b76ab8b5 ("ksmbd: add support for read compound")
Reported-by: Asim Viladi Oglu Manizada <manizada@pm.me>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2026-03-25 18:58:40 -05:00
Guenter Roeck
754bd2b4a0 hwmon: (pmbus/core) Protect regulator operations with mutex
The regulator operations pmbus_regulator_get_voltage(),
pmbus_regulator_set_voltage(), and pmbus_regulator_list_voltage()
access PMBus registers and shared data but were not protected by
the update_lock mutex. This could lead to race conditions.

However, adding mutex protection directly to these functions causes
a deadlock because pmbus_regulator_notify() (which calls
regulator_notifier_call_chain()) is often called with the mutex
already held (e.g., from pmbus_fault_handler()). If a regulator
callback then calls one of the now-protected voltage functions,
it will attempt to acquire the same mutex.

Rework pmbus_regulator_notify() to utilize a worker function to
send notifications outside of the mutex protection. Events are
stored as atomics in a per-page bitmask and processed by the worker.

Initialize the worker and its associated data during regulator
registration, and ensure it is cancelled on device removal using
devm_add_action_or_reset().

While at it, remove the unnecessary include of linux/of.h.

Cc: Sanman Pradhan <psanman@juniper.net>
Fixes: ddbb4db4ce ("hwmon: (pmbus) Add regulator support")
Reviewed-by: Sanman Pradhan <psanman@juniper.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2026-03-25 13:32:37 -07:00
Guenter Roeck
cd658475e7 hwmon: (pmbus) Introduce the concept of "write-only" attributes
Attributes intended to clear sensor history are intended to be writeable
only. Reading those attributes today results in reporting more or less
random values. To avoid ABI surprises, have those attributes explicitly
return 0 when reading.

Fixes: 787c095eda ("hwmon: (pmbus/core) Add support for rated attributes")
Reviewed-by: Sanman Pradhan <psanman@juniper.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2026-03-25 13:32:33 -07:00
Guenter Roeck
805a5bd1c3 hwmon: (pmbus) Mark lowest/average/highest/rated attributes as read-only
Writing those attributes is not supported, so mark them as read-only.

Prior to this change, attempts to write into these attributes returned
an error.

Mark boolean fields in struct pmbus_limit_attr and in struct
pmbus_sensor_attr as bit fields to reduce configuration data size.
The data is scanned only while probing, so performance is not a concern.

Fixes: 6f183d33a0 ("hwmon: (pmbus) Add support for peak attributes")
Reviewed-by: Sanman Pradhan <psanman@juniper.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2026-03-25 13:32:24 -07:00
Petr Mladek
e398978ddf workqueue: Better describe stall check
Try to be more explicit why the workqueue watchdog does not take
pool->lock by default. Spin locks are full memory barriers which
delay anything. Obviously, they would primary delay operations
on the related worker pools.

Explain why it is enough to prevent the false positive by re-checking
the timestamp under the pool->lock.

Finally, make it clear what would be the alternative solution in
__queue_work() which is a hotter path.

Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
2026-03-25 05:51:02 -10:00
Shuming Fan
c673efd5db ASoC: SDCA: fix finding wrong entity
This patch fixes an issue like:
where searching for the entity 'FU 11' could incorrectly match 'FU 113' first.
The driver should first perform an exact match on the full string name.
If no exact match is found, it can then fall back to a partial match.

Fixes: 48fa77af2f ("ASoC: SDCA: Add terminal type into input/output widget name")
Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Signed-off-by: Shuming Fan <shumingf@realtek.com>
Link: https://patch.msgid.link/20260325110406.3232420-1-shumingf@realtek.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2026-03-25 15:37:51 +00:00
Jianmin Lv
87a70013be MAINTAINERS: Update GPU driver maintainer information
I and Qianhai are GPU R&D engineers at Loongson, specializing
in kernel driver development. We understand that the current
Loongson GPU driver lacks dedicated maintenance resources
because of some reasons.

As Loongson GPU driver developers, we have both the capability
and the responsibility to continuously maintain the Loongson
GPU driver, ensuring minimal impact on its users. After internal
discussions, our team has decided to recommend me and Qianhai
to take over the maintenance responsibilities, and recommend
Huacai, Mingcong and Ruoyao to help to review.

And We'll continue to maintain it for current supported chips
and drive future updates according to chip support plan.

Signed-off-by: Jianmin Lv <lvjianmin@loongson.cn>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patch.msgid.link/20260320101012.22714-1-lvjianmin@loongson.cn
2026-03-25 15:10:30 +01:00
Sanman Pradhan
bf08749a6a hwmon: (adm1177) fix sysfs ABI violation and current unit conversion
The adm1177 driver exposes the current alert threshold through
hwmon_curr_max_alarm. This violates the hwmon sysfs ABI, where
*_alarm attributes are read-only status flags and writable thresholds
must use currN_max.

The driver also stores the threshold internally in microamps, while
currN_max is defined in milliamps. Convert the threshold accordingly
on both the read and write paths.

Widen the cached threshold and related calculations to 64 bits so
that small shunt resistor values do not cause truncation or overflow.
Also use 64-bit arithmetic for the mA/uA conversions, clamp writes
to the range the hardware can represent, and propagate failures from
adm1177_write_alert_thr() instead of silently ignoring them.

Update the hwmon documentation to reflect the attribute rename and
the correct units returned by the driver.

Fixes: 09b08ac9e8 ("hwmon: (adm1177) Add ADM1177 Hot Swap Controller and Digital Power Monitor driver")
Signed-off-by: Sanman Pradhan <psanman@juniper.net>
Acked-by: Nuno Sá <nuno.sa@analog.com>
Link: https://lore.kernel.org/r/20260325051246.28262-1-sanman.pradhan@hpe.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2026-03-25 06:50:13 -07:00
Shuming Fan
c991ca3238 ASoC: SDCA: remove the max count of initialization table
The number of the initialization table may exceed 2048.
Therefore, this patch removes the limitation and allows the driver to
allocate memory dynamically based on the size of the initialization table.

Signed-off-by: Shuming Fan <shumingf@realtek.com>
Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Link: https://patch.msgid.link/20260325092017.3221640-1-shumingf@realtek.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2026-03-25 12:17:43 +00:00
Matthew Auld
bfe9e314d7 drm/xe: always keep track of remap prev/next
During 3D workload, user is reporting hitting:

[  413.361679] WARNING: drivers/gpu/drm/xe/xe_vm.c:1217 at vm_bind_ioctl_ops_unwind+0x1e2/0x2e0 [xe], CPU#7: vkd3d_queue/9925
[  413.361944] CPU: 7 UID: 1000 PID: 9925 Comm: vkd3d_queue Kdump: loaded Not tainted 7.0.0-070000rc3-generic #202603090038 PREEMPT(lazy)
[  413.361949] RIP: 0010:vm_bind_ioctl_ops_unwind+0x1e2/0x2e0 [xe]
[  413.362074] RSP: 0018:ffffd4c25c3df930 EFLAGS: 00010282
[  413.362077] RAX: 0000000000000000 RBX: ffff8f3ee817ed10 RCX: 0000000000000000
[  413.362078] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[  413.362079] RBP: ffffd4c25c3df980 R08: 0000000000000000 R09: 0000000000000000
[  413.362081] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8f41fbf99380
[  413.362082] R13: ffff8f3ee817e968 R14: 00000000ffffffef R15: ffff8f43d00bd380
[  413.362083] FS:  00000001040ff6c0(0000) GS:ffff8f4696d89000(0000) knlGS:00000000330b0000
[  413.362085] CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
[  413.362086] CR2: 00007ddfc4747000 CR3: 00000002e6262005 CR4: 0000000000f72ef0
[  413.362088] PKRU: 55555554
[  413.362089] Call Trace:
[  413.362092]  <TASK>
[  413.362096]  xe_vm_bind_ioctl+0xa9a/0xc60 [xe]

Which seems to hint that the vma we are re-inserting for the ops unwind
is either invalid or overlapping with something already inserted in the
vm. It shouldn't be invalid since this is a re-insertion, so must have
worked before. Leaving the likely culprit as something already placed
where we want to insert the vma.

Following from that, for the case where we do something like a rebind in
the middle of a vma, and one or both mapped ends are already compatible,
we skip doing the rebind of those vma and set next/prev to NULL. As well
as then adjust the original unmap va range, to avoid unmapping the ends.
However, if we trigger the unwind path, we end up with three va, with
the two ends never being removed and the original va range in the middle
still being the shrunken size.

If this occurs, one failure mode is when another unwind op needs to
interact with that range, which can happen with a vector of binds. For
example, if we need to re-insert something in place of the original va.
In this case the va is still the shrunken version, so when removing it
and then doing a re-insert it can overlap with the ends, which were
never removed, triggering a warning like above, plus leaving the vm in a
bad state.

With that, we need two things here:

 1) Stop nuking the prev/next tracking for the skip cases. Instead
    relying on checking for skip prev/next, where needed. That way on the
    unwind path, we now correctly remove both ends.

 2) Undo the unmap va shrinkage, on the unwind path. With the two ends
    now removed the unmap va should expand back to the original size again,
    before re-insertion.

v2:
  - Update the explanation in the commit message, based on an actual IGT of
    triggering this issue, rather than conjecture.
  - Also undo the unmap shrinkage, for the skip case. With the two ends
    now removed, the original unmap va range should expand back to the
    original range.
v3:
  - Track the old start/range separately. vma_size/start() uses the va
    info directly.

Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/7602
Fixes: 8f33b4f054 ("drm/xe: Avoid doing rebinds")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260318100208.78097-2-matthew.auld@intel.com
(cherry picked from commit aec6969f75afbf4e01fd5fb5850ed3e9c27043ac)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-03-25 08:05:33 -04:00
Tvrtko Ursulin
06f4297134 drm/syncobj: Fix xa_alloc allocation flags
The xarray conversion blindly and wrongly replaced idr_alloc with xa_alloc
and kept the GFP_NOWAIT. It should have been GFP_KERNEL to account for
idr_preload it removed. Fix it.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Fixes: fec2c3c01f ("drm/syncobj: Convert syncobj idr to xarray")
Reported-by: Himanshu Girotra <himanshu.girotra@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Himanshu Girotra <himanshu.girotra@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
Link: https://lore.kernel.org/r/20260324111019.22467-1-tvrtko.ursulin@igalia.com
2026-03-25 08:05:35 +00:00
Zhan Xusheng
5d16467ae5 alarmtimer: Fix argument order in alarm_timer_forward()
alarm_timer_forward() passes arguments to alarm_forward() in the wrong
order:

  alarm_forward(alarm, timr->it_interval, now);

However, alarm_forward() is defined as:

  u64 alarm_forward(struct alarm *alarm, ktime_t now, ktime_t interval);

and uses the second argument as the current time:

  delta = ktime_sub(now, alarm->node.expires);

Passing the interval as "now" results in incorrect delta computation,
which can lead to missed expirations or incorrect overrun accounting.

This issue has been present since the introduction of
alarm_timer_forward().

Fix this by swapping the arguments.

Fixes: e7561f1633 ("alarmtimer: Implement forward callback")
Signed-off-by: Zhan Xusheng <zhanxusheng@xiaomi.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260323061130.29991-1-zhanxusheng@xiaomi.com
2026-03-24 23:17:14 +01:00
Tejun Heo
6680c162b4 selftests/cgroup: Don't require synchronous populated update on task exit
test_cgcore_populated (test_core) and test_cgkill_{simple,tree,forkbomb}
(test_kill) check cgroup.events "populated 0" immediately after reaping
child tasks with waitpid(). This used to work because cgroup_task_exit() in
do_exit() unlinked tasks from css_sets before exit_notify() woke up
waitpid().

d245698d72 ("cgroup: Defer task cgroup unlink until after the task is done
switching out") moved the unlink to cgroup_task_dead() in
finish_task_switch(), which runs after exit_notify(). The populated counter
is now decremented after the parent's waitpid() can return, so there is no
longer a synchronous ordering guarantee. On PREEMPT_RT, where
cgroup_task_dead() is further deferred through lazy irq_work, the race
window is even larger.

The synchronous populated transition was never part of the cgroup interface
contract - it was an implementation artifact. Use cg_read_strcmp_wait() which
retries for up to 1 second, matching what these tests actually need to
verify: that the cgroup eventually becomes unpopulated after all tasks exit.

Fixes: d245698d72 ("cgroup: Defer task cgroup unlink until after the task is done switching out")
Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: cgroups@vger.kernel.org
2026-03-24 10:21:57 -10:00
Tejun Heo
1b164b876c cgroup: Wait for dying tasks to leave on rmdir
a72f73c4dd ("cgroup: Don't expose dead tasks in cgroup") hid PF_EXITING
tasks from cgroup.procs so that systemd doesn't see tasks that have already
been reaped via waitpid(). However, the populated counter (nr_populated_csets)
is only decremented when the task later passes through cgroup_task_dead() in
finish_task_switch(). This means cgroup.procs can appear empty while the
cgroup is still populated, causing rmdir to fail with -EBUSY.

Fix this by making cgroup_rmdir() wait for dying tasks to fully leave. If the
cgroup is populated but all remaining tasks have PF_EXITING set (the task
iterator returns none due to the existing filter), wait for a kick from
cgroup_task_dead() and retry. The wait is brief as tasks are removed from the
cgroup's css_set between PF_EXITING assertion in do_exit() and
cgroup_task_dead() in finish_task_switch().

v2: cgroup_is_populated() true to false transition happens under css_set_lock
    not cgroup_mutex, so retest under css_set_lock before sleeping to avoid
    missed wakeups (Sebastian).

Fixes: a72f73c4dd ("cgroup: Don't expose dead tasks in cgroup")
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202603222104.2c81684e-lkp@intel.com
Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Bert Karwatzki <spasswolf@web.de>
Cc: Michal Koutny <mkoutny@suse.com>
Cc: cgroups@vger.kernel.org
2026-03-24 10:21:40 -10:00
Panagiotis "Ivory" Vasilopoulos
a23811061a landlock: Expand restrict flags example for ABI version 8
Add LANDLOCK_RESTRICT_SELF_TSYNC to the backwards compatibility example
for restrict flags. This introduces completeness, similar to that of
the ruleset attributes example. However, as the new example can impact
enforcement in certain cases, an appropriate warning is also included.

Additionally, I modified the two comments of the example to make them
more consistent with the ruleset attributes example's.

Signed-off-by: Panagiotis "Ivory" Vasilopoulos <git@n0toose.net>
Co-developed-by: Dan Cojocaru <dan@dcdev.ro>
Signed-off-by: Dan Cojocaru <dan@dcdev.ro>
Reviewed-by: Günther Noack <gnoack@google.com>
Link: https://lore.kernel.org/r/20260304-landlock-docs-add-tsync-example-v4-1-819a276f05c5@n0toose.net
[mic: Update date, improve comments consistency, fix newline issue]
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2026-03-24 20:55:55 +01:00
Srinivas Pandruvada
7dfe984601 thermal: intel: int340x: soc_slider: Set offset only for balanced mode
The slider offset can be set via debugfs for balanced mode. The offset
should be only applicable in balanced mode. For other modes, it should
be 0 when writing to MMIO offset,

Fixes: 8306bcaba0 ("thermal: intel: int340x: Add module parameter to change slider offset")
Tested-by: Erin Park <erin.park@intel.com>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: 6.18+ <stable@vger.kernel.org> # 6.18+
[ rjw: Subject and changelog tweaks ]
Link: https://patch.msgid.link/20260324172346.3317145-1-srinivas.pandruvada@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-03-24 19:00:12 +01:00
Alex Deucher
90d239cc53 drm/amd/display: Fix DCE LVDS handling
LVDS does not use an HPD pin so it may be invalid.  Handle
this case correctly in link encoder creation.

Fixes: 7c8fb3b8e9 ("drm/amd/display: Add hpd_source index check for DCE60/80/100/110/112/120 link encoders")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/5012
Cc: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Cc: Roman Li <roman.li@amd.com>
Reviewed-by: Roman Li <roman.li@amd.com>
Reviewed-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 3b5620f7ee688177fcf65cf61588c5435bce1872)
Cc: stable@vger.kernel.org
2026-03-24 13:55:47 -04:00
Donet Tom
4e9597f22a drm/amdgpu: Handle GPU page faults correctly on non-4K page systems
During a GPU page fault, the driver restores the SVM range and then maps it
into the GPU page tables. The current implementation passes a GPU-page-size
(4K-based) PFN to svm_range_restore_pages() to restore the range.

SVM ranges are tracked using system-page-size PFNs. On systems where the
system page size is larger than 4K, using GPU-page-size PFNs to restore the
range causes two problems:

Range lookup fails:
Because the restore function receives PFNs in GPU (4K) units, the SVM
range lookup does not find the existing range. This will result in a
duplicate SVM range being created.

VMA lookup failure:
The restore function also tries to locate the VMA for the faulting address.
It converts the GPU-page-size PFN into an address using the system page
size, which results in an incorrect address on non-4K page-size systems.
As a result, the VMA lookup fails with the message: "address 0xxxx VMA is
removed".

This patch passes the system-page-size PFN to svm_range_restore_pages() so
that the SVM range is restored correctly on non-4K page systems.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Donet Tom <donettom@linux.ibm.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 074fe395fb13247b057f60004c7ebcca9f38ef46)
2026-03-24 13:55:29 -04:00
Yang Wang
28922a43fd drm/amd/pm: disable OD_FAN_CURVE if temp or pwm range invalid for smu v14
Forcibly disable the OD_FAN_CURVE feature when temperature or PWM range is invalid,
otherwise PMFW will reject this configuration on smu v14.0.2/14.0.3.

example:
$ sudo cat /sys/bus/pci/devices/<BDF>/gpu_od/fan_ctrl/fan_curve

OD_FAN_CURVE:
0: 0C 0%
1: 0C 0%
2: 0C 0%
3: 0C 0%
4: 0C 0%
OD_RANGE:
FAN_CURVE(hotspot temp): 0C 0C
FAN_CURVE(fan speed): 0% 0%

$ echo "0 50 40" | sudo tee fan_curve

kernel log:
[  969.761627] amdgpu 0000:03:00.0: amdgpu: Fan curve temp setting(50) must be within [0, 0]!
[ 1010.897800] amdgpu 0000:03:00.0: amdgpu: Fan curve temp setting(50) must be within [0, 0]!

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit ab4905d466b60f170d85e19ca2a5d2b159aeb780)
Cc: stable@vger.kernel.org
2026-03-24 13:54:42 -04:00
Srinivasan Shanmugam
429aec2bc0 drm/amdkfd: Fix NULL pointer check order in kfd_ioctl_create_process
In kfd_ioctl_create_process(), the pointer 'p' is used before checking
if it is NULL.

The code accesses p->context_id before validating 'p'. This can lead
to a possible NULL pointer dereference.

Move the NULL check before using 'p' so that the pointer is validated
before access.

Fixes the below:
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_chardev.c:3177 kfd_ioctl_create_process() warn: variable dereferenced before check 'p' (see line 3174)

Fixes: cc6b66d661 ("amdkfd: introduce new ioctl AMDKFD_IOC_CREATE_PROCESS")
Cc: Zhu Lingshan <lingshan.zhu@amd.com>
Cc: Felix Kuehling <felix.kuehling@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 19d4149b22f57094bfc4b86b742381b3ca394ead)
2026-03-24 13:54:19 -04:00
Alex Deucher
9da4f9964a drm/amd/display: check if ext_caps is valid in BL setup
LVDS connectors don't have extended backlight caps so check
if the pointer is valid before accessing it.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/5012
Fixes: 1454642960 ("drm/amd: Re-introduce property to control adaptive backlight modulation")
Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 3f797396d7f4eb9bb6eded184bbc6f033628a6f6)
Cc: stable@vger.kernel.org
2026-03-24 13:53:46 -04:00
Srinivasan Shanmugam
7150850146 drm/amdgpu: Fix fence put before wait in amdgpu_amdkfd_submit_ib
amdgpu_amdkfd_submit_ib() submits a GPU job and gets a fence
from amdgpu_ib_schedule(). This fence is used to wait for job
completion.

Currently, the code drops the fence reference using dma_fence_put()
before calling dma_fence_wait().

If dma_fence_put() releases the last reference, the fence may be
freed before dma_fence_wait() is called. This can lead to a
use-after-free.

Fix this by waiting on the fence first and releasing the reference
only after dma_fence_wait() completes.

Fixes the below:
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c:697 amdgpu_amdkfd_submit_ib() warn: passing freed memory 'f' (line 696)

Fixes: 9ae55f030d ("drm/amdgpu: Follow up change to previous drm scheduler change.")
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Dan Carpenter <dan.carpenter@linaro.org>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 8b9e5259adc385b61a6590a13b82ae0ac2bd3482)
2026-03-24 13:53:33 -04:00
Weiming Shi
f6484cadbc ACPI: EC: clean up handlers on probe failure in acpi_ec_setup()
When ec_install_handlers() returns -EPROBE_DEFER on reduced-hardware
platforms, it has already started the EC and installed the address
space handler with the struct acpi_ec pointer as handler context.
However, acpi_ec_setup() propagates the error without any cleanup.

The caller acpi_ec_add() then frees the struct acpi_ec for non-boot
instances, leaving a dangling handler context in ACPICA.

Any subsequent AML evaluation that accesses an EC OpRegion field
dispatches into acpi_ec_space_handler() with the freed pointer,
causing a use-after-free:

 BUG: KASAN: slab-use-after-free in mutex_lock (kernel/locking/mutex.c:289)
 Write of size 8 at addr ffff88800721de38 by task init/1
 Call Trace:
  <TASK>
  mutex_lock (kernel/locking/mutex.c:289)
  acpi_ec_space_handler (drivers/acpi/ec.c:1362)
  acpi_ev_address_space_dispatch (drivers/acpi/acpica/evregion.c:293)
  acpi_ex_access_region (drivers/acpi/acpica/exfldio.c:246)
  acpi_ex_field_datum_io (drivers/acpi/acpica/exfldio.c:509)
  acpi_ex_extract_from_field (drivers/acpi/acpica/exfldio.c:700)
  acpi_ex_read_data_from_field (drivers/acpi/acpica/exfield.c:327)
  acpi_ex_resolve_node_to_value (drivers/acpi/acpica/exresolv.c:392)
  </TASK>

 Allocated by task 1:
  acpi_ec_alloc (drivers/acpi/ec.c:1424)
  acpi_ec_add (drivers/acpi/ec.c:1692)

 Freed by task 1:
  kfree (mm/slub.c:6876)
  acpi_ec_add (drivers/acpi/ec.c:1751)

The bug triggers on reduced-hardware EC platforms (ec->gpe < 0)
when the GPIO IRQ provider defers probing. Once the stale handler
exists, any unprivileged sysfs read that causes AML to touch an
EC OpRegion (battery, thermal, backlight) exercises the dangling
pointer.

Fix this by calling ec_remove_handlers() in the error path of
acpi_ec_setup() before clearing first_ec. ec_remove_handlers()
checks each EC_FLAGS_* bit before acting, so it is safe to call
regardless of how far ec_install_handlers() progressed:

  -ENODEV  (handler not installed): only calls acpi_ec_stop()
  -EPROBE_DEFER (handler installed): removes handler, stops EC

Fixes: 03e9a0e057 ("ACPI: EC: Consolidate event handler installation code")
Reported-by: Xiang Mei <xmei5@asu.edu>
Signed-off-by: Weiming Shi <bestswngs@gmail.com>
Link: https://patch.msgid.link/20260324165458.1337233-2-bestswngs@gmail.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-03-24 18:47:10 +01:00
Amir Goldstein
53a7c171e9 ovl: fix wrong detection of 32bit inode numbers
The implicit FILEID_INO32_GEN encoder was changed to be explicit,
so we need to fix the detection.

When mounting overlayfs with upperdir and lowerdir on different ext4
filesystems, the expected kmsg log is:

  overlayfs: "xino" feature enabled using 32 upper inode bits.

But instead, since the regressing commit, the kmsg log was:

  overlayfs: "xino" feature enabled using 2 upper inode bits.

Fixes: e21fc2038c ("exportfs: make ->encode_fh() a mandatory method for NFS export")
Cc: stable@vger.kernel.org # v6.7+
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2026-03-24 16:17:26 +01:00
Danilo Krummrich
cc34d77dd4 spi: use generic driver_override infrastructure
When a driver is probed through __driver_attach(), the bus' match()
callback is called without the device lock held, thus accessing the
driver_override field without a lock, which can cause a UAF.

Fix this by using the driver-core driver_override infrastructure taking
care of proper locking internally.

Note that calling match() from __driver_attach() without the device lock
held is intentional. [1]

Also note that we do not enable the driver_override feature of struct
bus_type, as SPI - in contrast to most other buses - passes "" to
sysfs_emit() when the driver_override pointer is NULL. Thus, printing
"\n" instead of "(null)\n".

Link: https://lore.kernel.org/driver-core/DGRGTIRHA62X.3RY09D9SOK77P@kernel.org/ [1]
Reported-by: Gui-Dong Han <hanguidong02@gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220789
Fixes: 5039563e7c ("spi: Add driver_override SPI device attribute")
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260324005919.2408620-12-dakr@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
2026-03-24 15:00:08 +00:00
Sanman Pradhan
b0c9d8ae71 hwmon: (peci/cputemp) Fix off-by-one in cputemp_is_visible()
cputemp_is_visible() validates the channel index against
CPUTEMP_CHANNEL_NUMS, but currently uses '>' instead of '>='.
As a result, channel == CPUTEMP_CHANNEL_NUMS is not rejected even though
valid indices are 0 .. CPUTEMP_CHANNEL_NUMS - 1.

Fix the bounds check by using '>=' so invalid channel indices are
rejected before indexing the core bitmap.

Fixes: bf3608f338 ("hwmon: peci: Add cputemp driver")
Cc: stable@vger.kernel.org
Signed-off-by: Sanman Pradhan <psanman@juniper.net>
Link: https://lore.kernel.org/r/20260323002352.93417-3-sanman.pradhan@hpe.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2026-03-24 07:55:34 -07:00
Sanman Pradhan
0adc752b4f hwmon: (peci/cputemp) Fix crit_hyst returning delta instead of absolute temperature
The hwmon sysfs ABI expects tempN_crit_hyst to report the temperature at
which the critical condition clears, not the hysteresis delta from the
critical limit.

The peci cputemp driver currently returns tjmax - tcontrol for
crit_hyst_type, which is the hysteresis margin rather than the
corresponding absolute temperature.

Return tcontrol directly, and update the documentation accordingly.

Fixes: bf3608f338 ("hwmon: peci: Add cputemp driver")
Cc: stable@vger.kernel.org
Signed-off-by: Sanman Pradhan <psanman@juniper.net>
Link: https://lore.kernel.org/r/20260323002352.93417-2-sanman.pradhan@hpe.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2026-03-24 07:55:34 -07:00
Sanman Pradhan
3075a3951f hwmon: (pmbus/isl68137) Add mutex protection for AVS enable sysfs attributes
The custom avs0_enable and avs1_enable sysfs attributes access PMBus
registers through the exported API helpers (pmbus_read_byte_data,
pmbus_read_word_data, pmbus_write_word_data, pmbus_update_byte_data)
without holding the PMBus update_lock mutex. These exported helpers do
not acquire the mutex internally, unlike the core's internal callers
which hold the lock before invoking them.

The store callback is especially vulnerable: it performs a multi-step
read-modify-write sequence (read VOUT_COMMAND, write VOUT_COMMAND, then
update OPERATION) where concurrent access from another thread could
interleave and corrupt the register state.

Add pmbus_lock_interruptible()/pmbus_unlock() around both the show and
store callbacks to serialize PMBus register access with the rest of the
driver.

Fixes: 038a9c3d1e ("hwmon: (pmbus/isl68137) Add driver for Intersil ISL68137 PWM Controller")
Cc: stable@vger.kernel.org
Signed-off-by: Sanman Pradhan <psanman@juniper.net>
Link: https://lore.kernel.org/r/20260319173055.125271-3-sanman.pradhan@hpe.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2026-03-24 07:55:34 -07:00
Sanman Pradhan
f7e775c469 hwmon: (pmbus/ina233) Fix error handling and sign extension in shunt voltage read
ina233_read_word_data() reads MFR_READ_VSHUNT via pmbus_read_word_data()
but has two issues:

1. The return value is not checked for errors before being used in
   arithmetic. A negative error code from a failed I2C transaction is
   passed directly to DIV_ROUND_CLOSEST(), producing garbage data.

2. MFR_READ_VSHUNT is a 16-bit two's complement value. Negative shunt
   voltages (values with bit 15 set) are treated as large positive
   values since pmbus_read_word_data() returns them zero-extended in an
   int. This leads to incorrect scaling in the VIN coefficient
   conversion.

Fix both issues by adding an error check, casting to s16 for proper
sign extension, and clamping the result to a valid non-negative range.
The clamp is necessary because read_word_data callbacks must return
non-negative values on success (negative values indicate errors to the
pmbus core).

Fixes: b64b6cb163 ("hwmon: Add driver for TI INA233 Current and Power Monitor")
Cc: stable@vger.kernel.org
Signed-off-by: Sanman Pradhan <psanman@juniper.net>
Link: https://lore.kernel.org/r/20260319173055.125271-2-sanman.pradhan@hpe.com
[groeck: Fixed clamp to avoid losing the sign bit]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2026-03-24 07:55:08 -07:00
Matt Roper
56781a4597 drm/xe: Implement recent spec updates to Wa_16025250150
The hardware teams noticed that the originally documented workaround
steps for Wa_16025250150 may not be sufficient to fully avoid a hardware
issue.  The workaround documentation has been augmented to suggest
programming one additional register; make the corresponding change in
the driver.

Fixes: 7654d51f1f ("drm/xe/xe2hpg: Add Wa_16025250150")
Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
Link: https://patch.msgid.link/20260319-wa_16025250150_part2-v1-1-46b1de1a31b2@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
(cherry picked from commit a31566762d4075646a8a2214586158b681e94305)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-03-24 09:29:10 -04:00
Srinivas Kandagatla
cfb385a8dc ASoC: codecs: wcd934x: fix typo in dt parsing
Looks like we ended up with a typo during device tree data parsing
as part of 4f16b6351b ("ASoC: codecs: wcd: add common helper for wcd
codecs") patch.
 This will result in not parsing the device tree data and results in
zero mic bias values.

Fix this by calling wcd_dt_parse_micbias_info instead of
wcd_dt_parse_mbhc_data.

Fixes: 4f16b6351b ("ASoC: codecs: wcd: add common helper for wcd codecs")
Cc: Stable@vger.kernel.org
Reported-by: Joel Selvaraj <foss@joelselvaraj.com>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@oss.qualcomm.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Link: https://patch.msgid.link/20260323231748.2217967-1-srinivas.kandagatla@oss.qualcomm.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2026-03-24 13:09:09 +00:00
Alice Ryhl
8121353a4b rust: regulator: do not assume that regulator_get() returns non-null
The Rust `Regulator` abstraction uses `NonNull` to wrap the underlying
`struct regulator` pointer. When `CONFIG_REGULATOR` is disabled, the C
stub for `regulator_get` returns `NULL`. `from_err_ptr` does not treat
`NULL` as an error, so it was passed to `NonNull::new_unchecked`,
causing undefined behavior.

Fix this by using a raw pointer `*mut bindings::regulator` instead of
`NonNull`. This allows `inner` to be `NULL` when `CONFIG_REGULATOR` is
disabled, and leverages the C stubs which are designed to handle `NULL`
or are no-ops.

Fixes: 9b614ceada ("rust: regulator: add a bare minimum regulator abstraction")
Reported-by: Miguel Ojeda <ojeda@kernel.org>
Closes: https://lore.kernel.org/r/20260322193830.89324-1-ojeda@kernel.org
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com>
Link: https://patch.msgid.link/20260324-regulator-fix-v1-1-a5244afa3c15@google.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2026-03-24 13:08:21 +00:00
Jihed Chaibi
91049ec2e1 ASoC: dt-bindings: stm32: Fix incorrect compatible string in stm32h7-sai match
The conditional block that defines clock constraints for the stm32h7-sai
variant references "st,stm32mph7-sai", which does not match any compatible
string in the enum. As a result, clock validation for the h7 variant is
silently skipped. Correct the compatible string to "st,stm32h7-sai".

Fixes: 8509bb1f11 ("ASoC: dt-bindings: add stm32mp25 support for sai")
Signed-off-by: Jihed Chaibi <jihed.chaibi.dev@gmail.com>
Reviewed-by: Olivier Moysan <olivier.moysan@foss.st.com>
Link: https://patch.msgid.link/20260321012011.125791-1-jihed.chaibi.dev@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2026-03-24 12:58:35 +00:00
Karol Wachowski
e8ab57b564 accel/ivpu: Add disable clock relinquish workaround for NVL-A0
Turn on disable clock relinquish workaround for Nova Lake A0.
Without this workaround NPU may not power off correctly after
inference, leading to unexpected system behavior.

Fixes: 550f4dd2ce ("accel/ivpu: Add support for Nova Lake's NPU")
Cc: <stable@vger.kernel.org> # v6.19+

Reviewed-by: Lizhi.hou <lizhi.hou@amd.com>
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Link: https://patch.msgid.link/20260323095029.64613-1-karol.wachowski@linux.intel.com
2026-03-24 09:29:28 +01:00
Darrick J. Wong
f621324dfb iomap: fix lockdep complaint when reads fail
Zorro Lang reported the following lockdep splat:

"While running fstests xfs/556 on kernel 7.0.0-rc4+ (HEAD=04a9f1766954),
a lockdep warning was triggered indicating an inconsistent lock state
for sb->s_type->i_lock_key.

"The deadlock might occur because iomap_read_end_io (called from a
hardware interrupt completion path) invokes fserror_report, which then
calls igrab.  igrab attempts to acquire the i_lock spinlock. However,
the i_lock is frequently acquired in process context with interrupts
enabled. If an interrupt occurs while a process holds the i_lock, and
that interrupt handler calls fserror_report, the system deadlocks.

"I hit this warning several times by running xfs/556 (mostly) or
generic/648 on xfs. More details refer to below console log."

along with this dmesg, for which I've cleaned up the stacktraces:

 run fstests xfs/556 at 2026-03-18 20:05:30
 XFS (sda3): Mounting V5 Filesystem 396e9164-c45a-4e05-be9d-b38c2c5c6477
 XFS (sda3): Ending clean mount
 XFS (sda3): Unmounting Filesystem 396e9164-c45a-4e05-be9d-b38c2c5c6477
 XFS (sda3): Mounting V5 Filesystem bf3f89c3-3c45-4650-a9c7-744f39c0191e
 XFS (sda3): Ending clean mount
 XFS (sda3): Unmounting Filesystem bf3f89c3-3c45-4650-a9c7-744f39c0191e
 XFS (dm-0): Mounting V5 Filesystem bf3f89c3-3c45-4650-a9c7-744f39c0191e
 XFS (dm-0): Ending clean mount
 device-mapper: table: 253:0: adding target device (start sect 209 len 1) caused an alignment inconsistency
 device-mapper: table: 253:0: adding target device (start sect 210 len 62914350) caused an alignment inconsistency
 buffer_io_error: 6 callbacks suppressed
 Buffer I/O error on dev dm-0, logical block 209, async page read
 Buffer I/O error on dev dm-0, logical block 209, async page read
 XFS (dm-0): Unmounting Filesystem bf3f89c3-3c45-4650-a9c7-744f39c0191e
 XFS (dm-0): Mounting V5 Filesystem bf3f89c3-3c45-4650-a9c7-744f39c0191e
 XFS (dm-0): Ending clean mount

 ================================
 WARNING: inconsistent lock state
 7.0.0-rc4+ #1 Tainted: G S      W
 --------------------------------
 inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
 od/2368602 [HC1[1]:SC0[0]:HE0:SE1] takes:
 ff1100069f2b4a98 (&sb->s_type->i_lock_key#31){?.+.}-{3:3}, at: igrab+0x28/0x1a0
 {HARDIRQ-ON-W} state was registered at:
   __lock_acquire+0x40d/0xbd0
   lock_acquire.part.0+0xbd/0x260
   _raw_spin_lock+0x37/0x80
   unlock_new_inode+0x66/0x2a0
   xfs_iget+0x67b/0x7b0 [xfs]
   xfs_mountfs+0xde4/0x1c80 [xfs]
   xfs_fs_fill_super+0xe86/0x17a0 [xfs]
   get_tree_bdev_flags+0x312/0x590
   vfs_get_tree+0x8d/0x2f0
   vfs_cmd_create+0xb2/0x240
   __do_sys_fsconfig+0x3d8/0x9a0
   do_syscall_64+0x13a/0x1520
   entry_SYSCALL_64_after_hwframe+0x76/0x7e
 irq event stamp: 3118
 hardirqs last  enabled at (3117): [<ffffffffb54e4ad8>] _raw_spin_unlock_irq+0x28/0x50
 hardirqs last disabled at (3118): [<ffffffffb54b84c9>] common_interrupt+0x19/0xe0
 softirqs last  enabled at (3040): [<ffffffffb290ca28>] handle_softirqs+0x6b8/0x950
 softirqs last disabled at (3023): [<ffffffffb290ce4d>] __irq_exit_rcu+0xfd/0x250

 other info that might help us debug this:
  Possible unsafe locking scenario:

        CPU0
        ----
   lock(&sb->s_type->i_lock_key#31);
   <Interrupt>
     lock(&sb->s_type->i_lock_key#31);

  *** DEADLOCK ***

 1 lock held by od/2368602:
  #0: ff1100069f2b4b58 (&sb->s_type->i_mutex_key#19){++++}-{4:4}, at: xfs_ilock+0x324/0x4b0 [xfs]

 stack backtrace:
 CPU: 15 UID: 0 PID: 2368602 Comm: od Kdump: loaded Tainted: G S      W           7.0.0-rc4+ #1 PREEMPT(full)
 Tainted: [S]=CPU_OUT_OF_SPEC, [W]=WARN
 Hardware name: Dell Inc. PowerEdge R660/0R5JJC, BIOS 2.1.5 03/14/2024
 Call Trace:
  <IRQ>
  dump_stack_lvl+0x6f/0xb0
  print_usage_bug.part.0+0x230/0x2c0
  mark_lock_irq+0x3ce/0x5b0
  mark_lock+0x1cb/0x3d0
  mark_usage+0x109/0x120
  __lock_acquire+0x40d/0xbd0
  lock_acquire.part.0+0xbd/0x260
  _raw_spin_lock+0x37/0x80
  igrab+0x28/0x1a0
  fserror_report+0x127/0x2d0
  iomap_finish_folio_read+0x13c/0x280
  iomap_read_end_io+0x10e/0x2c0
  clone_endio+0x37e/0x780 [dm_mod]
  blk_update_request+0x448/0xf00
  scsi_end_request+0x74/0x750
  scsi_io_completion+0xe9/0x7c0
  _scsih_io_done+0x6ba/0x1ca0 [mpt3sas]
  _base_process_reply_queue+0x249/0x15b0 [mpt3sas]
  _base_interrupt+0x95/0xe0 [mpt3sas]
  __handle_irq_event_percpu+0x1f0/0x780
  handle_irq_event+0xa9/0x1c0
  handle_edge_irq+0x2ef/0x8a0
  __common_interrupt+0xa0/0x170
  common_interrupt+0xb7/0xe0
  </IRQ>
  <TASK>
  asm_common_interrupt+0x26/0x40
 RIP: 0010:_raw_spin_unlock_irq+0x2e/0x50
 Code: 0f 1f 44 00 00 53 48 8b 74 24 08 48 89 fb 48 83 c7 18 e8 b5 73 5e fd 48 89 df e8 ed e2 5e fd e8 08 78 8f fd fb bf 01 00 00 00 <e8> 8d 56 4d fd 65 8b 05 46 d5 1d 03 85 c0 74 06 5b c3 cc cc cc cc
 RSP: 0018:ffa0000027d07538 EFLAGS: 00000206
 RAX: 0000000000000c2d RBX: ffffffffb6614bc8 RCX: 0000000000000080
 RDX: 0000000000000000 RSI: ffffffffb6306a01 RDI: 0000000000000001
 RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
 R10: ffffffffb75efc67 R11: 0000000000000001 R12: ff1100015ada0000
 R13: 0000000000000083 R14: 0000000000000002 R15: ffffffffb6614c10
  folio_wait_bit_common+0x407/0x780
  filemap_update_page+0x8e7/0xbd0
  filemap_get_pages+0x904/0xc50
  filemap_read+0x320/0xc20
  xfs_file_buffered_read+0x2aa/0x380 [xfs]
  xfs_file_read_iter+0x263/0x4a0 [xfs]
  vfs_read+0x6cb/0xb70
  ksys_read+0xf9/0x1d0
  do_syscall_64+0x13a/0x1520

Zorro's diagnosis makes sense, so the solution is to kick the failed
read handling to a workqueue much like we added for writeback ioends in
commit 294f54f849 ("fserror: fix lockdep complaint when igrabbing
inode").

Cc: Zorro Lang <zlang@redhat.com>
Link: https://lore.kernel.org/linux-xfs/20260319194303.efw4wcu7c4idhthz@doltdoltdolt/
Fixes: a9d573ee88 ("iomap: report file I/O errors to the VFS")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Link: https://patch.msgid.link/20260323210017.GL6223@frogsfrogsfrogs
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-03-24 09:14:46 +01:00
Imre Deak
77fcf58df1 drm/i915/dp_tunnel: Fix error handling when clearing stream BW in atomic state
Clearing the DP tunnel stream BW in the atomic state involves getting
the tunnel group state, which can fail. Handle the error accordingly.

This fixes at least one issue where drm_dp_tunnel_atomic_set_stream_bw()
failed to get the tunnel group state returning -EDEADLK, which wasn't
handled. This lead to the ctx->contended warn later in modeset_lock()
while taking a WW mutex for another object in the same atomic state, and
thus within the same already contended WW context.

Moving intel_crtc_state_alloc() later would avoid freeing saved_state on
the error path; this stable patch leaves that simplification for a
follow-up.

Cc: Uma Shankar <uma.shankar@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v6.9+
Fixes: a4efae87ec ("drm/i915/dp: Compute DP tunnel BW during encoder state computation")
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/7617
Reviewed-by: Michał Grzelak <michal.grzelak@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patch.msgid.link/20260320092900.13210-1-imre.deak@intel.com
(cherry picked from commit fb69d0076e687421188bc8103ab0e8e5825b1df1)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2026-03-24 08:00:00 +02:00
Lukas Wunner
05f643d6f7 Documentation: PCI: Document PCIe TLP Header decoder for AER messages
The prefix/header of a TLP that caused an error may be recorded in the AER
Capability and emitted to the kernel log in raw hex format.  Document the
existence and usage of tlp-tool, which decodes the TLP Header into
human-readable form.

The TLP Header hints at the root cause of an error, yet is often ignored
because of its seeming opaqueness.  Instead, PCIe errors are frequently
worked around by a change in the kernel without fully understanding the
actual source of the problem.  With more documentation on available tools
we'll hopefully come up with better solutions.

There are also wireshark dissectors for TLPs, but it seems they expect a
complete TLP, not just the header, and they cannot grok the hex format
emitted by the kernel directly.  tlp-tool appears to be the most cut and
dried solution out there.

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Maciej Grochowski <mx2pg@pm.me>
Link: https://patch.msgid.link/bf826c41b4c1d255c7dcb16e266b52f774d944ed.1774246067.git.lukas@wunner.de
2026-03-23 15:58:02 -05:00
Felix Gu
70bb843794 PCI/pwrctrl: Fix pci_pwrctrl_is_required() device node leak
The for_each_endpoint_of_node() macro requires calling of_node_put() on the
endpoint node when breaking out of the loop early.

Add of_node_put(endpoint) before the early return to release the reference.

Fixes: cf3287fb2c ("PCI/pwrctrl: Ensure that remote endpoint node parent has supply requirement")
Signed-off-by: Felix Gu <ustc.gu@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Link: https://patch.msgid.link/20260323-pwctrl-v1-1-f5c03a2df7fb@gmail.com
2026-03-23 15:57:43 -05:00
Chen-Yu Tsai
2d8c5098b8 PCI/pwrctrl: Do not power off on pwrctrl device removal
With the move to explicit pwrctrl power on/off APIs, the caller, i.e., the
PCI controller driver, should manage the power state. The pwrctrl drivers
should not try to clean up or power off when they are removed, as this
might end up disabling an already disabled regulator, causing a big
warning.  This can be triggered if a PCI controller driver's .remove()
callback calls pci_pwrctrl_destroy_devices() after
pci_pwrctrl_power_off_devices().

Drop the devm cleanup parts that turn off regulators from the pwrctrl
drivers.

Fixes: b921aa3f8d ("PCI/pwrctrl: Switch to pwrctrl create, power on/off, destroy APIs")
Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Link: https://patch.msgid.link/20260226092234.3859740-1-wenst@chromium.org
2026-03-23 15:25:32 -05:00
Filipe Manana
1c37d896b1 btrfs: fix lost error when running device stats on multiple devices fs
Whenever we get an error updating the device stats item for a device in
btrfs_run_dev_stats() we allow the loop to go to the next device, and if
updating the stats item for the next device succeeds, we end up losing
the error we had from the previous device.

Fix this by breaking out of the loop once we get an error and make sure
it's returned to the caller. Since we are in the transaction commit path
(and in the critical section actually), returning the error will result
in a transaction abort.

Fixes: 733f4fbbc1 ("Btrfs: read device stats on mount, write modified ones during commit")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2026-03-23 20:12:12 +01:00
Goldwyn Rodrigues
a85b46db14 btrfs: tracepoints: get correct superblock from dentry in event btrfs_sync_file()
If overlay is used on top of btrfs, dentry->d_sb translates to overlay's
super block and fsid assignment will lead to a crash.

Use file_inode(file)->i_sb to always get btrfs_sb.

Reviewed-by: Boris Burkov <boris@bur.io>
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2026-03-23 20:11:12 +01:00
Qu Wenruo
0dcabcb920 btrfs: zlib: handle page aligned compressed size correctly
[BUG]
Since commit 3d74a7556f ("btrfs: zlib: introduce zlib_compress_bio()
helper"), there are some reports about different crashes in zlib
compression path. One of the symptoms is list corruption like the
following:

  list_del corruption. next->prev should be fffffbb340204a08, but was ffff8d6517cb7de0. (next=fffffbb3402d62c8)
  ------------[ cut here ]------------
  kernel BUG at lib/list_debug.c:65!
  Oops: invalid opcode: 0000 [#1] SMP NOPTI
  CPU: 1 UID: 0 PID: 21436 Comm: kworker/u16:7 Not tainted 7.0.0-rc2-jcg+ #1 PREEMPT
  Hardware name: LENOVO 10VGS02P00/3130, BIOS M1XKT57A 02/10/2022
  Workqueue: btrfs-delalloc btrfs_work_helper [btrfs]
  RIP: 0010:__list_del_entry_valid_or_report+0xec/0xf0
  Call Trace:
   <TASK>
   btrfs_alloc_compr_folio+0xae/0xc0 [btrfs]
   zlib_compress_bio+0x39d/0x6a0 [btrfs]
   btrfs_compress_bio+0x2e3/0x3d0 [btrfs]
   compress_file_range+0x2b0/0x660 [btrfs]
   btrfs_work_helper+0xdb/0x3e0 [btrfs]
   process_one_work+0x192/0x3d0
   worker_thread+0x19a/0x310
   kthread+0xdf/0x120
   ret_from_fork+0x22e/0x310
   ret_from_fork_asm+0x1a/0x30
   </TASK>
  ---[ end trace 0000000000000000 ]---

Other symptoms include VM_BUG_ON() during folio_put() but it's rarer.

David Sterba firstly reported this during his CI runs but unfortunately
I'm unable to hit it.

Meanwhile zstd/lzo doesn't seem to have the same problem.

[CAUSE]
During zlib_compress_bio() every time the output buffer is full, we
queue the full folio into the compressed bio, and allocate a new folio
as the output folio.

After the input has finished, we loop through zlib_deflate() with
Z_FINISH to flush all output.

And when that is done, we still need to check if the last folio has any
content, and if so we still need to queue that part into the compressed
bio.

The problem is in the final folio handling, if the final folio is full
(for x86_64 the folio size is 4K), the length to queue is calculated by

  u32 cur_len = offset_in_folio(out_folio, workspace->strm.total_out);

But since total_out is 4K aligned, the resulted @cur_len will be 0, then
we hit the bio_add_folio(), which has a quirk that if bio_add_folio()
got an length 0, it will still queue the folio into the bio, but return
false.

In that case we go to out: tag, which calls btrfs_free_compr_folio() to
release @out_folio, which may put the out folio into the btrfs global
pool list.

On the other hand, that @out_folio is already added to the
compressed bio, and will later be released again by
cleanup_compressed_bio(), which results double release.

And if this time we still need to put the folio into the btrfs global
pool list, it will result a list corruption because it's already in the
list.

[FIX]
Instead of offset_inside_folio(), directly use the difference between
strm.total_out and bi_size.
So that if the last folio is completely full, we can still properly
queue the full folio other than queueing zero byte.

Fixes: 3d74a7556f ("btrfs: zlib: introduce zlib_compress_bio() helper")
Reported-by: David Sterba <dsterba@suse.com>
Reported-by: Jean-Christophe Guillain <jean-christophe@guillain.net>
Reported-by: syzbot+3c4d8371d65230f852a2@syzkaller.appspotmail.com
Link: https://bugzilla.kernel.org/show_bug.cgi?id=221176
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2026-03-23 20:11:07 +01:00
Shin'ichiro Kawasaki
a4376d9a5d btrfs: fix leak of kobject name for sub-group space_info
When create_space_info_sub_group() allocates elements of
space_info->sub_group[], kobject_init_and_add() is called for each
element via btrfs_sysfs_add_space_info_type(). However, when
check_removing_space_info() frees these elements, it does not call
btrfs_sysfs_remove_space_info() on them. As a result, kobject_put() is
not called and the associated kobj->name objects are leaked.

This memory leak is reproduced by running the blktests test case
zbd/009 on kernels built with CONFIG_DEBUG_KMEMLEAK. The kmemleak
feature reports the following error:

unreferenced object 0xffff888112877d40 (size 16):
  comm "mount", pid 1244, jiffies 4294996972
  hex dump (first 16 bytes):
    64 61 74 61 2d 72 65 6c 6f 63 00 c4 c6 a7 cb 7f  data-reloc......
  backtrace (crc 53ffde4d):
    __kmalloc_node_track_caller_noprof+0x619/0x870
    kstrdup+0x42/0xc0
    kobject_set_name_vargs+0x44/0x110
    kobject_init_and_add+0xcf/0x150
    btrfs_sysfs_add_space_info_type+0xfc/0x210 [btrfs]
    create_space_info_sub_group.constprop.0+0xfb/0x1b0 [btrfs]
    create_space_info+0x211/0x320 [btrfs]
    btrfs_init_space_info+0x15a/0x1b0 [btrfs]
    open_ctree+0x33c7/0x4a50 [btrfs]
    btrfs_get_tree.cold+0x9f/0x1ee [btrfs]
    vfs_get_tree+0x87/0x2f0
    vfs_cmd_create+0xbd/0x280
    __do_sys_fsconfig+0x3df/0x990
    do_syscall_64+0x136/0x1540
    entry_SYSCALL_64_after_hwframe+0x76/0x7e

To avoid the leak, call btrfs_sysfs_remove_space_info() instead of
kfree() for the elements.

Fixes: f92ee31e03 ("btrfs: introduce btrfs_space_info sub-group")
Link: https://lore.kernel.org/linux-block/b9488881-f18d-4f47-91a5-3c9bf63955a5@wdc.com/
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2026-03-23 20:10:27 +01:00
Filipe Manana
5254d4181a btrfs: fix zero size inode with non-zero size after log replay
When logging that an inode exists, as part of logging a new name or
logging new dir entries for a directory, we always set the generation of
the logged inode item to 0. This is to signal during log replay (in
overwrite_item()), that we should not set the i_size since we only logged
that an inode exists, so the i_size of the inode in the subvolume tree
must be preserved (as when we log new names or that an inode exists, we
don't log extents).

This works fine except when we have already logged an inode in full mode
or it's the first time we are logging an inode created in a past
transaction, that inode has a new i_size of 0 and then we log a new name
for the inode (due to a new hardlink or a rename), in which case we log
an i_size of 0 for the inode and a generation of 0, which causes the log
replay code to not update the inode's i_size to 0 (in overwrite_item()).

An example scenario:

  mkdir /mnt/dir
  xfs_io -f -c "pwrite 0 64K" /mnt/dir/foo

  sync

  xfs_io -c "truncate 0" -c "fsync" /mnt/dir/foo

  ln /mnt/dir/foo /mnt/dir/bar

  xfs_io -c "fsync" /mnt/dir

  <power fail>

After log replay the file remains with a size of 64K. This is because when
we first log the inode, when we fsync file foo, we log its current i_size
of 0, and then when we create a hard link we log again the inode in exists
mode (LOG_INODE_EXISTS) but we set a generation of 0 for the inode item we
add to the log tree, so during log replay overwrite_item() sees that the
generation is 0 and i_size is 0 so we skip updating the inode's i_size
from 64K to 0.

Fix this by making sure at fill_inode_item() we always log the real
generation of the inode if it was logged in the current transaction with
the i_size we logged before. Also if an inode created in a previous
transaction is logged in exists mode only, make sure we log the i_size
stored in the inode item located from the commit root, so that if we log
multiple times that the inode exists we get the correct i_size.

A test case for fstests will follow soon.

Reported-by: Vyacheslav Kovalevsky <slava.kovalevskiy.2014@gmail.com>
Link: https://lore.kernel.org/linux-btrfs/af8c15fa-4e41-4bb2-885c-0bc4e97532a6@gmail.com/
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2026-03-23 20:09:33 +01:00
Mark Harmstone
b52fe51f72 btrfs: fix super block offset in error message in btrfs_validate_super()
Fix the superblock offset mismatch error message in
btrfs_validate_super(): we changed it so that it considers all the
superblocks, but the message still assumes we're only looking at the
first one.

The change from %u to %llu is because we're changing from a constant to
a u64.

Fixes: 069ec957c3 ("btrfs: Refactor btrfs_check_super_valid")
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Mark Harmstone <mark@harmstone.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2026-03-23 20:09:08 +01:00
Yang Wang
3e6dd28a11 drm/amd/pm: disable OD_FAN_CURVE if temp or pwm range invalid for smu v13
Forcibly disable the OD_FAN_CURVE feature when temperature or PWM range is invalid,
otherwise PMFW will reject this configuration on smu v13.0.x

example:
$ sudo cat /sys/bus/pci/devices/<BDF>/gpu_od/fan_ctrl/fan_curve

OD_FAN_CURVE:
0: 0C 0%
1: 0C 0%
2: 0C 0%
3: 0C 0%
4: 0C 0%
OD_RANGE:
FAN_CURVE(hotspot temp): 0C 0C
FAN_CURVE(fan speed): 0% 0%

$ echo "0 50 40" | sudo tee fan_curve

kernel log:
[  756.442527] amdgpu 0000:03:00.0: amdgpu: Fan curve temp setting(50) must be within [0, 0]!
[  777.345800] amdgpu 0000:03:00.0: amdgpu: Fan curve temp setting(50) must be within [0, 0]!

Closes: https://github.com/ROCm/amdgpu/issues/208
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 470891606c5a97b1d0d937e0aa67a3bed9fcb056)
Cc: stable@vger.kernel.org
2026-03-23 14:49:15 -04:00
Asad Kamal
2f0e491fae drm/amd/pm: Return -EOPNOTSUPP for unsupported OD_MCLK on smu_v13_0_6
When SET_UCLK_MAX capability is absent, return -EOPNOTSUPP from
smu_v13_0_6_emit_clk_levels() for OD_MCLK instead of 0. This makes
unsupported OD_MCLK reporting consistent with other clock types
and allows callers to skip the entry cleanly.

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit d82e0a72d9189e8acd353988e1a57f85ce479e37)
Cc: stable@vger.kernel.org
2026-03-23 14:48:52 -04:00
Asad Kamal
cdbc3b62cf drm/amd/pm: Skip redundant UCLK restore in smu_v13_0_6
Only reapply UCLK soft limits during PP_OD_RESTORE_DEFAULT when the
current max differs from the DPM table max. This avoids redundant
SMC updates and prevents -EINVAL on restore when no change is needed.

Fixes: b7a9003445 ("drm/amd/pm: Allow setting max UCLK on SMU v13.0.6")
Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 17f11bbbc76c8e83c8474ea708316b1e3631d927)
2026-03-23 14:48:45 -04:00
Alex Hung
37c2caa167 drm/amd/display: Fix drm_edid leak in amdgpu_dm
[WHAT]
When a sink is connected, aconnector->drm_edid was overwritten without
freeing the previous allocation, causing a memory leak on resume.

[HOW]
Free the previous drm_edid before updating it.

Reviewed-by: Roman Li <roman.li@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 52024a94e7111366141cfc5d888b2ef011f879e5)
Cc: stable@vger.kernel.org
2026-03-23 14:48:32 -04:00
Eric Huang
14b81abe7b drm/amdgpu: prevent immediate PASID reuse case
PASID resue could cause interrupt issue when process
immediately runs into hw state left by previous
process exited with the same PASID, it's possible that
page faults are still pending in the IH ring buffer when
the process exits and frees up its PASID. To prevent the
case, it uses idr cyclic allocator same as kernel pid's.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 8f1de51f49be692de137c8525106e0fce2d1912d)
Cc: stable@vger.kernel.org
2026-03-23 14:48:06 -04:00
Ruijing Dong
2d300ebfc4 drm/amdgpu: fix strsep() corrupting lockup_timeout on multi-GPU (v3)
amdgpu_device_get_job_timeout_settings() passes a pointer directly
to the global amdgpu_lockup_timeout[] buffer into strsep().
strsep() destructively replaces delimiter characters with '\0'
in-place.

On multi-GPU systems, this function is called once per device.
When a multi-value setting like "0,0,0,-1" is used, the first
GPU's call transforms the global buffer into "0\00\00\0-1". The
second GPU then sees only "0" (terminated at the first '\0'),
parses a single value, hits the single-value fallthrough
(index == 1), and applies timeout=0 to all rings — causing
immediate false job timeouts.

Fix this by copying into a stack-local array before calling
strsep(), so the global module parameter buffer remains intact
across calls. The buffer is AMDGPU_MAX_TIMEOUT_PARAM_LENGTH
(256) bytes, which is safe for the stack.

v2: wrap commit message to 72 columns, add Assisted-by tag.
v3: use stack array with strscpy() instead of kstrdup()/kfree()
    to avoid unnecessary heap allocation (Christian).

This patch was developed with assistance from Claude (claude-opus-4-6).

Assisted-by: Claude:claude-opus-4-6
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 94d79f51efecb74be1d88dde66bdc8bfcca17935)
Cc: stable@vger.kernel.org
2026-03-23 14:47:47 -04:00
Yussuf Khalil
aed3d041ab drm/amd/display: Do not skip unrelated mode changes in DSC validation
Starting with commit 17ce8a6907 ("drm/amd/display: Add dsc pre-validation in
atomic check"), amdgpu resets the CRTC state mode_changed flag to false when
recomputing the DSC configuration results in no timing change for a particular
stream.

However, this is incorrect in scenarios where a change in MST/DSC configuration
happens in the same KMS commit as another (unrelated) mode change. For example,
the integrated panel of a laptop may be configured differently (e.g., HDR
enabled/disabled) depending on whether external screens are attached. In this
case, plugging in external DP-MST screens may result in the mode_changed flag
being dropped incorrectly for the integrated panel if its DSC configuration
did not change during precomputation in pre_validate_dsc().

At this point, however, dm_update_crtc_state() has already created new streams
for CRTCs with DSC-independent mode changes. In turn,
amdgpu_dm_commit_streams() will never release the old stream, resulting in a
memory leak. amdgpu_dm_atomic_commit_tail() will never acquire a reference to
the new stream either, which manifests as a use-after-free when the stream gets
disabled later on:

BUG: KASAN: use-after-free in dc_stream_release+0x25/0x90 [amdgpu]
Write of size 4 at addr ffff88813d836524 by task kworker/9:9/29977

Workqueue: events drm_mode_rmfb_work_fn
Call Trace:
 <TASK>
 dump_stack_lvl+0x6e/0xa0
 print_address_description.constprop.0+0x88/0x320
 ? dc_stream_release+0x25/0x90 [amdgpu]
 print_report+0xfc/0x1ff
 ? srso_alias_return_thunk+0x5/0xfbef5
 ? __virt_addr_valid+0x225/0x4e0
 ? dc_stream_release+0x25/0x90 [amdgpu]
 kasan_report+0xe1/0x180
 ? dc_stream_release+0x25/0x90 [amdgpu]
 kasan_check_range+0x125/0x200
 dc_stream_release+0x25/0x90 [amdgpu]
 dc_state_destruct+0x14d/0x5c0 [amdgpu]
 dc_state_release.part.0+0x4e/0x130 [amdgpu]
 dm_atomic_destroy_state+0x3f/0x70 [amdgpu]
 drm_atomic_state_default_clear+0x8ee/0xf30
 ? drm_mode_object_put.part.0+0xb1/0x130
 __drm_atomic_state_free+0x15c/0x2d0
 atomic_remove_fb+0x67e/0x980

Since there is no reliable way of figuring out whether a CRTC has unrelated
mode changes pending at the time of DSC validation, remember the value of the
mode_changed flag from before the point where a CRTC was marked as potentially
affected by a change in DSC configuration. Reset the mode_changed flag to this
earlier value instead in pre_validate_dsc().

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/5004
Fixes: 17ce8a6907 ("drm/amd/display: Add dsc pre-validation in atomic check")
Signed-off-by: Yussuf Khalil <dev@pp3345.net>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit cc7c7121ae082b7b82891baa7280f1ff2608f22b)
2026-03-23 14:38:56 -04:00
Felix Gu
63542bb402 spi: meson-spicc: Fix double-put in remove path
meson_spicc_probe() registers the controller with
devm_spi_register_controller(), so teardown already drops the
controller reference via devm cleanup.

Calling spi_controller_put() again in meson_spicc_remove()
causes a double-put.

Fixes: 8311ee2164 ("spi: meson-spicc: fix memory leak in meson_spicc_remove")
Signed-off-by: Felix Gu <ustc.gu@gmail.com>
Reviewed-by: Johan Hovold <johan@kernel.org>
Link: https://patch.msgid.link/20260322-rockchip-v1-1-fac3f0c6dad8@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2026-03-23 18:32:05 +00:00
Cezary Rojewski
5a184f1cb4 ASoC: Intel: catpt: Fix the device initialization
The DMA mask shall be coerced before any buffer allocations for the
device are done.  At the same time explain why DMA mask of 31 bits is
used in the first place.

Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Fixes: 7a10b66a5d ("ASoC: Intel: catpt: Device driver lifecycle")
Signed-off-by: Cezary Rojewski <cezary.rojewski@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://patch.msgid.link/20260320101217.1243688-1-cezary.rojewski@intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2026-03-23 17:32:27 +00:00
Felix Gu
a42c9b8b0c spi: sn-f-ospi: Use devm_mutex_init() to simplify code
Switch to devm_mutex_init() to handle mutex destruction automatically.
This simplifies the error paths in probe() and removes the need for an
explicit mutex_destroy() in remove() callback.

Signed-off-by: Felix Gu <ustc.gu@gmail.com>
Link: https://patch.msgid.link/20260319-sn-f-v1-2-33a6738d2da8@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2026-03-23 14:51:59 +00:00
Felix Gu
ef3d549e1d spi: sn-f-ospi: Fix resource leak in f_ospi_probe()
In f_ospi_probe(), when num_cs validation fails, it returns without
calling spi_controller_put() on the SPI controller, which causes a
resource leak.

Use devm_spi_alloc_host() instead of spi_alloc_host() to ensure the
SPI controller is properly freed when probe fails.

Fixes: 1b74dd64c8 ("spi: Add Socionext F_OSPI SPI flash controller driver")
Signed-off-by: Felix Gu <ustc.gu@gmail.com>
Link: https://patch.msgid.link/20260319-sn-f-v1-1-33a6738d2da8@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2026-03-23 14:51:59 +00:00
Michał Winiarski
87997b6c65 drm/xe/pf: Fix use-after-free in migration restore
When an error is returned from xe_sriov_pf_migration_restore_produce(),
the data pointer is not set to NULL, which can trigger use-after-free
in subsequent .write() calls.
Set the pointer to NULL upon error to fix the problem.

Fixes: 1ed30397c0 ("drm/xe/pf: Add support for encap/decap of bitstream to/from packet")
Reported-by: Sebastian Österlund <sebastian.osterlund@intel.com>
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/7230
Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com>
Link: https://patch.msgid.link/20260217154118.176902-1-michal.winiarski@intel.com
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
(cherry picked from commit 4f53d8c6d23527d734fe3531d08e15cb170a0819)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-03-23 10:10:50 -04:00
Peter Zijlstra
a3e93cac25 x86/cpu: Add comment clarifying CRn pinning
To avoid future confusion on the purpose and design of the CRn pinning code.

Also note that if the attacker controls page-tables, the CRn bits lose much of
the attraction anyway.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://patch.msgid.link/20260320092521.GG3739106@noisy.programming.kicks-ass.net
2026-03-23 14:25:53 +01:00
Nikunj A Dadhania
3645eb7e39 x86/fred: Fix early boot failures on SEV-ES/SNP guests
FRED-enabled SEV-(ES,SNP) guests fail to boot due to the following issues
in the early boot sequence:

* FRED does not have a #VC exception handler in the dispatch logic

* Early FRED #VC exceptions attempt to use uninitialized per-CPU GHCBs
  instead of boot_ghcb

Add X86_TRAP_VC case to fred_hwexc() with a new exc_vmm_communication()
function that provides the unified entry point FRED requires, dispatching
to existing user/kernel handlers based on privilege level. The function is
already declared via DECLARE_IDTENTRY_VC().

Fix early GHCB access by falling back to boot_ghcb in
__sev_{get,put}_ghcb() when per-CPU GHCBs are not yet initialized.

Fixes: 14619d912b ("x86/fred: FRED entry/exit and dispatch code")
Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Cc: <stable@kernel.org>  # 6.12+
Link: https://patch.msgid.link/20260318075654.1792916-4-nikunj@amd.com
2026-03-23 14:18:18 +01:00
Huiwen He
34420cb92d smb/client: ensure smb2_mapping_table rebuild on cmd changes
The current rule for smb2_mapping_table.c uses `$(call cmd,...)`, which
fails to track command line modifications in the Makefile (e.g., modifying
the command to `perl -d` or `perl -w` for debug will not trigger a rebuild)
and does not generate the required .cmd file for Kbuild.

Fix this by transitioning to the standard `$(call if_changed,...)` macro.
This includes adding the `FORCE` prerequisite and appending the output
file to the `targets` variable so Kbuild can track it properly.

As a result, Kbuild now automatically handles the cleaning of the
generated file, allowing us to safely drop the redundant `clean-files`
assignment.

Fixes: c527e13a7a ("cifs: Autogenerate SMB2 error mapping table")
Signed-off-by: Huiwen He <hehuiwen@kylinos.cn>
Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Signed-off-by: Steve French <stfrench@microsoft.com>
2026-03-23 08:17:26 -05:00
Borislav Petkov (AMD)
411df123c0 x86/cpu: Remove X86_CR4_FRED from the CR4 pinned bits mask
Commit in Fixes added the FRED CR4 bit to the CR4 pinned bits mask so
that whenever something else modifies CR4, that bit remains set. Which
in itself is a perfectly fine idea.

However, there's an issue when during boot FRED is initialized: first on
the BSP and later on the APs. Thus, there's a window in time when
exceptions cannot be handled.

This becomes particularly nasty when running as SEV-{ES,SNP} or TDX
guests which, when they manage to trigger exceptions during that short
window described above, triple fault due to FRED MSRs not being set up
yet.

See Link tag below for a much more detailed explanation of the
situation.

So, as a result, the commit in that Link URL tried to address this
shortcoming by temporarily disabling CR4 pinning when an AP is not
online yet.

However, that is a problem in itself because in this case, an attack on
the kernel needs to only modify the online bit - a single bit in RW
memory - and then disable CR4 pinning and then disable SM*P, leading to
more and worse things to happen to the system.

So, instead, remove the FRED bit from the CR4 pinning mask, thus
obviating the need to temporarily disable CR4 pinning.

If someone manages to disable FRED when poking at CR4, then
idt_invalidate() would make sure the system would crash'n'burn on the
first exception triggered, which is a much better outcome security-wise.

Fixes: ff45746fbf ("x86/cpu: Add X86_CR4_FRED macro")
Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: <stable@kernel.org> # 6.12+
Link: https://lore.kernel.org/r/177385987098.1647592.3381141860481415647.tip-bot2@tip-bot2
2026-03-23 14:12:03 +01:00
Youngjun Park
a8d51efb59 PM: sleep: Drop spurious WARN_ON() from pm_restore_gfp_mask()
Commit 35e4a69b20 ("PM: sleep: Allow pm_restrict_gfp_mask()
stacking") introduced refcount-based GFP mask management that warns
when pm_restore_gfp_mask() is called with saved_gfp_count == 0.

Some hibernation paths call pm_restore_gfp_mask() defensively where
the GFP mask may or may not be restricted depending on the execution
path. For example, the uswsusp interface invokes it in
SNAPSHOT_CREATE_IMAGE, SNAPSHOT_UNFREEZE, and snapshot_release().
Before the stacking change this was a silent no-op; it now triggers
a spurious WARNING.

Remove the WARN_ON() wrapper from the !saved_gfp_count check while
retaining the check itself, so that defensive calls remain harmless
without producing false warnings.

Fixes: 35e4a69b20 ("PM: sleep: Allow pm_restrict_gfp_mask() stacking")
Signed-off-by: Youngjun Park <youngjun.park@lge.com>
[ rjw: Subject tweak ]
Link: https://patch.msgid.link/20260322120528.750178-1-youngjun.park@lge.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-03-23 13:54:53 +01:00
Alberto Garcia
734eba62cd PM: hibernate: Drain trailing zero pages on userspace restore
Commit 005e8dddd4 ("PM: hibernate: don't store zero pages in the
image file") added an optimization to skip zero-filled pages in the
hibernation image. On restore, zero pages are handled internally by
snapshot_write_next() in a loop that processes them without returning
to the caller.

With the userspace restore interface, writing the last non-zero page
to /dev/snapshot is followed by the SNAPSHOT_ATOMIC_RESTORE ioctl. At
this point there are no more calls to snapshot_write_next() so any
trailing zero pages are not processed, snapshot_image_loaded() fails
because handle->cur is smaller than expected, the ioctl returns -EPERM
and the image is not restored.

The in-kernel restore path is not affected by this because the loop in
load_image() in swap.c calls snapshot_write_next() until it returns 0.
It is this final call that drains any trailing zero pages.

Fixed by calling snapshot_write_next() in snapshot_write_finalize(),
giving the kernel the chance to drain any trailing zero pages.

Fixes: 005e8dddd4 ("PM: hibernate: don't store zero pages in the image file")
Signed-off-by: Alberto Garcia <berto@igalia.com>
Acked-by: Brian Geffon <bgeffon@google.com>
Link: https://patch.msgid.link/ef5a7c5e3e3dbd17dcb20efaa0c53a47a23498bb.1773075892.git.berto@igalia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-03-23 13:33:06 +01:00
Viresh Kumar
6a28fb8cb2 cpufreq: conservative: Reset requested_freq on limits change
A recently reported issue highlighted that the cached requested_freq
is not guaranteed to stay in sync with policy->cur. If the platform
changes the actual CPU frequency after the governor sets one (e.g.
due to platform-specific frequency scaling) and a re-sync occurs
later, policy->cur may diverge from requested_freq.

This can lead to incorrect behavior in the conservative governor.
For example, the governor may assume the CPU is already running at
the maximum frequency and skip further increases even though there
is still headroom.

Avoid this by resetting the cached requested_freq to policy->cur on
detecting a change in policy limits.

Reported-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Tested-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Link: https://lore.kernel.org/all/20260210115458.3493646-1-zhenglifeng1@huawei.com/
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Zhongqiu Han <zhongqiu.han@oss.qualcomm.com>
Cc: All applicable <stable@vger.kernel.org>
Link: https://patch.msgid.link/d846a141a98ac0482f20560fcd7525c0f0ec2f30.1773999467.git.viresh.kumar@linaro.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-03-23 13:32:57 +01:00
Viresh Kumar
8f13c0c6cb cpufreq: Don't skip cpufreq_frequency_table_cpuinfo()
The commit 6db0f533d3 ("cpufreq: preserve freq_table_sorted
across suspend/hibernate") unintentionally made a change where
cpufreq_frequency_table_cpuinfo() isn't getting called anymore
for old policies getting re-initialized.

This leads to potentially invalid values of policy->max and
policy->cpuinfo_max_freq.

Fix the issue by reverting the original commit and adding the condition
for just the sorting function.

Fixes: 6db0f533d3 ("cpufreq: preserve freq_table_sorted across suspend/hibernate")
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: 6.19+ <stable@vger.kernel.org> # 6.19+
Link: https://patch.msgid.link/65ba5c45749267c82e8a87af3dc788b37a0b3f48.1773998611.git.viresh.kumar@linaro.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-03-23 13:32:57 +01:00
Nikunj A Dadhania
05243d490b x86/cpu: Enable FSGSBASE early in cpu_init_exception_handling()
Move FSGSBASE enablement from identify_cpu() to cpu_init_exception_handling()
to ensure it is enabled before any exceptions can occur on both boot and
secondary CPUs.

== Background ==

Exception entry code (paranoid_entry()) uses ALTERNATIVE patching based on
X86_FEATURE_FSGSBASE to decide whether to use RDGSBASE/WRGSBASE instructions
or the slower RDMSR/SWAPGS sequence for saving/restoring GSBASE.

On boot CPU, ALTERNATIVE patching happens after enabling FSGSBASE in CR4.
When the feature is available, the code is permanently patched to use
RDGSBASE/WRGSBASE, which require CR4.FSGSBASE=1 to execute without triggering

== Boot Sequence ==

Boot CPU (with CR pinning enabled):
  trap_init()
    cpu_init()                   <- Uses unpatched code (RDMSR/SWAPGS)
      x2apic_setup()
  ...
  arch_cpu_finalize_init()
    identify_boot_cpu()
      identify_cpu()
        cr4_set_bits(X86_CR4_FSGSBASE)  # Enables the feature
	# This becomes part of cr4_pinned_bits
    ...
    alternative_instructions()   <- Patches code to use RDGSBASE/WRGSBASE

Secondary CPUs (with CR pinning enabled):
  start_secondary()
    cr4_init()                   <- Code already patched, CR4.FSGSBASE=1
                                    set implicitly via cr4_pinned_bits

    cpu_init()                   <- exceptions work because FSGSBASE is
                                    already enabled

Secondary CPU (with CR pinning disabled):
  start_secondary()
    cr4_init()                   <- Code already patched, CR4.FSGSBASE=0
    cpu_init()
      x2apic_setup()
        rdmsrq(MSR_IA32_APICBASE)  <- Triggers #VC in SNP guests
          exc_vmm_communication()
            paranoid_entry()       <- Uses RDGSBASE with CR4.FSGSBASE=0
                                      (patched code)
    ...
    ap_starting()
      identify_secondary_cpu()
        identify_cpu()
	  cr4_set_bits(X86_CR4_FSGSBASE)  <- Enables the feature, which is
                                             too late

== CR Pinning ==

Currently, for secondary CPUs, CR4.FSGSBASE is set implicitly through
CR-pinning: the boot CPU sets it during identify_cpu(), it becomes part of
cr4_pinned_bits, and cr4_init() applies those pinned bits to secondary CPUs.
This works but creates an undocumented dependency between cr4_init() and the
pinning mechanism.

== Problem ==

Secondary CPUs boot after alternatives have been applied globally. They
execute already-patched paranoid_entry() code that uses RDGSBASE/WRGSBASE
instructions, which require CR4.FSGSBASE=1. Upcoming changes to CR pinning
behavior will break the implicit dependency, causing secondary CPUs to
generate #UD.

This issue manifests itself on AMD SEV-SNP guests, where the rdmsrq() in
x2apic_setup() triggers a #VC exception early during cpu_init(). The #VC
handler (exc_vmm_communication()) executes the patched paranoid_entry() path.
Without CR4.FSGSBASE enabled, RDGSBASE instructions trigger #UD.

== Fix ==

Enable FSGSBASE explicitly in cpu_init_exception_handling() before loading
exception handlers. This makes the dependency explicit and ensures both
boot and secondary CPUs have FSGSBASE enabled before paranoid_entry()
executes.

Fixes: c82965f9e5 ("x86/entry/64: Handle FSGSBASE enabled paranoid entry/exit")
Reported-by: Borislav Petkov <bp@alien8.de>
Suggested-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Cc: <stable@kernel.org>
Link: https://patch.msgid.link/20260318075654.1792916-2-nikunj@amd.com
2026-03-23 13:29:50 +01:00
Long Li
c6c56ff975 xfs: remove redundant validation in xlog_recover_attri_commit_pass2
Remove the redundant post-parse validation switch. By the time that
block is reached, xfs_attri_validate() has already guaranteed all name
lengths are non-zero via xfs_attri_validate_namelen(), and
xfs_attri_validate_name_iovec() has already returned -EFSCORRUPTED for
NULL names. For the REMOVE case, attr_value and value_len are
structurally guaranteed to be NULL/zero because the parsing loop only
populates them when value_len != 0. All checks in that switch are
therefore dead code.

Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Long Li <leo.lilong@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-03-23 11:00:08 +01:00
Long Li
d72f2084e3 xfs: fix ri_total validation in xlog_recover_attri_commit_pass2
The ri_total checks for SET/REPLACE operations are hardcoded to 3,
but xfs_attri_item_size() only emits a value iovec when value_len > 0,
so ri_total is 2 when value_len == 0.

For PPTR_SET/PPTR_REMOVE/PPTR_REPLACE, value_len is validated by
xfs_attri_validate() to be exactly sizeof(struct xfs_parent_rec) and
is never zero, so their hardcoded checks remain correct.

This problem may cause log recovery failures. The following script can be
used to reproduce the problem:

 #!/bin/bash
 mkfs.xfs -f /dev/sda
 mount /dev/sda /mnt/test/
 touch /mnt/test/file
 for i in {1..200}; do
         attr -s "user.attr_$i" -V "value_$i" /mnt/test/file > /dev/null
 done
 echo 1 > /sys/fs/xfs/debug/larp
 echo 1 > /sys/fs/xfs/sda/errortag/larp
 attr -s "user.zero" -V "" /mnt/test/file
 echo 0 > /sys/fs/xfs/sda/errortag/larp
 umount /mnt/test
 mount /dev/sda /mnt/test/  # mount failed

Fix this by deriving the expected count dynamically as "2 + !!value_len"
for SET/REPLACE operations.

Cc: stable@vger.kernel.org # v6.9
Fixes: ad206ae50e ("xfs: check opcode and iovec count match in xlog_recover_attri_commit_pass2")
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Long Li <leo.lilong@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-03-23 11:00:08 +01:00
Long Li
b854e1c4ef xfs: close crash window in attr dabtree inactivation
When inactivating an inode with node-format extended attributes,
xfs_attr3_node_inactive() invalidates all child leaf/node blocks via
xfs_trans_binval(), but intentionally does not remove the corresponding
entries from their parent node blocks.  The implicit assumption is that
xfs_attr_inactive() will truncate the entire attr fork to zero extents
afterwards, so log recovery will never reach the root node and follow
those stale pointers.

However, if a log shutdown occurs after the leaf/node block cancellations
commit but before the attr bmap truncation commits, this assumption
breaks.  Recovery replays the attr bmap intact (the inode still has
attr fork extents), but suppresses replay of all cancelled leaf/node
blocks, maybe leaving them as stale data on disk.  On the next mount,
xlog_recover_process_iunlinks() retries inactivation and attempts to
read the root node via the attr bmap. If the root node was not replayed,
reading the unreplayed root block triggers a metadata verification
failure immediately; if it was replayed, following its child pointers
to unreplayed child blocks triggers the same failure:

 XFS (pmem0): Metadata corruption detected at
 xfs_da3_node_read_verify+0x53/0x220, xfs_da3_node block 0x78
 XFS (pmem0): Unmount and run xfs_repair
 XFS (pmem0): First 128 bytes of corrupted metadata buffer:
 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
 XFS (pmem0): metadata I/O error in "xfs_da_read_buf+0x104/0x190" at daddr 0x78 len 8 error 117

Fix this in two places:

In xfs_attr3_node_inactive(), after calling xfs_trans_binval() on a
child block, immediately remove the entry that references it from the
parent node in the same transaction.  This eliminates the window where
the parent holds a pointer to a cancelled block.  Once all children are
removed, the now-empty root node is converted to a leaf block within the
same transaction. This node-to-leaf conversion is necessary for crash
safety. If the system shutdown after the empty node is written to the
log but before the second-phase bmap truncation commits, log recovery
will attempt to verify the root block on disk. xfs_da3_node_verify()
does not permit a node block with count == 0; such a block will fail
verification and trigger a metadata corruption shutdown. on the other
hand, leaf blocks are allowed to have this transient state.

In xfs_attr_inactive(), split the attr fork truncation into two explicit
phases.  First, truncate all extents beyond the root block (the child
extents whose parent references have already been removed above).
Second, invalidate the root block and truncate the attr bmap to zero in
a single transaction.  The two operations in the second phase must be
atomic: as long as the attr bmap has any non-zero length, recovery can
follow it to the root block, so the root block invalidation must commit
together with the bmap-to-zero truncation.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Cc: stable@vger.kernel.org
Signed-off-by: Long Li <leo.lilong@huawei.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-03-23 10:47:28 +01:00
Long Li
e65bb55d7f xfs: factor out xfs_attr3_leaf_init
Factor out wrapper xfs_attr3_leaf_init function, which exported for
external use.

Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Long Li <leo.lilong@huawei.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-03-23 10:47:28 +01:00
Long Li
ce4e789cf3 xfs: factor out xfs_attr3_node_entry_remove
Factor out wrapper xfs_attr3_node_entry_remove function, which
exported for external use.

Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Long Li <leo.lilong@huawei.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-03-23 10:47:28 +01:00
Long Li
e942498385 xfs: only assert new size for datafork during truncate extents
The assertion functions properly because we currently only truncate the
attr to a zero size. Any other new size of the attr is not preempted.
Make this assertion is specific to the datafork, preparing for
subsequent patches to truncate the attribute to a non-zero size.

Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Long Li <leo.lilong@huawei.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-03-23 10:47:27 +01:00
Ville Syrjälä
bfa71b7a9d drm/i915: Unlink NV12 planes earlier
unlink_nv12_plane() will clobber parts of the plane state
potentially already set up by plane_atomic_check(), so we
must make sure not to call the two in the wrong order.
The problem happens when a plane previously selected as
a Y plane is now configured as a normal plane by user space.
plane_atomic_check() will first compute the proper plane
state based on the userspace request, and unlink_nv12_plane()
later clears some of the state.

This used to work on account of unlink_nv12_plane() skipping
the state clearing based on the plane visibility. But I removed
that check, thinking it was an impossible situation. Now when
that situation happens unlink_nv12_plane() will just WARN
and proceed to clobber the state.

Rather than reverting to the old way of doing things, I think
it's more clear if we unlink the NV12 planes before we even
compute the new plane state.

Cc: stable@vger.kernel.org
Reported-by: Khaled Almahallawy <khaled.almahallawy@intel.com>
Closes: https://lore.kernel.org/intel-gfx/20260212004852.1920270-1-khaled.almahallawy@intel.com/
Tested-by: Khaled Almahallawy <khaled.almahallawy@intel.com>
Fixes: 6a01df2f1b ("drm/i915: Remove pointless visible check in unlink_nv12_plane()")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patch.msgid.link/20260316163953.12905-2-ville.syrjala@linux.intel.com
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
(cherry picked from commit 017ecd04985573eeeb0745fa2c23896fb22ee0cc)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2026-03-23 09:06:43 +02:00
Ville Syrjälä
6ad2a661ff drm/i915: Order OP vs. timeout correctly in __wait_for()
Put the barrier() before the OP so that anything we read out in
OP and check in COND will actually be read out after the timeout
has been evaluated.

Currently the only place where we use OP is __intel_wait_for_register(),
but the use there is precisely susceptible to this reordering, assuming
the ktime_*() stuff itself doesn't act as a sufficient barrier:

__intel_wait_for_register(...)
{
	...
	ret = __wait_for(reg_value = intel_uncore_read_notrace(...),
 			 (reg_value & mask) == value, ...);
	...
}

Cc: stable@vger.kernel.org
Fixes: 1c3c1dc66a ("drm/i915: Add compiler barrier to wait_for")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patch.msgid.link/20260313110740.24620-1-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
(cherry picked from commit a464bace0482aa9a83e9aa7beefbaf44cd58e6cf)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2026-03-23 09:06:40 +02:00
Samasth Norway Ananda
08441f10f4 drm/i915/gmbus: fix spurious timeout on 512-byte burst reads
When reading exactly 512 bytes with burst read enabled, the
extra_byte_added path breaks out of the inner do-while without
decrementing len. The outer while(len) then re-enters and gmbus_wait()
times out since all data has been delivered. Decrement len before the
break so the outer loop terminates correctly.

Fixes: d5dc0f43f2 ("drm/i915/gmbus: Enable burst read")
Signed-off-by: Samasth Norway Ananda <samasth.norway.ananda@oracle.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patch.msgid.link/20260316231920.135438-2-samasth.norway.ananda@oracle.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
(cherry picked from commit 4ab0f09ee73fc853d00466682635f67c531f909c)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2026-03-23 09:06:36 +02:00
Namjae Jeon
0e55f63dd0 ksmbd: replace hardcoded hdr2_len with offsetof() in smb2_calc_max_out_buf_len()
After this commit (e2b76ab8b5 "ksmbd: add support for read compound"),
response buffer management was changed to use dynamic iov array.
In the new design, smb2_calc_max_out_buf_len() expects the second
argument (hdr2_len) to be the offset of ->Buffer field in the
response structure, not a hardcoded magic number.
Fix the remaining call sites to use the correct offsetof() value.

Cc: stable@vger.kernel.org
Fixes: e2b76ab8b5 ("ksmbd: add support for read compound")
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2026-03-22 19:10:22 -05:00
Werner Kasselman
309b44ed68 ksmbd: fix memory leaks and NULL deref in smb2_lock()
smb2_lock() has three error handling issues after list_del() detaches
smb_lock from lock_list at no_check_cl:

1) If vfs_lock_file() returns an unexpected error in the non-UNLOCK
   path, goto out leaks smb_lock and its flock because the out:
   handler only iterates lock_list and rollback_list, neither of
   which contains the detached smb_lock.

2) If vfs_lock_file() returns -ENOENT in the UNLOCK path, goto out
   leaks smb_lock and flock for the same reason.  The error code
   returned to the dispatcher is also stale.

3) In the rollback path, smb_flock_init() can return NULL on
   allocation failure.  The result is dereferenced unconditionally,
   causing a kernel NULL pointer dereference.  Add a NULL check to
   prevent the crash and clean up the bookkeeping; the VFS lock
   itself cannot be rolled back without the allocation and will be
   released at file or connection teardown.

Fix cases 1 and 2 by hoisting the locks_free_lock()/kfree() to before
the if(!rc) check in the UNLOCK branch so all exit paths share one
free site, and by freeing smb_lock and flock before goto out in the
non-UNLOCK branch.  Propagate the correct error code in both cases.
Fix case 3 by wrapping the VFS unlock in an if(rlock) guard and adding
a NULL check for locks_free_lock(rlock) in the shared cleanup.

Found via call-graph analysis using sqry.

Fixes: e2f34481b2 ("cifsd: add server-side procedures for SMB3")
Cc: stable@vger.kernel.org
Suggested-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Signed-off-by: Werner Kasselman <werner@verivus.com>
Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2026-03-22 17:15:00 -05:00
Werner Kasselman
48623ec358 ksmbd: fix use-after-free and NULL deref in smb_grant_oplock()
smb_grant_oplock() has two issues in the oplock publication sequence:

1) opinfo is linked into ci->m_op_list (via opinfo_add) before
   add_lease_global_list() is called.  If add_lease_global_list()
   fails (kmalloc returns NULL), the error path frees the opinfo
   via __free_opinfo() while it is still linked in ci->m_op_list.
   Concurrent m_op_list readers (opinfo_get_list, or direct iteration
   in smb_break_all_levII_oplock) dereference the freed node.

2) opinfo->o_fp is assigned after add_lease_global_list() publishes
   the opinfo on the global lease list.  A concurrent
   find_same_lease_key() can walk the lease list and dereference
   opinfo->o_fp->f_ci while o_fp is still NULL.

Fix by restructuring the publication sequence to eliminate post-publish
failure:

- Set opinfo->o_fp before any list publication (fixes NULL deref).
- Preallocate lease_table via alloc_lease_table() before opinfo_add()
  so add_lease_global_list() becomes infallible after publication.
- Keep the original m_op_list publication order (opinfo_add before
  lease list) so concurrent opens via same_client_has_lease() and
  opinfo_get_list() still see the in-flight grant.
- Use opinfo_put() instead of __free_opinfo() on err_out so that
  the RCU-deferred free path is used.

This also requires splitting add_lease_global_list() to take a
preallocated lease_table and changing its return type from int to void,
since it can no longer fail.

Fixes: 1dfd062caa ("ksmbd: fix use-after-free by using call_rcu() for oplock_info")
Cc: stable@vger.kernel.org
Signed-off-by: Werner Kasselman <werner@verivus.com>
Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2026-03-22 17:15:00 -05:00
Hyunwoo Kim
9bbb19d21d ksmbd: do not expire session on binding failure
When a multichannel session binding request fails (e.g. wrong password),
the error path unconditionally sets sess->state = SMB2_SESSION_EXPIRED.
However, during binding, sess points to the target session looked up via
ksmbd_session_lookup_slowpath() -- which belongs to another connection's
user. This allows a remote attacker to invalidate any active session by
simply sending a binding request with a wrong password (DoS).

Fix this by skipping session expiration when the failed request was
a binding attempt, since the session does not belong to the current
connection. The reference taken by ksmbd_session_lookup_slowpath() is
still correctly released via ksmbd_user_session_put().

Cc: stable@vger.kernel.org
Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2026-03-22 17:15:00 -05:00
Luca Leonardo Scorcia
4cfdfeb6ac drm/mediatek: dsi: Store driver data before invoking mipi_dsi_host_register
The call to mipi_dsi_host_register triggers a callback to mtk_dsi_bind,
which uses dev_get_drvdata to retrieve the mtk_dsi struct, so this
structure needs to be stored inside the driver data before invoking it.

As drvdata is currently uninitialized it leads to a crash when
registering the DSI DRM encoder right after acquiring
the mode_config.idr_mutex, blocking all subsequent DRM operations.

Fixes the following crash during mediatek-drm probe (tested on Xiaomi
Smart Clock x04g):

Unable to handle kernel NULL pointer dereference at virtual address
 0000000000000040
[...]
Modules linked in: mediatek_drm(+) drm_display_helper cec drm_client_lib
 drm_dma_helper drm_kms_helper panel_simple
[...]
Call trace:
 drm_mode_object_add+0x58/0x98 (P)
 __drm_encoder_init+0x48/0x140
 drm_encoder_init+0x6c/0xa0
 drm_simple_encoder_init+0x20/0x34 [drm_kms_helper]
 mtk_dsi_bind+0x34/0x13c [mediatek_drm]
 component_bind_all+0x120/0x280
 mtk_drm_bind+0x284/0x67c [mediatek_drm]
 try_to_bring_up_aggregate_device+0x23c/0x320
 __component_add+0xa4/0x198
 component_add+0x14/0x20
 mtk_dsi_host_attach+0x78/0x100 [mediatek_drm]
 mipi_dsi_attach+0x2c/0x50
 panel_simple_dsi_probe+0x4c/0x9c [panel_simple]
 mipi_dsi_drv_probe+0x1c/0x28
 really_probe+0xc0/0x3dc
 __driver_probe_device+0x80/0x160
 driver_probe_device+0x40/0x120
 __device_attach_driver+0xbc/0x17c
 bus_for_each_drv+0x88/0xf0
 __device_attach+0x9c/0x1cc
 device_initial_probe+0x54/0x60
 bus_probe_device+0x34/0xa0
 device_add+0x5b0/0x800
 mipi_dsi_device_register_full+0xdc/0x16c
 mipi_dsi_host_register+0xc4/0x17c
 mtk_dsi_probe+0x10c/0x260 [mediatek_drm]
 platform_probe+0x5c/0xa4
 really_probe+0xc0/0x3dc
 __driver_probe_device+0x80/0x160
 driver_probe_device+0x40/0x120
 __driver_attach+0xc8/0x1f8
 bus_for_each_dev+0x7c/0xe0
 driver_attach+0x24/0x30
 bus_add_driver+0x11c/0x240
 driver_register+0x68/0x130
 __platform_register_drivers+0x64/0x160
 mtk_drm_init+0x24/0x1000 [mediatek_drm]
 do_one_initcall+0x60/0x1d0
 do_init_module+0x54/0x240
 load_module+0x1838/0x1dc0
 init_module_from_file+0xd8/0xf0
 __arm64_sys_finit_module+0x1b4/0x428
 invoke_syscall.constprop.0+0x48/0xc8
 do_el0_svc+0x3c/0xb8
 el0_svc+0x34/0xe8
 el0t_64_sync_handler+0xa0/0xe4
 el0t_64_sync+0x198/0x19c
Code: 52800022 941004ab 2a0003f3 37f80040 (29005a80)

Fixes: e4732b590a ("drm/mediatek: dsi: Register DSI host after acquiring clocks and PHY")
Signed-off-by: Luca Leonardo Scorcia <l.scorcia@gmail.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Reviewed-by: CK Hu <ck.hu@mediatek.com>
Link: https://patchwork.kernel.org/project/dri-devel/patch/20260225094047.76780-1-l.scorcia@gmail.com/
Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
2026-03-22 14:15:44 +00:00
Song Liu
c7f27a8ab9 workqueue: Fix false positive stall reports
On weakly ordered architectures (e.g., arm64), the lockless check in
wq_watchdog_timer_fn() can observe a reordering between the worklist
insertion and the last_progress_ts update. Specifically, the watchdog
can see a non-empty worklist (from a list_add) while reading a stale
last_progress_ts value, causing a false positive stall report.

This was confirmed by reading pool->last_progress_ts again after holding
pool->lock in wq_watchdog_timer_fn():

  workqueue watchdog: pool 7 false positive detected!
    lockless_ts=4784580465 locked_ts=4785033728
    diff=453263ms worklist_empty=0

To avoid slowing down the hot path (queue_work, etc.), recheck
last_progress_ts with pool->lock held. This will eliminate the false
positive with minimal overhead.

Remove two extra empty lines in wq_watchdog_timer_fn() as we are on it.

Fixes: 82607adcf9 ("workqueue: implement lockup detector")
Cc: stable@vger.kernel.org # v4.5+
Assisted-by: claude-code:claude-opus-4-6
Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
2026-03-21 18:34:59 -10:00
Cheng-Yang Chou
db08b1940f sched_ext: Fix inconsistent NUMA node lookup in scx_select_cpu_dfl()
In the WAKE_SYNC path of scx_select_cpu_dfl(), waker_node was computed
with cpu_to_node(), while node (for prev_cpu) was computed with
scx_cpu_node_if_enabled(). When scx_builtin_idle_per_node is disabled,
idle_cpumask(waker_node) is called with a real node ID even though
per-node idle tracking is disabled, resulting in undefined behavior.

Fix by using scx_cpu_node_if_enabled() for waker_node as well, ensuring
both variables are computed consistently.

Fixes: 48849271e6 ("sched_ext: idle: Per-node idle cpumasks")
Cc: stable@vger.kernel.org # v6.15+
Signed-off-by: Cheng-Yang Chou <yphbchou0911@gmail.com>
Reviewed-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2026-03-21 14:22:37 -10:00
Zubin Mithra
c3fd16c3b9 virt: tdx-guest: Fix handling of host controlled 'quote' buffer length
Validate host controlled value `quote_buf->out_len` that determines how
many bytes of the quote are copied out to guest userspace. In TDX
environments with remote attestation, quotes are not considered private,
and can be forwarded to an attestation server.

Catch scenarios where the host specifies a response length larger than
the guest's allocation, or otherwise races modifying the response while
the guest consumes it.

This prevents contents beyond the pages allocated for `quote_buf`
(up to TSM_REPORT_OUTBLOB_MAX) from being read out to guest userspace,
and possibly forwarded in attestation requests.

Recall that some deployments want per-container configs-tsm-report
interfaces, so the leak may cross container protection boundaries, not
just local root.

Fixes: f4738f56d1 ("virt: tdx-guest: Add Quote generation support using TSM_REPORTS")
Cc: stable@vger.kernel.org
Signed-off-by: Zubin Mithra <zsm@google.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Kiryl Shutsemau (Meta) <kas@kernel.org>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2026-03-20 21:05:50 -07:00
Mike Rapoport (Microsoft)
217c0a5c17 x86/efi: efi_unmap_boot_services: fix calculation of ranges_to_free size
ranges_to_free array should have enough room to store the entire EFI
memmap plus an extra element for NULL entry.
The calculation of this array size wrongly adds 1 to the overall size
instead of adding 1 to the number of elements.

Add parentheses to properly size the array.

Reported-by: Guenter Roeck <linux@roeck-us.net>
Fixes: a4b0bf6a40 ("x86/efi: defer freeing of boot services memory")
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2026-03-20 15:31:10 +01:00
Joanne Koong
76f9377cd2 writeback: don't block sync for filesystems with no data integrity guarantees
Add a SB_I_NO_DATA_INTEGRITY superblock flag for filesystems that cannot
guarantee data persistence on sync (eg fuse). For superblocks with this
flag set, sync kicks off writeback of dirty inodes but does not wait
for the flusher threads to complete the writeback.

This replaces the per-inode AS_NO_DATA_INTEGRITY mapping flag added in
commit f9a49aa302 ("fs/writeback: skip AS_NO_DATA_INTEGRITY mappings
in wait_sb_inodes()"). The flag belongs at the superblock level because
data integrity is a filesystem-wide property, not a per-inode one.
Having this flag at the superblock level also allows us to skip having
to iterate every dirty inode in wait_sb_inodes() only to skip each inode
individually.

Prior to this commit, mappings with no data integrity guarantees skipped
waiting on writeback completion but still waited on the flusher threads
to finish initiating the writeback. Waiting on the flusher threads is
unnecessary. This commit kicks off writeback but does not wait on the
flusher threads. This change properly addresses a recent report [1] for
a suspend-to-RAM hang seen on fuse-overlayfs that was caused by waiting
on the flusher threads to finish:

Workqueue: pm_fs_sync pm_fs_sync_work_fn
Call Trace:
 <TASK>
 __schedule+0x457/0x1720
 schedule+0x27/0xd0
 wb_wait_for_completion+0x97/0xe0
 sync_inodes_sb+0xf8/0x2e0
 __iterate_supers+0xdc/0x160
 ksys_sync+0x43/0xb0
 pm_fs_sync_work_fn+0x17/0xa0
 process_one_work+0x193/0x350
 worker_thread+0x1a1/0x310
 kthread+0xfc/0x240
 ret_from_fork+0x243/0x280
 ret_from_fork_asm+0x1a/0x30
 </TASK>

On fuse this is problematic because there are paths that may cause the
flusher thread to block (eg if systemd freezes the user session cgroups
first, which freezes the fuse daemon, before invoking the kernel
suspend. The kernel suspend triggers ->write_node() which on fuse issues
a synchronous setattr request, which cannot be processed since the
daemon is frozen. Or if the daemon is buggy and cannot properly complete
writeback, initiating writeback on a dirty folio already under writeback
leads to writeback_get_folio() -> folio_prepare_writeback() ->
unconditional wait on writeback to finish, which will cause a hang).
This commit restores fuse to its prior behavior before tmp folios were
removed, where sync was essentially a no-op.

[1] https://lore.kernel.org/linux-fsdevel/CAJnrk1a-asuvfrbKXbEwwDSctvemF+6zfhdnuzO65Pt8HsFSRw@mail.gmail.com/T/#m632c4648e9cafc4239299887109ebd880ac6c5c1

Fixes: 0c58a97f91 ("fuse: remove tmp folio for writebacks and internal rb tree")
Reported-by: John <therealgraysky@proton.me>
Cc: stable@vger.kernel.org
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Link: https://patch.msgid.link/20260320005145.2483161-2-joannelkoong@gmail.com
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: David Hildenbrand (Arm) <david@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-03-20 14:18:56 +01:00
Hasun Park
2594196f4e ASoC: amd: acp: add ASUS HN7306EA quirk for legacy SDW machine
Add a DMI quirk entry for ASUS HN7306EA in the ACP SoundWire legacy
machine driver.

Set driver_data to ASOC_SDW_ACP_DMIC for this board so the
platform-specific DMIC quirk path is selected.

Signed-off-by: Hasun Park <hasunpark@gmail.com>
Link: https://patch.msgid.link/20260319163321.30326-1-hasunpark@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2026-03-20 12:53:35 +00:00
Cássio Gabriel
215e5fe758 ASoC: SOF: topology: reject invalid vendor array size in token parser
sof_parse_token_sets() accepts array->size values that can be invalid
for a vendor tuple array header. In particular, a zero size does not
advance the parser state and can lead to non-progress parsing on
malformed topology data.

Validate array->size against the minimum header size and reject values
smaller than sizeof(*array) before parsing. This preserves behavior for
valid topologies and hardens malformed-input handling.

Signed-off-by: Cássio Gabriel <cassiogabrielcontato@gmail.com>
Acked-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Link: https://patch.msgid.link/20260319-sof-topology-array-size-fix-v1-1-f9191b16b1b7@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2026-03-20 12:52:41 +00:00
Pedro Demarchi Gomes
fc3bbf34e6 drm/shmem-helper: Fix huge page mapping in fault handler
When running ./tools/testing/selftests/mm/split_huge_page_test multiple
times with /sys/kernel/mm/transparent_hugepage/shmem_enabled and
/sys/kernel/mm/transparent_hugepage/enabled set as always the following BUG
occurs:

[  232.728858] ------------[ cut here ]------------
[  232.729458] kernel BUG at mm/memory.c:2276!
[  232.729726] Oops: invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
[  232.730217] CPU: 19 UID: 60578 PID: 1497 Comm: llvmpipe-9 Not tainted 7.0.0-rc1mm-new+ #19 PREEMPT(lazy)
[  232.730855] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.17.0-9.fc43 06/10/2025
[  232.731360] RIP: 0010:walk_to_pmd+0x29e/0x3c0
[  232.731569] Code: d8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 48 89 ea 48 89 de 4c 89 f7 e8 ae 85 ff ff 85 c0 0f 84 1f fe ff ff 31 db eb d0 <0f> 0b 48 89 ea 48 89 de 4c 89 f7 e8 92 8b ff ff 85 c0 75 e8 48 b8
[  232.732614] RSP: 0000:ffff8881aa6ff9a8 EFLAGS: 00010282
[  232.732991] RAX: 8000000142e002e7 RBX: ffff8881433cae10 RCX: dffffc0000000000
[  232.733362] RDX: 0000000000000000 RSI: 00007fb47840b000 RDI: 8000000142e002e7
[  232.733801] RBP: 00007fb47840b000 R08: 0000000000000000 R09: 1ffff110354dff46
[  232.734168] R10: fffffbfff0cb921d R11: 00000000910da5ce R12: 1ffffffff0c1fcdd
[  232.734459] R13: 1ffffffff0c23f36 R14: ffff888171628040 R15: 0000000000000000
[  232.734861] FS:  00007fb4907f86c0(0000) GS:ffff888791f2c000(0000) knlGS:0000000000000000
[  232.735265] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  232.735548] CR2: 00007fb47840be00 CR3: 000000015e6dc000 CR4: 00000000000006f0
[  232.736031] Call Trace:
[  232.736273]  <TASK>
[  232.736500]  get_locked_pte+0x1f/0xa0
[  232.736878]  insert_pfn+0x9f/0x350
[  232.737190]  ? __pfx_pat_pagerange_is_ram+0x10/0x10
[  232.737614]  ? __pfx_insert_pfn+0x10/0x10
[  232.737990]  ? __pfx_css_rstat_updated+0x10/0x10
[  232.738281]  ? __pfx_pfn_modify_allowed+0x10/0x10
[  232.738552]  ? lookup_memtype+0x62/0x180
[  232.738761]  vmf_insert_pfn_prot+0x14b/0x340
[  232.739012]  ? __pfx_vmf_insert_pfn_prot+0x10/0x10
[  232.739247]  ? __pfx___might_resched+0x10/0x10
[  232.739475]  drm_gem_shmem_fault.cold+0x18/0x39
[  232.739677]  ? rcu_read_unlock+0x20/0x70
[  232.739882]  __do_fault+0x251/0x7b0
[  232.740028]  do_fault+0x6e1/0xc00
[  232.740167]  ? __lock_acquire+0x590/0xc40
[  232.740335]  handle_pte_fault+0x439/0x760
[  232.740498]  ? mtree_range_walk+0x252/0xae0
[  232.740669]  ? __pfx_handle_pte_fault+0x10/0x10
[  232.740899]  __handle_mm_fault+0xa02/0xf30
[  232.741066]  ? __pfx___handle_mm_fault+0x10/0x10
[  232.741255]  ? find_vma+0xa1/0x120
[  232.741403]  handle_mm_fault+0x2bf/0x8f0
[  232.741564]  do_user_addr_fault+0x2d3/0xed0
[  232.741736]  ? trace_page_fault_user+0x1bf/0x240
[  232.741969]  exc_page_fault+0x87/0x120
[  232.742124]  asm_exc_page_fault+0x26/0x30
[  232.742288] RIP: 0033:0x7fb4d73ed546
[  232.742441] Code: 66 41 0f 6f fb 66 44 0f 6d dc 66 44 0f 6f c6 66 41 0f 6d f1 66 0f 6c fc 66 45 0f 6c c1 66 44 0f 6f c9 66 0f 6d ca 66 0f db f0 <66> 0f df 04 08 66 44 0f 6c ca 66 45 0f db c2 66 44 0f df 10 66 44
[  232.743193] RSP: 002b:00007fb4907f68a0 EFLAGS: 00010206
[  232.743565] RAX: 00007fb47840aa00 RBX: 00007fb4d73ec070 RCX: 0000000000001400
[  232.743871] RDX: 0000000000002800 RSI: 0000000000003c00 RDI: 0000000000000001
[  232.744150] RBP: 0000000000000004 R08: 0000000000001400 R09: 00007fb4d73ec060
[  232.744433] R10: 000055f0261a4288 R11: 00007fb4c013da40 R12: 0000000000000008
[  232.744712] R13: 0000000000000000 R14: 4332322132212110 R15: 0000000000000004
[  232.746616]  </TASK>
[  232.746711] Modules linked in: nft_nat nft_masq veth bridge stp llc snd_seq_dummy snd_hrtimer snd_seq snd_seq_device snd_timer snd soundcore overlay rfkill nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables qrtr ppdev 9pnet_virtio 9pnet parport_pc i2c_piix4 netfs pcspkr parport i2c_smbus joydev sunrpc vfat fat loop dm_multipath nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport zram lz4hc_compress vmw_vmci lz4_compress vsock e1000 bochs serio_raw ata_generic pata_acpi scsi_dh_rdac scsi_dh_emc scsi_dh_alua i2c_dev fuse qemu_fw_cfg
[  232.749308] ---[ end trace 0000000000000000 ]---
[  232.749507] RIP: 0010:walk_to_pmd+0x29e/0x3c0
[  232.749692] Code: d8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 48 89 ea 48 89 de 4c 89 f7 e8 ae 85 ff ff 85 c0 0f 84 1f fe ff ff 31 db eb d0 <0f> 0b 48 89 ea 48 89 de 4c 89 f7 e8 92 8b ff ff 85 c0 75 e8 48 b8
[  232.750428] RSP: 0000:ffff8881aa6ff9a8 EFLAGS: 00010282
[  232.750645] RAX: 8000000142e002e7 RBX: ffff8881433cae10 RCX: dffffc0000000000
[  232.750954] RDX: 0000000000000000 RSI: 00007fb47840b000 RDI: 8000000142e002e7
[  232.751232] RBP: 00007fb47840b000 R08: 0000000000000000 R09: 1ffff110354dff46
[  232.751514] R10: fffffbfff0cb921d R11: 00000000910da5ce R12: 1ffffffff0c1fcdd
[  232.751837] R13: 1ffffffff0c23f36 R14: ffff888171628040 R15: 0000000000000000
[  232.752124] FS:  00007fb4907f86c0(0000) GS:ffff888791f2c000(0000) knlGS:0000000000000000
[  232.752441] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  232.752674] CR2: 00007fb47840be00 CR3: 000000015e6dc000 CR4: 00000000000006f0
[  232.752983] Kernel panic - not syncing: Fatal exception
[  232.753510] Kernel Offset: disabled
[  232.754643] ---[ end Kernel panic - not syncing: Fatal exception ]---

This happens when two concurrent page faults occur within the same PMD range.
One fault installs a PMD mapping through vmf_insert_pfn_pmd(), while the other
attempts to install a PTE mapping via vmf_insert_pfn(). The bug is
triggered because a pmd_trans_huge is not expected when walking the page
table inside vmf_insert_pfn.

Avoid this race by adding a huge_fault callback to drm_gem_shmem_vm_ops so that
PMD-sized mappings are handled through the appropriate huge page fault path.

Fixes: 211b9a39f2 ("drm/shmem-helper: Map huge pages in fault handler")
Signed-off-by: Pedro Demarchi Gomes <pedrodemargomes@gmail.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Link: https://patch.msgid.link/20260319015224.46896-1-pedrodemargomes@gmail.com
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
2026-03-20 09:15:39 +01:00
Thinh Nguyen
01f784fc9d scsi: target: file: Use kzalloc_flex for aio_cmd
The target_core_file doesn't initialize the aio_cmd->iocb for the
ki_write_stream. When a write command fd_execute_rw_aio() is executed,
we may get a bogus ki_write_stream value, causing unintended write
failure status when checking iocb->ki_write_stream > max_write_streams
in the block device.

Let's just use kzalloc_flex when allocating the aio_cmd and let
ki_write_stream=0 to fix this issue.

Fixes: 732f25a289 ("fs: add a write stream field to the kiocb")
Fixes: c27683da64 ("block: expose write streams for block device nodes")
Cc: stable@vger.kernel.org
Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
Link: https://patch.msgid.link/f1a2f81c62f043e31f80bb92d5f29893400c8ee2.1773450782.git.Thinh.Nguyen@synopsys.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2026-03-19 22:07:39 -04:00
Yihang Li
d71afa9deb scsi: scsi_transport_sas: Fix the maximum channel scanning issue
After commit 37c4e72b06 ("scsi: Fix sas_user_scan() to handle wildcard
and multi-channel scans"), if the device supports multiple channels (0 to
shost->max_channel), user_scan() invokes updated sas_user_scan() to perform
the scan behavior for a specific transfer.  However, when the user
specifies shost->max_channel, it will return -EINVAL, which is not
expected.

Fix and support specifying the scan shost->max_channel for scanning.

Fixes: 37c4e72b06 ("scsi: Fix sas_user_scan() to handle wildcard and multi-channel scans")
Signed-off-by: Yihang Li <liyihang9@huawei.com>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Link: https://patch.msgid.link/20260317063147.2182562-1-liyihang9@huawei.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2026-03-19 21:57:22 -04:00
Josef Bacik
1333eee56c scsi: target: tcm_loop: Drain commands in target_reset handler
tcm_loop_target_reset() violates the SCSI EH contract: it returns SUCCESS
without draining any in-flight commands.  The SCSI EH documentation
(scsi_eh.rst) requires that when a reset handler returns SUCCESS the driver
has made lower layers "forget about timed out scmds" and is ready for new
commands.  Every other SCSI LLD (virtio_scsi, mpt3sas, ipr, scsi_debug,
mpi3mr) enforces this by draining or completing outstanding commands before
returning SUCCESS.

Because tcm_loop_target_reset() doesn't drain, the SCSI EH reuses in-flight
scsi_cmnd structures for recovery commands (e.g. TUR) while the target core
still has async completion work queued for the old se_cmd.  The memset in
queuecommand zeroes se_lun and lun_ref_active, causing
transport_lun_remove_cmd() to skip its percpu_ref_put().  The leaked LUN
reference prevents transport_clear_lun_ref() from completing, hanging
configfs LUN unlink forever in D-state:

  INFO: task rm:264 blocked for more than 122 seconds.
  rm              D    0   264    258 0x00004000
  Call Trace:
   __schedule+0x3d0/0x8e0
   schedule+0x36/0xf0
   transport_clear_lun_ref+0x78/0x90 [target_core_mod]
   core_tpg_remove_lun+0x28/0xb0 [target_core_mod]
   target_fabric_port_unlink+0x50/0x60 [target_core_mod]
   configfs_unlink+0x156/0x1f0 [configfs]
   vfs_unlink+0x109/0x290
   do_unlinkat+0x1d5/0x2d0

Fix this by making tcm_loop_target_reset() actually drain commands:

 1. Issue TMR_LUN_RESET via tcm_loop_issue_tmr() to drain all commands that
    the target core knows about (those not yet CMD_T_COMPLETE).

 2. Use blk_mq_tagset_busy_iter() to iterate all started requests and
    flush_work() on each se_cmd — this drains any deferred completion work
    for commands that already had CMD_T_COMPLETE set before the TMR (which
    the TMR skips via __target_check_io_state()).  This is the same pattern
    used by mpi3mr, scsi_debug, and libsas to drain outstanding commands
    during reset.

Fixes: e0eb5d38b7 ("scsi: target: tcm_loop: Use block cmd allocator for se_cmds")
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-opus-4-6
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Link: https://patch.msgid.link/27011aa34c8f6b1b94d2e3cf5655b6d037f53428.1773706803.git.josef@toxicpanda.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2026-03-19 21:54:51 -04:00
Tyllis Xu
61d099ac4a scsi: ibmvfc: Fix OOB access in ibmvfc_discover_targets_done()
A malicious or compromised VIO server can return a num_written value in the
discover targets MAD response that exceeds max_targets. This value is
stored directly in vhost->num_targets without validation, and is then used
as the loop bound in ibmvfc_alloc_targets() to index into disc_buf[], which
is only allocated for max_targets entries. Indices at or beyond max_targets
access kernel memory outside the DMA-coherent allocation.  The
out-of-bounds data is subsequently embedded in Implicit Logout and PLOGI
MADs that are sent back to the VIO server, leaking kernel memory.

Fix by clamping num_written to max_targets before storing it.

Fixes: 072b91f9c6 ("[SCSI] ibmvfc: IBM Power Virtual Fibre Channel Adapter Client Driver")
Reported-by: Yuhao Jiang <danisjiang@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Tyllis Xu <LivelyCarpet87@gmail.com>
Reviewed-by: Dave Marquardt <davemarq@linux.ibm.com>
Acked-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Link: https://patch.msgid.link/20260314170151.548614-1-LivelyCarpet87@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2026-03-19 21:42:35 -04:00
Greg Kroah-Hartman
7a9f448d44 scsi: ses: Handle positive SCSI error from ses_recv_diag()
ses_recv_diag() can return a positive value, which also means that an
error happened, so do not only test for negative values.

Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: stable <stable@kernel.org>
Assisted-by: gkh_clanker_2000
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://patch.msgid.link/2026022301-bony-overstock-a07f@gregkh
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2026-03-19 21:33:28 -04:00
Mickaël Salaün
a54142d9ff selftests/landlock: Test tsync interruption and cancellation paths
Add tsync_interrupt test to exercise the signal interruption path in
landlock_restrict_sibling_threads().  When a signal interrupts
wait_for_completion_interruptible() while the calling thread waits for
sibling threads to finish credential preparation, the kernel:

1. Sets ERESTARTNOINTR to request a transparent syscall restart.
2. Calls cancel_tsync_works() to opportunistically dequeue task works
   that have not started running yet.
3. Breaks out of the preparation loop, then unblocks remaining
   task works via complete_all() and waits for them to finish.
4. Returns the error, causing abort_creds() in the syscall handler.

Specifically, cancel_tsync_works() in its entirety, the ERESTARTNOINTR
error branch in landlock_restrict_sibling_threads(), and the
abort_creds() error branch in the landlock_restrict_self() syscall
handler are timing-dependent and not exercised by the existing tsync
tests, making code coverage measurements non-deterministic.

The test spawns a signaler thread that rapidly sends SIGUSR1 to the
calling thread while it performs landlock_restrict_self() with
LANDLOCK_RESTRICT_SELF_TSYNC.  Since ERESTARTNOINTR causes a
transparent restart, userspace always sees the syscall succeed.

This is a best-effort coverage test: the interruption path is exercised
when the signal lands during the preparation wait, which depends on
thread scheduling.  The test creates enough idle sibling threads (200)
to ensure multiple serialized waves of credential preparation even on
machines with many cores (e.g., 64), widening the window for the
signaler.  Deterministic coverage would require wrapping the wait call
with ALLOW_ERROR_INJECTION() and using CONFIG_FAIL_FUNCTION.

Test coverage for security/landlock was 90.2% of 2105 lines according to
LLVM 21, and it is now 91.1% of 2105 lines with this new test.

Cc: Günther Noack <gnoack@google.com>
Cc: Justin Suess <utilityemal77@gmail.com>
Cc: Tingmao Wang <m@maowtm.org>
Cc: Yihan Ding <dingyihan@uniontech.com>
Link: https://lore.kernel.org/r/20260310190416.1913908-1-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2026-03-19 20:57:39 +01:00
Baojun Xu
a437601a0a ASoC: tas2781: Add null check for calibration data
For avoid null pointer problem if no calibration data exist.

Fixes: 55137f5a68 ("ASoC: tas2781: Put three different calibrated data solution into the same data structure")

Signed-off-by: Baojun Xu <baojun.xu@ti.com>
Link: https://patch.msgid.link/20260319090747.2090-1-baojun.xu@ti.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2026-03-19 12:52:55 +00:00
David Howells
7e57523490 netfs: Fix read abandonment during retry
Under certain circumstances, all the remaining subrequests from a read
request will get abandoned during retry.  The abandonment process expects
the 'subreq' variable to be set to the place to start abandonment from, but
it doesn't always have a useful value (it will be uninitialised on the
first pass through the loop and it may point to a deleted subrequest on
later passes).

Fix the first jump to "abandon:" to set subreq to the start of the first
subrequest expected to need retry (which, in this abandonment case, turned
out unexpectedly to no longer have NEED_RETRY set).

Also clear the subreq pointer after discarding superfluous retryable
subrequests to cause an oops if we do try to access it.

Fixes: ee4cdf7ba8 ("netfs: Speed up buffered reading")
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://patch.msgid.link/3775287.1773848338@warthog.procyon.org.uk
Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
cc: Paulo Alcantara <pc@manguebit.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-03-19 11:20:21 +01:00
Jori Koolstra
8fb6857f2f vfs: fix docstring of hash_name()
The docstring of hash_name() is falsely reporting that it returns the
component length, whereas it returns a pointer to the terminating '/'
or NUL character in the pathname being resolved.

Signed-off-by: Jori Koolstra <jkoolstra@xs4all.nl>
Link: https://patch.msgid.link/20260318203953.5770-1-jkoolstra@xs4all.nl
Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-03-19 11:18:01 +01:00
Arnd Bergmann
591721223b ALSA: asihpi: avoid write overflow check warning
clang-22 rightfully warns that the memcpy() in adapter_prepare() copies
between different structures, crossing the boundary of nested
structures inside it:

In file included from sound/pci/asihpi/hpimsgx.c:13:
In file included from include/linux/string.h:386:
include/linux/fortify-string.h:569:4: error: call to '__write_overflow_field' declared with 'warning' attribute: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Werror,-Wattribute-warning]
  569 |                         __write_overflow_field(p_size_field, size);

The two structures seem to refer to the same layout, despite the
separate definitions, so the code is in fact correct.

Avoid the warning by copying the two inner structures separately.
I see the same pattern happens in other functions in the same file,
so there is a chance that this may come back in the future, but
this instance is the only one that I saw in practice, hitting it
multiple times per day in randconfig build.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patch.msgid.link/20260318124016.3488566-1-arnd@kernel.org
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2026-03-18 17:01:17 +01:00
Mark Brown
5e4ed0320b ASoC: fix usage of playback_only and capture_only
Shengjiu Wang <shengjiu.wang@nxp.com> says:

Check value of is_playback_only and is_capture_only in
graph_util_parse_link_direction() and initialize playback_only and
capture_only in imx-card.c
2026-03-18 13:26:52 +00:00
Shengjiu Wang
ca67bd564e ASoC: fsl: imx-card: initialize playback_only and capture_only
Fix uninitialized variable playback_only and capture_only because
graph_util_parse_link_direction() may not write them.

Fixes: 1877c3e793 ("ASoC: imx-card: Add playback_only or capture_only support")
Suggested-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Acked-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Link: https://patch.msgid.link/20260318102850.2794029-3-shengjiu.wang@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2026-03-18 13:26:50 +00:00
Shengjiu Wang
0e9fc79132 ASoC: simple-card-utils: Check value of is_playback_only and is_capture_only
The audio-graph-card2 gets the value of 'playback-only' and
'capture_only' property in below sequence, if there is 'playback_only' or
'capture_only' property in port_cpu and port_codec nodes, but no these
properties in ep_cpu and ep_codec nodes, the value of playback_only and
capture_only will be flushed to zero in the end.

graph_util_parse_link_direction(lnk,            &playback_only, &capture_only);
graph_util_parse_link_direction(ports_cpu,      &playback_only, &capture_only);
graph_util_parse_link_direction(ports_codec,    &playback_only, &capture_only);
graph_util_parse_link_direction(port_cpu,       &playback_only, &capture_only);
graph_util_parse_link_direction(port_codec,     &playback_only, &capture_only);
graph_util_parse_link_direction(ep_cpu,         &playback_only, &capture_only);
graph_util_parse_link_direction(ep_codec,       &playback_only, &capture_only);

So check the value of is_playback_only and is_capture_only in
graph_util_parse_link_direction() function, if they are true, then rewrite
the values, and no need to check the np variable as
of_property_read_bool() will ignore if it was NULL.

Fixes: 3cc393d223 ("ASoC: simple-card-utils: Fix pointer check in graph_util_parse_link_direction")
Fixes: 22a507d768 ("ASoC: simple-card-utils: Check device node before overwrite direction")
Suggested-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Acked-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Link: https://patch.msgid.link/20260318102850.2794029-2-shengjiu.wang@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2026-03-18 13:26:49 +00:00
Daniel Lezcano
8306a78a1c ALSA: usb-audio: qcom: Fix the license marking
The Copyright for Qualcomm changed its format and replaces the old
Qualcomm Innovative Center by Qualcomm Technology Inc.

Signed-off-by: Daniel Lezcano <daniel.lezcano@oss.qualcomm.com>
Link: https://patch.msgid.link/20260317180943.3062085-1-daniel.lezcano@oss.qualcomm.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2026-03-18 12:37:05 +01:00
Frank Zhang
b8bee48e38 ALSA:usb:qcom: add AUXILIARY_BUS to Kconfig dependencies
The build can fail with:

ERROR: modpost: "__auxiliary_driver_register"
[sound/usb/qcom/snd-usb-audio-qmi.ko] undefined!
ERROR: modpost: "auxiliary_driver_unregister"
[sound/usb/qcom/snd-usb-audio-qmi.ko] undefined!

Select AUXILIARY_BUS when SND_USB_AUDIO_QMI is enabled.

Signed-off-by: Frank Zhang <rmxpzlb@gmail.com>
Link: https://patch.msgid.link/20260317102527.556248-1-rmxpzlb@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2026-03-18 12:36:39 +01:00
Shiraz Saleem
e37afcb56a RDMA/irdma: Harden depth calculation functions
An issue was exposed where OS can pass in U32_MAX for SQ/RQ/SRQ size.
This can cause integer overflow and truncation of SQ/RQ/SRQ depth
returning a success when it should have failed.

Harden the functions to do all depth calculations and boundary
checking in u64 sizes.

Fixes: 563e1feb5f ("RDMA/irdma: Add SRQ support")
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2026-03-18 06:20:53 -04:00
Tatyana Nikolova
7221f581ee RDMA/irdma: Return EINVAL for invalid arp index error
When rdma_connect() fails due to an invalid arp index, user space rdma core
reports ENOMEM which is confusing. Modify irdma_make_cm_node() to return the
correct error code.

Fixes: 146b9756f1 ("RDMA/irdma: Add connection manager")
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2026-03-18 06:20:53 -04:00
Anil Samal
6f52370970 RDMA/irdma: Fix deadlock during netdev reset with active connections
Resolve deadlock that occurs when user executes netdev reset while RDMA
applications (e.g., rping) are active. The netdev reset causes ice
driver to remove irdma auxiliary driver, triggering device_delete and
subsequent client removal. During client removal, uverbs_client waits
for QP reference count to reach zero while cma_client holds the final
reference, creating circular dependency and indefinite wait in iWARP
mode. Skip QP reference count wait during device reset to prevent
deadlock.

Fixes: c8f304d75f ("RDMA/irdma: Prevent QP use after free")
Signed-off-by: Anil Samal <anil.samal@intel.com>
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2026-03-18 06:20:53 -04:00
Tatyana Nikolova
c45c6ebd69 RDMA/irdma: Remove reset check from irdma_modify_qp_to_err()
During reset, irdma_modify_qp() to error should be called to disconnect
the QP. Without this fix, if not preceded by irdma_modify_qp() to error, the
API call irdma_destroy_qp() gets stuck waiting for the QP refcount to go
to zero, because the cm_node associated with this QP isn't disconnected.

Fixes: 915cc7ac0f ("RDMA/irdma: Add miscellaneous utility definitions")
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2026-03-18 06:20:53 -04:00
Ivan Barrera
b415399c9a RDMA/irdma: Clean up unnecessary dereference of event->cm_node
The cm_node is available and the usage of cm_node and event->cm_node
seems arbitrary. Clean up unnecessary dereference of event->cm_node.

Fixes: 146b9756f1 ("RDMA/irdma: Add connection manager")
Signed-off-by: Ivan Barrera <ivan.d.barrera@intel.com>
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2026-03-18 06:20:53 -04:00
Tatyana Nikolova
5e8f023973 RDMA/irdma: Remove a NOP wait_event() in irdma_modify_qp_roce()
Remove a NOP wait_event() in irdma_modify_qp_roce() which is relevant
for iWARP and likely a copy and paste artifact for RoCEv2. The wait event
is for sending a reset on a TCP connection, after the reset has been
requested in irdma_modify_qp(), which occurs only in iWarp mode.

Fixes: b48c24c2d7 ("RDMA/irdma: Implement device supported verb APIs")
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2026-03-18 06:20:53 -04:00
Tatyana Nikolova
8c1f19a222 RDMA/irdma: Update ibqp state to error if QP is already in error state
In irdma_modify_qp() update ibqp state to error if the irdma QP is already
in error state, otherwise the ibqp state which is visible to the consumer
app remains stale.

Fixes: b48c24c2d7 ("RDMA/irdma: Implement device supported verb APIs")
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2026-03-18 06:20:53 -04:00
Jacob Moroni
11a95521fb RDMA/irdma: Initialize free_qp completion before using it
In irdma_create_qp, if ib_copy_to_udata fails, it will call
irdma_destroy_qp to clean up which will attempt to wait on
the free_qp completion, which is not initialized yet. Fix this
by initializing the completion before the ib_copy_to_udata call.

Fixes: b48c24c2d7 ("RDMA/irdma: Implement device supported verb APIs")
Signed-off-by: Jacob Moroni <jmoroni@google.com>
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2026-03-18 06:20:53 -04:00
Joanne Koong
bd71fb3fea iomap: fix invalid folio access when i_blkbits differs from I/O granularity
Commit aa35dd5cbc ("iomap: fix invalid folio access after
folio_end_read()") partially addressed invalid folio access for folios
without an ifs attached, but it did not handle the case where
1 << inode->i_blkbits matches the folio size but is different from the
granularity used for the IO, which means IO can be submitted for less
than the full folio for the !ifs case.

In this case, the condition:

  if (*bytes_submitted == folio_len)
    ctx->cur_folio = NULL;

in iomap_read_folio_iter() will not invalidate ctx->cur_folio, and
iomap_read_end() will still be called on the folio even though the IO
helper owns it and will finish the read on it.

Fix this by unconditionally invalidating ctx->cur_folio for the !ifs
case.

Reported-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Tested-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/linux-fsdevel/b3dfe271-4e3d-4922-b618-e73731242bca@wdc.com/
Fixes: b2f35ac414 ("iomap: add caller-provided callbacks for read and readahead")
Cc: stable@vger.kernel.org
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Link: https://patch.msgid.link/20260317203935.830549-1-joannelkoong@gmail.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-03-18 10:42:08 +01:00
Bill Wendling
e5966096d0 xfs: annotate struct xfs_attr_list_context with __counted_by_ptr
Add the `__counted_by_ptr` attribute to the `buffer` field of `struct
xfs_attr_list_context`. This field is used to point to a buffer of
size `bufsize`.

The `buffer` field is assigned in:
1. `xfs_ioc_attr_list` in `fs/xfs/xfs_handle.c`
2. `xfs_xattr_list` in `fs/xfs/xfs_xattr.c`
3. `xfs_getparents` in `fs/xfs/xfs_handle.c` (implicitly initialized to NULL)

In `xfs_ioc_attr_list`, `buffer` was assigned before `bufsize`. Reorder
them to ensure `bufsize` is set before `buffer` is assigned, although
no access happens between them.

In `xfs_xattr_list`, `buffer` was assigned before `bufsize`. Reorder
them to ensure `bufsize` is set before `buffer` is assigned.

In `xfs_getparents`, `buffer` is NULL (from zero initialization) and
remains NULL. `bufsize` is set to a non-zero value, but since `buffer`
is NULL, no access occurs.

In all cases, the pointer `buffer` is not accessed before `bufsize` is set.

This patch was generated by CodeMender and reviewed by Bill Wendling.
Tested by running xfstests.

Signed-off-by: Bill Wendling <morbo@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-03-18 09:54:39 +01:00
Christoph Hellwig
0c98524ab2 xfs: cleanup buftarg handling in XFS_IOC_VERIFY_MEDIA
The newly added XFS_IOC_VERIFY_MEDIA is a bit unusual in how it handles
buftarg fields.  Update it to be more in line with other XFS code:

 - use btp->bt_dev instead of btp->bt_bdev->bd_dev to retrieve the device
   number for tracing
 - use btp->bt_logical_sectorsize instead of
   bdev_logical_block_size(btp->bt_bdev) to retrieve the logical sector
   size
 - compare the buftarg and not the bdev to see if there is a separate
   log buftarg

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-03-18 09:52:33 +01:00
hongao
268378b6ad xfs: scrub: unlock dquot before early return in quota scrub
xchk_quota_item can return early after calling xchk_fblock_process_error.
When that helper returns false, the function returned immediately without
dropping dq->q_qlock, which can leave the dquot lock held and risk lock
leaks or deadlocks in later quota operations.

Fix this by unlocking dq->q_qlock before the early return.

Signed-off-by: hongao <hongao@uniontech.com>
Fixes: 7d1f0e167a ("xfs: check the ondisk space mapping behind a dquot")
Cc: <stable@vger.kernel.org> # v6.8
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-03-18 09:44:46 +01:00
Yuto Ohnuki
7cac609473 xfs: refactor xfsaild_push loop into helper
Factor the loop body of xfsaild_push() into a separate
xfsaild_process_logitem() helper to improve readability.

This is a pure code movement with no functional change.

Signed-off-by: Yuto Ohnuki <ytohnuki@amazon.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-03-18 09:40:31 +01:00
Yuto Ohnuki
394d70b86f xfs: save ailp before dropping the AIL lock in push callbacks
In xfs_inode_item_push() and xfs_qm_dquot_logitem_push(), the AIL lock
is dropped to perform buffer IO. Once the cluster buffer no longer
protects the log item from reclaim, the log item may be freed by
background reclaim or the dquot shrinker. The subsequent spin_lock()
call dereferences lip->li_ailp, which is a use-after-free.

Fix this by saving the ailp pointer in a local variable while the AIL
lock is held and the log item is guaranteed to be valid.

Reported-by: syzbot+652af2b3c5569c4ab63c@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=652af2b3c5569c4ab63c
Fixes: 90c60e1640 ("xfs: xfs_iflush() is no longer necessary")
Cc: stable@vger.kernel.org # v5.9
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Yuto Ohnuki <ytohnuki@amazon.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-03-18 09:40:31 +01:00
Yuto Ohnuki
79ef34ec05 xfs: avoid dereferencing log items after push callbacks
After xfsaild_push_item() calls iop_push(), the log item may have been
freed if the AIL lock was dropped during the push. Background inode
reclaim or the dquot shrinker can free the log item while the AIL lock
is not held, and the tracepoints in the switch statement dereference
the log item after iop_push() returns.

Fix this by capturing the log item type, flags, and LSN before calling
xfsaild_push_item(), and introducing a new xfs_ail_push_class trace
event class that takes these pre-captured values and the ailp pointer
instead of the log item pointer.

Reported-by: syzbot+652af2b3c5569c4ab63c@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=652af2b3c5569c4ab63c
Fixes: 90c60e1640 ("xfs: xfs_iflush() is no longer necessary")
Cc: stable@vger.kernel.org # v5.9
Signed-off-by: Yuto Ohnuki <ytohnuki@amazon.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-03-18 09:40:31 +01:00
Yuto Ohnuki
4f24a767e3 xfs: stop reclaim before pushing AIL during unmount
The unmount sequence in xfs_unmount_flush_inodes() pushed the AIL while
background reclaim and inodegc are still running. This is broken
independently of any use-after-free issues - background reclaim and
inodegc should not be running while the AIL is being pushed during
unmount, as inodegc can dirty and insert inodes into the AIL during the
flush, and background reclaim can race to abort and free dirty inodes.

Reorder xfs_unmount_flush_inodes() to stop inodegc and cancel background
reclaim before pushing the AIL. Stop inodegc before cancelling
m_reclaim_work because the inodegc worker can re-queue m_reclaim_work
via xfs_inodegc_set_reclaimable.

Reported-by: syzbot+652af2b3c5569c4ab63c@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=652af2b3c5569c4ab63c
Fixes: 90c60e1640 ("xfs: xfs_iflush() is no longer necessary")
Cc: stable@vger.kernel.org # v5.9
Signed-off-by: Yuto Ohnuki <ytohnuki@amazon.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-03-18 09:40:31 +01:00
Geoffrey D. Bennett
8780f561f6 ALSA: usb-audio: Exclude Scarlett 2i2 1st Gen from SKIP_IFACE_SETUP
The Focusrite Scarlett 2i2 1st Gen (1235:8006) produces
distorted/silent audio when QUIRK_FLAG_SKIP_IFACE_SETUP is active, as
that flag causes the feedback format to be detected as 17.15 instead
of 16.16.

Add a DEVICE_FLG entry for this device before the Focusrite VENDOR_FLG
entry so that it gets no quirk flags, overriding the vendor-wide
SKIP_IFACE_SETUP. This device doesn't have the internal mixer, Air, or
Safe modes that the quirk was designed to protect.

Fixes: 38c322068a ("ALSA: usb-audio: Add QUIRK_FLAG_SKIP_IFACE_SETUP")
Reported-by: pairomaniac [https://github.com/geoffreybennett/linux-fcp/issues/54]
Tested-by: pairomaniac [https://github.com/geoffreybennett/linux-fcp/issues/54]
Signed-off-by: Geoffrey D. Bennett <g@b4.vu>
Link: https://patch.msgid.link/abmsTjKmQMKbhYtK@m.b4.vu
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2026-03-18 08:20:43 +01:00
Ethan Tidmore
0f2055db7b RDMA/efa: Fix possible deadlock
In the error path for efa_com_alloc_comp_ctx() the semaphore assigned to
&aq->avail_cmds is not released.

Detected by Smatch:
drivers/infiniband/hw/efa/efa_com.c:662 efa_com_cmd_exec() warn:
inconsistent returns '&aq->avail_cmds'

Add release for &aq->avail_cmds in efa_com_alloc_comp_ctx() error path.

Fixes: ef3b06742c ("RDMA/efa: Fix use of completion ctx after free")
Signed-off-by: Ethan Tidmore <ethantidmore06@gmail.com>
Link: https://patch.msgid.link/20260314045730.1143862-1-ethantidmore06@gmail.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2026-03-17 15:01:05 -04:00
Chuck Lever
f28599f396 RDMA/rw: Fix MR pool exhaustion in bvec RDMA READ path
When IOVA-based DMA mapping is unavailable (e.g., IOMMU
passthrough mode), rdma_rw_ctx_init_bvec() falls back to
checking rdma_rw_io_needs_mr() with the raw bvec count.
Unlike the scatterlist path in rdma_rw_ctx_init(), which
passes a post-DMA-mapping entry count that reflects
coalescing of physically contiguous pages, the bvec path
passes the pre-mapping page count. This overstates the
number of DMA entries, causing every multi-bvec RDMA READ
to consume an MR from the QP's pool.

Under NFS WRITE workloads the server performs RDMA READs
to pull data from the client. With the inflated MR demand,
the pool is rapidly exhausted, ib_mr_pool_get() returns
NULL, and rdma_rw_init_one_mr() returns -EAGAIN. svcrdma
treats this as a DMA mapping failure, closes the connection,
and the client reconnects -- producing a cycle of 71% RPC
retransmissions and ~100 reconnections per test run. RDMA
WRITEs (NFS READ direction) are unaffected because
DMA_TO_DEVICE never triggers the max_sgl_rd check.

Remove the rdma_rw_io_needs_mr() gate from the bvec path
entirely, so that bvec RDMA operations always use the
map_wrs path (direct WR posting without MR allocation).
The bvec caller has no post-DMA-coalescing segment count
available -- xdr_buf and svc_rqst hold pages as individual
pointers, and physical contiguity is discovered only during
DMA mapping -- so the raw page count cannot serve as a
reliable input to rdma_rw_io_needs_mr(). iWARP devices,
which require MRs unconditionally, are handled by an
earlier check in rdma_rw_ctx_init_bvec() and are unaffected.

Fixes: bea28ac14c ("RDMA/core: add MR support for bvec-based RDMA operations")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://patch.msgid.link/20260313194201.5818-3-cel@kernel.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2026-03-17 15:00:56 -04:00
Chuck Lever
00da250c21 RDMA/rw: Fall back to direct SGE on MR pool exhaustion
When IOMMU passthrough mode is active, ib_dma_map_sgtable_attrs()
produces no coalescing: each scatterlist page maps 1:1 to a DMA
entry, so sgt.nents equals the raw page count. A 1 MB transfer
yields 256 DMA entries. If that count exceeds the device's
max_sgl_rd threshold (an optimization hint from mlx5 firmware),
rdma_rw_io_needs_mr() steers the operation into the MR
registration path. Each such operation consumes one or more MRs
from a pool sized at max_rdma_ctxs -- roughly one MR per
concurrent context. Under write-intensive workloads that issue
many concurrent RDMA READs, the pool is rapidly exhausted,
ib_mr_pool_get() returns NULL, and rdma_rw_init_one_mr() returns
-EAGAIN. Upper layer protocols treat this as a fatal DMA mapping
failure and tear down the connection.

The max_sgl_rd check is a performance optimization, not a
correctness requirement: the device can handle large SGE counts
via direct posting, just less efficiently than with MR
registration. When the MR pool cannot satisfy a request, falling
back to the direct SGE (map_wrs) path avoids the connection
reset while preserving the MR optimization for the common case
where pool resources are available.

Add a fallback in rdma_rw_ctx_init() so that -EAGAIN from
rdma_rw_init_mr_wrs() triggers direct SGE posting instead of
propagating the error. iWARP devices, which mandate MR
registration for RDMA READs, and force_mr debug mode continue
to treat -EAGAIN as terminal.

Fixes: 00bd1439f4 ("RDMA/rw: Support threshold for registration vs scattering to local pages")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://patch.msgid.link/20260313194201.5818-2-cel@kernel.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2026-03-17 15:00:38 -04:00
Zhang Heng
1f182ec9d7 ASoC: amd: yc: Add DMI quirk for Thin A15 B7VF
Add a DMI quirk for the Thin A15 B7VF fixing the issue where
the internal microphone was not detected.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=220833
Signed-off-by: Zhang Heng <zhangheng@kylinos.cn>
Link: https://patch.msgid.link/20260316080218.2931304-1-zhangheng@kylinos.cn
Signed-off-by: Mark Brown <broonie@kernel.org>
2026-03-17 17:38:35 +00:00
Christian Brauner
c465f5591a selftests/mount_setattr: increase tmpfs size for idmapped mount tests
The mount_setattr_idmapped fixture mounts a 2 MB tmpfs at /mnt and then
creates a 2 GB sparse ext4 image at /mnt/C/ext4.img. While ftruncate()
succeeds (sparse file), mkfs.ext4 needs to write actual metadata blocks
(inode tables, journal, bitmaps) which easily exceeds the 2 MB tmpfs
limit, causing ENOSPC and failing the fixture setup for all
mount_setattr_idmapped tests.

This was introduced by commit d37d4720c3 ("selftests/mount_settattr:
ensure that ext4 filesystem can be created") which increased the image
size from 2 MB to 2 GB but didn't adjust the tmpfs size.

Bump the tmpfs size to 256 MB which is sufficient for the ext4 metadata.

Fixes: d37d4720c3 ("selftests/mount_settattr: ensure that ext4 filesystem can be created")
Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-03-17 16:59:45 +01:00
Vee Satayamas
f200b2f9a8 ASoC: amd: yc: Add DMI quirk for ASUS EXPERTBOOK BM1403CDA
Add a DMI quirk for the Asus Expertbook BM1403CDA to resolve the issue of the
internal microphone not being detected.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=221236
Signed-off-by: Vee Satayamas <vsatayamas@gmail.com>
Reviewed-by: Zhang Heng <zhangheng@kylinos.cn>
Link: https://patch.msgid.link/20260315142511.66029-2-vsatayamas@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2026-03-17 14:28:54 +00:00
Christoph Hellwig
f8b8820a4a fs: clear I_DIRTY_TIME in sync_lazytime
For file systems implementing ->sync_lazytime, I_DIRTY_TIME fails to get
cleared in sync_lazytime, and might cause additional calls to
sync_lazytime during inode deactivation.  Use the same pattern as in
__mark_inode_dirty to clear the flag under the inode lock.

Fixes: 5cf06ea56e ("fs: add a ->sync_lazytime method")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://patch.msgid.link/20260317134409.1691317-1-hch@lst.de
Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-03-17 15:12:15 +01:00
Sebastian Reichel
4eae391a8e ASoC: dt-bindings: rockchip: Add compatible for RK3576 SPDIF
Add a compatible string for SPDIF on RK3576, which is similar to the
one on RK3568.

Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Link: https://patch.msgid.link/20260316-rk3576-spdif-v1-1-acb75088b560@collabora.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2026-03-17 12:30:51 +00:00
Tomi Valkeinen
a17ce4bc6f dmaengine: xilinx_dma: Fix reset related timeout with two-channel AXIDMA
A single AXIDMA controller can have one or two channels. When it has two
channels, the reset for both are tied together: resetting one channel
resets the other as well. This creates a problem where resetting one
channel will reset the registers for both channels, including clearing
interrupt enable bits for the other channel, which can then lead  to
timeouts as the driver is waiting for an interrupt which never comes.

The driver currently has a probe-time work around for this: when a
channel is created, the driver also resets and enables the
interrupts. With two channels the reset for the second channel will
clear the interrupt enables for the first one. The work around in the
driver is just to manually enable the interrupts again in
xilinx_dma_alloc_chan_resources().

This workaround only addresses the probe-time issue. When channels are
reset at runtime (e.g., in xilinx_dma_terminate_all() or during error
recovery), there's no corresponding mechanism to restore the other
channel's interrupt enables. This leads to one channel having its
interrupts disabled while the driver expects them to work, causing
timeouts and DMA failures.

A proper fix is a complicated matter, as we should not reset the other
channel when it's operating normally. So, perhaps, there should be some
kind of synchronization for a common reset, which is not trivial to
implement. To add to the complexity, the driver also supports other DMA
types, like VDMA, CDMA and MCDMA, which don't have a shared reset.

However, when the two-channel AXIDMA is used in the (assumably) normal
use case, providing DMA for a single memory-to-memory device, the common
reset is a bit smaller issue: when something bad happens on one channel,
or when one channel is terminated, the assumption is that we also want
to terminate the other channel. And thus resetting both at the same time
is "ok".

With that line of thinking we can implement a bit better work around
than just the current probe time work around: let's enable the
AXIDMA interrupts at xilinx_dma_start_transfer() instead.
This ensures interrupts are enabled whenever a transfer starts,
regardless of any prior resets that may have cleared them.

This approach is also more logical: enable interrupts only when needed
for a transfer, rather than at resource allocation time, and, I think,
all the other DMA types should also use this model, but I'm reluctant to
do such changes as I cannot test them.

The reset function still enables interrupts even though it's not needed
for AXIDMA anymore, but it's common code for all DMA types (VDMA, CDMA,
MCDMA), so leave it unchanged to avoid affecting other variants.

Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Fixes: c0bba3a99f ("dmaengine: vdma: Add Support for Xilinx AXI Direct Memory Access Engine")
Link: https://patch.msgid.link/20260311-xilinx-dma-fix-v2-1-a725abb66e3c@ideasonboard.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2026-03-17 16:33:32 +05:30
Marek Vasut
c7d812e33f dmaengine: xilinx: xilinx_dma: Fix unmasked residue subtraction
The segment .control and .status fields both contain top bits which are
not part of the buffer size, the buffer size is located only in the bottom
max_buffer_len bits. To avoid interference from those top bits, mask out
the size using max_buffer_len first, and only then subtract the values.

Fixes: a575d0b4e6 ("dmaengine: xilinx_dma: Introduce xilinx_dma_get_residue")
Signed-off-by: Marek Vasut <marex@nabladev.com>
Link: https://patch.msgid.link/20260316222530.163815-1-marex@nabladev.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2026-03-17 16:14:34 +05:30
Marek Vasut
f61d145999 dmaengine: xilinx: xilinx_dma: Fix residue calculation for cyclic DMA
The cyclic DMA calculation is currently entirely broken and reports
residue only for the first segment. The problem is twofold.

First, when the first descriptor finishes, it is moved from active_list
to done_list, but it is never returned back into the active_list. The
xilinx_dma_tx_status() expects the descriptor to be in the active_list
to report any meaningful residue information, which never happens after
the first descriptor finishes. Fix this up in xilinx_dma_start_transfer()
and if the descriptor is cyclic, lift it from done_list and place it back
into active_list list.

Second, the segment .status fields of the descriptor remain dirty. Once
the DMA did one pass on the descriptor, the .status fields are populated
with data by the DMA, but the .status fields are not cleared before reuse
during the next cyclic DMA round. The xilinx_dma_get_residue() recognizes
that as if the descriptor was complete and had 0 residue, which is bogus.
Reinitialize the status field before placing the descriptor back into the
active_list.

Fixes: c0bba3a99f ("dmaengine: vdma: Add Support for Xilinx AXI Direct Memory Access Engine")
Signed-off-by: Marek Vasut <marex@nabladev.com>
Link: https://patch.msgid.link/20260316221943.160375-1-marex@nabladev.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2026-03-17 16:14:20 +05:30
Marek Vasut
e9cc95397b dmaengine: xilinx: xilinx_dma: Fix dma_device directions
Unlike chan->direction , struct dma_device .directions field is a
bitfield. Turn chan->direction into a bitfield to make it compatible
with struct dma_device .directions .

Fixes: 7e01511443 ("dmaengine: xilinx_dma: Set dma_device directions")
Signed-off-by: Marek Vasut <marex@nabladev.com>
Link: https://patch.msgid.link/20260316221728.160139-1-marex@nabladev.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2026-03-17 16:14:06 +05:30
Claudiu Beznea
89a8567d84 dmaengine: sh: rz-dmac: Move CHCTRL updates under spinlock
Both rz_dmac_disable_hw() and rz_dmac_irq_handle_channel() update the
CHCTRL register. To avoid concurrency issues when configuring
functionalities exposed by this registers, take the virtual channel lock.
All other CHCTRL updates were already protected by the same lock.

Previously, rz_dmac_disable_hw() disabled and re-enabled local IRQs, before
accessing CHCTRL registers but this does not ensure race-free access.
Remove the local IRQ disable/enable code as well.

Fixes: 5000d37042 ("dmaengine: sh: Add DMAC driver for RZ/G2L SoC")
Cc: stable@vger.kernel.org
Reviewed-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Link: https://patch.msgid.link/20260316133252.240348-3-claudiu.beznea.uj@bp.renesas.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2026-03-17 16:11:11 +05:30
Claudiu Beznea
abb863e621 dmaengine: sh: rz-dmac: Protect the driver specific lists
The driver lists (ld_free, ld_queue) are used in
rz_dmac_free_chan_resources(), rz_dmac_terminate_all(),
rz_dmac_issue_pending(), and rz_dmac_irq_handler_thread(), all under
the virtual channel lock. Take the same lock in rz_dmac_prep_slave_sg()
and rz_dmac_prep_dma_memcpy() as well to avoid concurrency issues, since
these functions also check whether the lists are empty and update or
remove list entries.

Fixes: 5000d37042 ("dmaengine: sh: Add DMAC driver for RZ/G2L SoC")
Cc: stable@vger.kernel.org
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Link: https://patch.msgid.link/20260316133252.240348-2-claudiu.beznea.uj@bp.renesas.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2026-03-17 16:11:11 +05:30
Alexey Nepomnyashih
bb120ad57d ALSA: firewire-lib: fix uninitialized local variable
Similar to commit d8dc872046 ("ALSA: firewire-lib: fix uninitialized
local variable"), the local variable `curr_cycle_time` in
process_rx_packets() is declared without initialization.

When the tracepoint event is not probed, the variable may appear to be
used without being initialized. In practice the value is only relevant
when the tracepoint is enabled, however initializing it avoids potential
use of an uninitialized value and improves code safety.

Initialize `curr_cycle_time` to zero.

Fixes: fef4e61b0b ("ALSA: firewire-lib: extend tracepoints event including CYCLE_TIME of 1394 OHCI")
Cc: stable@vger.kernel.org
Signed-off-by: Alexey Nepomnyashih <sdl@nppct.ru>
Link: https://patch.msgid.link/20260316191824.83249-1-sdl@nppct.ru
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2026-03-17 09:54:48 +01:00
Zhang Heng
0bdf27abaf ALSA: hda/realtek: add quirk for ASUS Strix G16 G615JMR
The machine is equipped with ALC294 and requires the
ALC287_FIXUP_TXNW2781_I2C_ASUS quirk for the amplifier to work properly.
Since the machine's PCI SSID is also 1043:1204, HDA_CODEC_QUIRK is
used to retain the previous quirk.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=221173
Cc: <stable@vger.kernel.org>
Signed-off-by: Zhang Heng <zhangheng@kylinos.cn>
Link: https://patch.msgid.link/20260316022843.2809968-1-zhangheng@kylinos.cn
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2026-03-16 18:04:36 +01:00
Sean Rhodes
a6919f2a01 ALSA: hda/realtek: Sequence GPIO2 on Star Labs StarFighter
The initial StarFighter quirk fixed the runtime suspend pop by muting
speakers in the shutup callback before power-down. Further hardware
validation showed that the speaker path is controlled directly by LINE2
EAPD on NID 0x1b together with GPIO2 for the external amplifier.

Replace the shutup-delay workaround with explicit sequencing of those
controls at playback start and stop:
- assert LINE2 EAPD and drive GPIO2 high on PREPARE
- deassert LINE2 EAPD and drive GPIO2 low on CLEANUP

This avoids the runtime suspend pop without a sleep, and also fixes pops
around G3 entry and display-manager start that the original workaround
did not cover.

Fixes: 1cb3c20688 ("ALSA: hda/realtek: Fix speaker pop on Star Labs StarFighter")
Tested-by: Sean Rhodes <sean@starlabs.systems>
Signed-off-by: Sean Rhodes <sean@starlabs.systems>
Link: https://patch.msgid.link/20260315201127.33744-1-sean@starlabs.systems
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2026-03-16 18:04:00 +01:00
Andy Shevchenko
09e70e4f11 regmap: Synchronize cache for the page selector
If the selector register is represented in each page, its value
according to the debugfs is stale because it gets synchronized
only after the real page switch happens. Hence the regmap cache
initialisation from the HW inherits outdated data in the selector
register.

Synchronize cache for the page selector just in time.

Before (offset followed by hexdump, the first byte is selector):

    // Real registers
    18: 05 ff 00 00 ff 0f 00 00 f0 00 00 00
    ...
    // Virtual (per port)
    40: 05 ff 00 00 e0 e0 00 00 00 00 00 1f
    50: 00 ff 00 00 e0 e0 00 00 00 00 00 1f
    60: 01 ff 00 00 ff ff 00 00 00 00 00 00
    70: 02 ff 00 00 cf f3 00 00 00 00 00 0c
    80: 03 ff 00 00 00 00 00 00 00 00 00 ff
    90: 04 ff 00 00 ff 0f 00 00 f0 00 00 00

After:

    // Real registers
    18: 05 ff 00 00 ff 0f 00 00 f0 00 00 00
    ...
    // Virtual (per port)
    40: 00 ff 00 00 e0 e0 00 00 00 00 00 1f
    50: 01 ff 00 00 e0 e0 00 00 00 00 00 1f
    60: 02 ff 00 00 ff ff 00 00 00 00 00 00
    70: 03 ff 00 00 cf f3 00 00 00 00 00 0c
    80: 04 ff 00 00 00 00 00 00 00 00 00 ff
    90: 05 ff 00 00 ff 0f 00 00 f0 00 00 00

Fixes: 6863ca6227 ("regmap: Add support for register indirect addressing.")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://patch.msgid.link/20260302184753.2693803-1-andriy.shevchenko@linux.intel.com
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
2026-03-16 13:50:34 +00:00
Hasun Park
399b6fd37a ASoC: amd: acp: add PX13 SoundWire machine link for rt721+tas2783x2
Add an ACP70 SoundWire machine entry for ASUS PX13
(HN7306EA/HN7306EAC) with rt721 and two TAS2783 amps on link1.

Describe rt721 with jack/DMIC endpoints on this platform and add
explicit left/right TAS2783 speaker endpoint mapping via name prefixes.

Signed-off-by: Hasun Park <hasunpark@gmail.com>
Link: https://patch.msgid.link/20260308151654.29059-3-hasunpark@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2026-03-16 00:14:35 +00:00
Hasun Park
d0426510a9 ASoC: amd: acp: add DMI override for ACP70 flag
Some ASUS ProArt PX13 systems expose ACP ACPI config flags that can
select a non-working fallback path.

Add a DMI override in snd_amd_acp_find_config() for ACP70+ boards and
return 0 so ACP ACPI flag-based selection is skipped on this platform.

This keeps machine driver selection on the intended SoundWire path.

Signed-off-by: Hasun Park <hasunpark@gmail.com>
Link: https://patch.msgid.link/20260308151654.29059-2-hasunpark@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2026-03-16 00:14:34 +00:00
Guangshuo Li
fe757092d2 ASoC: sma1307: fix double free of devm_kzalloc() memory
A previous change added NULL checks and cleanup for allocation
failures in sma1307_setting_loaded().

However, the cleanup for mode_set entries is wrong. Those entries are
allocated with devm_kzalloc(), so they are device-managed resources and
must not be freed with kfree(). Manually freeing them in the error path
can lead to a double free when devres later releases the same memory.

Drop the manual kfree() loop and let devres handle the cleanup.

Fixes: 0ec6bd1670 ("ASoC: sma1307: Add NULL check in sma1307_setting_loaded()")
Cc: stable@vger.kernel.org
Signed-off-by: Guangshuo Li <lgs201920130244@gmail.com>
Link: https://patch.msgid.link/20260313040611.391479-1-lgs201920130244@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2026-03-16 00:05:44 +00:00
Andrii Kovalchuk
793b008cd3 ALSA: hda/realtek: Add HP ENVY Laptop 13-ba0xxx quirk
Add a PCI quirk for HP ENVY Laptop 13-ba0xxx (PCI device ID 0x8756)
to enable proper mute LED and mic mute behavior using the
ALC245_FIXUP_HP_X360_MUTE_LEDS fixup.

Signed-off-by: Andrii Kovalchuk <coderpy4@proton.me>
Link: https://patch.msgid.link/u0s-uRVegF9BN0t-4JnOUwsIAR-mVc4U4FJfJHdEHX7ro_laErHD9y35NebWybcN16gVaVHPJo1ap3AoJ1a2gqJImPvThgeNt_SYVY1KaDw=@proton.me
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2026-03-15 10:28:39 +01:00
Matthew Schwartz
59f68dc1d8 ALSA: hda/realtek: Add quirk for ASUS ROG Flow Z13-KJP GZ302EAC
Fixes lack of audio output on the ASUS ROG Flow Z13-KJP GZ302EAC model,
similar to the ASUS ROG Flow Z13 GZ302EA.

Signed-off-by: Matthew Schwartz <matthew.schwartz@linux.dev>
Link: https://patch.msgid.link/20260313172503.285846-1-matthew.schwartz@linux.dev
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2026-03-14 14:32:05 +01:00
Zhang Heng
7bae956cac ALSA: hda/realtek: add quirk for Lenovo Yoga 7 2-in-1 16AKP10
This machine is equipped with ALC287 and requires the quirk
ALC287_FIXUP_YOGA9_14IAP7_BASS_SPK_PIN to fix the issue
where the bass speakers are not configured and the speaker
volume cannot be controlled.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=221210
Signed-off-by: Zhang Heng <zhangheng@kylinos.cn>
Link: https://patch.msgid.link/20260313080624.1395362-1-zhangheng@kylinos.cn
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2026-03-14 14:31:37 +01:00
Wang Jun
995a418a6c auxdisplay: lcd2s: add error handling for i2c transfers
The lcd2s_print() and lcd2s_gotoxy() functions currently ignore the
return value of lcd2s_i2c_master_send(), which can fail. This can lead
to silent data loss or incorrect cursor positioning.

Add proper error checking: if the number of bytes sent does not match
the expected length, return -EIO; otherwise propagate any error code
from the I2C transfer.

Signed-off-by: Wang Jun <1742789905@qq.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2026-03-13 11:00:06 +01:00
Yihan Ding
697f514ad9 landlock: Clean up interrupted thread logic in TSYNC
In landlock_restrict_sibling_threads(), when the calling thread is
interrupted while waiting for sibling threads to prepare, it executes
a recovery path.

Previously, this path included a wait_for_completion() call on
all_prepared to prevent a Use-After-Free of the local shared_ctx.
However, this wait is redundant. Exiting the main do-while loop
already leads to a bottom cleanup section that unconditionally waits
for all_finished. Therefore, replacing the wait with a simple break
is safe, prevents UAF, and correctly unblocks the remaining task_works.

Clean up the error path by breaking the loop and updating the
surrounding comments to accurately reflect the state machine.

Suggested-by: Günther Noack <gnoack3000@gmail.com>
Signed-off-by: Yihan Ding <dingyihan@uniontech.com>
Tested-by: Günther Noack <gnoack3000@gmail.com>
Reviewed-by: Günther Noack <gnoack3000@gmail.com>
Link: https://lore.kernel.org/r/20260306021651.744723-3-dingyihan@uniontech.com
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2026-03-10 18:22:58 +01:00
Yihan Ding
ff88df67db landlock: Serialize TSYNC thread restriction
syzbot found a deadlock in landlock_restrict_sibling_threads().
When multiple threads concurrently call landlock_restrict_self() with
sibling thread restriction enabled, they can deadlock by mutually
queueing task_works on each other and then blocking in kernel space
(waiting for the other to finish).

Fix this by serializing the TSYNC operations within the same process
using the exec_update_lock. This prevents concurrent invocations
from deadlocking.

We use down_write_trylock() and restart the syscall if the lock
cannot be acquired immediately. This ensures that if a thread fails
to get the lock, it will return to userspace, allowing it to process
any pending TSYNC task_works from the lock holder, and then
transparently restart the syscall.

Fixes: 42fc7e6543 ("landlock: Multithreading support for landlock_restrict_self()")
Reported-by: syzbot+7ea2f5e9dfd468201817@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=7ea2f5e9dfd468201817
Suggested-by: Günther Noack <gnoack3000@gmail.com>
Suggested-by: Tingmao Wang <m@maowtm.org>
Tested-by: Justin Suess <utilityemal77@gmail.com>
Signed-off-by: Yihan Ding <dingyihan@uniontech.com>
Tested-by: Günther Noack <gnoack3000@gmail.com>
Reviewed-by: Günther Noack <gnoack3000@gmail.com>
Link: https://lore.kernel.org/r/20260306021651.744723-2-dingyihan@uniontech.com
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2026-03-10 18:22:56 +01:00
Tuo Li
e1c9866173 dmaengine: idxd: fix possible wrong descriptor completion in llist_abort_desc()
At the end of this function, d is the traversal cursor of flist, but the
code completes found instead. This can lead to issues such as NULL pointer
dereferences, double completion, or descriptor leaks.

Fix this by completing d instead of found in the final
list_for_each_entry_safe() loop.

Fixes: aa8d18becc ("dmaengine: idxd: add callback support for iaa crypto")
Signed-off-by: Tuo Li <islituo@gmail.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://patch.msgid.link/20260106032428.162445-1-islituo@gmail.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2026-03-09 12:27:11 +01:00
Deepanshu Kartikey
e9075e420a netfs: Fix NULL pointer dereference in netfs_unbuffered_write() on retry
When a write subrequest is marked NETFS_SREQ_NEED_RETRY, the retry path
in netfs_unbuffered_write() unconditionally calls stream->prepare_write()
without checking if it is NULL.

Filesystems such as 9P do not set the prepare_write operation, so
stream->prepare_write remains NULL. When get_user_pages() fails with
-EFAULT and the subrequest is flagged for retry, this results in a NULL
pointer dereference at fs/netfs/direct_write.c:189.

Fix this by mirroring the pattern already used in write_retry.c: if
stream->prepare_write is NULL, skip renegotiation and directly reissue
the subrequest via netfs_reissue_write(), which handles iterator reset,
IN_PROGRESS flag, stats update and reissue internally.

Fixes: a0b4c7a491 ("netfs: Fix unbuffered/DIO writes to dispatch subrequests in strict sequence")
Reported-by: syzbot+7227db0fbac9f348dba0@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=7227db0fbac9f348dba0
Signed-off-by: Deepanshu Kartikey <Kartikey406@gmail.com>
Link: https://patch.msgid.link/20260307043947.347092-1-kartikey406@gmail.com
Tested-by: syzbot+7227db0fbac9f348dba0@syzkaller.appspotmail.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-03-09 10:21:56 +01:00
Deepanshu Kartikey
67e467a11f netfs: Fix kernel BUG in netfs_limit_iter() for ITER_KVEC iterators
When a process crashes and the kernel writes a core dump to a 9P
filesystem, __kernel_write() creates an ITER_KVEC iterator. This
iterator reaches netfs_limit_iter() via netfs_unbuffered_write(), which
only handles ITER_FOLIOQ, ITER_BVEC and ITER_XARRAY iterator types,
hitting the BUG() for any other type.

Fix this by adding netfs_limit_kvec() following the same pattern as
netfs_limit_bvec(), since both kvec and bvec are simple segment arrays
with pointer and length fields. Dispatch it from netfs_limit_iter() when
the iterator type is ITER_KVEC.

Fixes: cae932d3ae ("netfs: Add func to calculate pagecount/size-limited span of an iterator")
Reported-by: syzbot+9c058f0d63475adc97fd@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=9c058f0d63475adc97fd
Tested-by: syzbot+9c058f0d63475adc97fd@syzkaller.appspotmail.com
Signed-off-by: Deepanshu Kartikey <Kartikey406@gmail.com>
Link: https://patch.msgid.link/20260307090041.359870-1-kartikey406@gmail.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-03-09 10:17:00 +01:00
Alexander Stein
e0adbf74e2 dmaengine: xilinx: xdma: Fix regmap init error handling
devm_regmap_init_mmio returns an ERR_PTR() upon error, not NULL.
Fix the error check and also fix the error message. Use the error code
from ERR_PTR() instead of the wrong value in ret.

Fixes: 17ce252266 ("dmaengine: xilinx: xdma: Add xilinx xdma driver")
Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20251014061309.283468-1-alexander.stein@ew.tq-group.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2026-03-09 08:21:22 +01:00
LUO Haowen
3f63297ff6 dmaengine: dw-edma: Fix multiple times setting of the CYCLE_STATE and CYCLE_BIT bits for HDMA.
Others have submitted this issue (https://lore.kernel.org/dmaengine/
20240722030405.3385-1-zhengdongxiong@gxmicro.cn/),
but it has not been fixed yet. Therefore, more supplementary information
is provided here.

As mentioned in the "PCS-CCS-CB-TCB" Producer-Consumer Synchronization of
"DesignWare Cores PCI Express Controller Databook, version 6.00a":

1. The Consumer CYCLE_STATE (CCS) bit in the register only needs to be
initialized once; the value will update automatically to be
~CYCLE_BIT (CB) in the next chunk.
2. The Consumer CYCLE_BIT bit in the register is loaded from the LL
element and tested against CCS. When CB = CCS, the data transfer is
executed. Otherwise not.

The current logic sets customer (HDMA) CS and CB bits to 1 in each chunk
while setting the producer (software) CB of odd chunks to 0 and even
chunks to 1 in the linked list. This is leading to a mismatch between
the producer CB and consumer CS bits.

This issue can be reproduced by setting the transmission data size to
exceed one chunk. By the way, in the EDMA using the same "PCS-CCS-CB-TCB"
mechanism, the CS bit is only initialized once and this issue was not
found. Refer to
drivers/dma/dw-edma/dw-edma-v0-core.c:dw_edma_v0_core_start.

So fix this issue by initializing the CYCLE_STATE and CYCLE_BIT bits
only once.

Fixes: e74c39573d ("dmaengine: dw-edma: Add support for native HDMA")
Signed-off-by: LUO Haowen <luo-hw@foxmail.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/tencent_CB11AA9F3920C1911AF7477A9BD8EFE0AD05@qq.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2026-03-09 08:15:54 +01:00
Yonatan Nachum
ef3b06742c RDMA/efa: Fix use of completion ctx after free
On admin queue completion handling, if the admin command completed with
error we print data from the completion context. The issue is that we
already freed the completion context in polling/interrupts handler which
means we print data from context in an unknown state (it might be
already used again).
Change the admin submission flow so alloc/dealloc of the context will be
symmetric and dealloc will be called after any potential use of the
context.

Fixes: 68fb9f3e31 ("RDMA/efa: Remove redundant NULL pointer check of CQE")
Reviewed-by: Daniel Kranzdorf <dkkranzd@amazon.com>
Reviewed-by: Michael Margolin <mrgolin@amazon.com>
Signed-off-by: Yonatan Nachum <ynachum@amazon.com>
Link: https://patch.msgid.link/20260308165350.18219-1-ynachum@amazon.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2026-03-08 14:46:42 -04:00
Kamal Heib
c242e92c9d RDMA/bng_re: Fix silent failure in HWRM version query
If the firmware version query fails, the driver currently ignores the
error and continues initializing. This leaves the device in a bad state.

Fix this by making bng_re_query_hwrm_version() return the error code and
update the driver to check for this error and stop the setup process
safely if it happens.

Fixes: 745065770c ("RDMA/bng_re: Register and get the resources from bnge driver")
Signed-off-by: Kamal Heib <kheib@redhat.com>
Link: https://patch.msgid.link/20260303043645.425724-1-kheib@redhat.com
Reviewed-by: Siva Reddy Kallam <siva.kallam@broadcom.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2026-03-05 04:34:31 -05:00
Günther Noack
f8e2019c3b samples/landlock: Bump ABI version to 8
The sample tool should print a warning if it is not running on a
kernel that provides the newest Landlock ABI version.

Link: https://lore.kernel.org/all/20260218.ufao5Vaefa2u@digikod.net/
Suggested-by: Mickaël Salaün <mic@digikod.net>
Signed-off-by: Günther Noack <gnoack3000@gmail.com>
Link: https://lore.kernel.org/r/20260220160627.53913-1-gnoack3000@gmail.com
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2026-03-04 18:28:11 +01:00
Mickaël Salaün
bb8369ead4 landlock: Improve TSYNC types
Constify pointers when it makes sense.

Consistently use size_t for loops, especially to match works->size type.

Add new lines to improve readability.

Cc: Jann Horn <jannh@google.com>
Reviewed-by: Günther Noack <gnoack@google.com>
Link: https://lore.kernel.org/r/20260217122341.2359582-2-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2026-03-04 18:28:11 +01:00
Mickaël Salaün
929553bbb4 landlock: Fully release unused TSYNC work entries
If task_work_add() failed, ctx->task is put but the tsync_works struct
is not reset to its previous state.  The first consequence is that the
kernel allocates memory for dying threads, which could lead to
user-accounted memory exhaustion (not very useful nor specific to this
case).  The second consequence is that task_work_cancel(), called by
cancel_tsync_works(), can dereference a NULL task pointer.

Fix this issues by keeping a consistent works->size wrt the added task
work.  This is done in a new tsync_works_trim() helper which also cleans
up the shared_ctx and work fields.

As a safeguard, add a pointer check to cancel_tsync_works() and update
tsync_works_release() accordingly.

Cc: Jann Horn <jannh@google.com>
Reviewed-by: Günther Noack <gnoack@google.com>
Link: https://lore.kernel.org/r/20260217122341.2359582-1-mic@digikod.net
[mic: Replace memset() with compound literal]
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2026-03-04 18:28:10 +01:00
Mickaël Salaün
405ca72dc5 landlock: Fix formatting
Auto-format with clang-format -i security/landlock/*.[ch]

Cc: Günther Noack <gnoack@google.com>
Cc: Kees Cook <kees@kernel.org>
Fixes: 69050f8d6d ("treewide: Replace kmalloc with kmalloc_obj for non-scalar types")
Reviewed-by: Günther Noack <gnoack3000@gmail.com>
Link: https://lore.kernel.org/r/20260303173632.88040-1-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2026-03-04 18:28:08 +01:00
Abhijit Gangurde
a08aaf3968 RDMA/ionic: Preserve and set Ethernet source MAC after ib_ud_header_init()
ionic_build_hdr() populated the Ethernet source MAC (hdr->eth.smac_h) by
passing the header’s storage directly to rdma_read_gid_l2_fields().
However, ib_ud_header_init() is called after that and re-initializes the
UD header, which wipes the previously written smac_h. As a result, packets
are emitted with an zero source MAC address on the wire.

Correct the source MAC by reading the GID-derived smac into a temporary
buffer and copy it after ib_ud_header_init() completes.

Fixes: e8521822c7 ("RDMA/ionic: Register device ops for control path")
Cc: stable@vger.kernel.org # 6.18
Signed-off-by: Abhijit Gangurde <abhijit.gangurde@amd.com>
Link: https://patch.msgid.link/20260227061809.2979990-1-abhijit.gangurde@amd.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2026-03-04 02:44:01 -05:00
Jacob Moroni
29a3edd700 RDMA/irdma: Fix double free related to rereg_user_mr
If IB_MR_REREG_TRANS is set during rereg_user_mr, the
umem will be released and a new one will be allocated
in irdma_rereg_mr_trans. If any step of irdma_rereg_mr_trans
fails after the new umem is allocated, it releases the umem,
but does not set iwmr->region to NULL. The problem is that
this failure is propagated to the user, who will then call
ibv_dereg_mr (as they should). Then, the dereg_mr path will
see a non-NULL umem and attempt to call ib_umem_release again.

Fix this by setting iwmr->region to NULL after ib_umem_release.

Fixed: 5ac388db27 ("RDMA/irdma: Add support to re-register a memory region")
Signed-off-by: Jacob Moroni <jmoroni@google.com>
Link: https://patch.msgid.link/20260227152743.1183388-1-jmoroni@google.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2026-03-04 02:44:01 -05:00
Frank Li
398c0c8bbc dt-bindings: auxdisplay: ht16k33: Use unevaluatedProperties to fix common property warning
Change additionalProperties to unevaluatedProperties because it refs to
/schemas/input/matrix-keymap.yaml.

Fix below CHECK_DTBS warnings:
arch/arm/boot/dts/nxp/imx/imx6dl-victgo.dtb: keypad@70 (holtek,ht16k33): 'keypad,num-columns', 'keypad,num-rows' do not match any of the regexes: '^pinctrl-[0-9]+$'
        from schema $id: http://devicetree.org/schemas/auxdisplay/holtek,ht16k33.yaml#

Fixes: f12b457c6b ("dt-bindings: auxdisplay: ht16k33: Convert to json-schema")
Acked-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2026-03-03 08:59:02 +01:00
Abel Vesa
81af9e40e2 phy: qcom: qmp-ufs: Fix SM8650 PCS table for Gear 4
According to internal documentation, on SM8650, when the PHY is configured
in Gear 4, the QPHY_V6_PCS_UFS_PLL_CNTL register needs to have the same
value as for Gear 5.

At the moment, there is no board that comes with a UFS 3.x device, so
this issue doesn't show up, but with the new Eliza SoC, which uses the
same init sequence as SM8650, on the MTP board, the link startup fails
with the current Gear 4 PCS table.

So fix that by moving the entry into the PCS generic table instead,
while keeping the value from Gear 5 configuration.

Cc: stable@vger.kernel.org # v6.10
Fixes: b9251e64a9 ("phy: qcom: qmp-ufs: update SM8650 tables for Gear 4 & 5")
Suggested-by: Nitin Rawat <nitin.rawat@oss.qualcomm.com>
Signed-off-by: Abel Vesa <abel.vesa@oss.qualcomm.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-HDK
Link: https://patch.msgid.link/20260219-phy-qcom-qmp-ufs-fix-sm8650-pcs-g4-table-v1-1-f136505b57f6@oss.qualcomm.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2026-02-27 20:46:57 +05:30
Felix Gu
584b457f41 phy: ti: j721e-wiz: Fix device node reference leak in wiz_get_lane_phy_types()
The serdes device_node is obtained using of_get_child_by_name(),
which increments the reference count. However, it is never put,
leading to a reference leak.

Add the missing of_node_put() calls to ensure the reference count is
properly balanced.

Fixes: 7ae14cf581 ("phy: ti: j721e-wiz: Implement DisplayPort mode to the wiz driver")
Suggested-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Felix Gu <ustc.gu@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://patch.msgid.link/20260212-wiz-v2-1-6e8bd4cc7a4a@gmail.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2026-02-27 20:43:29 +05:30
Yixun Lan
f0cf0a882a phy: k1-usb: add disconnect function support
A disconnect status BIT of USB2 PHY need to be cleared, otherwise
it will fail to work properly during next connection when devices
connect to roothub directly.

Fixes: fe4bc1a086 ("phy: spacemit: support K1 USB2.0 PHY controller")
Signed-off-by: Yixun Lan <dlan@kernel.org>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://patch.msgid.link/20260216152653.25244-1-dlan@kernel.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2026-02-27 20:27:00 +05:30
Vladimir Oltean
a258d843a3 phy: lynx-28g: skip CDR lock workaround for lanes disabled in the device tree
The blamed commit introduced support for specifying individual lanes as
OF nodes in the device, and these can have status = "disabled".

When that happens, for_each_available_child_of_node() skips them and
lynx_28g_probe_lane() -> devm_phy_create() is not called, so lane->phy
will be NULL. Yet it will be dereferenced in lynx_28g_cdr_lock_check(),
resulting in a crash.

This used to be well handled in v3 of that patch:
https://lore.kernel.org/linux-phy/20250926180505.760089-14-vladimir.oltean@nxp.com/
but until v5 was merged, the logic to support per-lane OF nodes was
split into a separate change, and the per-SoC compatible strings patch
was deferred to a "part 2" set. The splitting was done improperly, and
that handling of NULL lane->phy pointers was not integrated into the
proper commit.

Fixes: 7df7d58abb ("phy: lynx-28g: support individual lanes as OF PHY providers")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20260226182853.1103616-1-vladimir.oltean@nxp.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2026-02-27 19:21:01 +05:30
Vladimir Oltean
48fafffcf2 phy: make PHY_COMMON_PROPS Kconfig symbol conditionally user-selectable
Geert reports that enabling CONFIG_KUNIT_ALL_TESTS shouldn't enable
features that aren't enabled without it. That isn't what "*all* tests"
means, but as the prompt puts it, "All KUnit tests with satisfied
dependencies".

The impact is that enabling CONFIG_KUNIT_ALL_TESTS brings features which
cannot be disabled as built-in into the kernel.

Keep the pattern where consumer drivers have to "select PHY_COMMON_PROPS",
but if KUNIT_ALL_TESTS is enabled, also make PHY_COMMON_PROPS user
selectable, so it can be turned off.

Modify PHY_COMMON_PROPS_TEST to depend on PHY_COMMON_PROPS rather than
select it.

Fixes: e7556b59ba ("phy: add phy_get_rx_polarity() and phy_get_tx_polarity()")
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Closes: https://lore.kernel.org/linux-phy/CAMuHMdUBaoYKNj52gn8DQeZFZ42Cvm6xT6fvo0-_twNv1k3Jhg@mail.gmail.com/
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20260226153315.3530378-1-vladimir.oltean@nxp.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2026-02-27 19:20:39 +05:30
Vinicius Costa Gomes
ee66bc2957 dmaengine: idxd: Fix leaking event log memory
During the device remove process, the device is reset, causing the
configuration registers to go back to their default state, which is
zero. As the driver is checking if the event log support was enabled
before deallocating, it will fail if a reset happened before.

Do not check if the support was enabled, the check for 'idxd->evl'
being valid (only allocated if the HW capability is available) is
enough.

Fixes: 244da66cda ("dmaengine: idxd: setup event log configuration")
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Link: https://patch.msgid.link/20260121-idxd-fix-flr-on-kernel-queues-v3-v3-10-7ed70658a9d1@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2026-02-25 16:39:18 +05:30
Vinicius Costa Gomes
c311f5e924 dmaengine: idxd: Fix freeing the allocated ida too late
It can happen that when the cdev .release() is called, the driver
already called ida_destroy(). Move ida_free() to the _del() path.

We see with DEBUG_KOBJECT_RELEASE enabled and forcing an early PCI
unbind.

Fixes: 04922b7445 ("dmaengine: idxd: fix cdev setup and free device lifetime issues")
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Link: https://patch.msgid.link/20260121-idxd-fix-flr-on-kernel-queues-v3-v3-9-7ed70658a9d1@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2026-02-25 16:39:18 +05:30
Vinicius Costa Gomes
d9cfb5193a dmaengine: idxd: Fix memory leak when a wq is reset
idxd_wq_disable_cleanup() which is called from the reset path for a
workqueue, sets the wq type to NONE, which for other parts of the
driver mean that the wq is empty (all its resources were released).

Only set the wq type to NONE after its resources are released.

Fixes: da32b28c95 ("dmaengine: idxd: cleanup workqueue config after disabling")
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Link: https://patch.msgid.link/20260121-idxd-fix-flr-on-kernel-queues-v3-v3-8-7ed70658a9d1@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2026-02-25 16:39:18 +05:30
Vinicius Costa Gomes
3d33de353b dmaengine: idxd: Fix not releasing workqueue on .release()
The workqueue associated with an DSA/IAA device is not released when
the object is freed.

Fixes: 47c16ac27d ("dmaengine: idxd: fix idxd conf_dev 'struct device' lifetime")
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Link: https://patch.msgid.link/20260121-idxd-fix-flr-on-kernel-queues-v3-v3-7-7ed70658a9d1@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2026-02-25 16:39:18 +05:30
Vinicius Costa Gomes
4fd3c4679f dmaengine: idxd: Wait for submitted operations on .device_synchronize()
When the dmaengine "core" asks the driver to synchronize, send a Drain
operation to the device workqueue, which will wait for the already
submitted operations to finish.

Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Link: https://patch.msgid.link/20260121-idxd-fix-flr-on-kernel-queues-v3-v3-6-7ed70658a9d1@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2026-02-25 16:39:18 +05:30
Vinicius Costa Gomes
2a93f5747d dmaengine: idxd: Flush all pending descriptors
When used as a dmaengine, the DMA "core" might ask the driver to
terminate all pending requests, when that happens, flush all pending
descriptors.

In this context, flush means removing the requests from the pending
lists, so even if they are completed after, the user is not notified.

Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Link: https://patch.msgid.link/20260121-idxd-fix-flr-on-kernel-queues-v3-v3-5-7ed70658a9d1@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2026-02-25 16:39:18 +05:30
Vinicius Costa Gomes
f019d7814b dmaengine: idxd: Flush kernel workqueues on Function Level Reset
When a Function Level Reset (FLR) happens, terminate the pending
descriptors that were issued by in-kernel users and disable the
interrupts associated with those. They will be re-enabled after FLR
finishes.

idxd_wq_flush_desc() is declared on idxd.h because it's going to be
used in by the DMA backend in a future patch.

Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://patch.msgid.link/20260121-idxd-fix-flr-on-kernel-queues-v3-v3-4-7ed70658a9d1@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2026-02-25 16:39:17 +05:30
Vinicius Costa Gomes
d6077df7b7 dmaengine: idxd: Fix possible invalid memory access after FLR
In the case that the first Function Level Reset (FLR) concludes
correctly, but in the second FLR the scratch area for the saved
configuration cannot be allocated, it's possible for a invalid memory
access to happen.

Always set the deallocated scratch area to NULL after FLR completes.

Fixes: 98d187a989 ("dmaengine: idxd: Enable Function Level Reset (FLR) for halt")
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Link: https://patch.msgid.link/20260121-idxd-fix-flr-on-kernel-queues-v3-v3-3-7ed70658a9d1@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2026-02-25 16:39:17 +05:30
Vinicius Costa Gomes
52d2edea0d dmaengine: idxd: Fix crash when the event log is disabled
If reporting errors to the event log is not supported by the hardware,
and an error that causes Function Level Reset (FLR) is received, the
driver will try to restore the event log even if it was not allocated.

Also, only try to free the event log if it was properly allocated.

Fixes: 6078a315ae ("dmaengine: idxd: Add idxd_device_config_save() and idxd_device_config_restore() helpers")
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Link: https://patch.msgid.link/20260121-idxd-fix-flr-on-kernel-queues-v3-v3-2-7ed70658a9d1@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2026-02-25 16:39:17 +05:30
Vinicius Costa Gomes
caf91cdf2d dmaengine: idxd: Fix lockdep warnings when calling idxd_device_config()
Move the check for IDXD_FLAG_CONFIGURABLE and the locking to "inside"
idxd_device_config(), as this is common to all callers, and the one
that wasn't holding the lock was an error (that was causing the
lockdep warning).

Suggested-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Link: https://patch.msgid.link/20260121-idxd-fix-flr-on-kernel-queues-v3-v3-1-7ed70658a9d1@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2026-02-25 16:39:17 +05:30
Shenghui Shi
77b19d053a dmaengine: dw-edma: fix MSI data programming for multi-IRQ case
When using MSI (not MSI-X) with multiple IRQs, the MSI data value
must be unique per vector to ensure correct interrupt delivery.
Currently, the driver fails to increment the MSI data per vector,
causing interrupts to be misrouted.

Fix this by caching the base MSI data and adjusting each vector's
data accordingly during IRQ setup.

Fixes: e63d79d1ff04 ("dmaengine: dw-edma: Add Synopsys DesignWare eDMA IP core driver")
Signed-off-by: Shenghui Shi <brody.shi@m2semi.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20260209103726.414-1-brody.shi@m2semi.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2026-02-25 16:32:11 +05:30
Joy Zou
2e7b5cf72e dmaengine: fsl-edma: fix channel parameter config for fixed channel requests
Configure only the requested channel when a fixed channel is specified
to avoid modifying other channels unintentionally.

Fix parameter configuration when a fixed DMA channel is requested on
i.MX9 AON domain and i.MX8QM/QXP/DXL platforms. When a client requests
a fixed channel (e.g., channel 6), the driver traverses channels 0-5
and may unintentionally modify their configuration if they are unused.

This leads to issues such as setting the `is_multi_fifo` flag unexpectedly,
causing memcpy tests to fail when using the dmatest tool.

Only affect edma memcpy test when the channel is fixed.

Fixes: 72f5801a4e ("dmaengine: fsl-edma: integrate v3 support")
Signed-off-by: Joy Zou <joy.zou@nxp.com>
Cc: stable@vger.kernel.org
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20250917-b4-edma-chanconf-v1-1-886486e02e91@nxp.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2026-02-25 14:55:10 +05:30
345 changed files with 4783 additions and 1895 deletions

View File

@@ -316,6 +316,7 @@ Hans Verkuil <hverkuil@kernel.org> <hverkuil-cisco@xs4all.nl>
Hans Verkuil <hverkuil@kernel.org> <hansverk@cisco.com>
Hao Ge <hao.ge@linux.dev> <gehao@kylinos.cn>
Harry Yoo <harry.yoo@oracle.com> <42.hyeyoo@gmail.com>
Harry Yoo <harry@kernel.org> <harry.yoo@oracle.com>
Heiko Carstens <hca@linux.ibm.com> <h.carstens@de.ibm.com>
Heiko Carstens <hca@linux.ibm.com> <heiko.carstens@de.ibm.com>
Heiko Stuebner <heiko@sntech.de> <heiko.stuebner@bqreaders.com>

View File

@@ -85,6 +85,16 @@ In the example, 'Requester ID' means the ID of the device that sent
the error message to the Root Port. Please refer to PCIe specs for other
fields.
The 'TLP Header' is the prefix/header of the TLP that caused the error
in raw hex format. To decode the TLP Header into human-readable form
one may use tlp-tool:
https://github.com/mmpg-x86/tlp-tool
Example usage::
curl -L https://git.kernel.org/linus/2ca1c94ce0b6 | rtlp-tool --aer
AER Ratelimits
--------------

View File

@@ -66,7 +66,7 @@ then:
required:
- refresh-rate-hz
additionalProperties: false
unevaluatedProperties: false
examples:
- |

View File

@@ -42,7 +42,7 @@ properties:
- const: mgbe
- const: mac
- const: mac-divider
- const: ptp-ref
- const: ptp_ref
- const: rx-input-m
- const: rx-input
- const: tx
@@ -133,7 +133,7 @@ examples:
<&bpmp TEGRA234_CLK_MGBE0_RX_PCS_M>,
<&bpmp TEGRA234_CLK_MGBE0_RX_PCS>,
<&bpmp TEGRA234_CLK_MGBE0_TX_PCS>;
clock-names = "mgbe", "mac", "mac-divider", "ptp-ref", "rx-input-m",
clock-names = "mgbe", "mac", "mac-divider", "ptp_ref", "rx-input-m",
"rx-input", "tx", "eee-pcs", "rx-pcs-input", "rx-pcs-m",
"rx-pcs", "tx-pcs";
resets = <&bpmp TEGRA234_RESET_MGBE0_MAC>,

View File

@@ -33,6 +33,7 @@ properties:
- const: rockchip,rk3066-spdif
- items:
- enum:
- rockchip,rk3576-spdif
- rockchip,rk3588-spdif
- const: rockchip,rk3568-spdif

View File

@@ -164,7 +164,7 @@ allOf:
properties:
compatible:
contains:
const: st,stm32mph7-sai
const: st,stm32h7-sai
then:
properties:
clocks:

View File

@@ -783,6 +783,56 @@ controlled by the "uuid" mount option, which supports these values:
mounted with "uuid=on".
Durability and copy up
----------------------
The fsync(2) system call ensures that the data and metadata of a file
are safely written to the backing storage, which is expected to
guarantee the existence of the information post system crash.
Without an fsync(2) call, there is no guarantee that the observed
data after a system crash will be either the old or the new data, but
in practice, the observed data after crash is often the old or new data
or a mix of both.
When an overlayfs file is modified for the first time, copy up will
create a copy of the lower file and its parent directories in the upper
layer. Since the Linux filesystem API does not enforce any particular
ordering on storing changes without explicit fsync(2) calls, in case
of a system crash, the upper file could end up with no data at all
(i.e. zeros), which would be an unusual outcome. To avoid this
experience, overlayfs calls fsync(2) on the upper file before completing
data copy up with rename(2) or link(2) to make the copy up "atomic".
By default, overlayfs does not explicitly call fsync(2) on copied up
directories or on metadata-only copy up, so it provides no guarantee to
persist the user's modification unless the user calls fsync(2).
The fsync during copy up only guarantees that if a copy up is observed
after a crash, the observed data is not zeroes or intermediate values
from the copy up staging area.
On traditional local filesystems with a single journal (e.g. ext4, xfs),
fsync on a file also persists the parent directory changes, because they
are usually modified in the same transaction, so metadata durability during
data copy up effectively comes for free. Overlayfs further limits risk by
disallowing network filesystems as upper layer.
Overlayfs can be tuned to prefer performance or durability when storing
to the underlying upper layer. This is controlled by the "fsync" mount
option, which supports these values:
- "auto": (default)
Call fsync(2) on upper file before completion of data copy up.
No explicit fsync(2) on directory or metadata-only copy up.
- "strict":
Call fsync(2) on upper file and directories before completion of any
copy up.
- "volatile": [*]
Prefer performance over durability (see `Volatile mount`_)
[*] The mount option "volatile" is an alias to "fsync=volatile".
Volatile mount
--------------

View File

@@ -27,10 +27,10 @@ for details.
Sysfs entries
-------------
The following attributes are supported. Current maxim attribute
The following attributes are supported. Current maximum attribute
is read-write, all other attributes are read-only.
in0_input Measured voltage in microvolts.
in0_input Measured voltage in millivolts.
curr1_input Measured current in microamperes.
curr1_max_alarm Overcurrent alarm in microamperes.
curr1_input Measured current in milliamperes.
curr1_max Overcurrent shutdown threshold in milliamperes.

View File

@@ -51,8 +51,9 @@ temp1_max Provides thermal control temperature of the CPU package
temp1_crit Provides shutdown temperature of the CPU package which
is also known as the maximum processor junction
temperature, Tjmax or Tprochot.
temp1_crit_hyst Provides the hysteresis value from Tcontrol to Tjmax of
the CPU package.
temp1_crit_hyst Provides the hysteresis temperature of the CPU
package. Returns Tcontrol, the temperature at which
the critical condition clears.
temp2_label "DTS"
temp2_input Provides current temperature of the CPU package scaled
@@ -62,8 +63,9 @@ temp2_max Provides thermal control temperature of the CPU package
temp2_crit Provides shutdown temperature of the CPU package which
is also known as the maximum processor junction
temperature, Tjmax or Tprochot.
temp2_crit_hyst Provides the hysteresis value from Tcontrol to Tjmax of
the CPU package.
temp2_crit_hyst Provides the hysteresis temperature of the CPU
package. Returns Tcontrol, the temperature at which
the critical condition clears.
temp3_label "Tcontrol"
temp3_input Provides current Tcontrol temperature of the CPU

View File

@@ -8,7 +8,7 @@ Landlock: unprivileged access control
=====================================
:Author: Mickaël Salaün
:Date: January 2026
:Date: March 2026
The goal of Landlock is to enable restriction of ambient rights (e.g. global
filesystem or network access) for a set of processes. Because Landlock
@@ -197,12 +197,27 @@ similar backwards compatibility check is needed for the restrict flags
.. code-block:: c
__u32 restrict_flags = LANDLOCK_RESTRICT_SELF_LOG_NEW_EXEC_ON;
if (abi < 7) {
/* Clear logging flags unsupported before ABI 7. */
__u32 restrict_flags =
LANDLOCK_RESTRICT_SELF_LOG_NEW_EXEC_ON |
LANDLOCK_RESTRICT_SELF_TSYNC;
switch (abi) {
case 1 ... 6:
/* Removes logging flags for ABI < 7 */
restrict_flags &= ~(LANDLOCK_RESTRICT_SELF_LOG_SAME_EXEC_OFF |
LANDLOCK_RESTRICT_SELF_LOG_NEW_EXEC_ON |
LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF);
__attribute__((fallthrough));
case 7:
/*
* Removes multithreaded enforcement flag for ABI < 8
*
* WARNING: Without this flag, calling landlock_restrict_self(2) is
* only equivalent if the calling process is single-threaded. Below
* ABI v8 (and as of ABI v8, when not using this flag), a Landlock
* policy would only be enforced for the calling thread and its
* children (and not for all threads, including parents and siblings).
*/
restrict_flags &= ~LANDLOCK_RESTRICT_SELF_TSYNC;
}
The next step is to restrict the current thread from gaining more privileges

View File

@@ -8628,8 +8628,14 @@ F: drivers/gpu/drm/lima/
F: include/uapi/drm/lima_drm.h
DRM DRIVERS FOR LOONGSON
M: Jianmin Lv <lvjianmin@loongson.cn>
M: Qianhai Wu <wuqianhai@loongson.cn>
R: Huacai Chen <chenhuacai@kernel.org>
R: Mingcong Bai <jeffbai@aosc.io>
R: Xi Ruoyao <xry111@xry111.site>
R: Icenowy Zheng <zhengxingda@iscas.ac.cn>
L: dri-devel@lists.freedesktop.org
S: Orphan
S: Maintained
T: git https://gitlab.freedesktop.org/drm/misc/kernel.git
F: drivers/gpu/drm/loongson/
@@ -9613,7 +9619,12 @@ F: include/linux/ext2*
EXT4 FILE SYSTEM
M: "Theodore Ts'o" <tytso@mit.edu>
M: Andreas Dilger <adilger.kernel@dilger.ca>
R: Andreas Dilger <adilger.kernel@dilger.ca>
R: Baokun Li <libaokun@linux.alibaba.com>
R: Jan Kara <jack@suse.cz>
R: Ojaswin Mujoo <ojaswin@linux.ibm.com>
R: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
R: Zhang Yi <yi.zhang@huawei.com>
L: linux-ext4@vger.kernel.org
S: Maintained
W: http://ext4.wiki.kernel.org
@@ -12009,7 +12020,6 @@ I2C SUBSYSTEM
M: Wolfram Sang <wsa+renesas@sang-engineering.com>
L: linux-i2c@vger.kernel.org
S: Maintained
W: https://i2c.wiki.kernel.org/
Q: https://patchwork.ozlabs.org/project/linux-i2c/list/
T: git git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux.git
F: Documentation/i2c/
@@ -12035,7 +12045,6 @@ I2C SUBSYSTEM HOST DRIVERS
M: Andi Shyti <andi.shyti@kernel.org>
L: linux-i2c@vger.kernel.org
S: Maintained
W: https://i2c.wiki.kernel.org/
Q: https://patchwork.ozlabs.org/project/linux-i2c/list/
T: git git://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux.git
F: Documentation/devicetree/bindings/i2c/
@@ -16877,7 +16886,7 @@ M: Lorenzo Stoakes <ljs@kernel.org>
R: Rik van Riel <riel@surriel.com>
R: Liam R. Howlett <Liam.Howlett@oracle.com>
R: Vlastimil Babka <vbabka@kernel.org>
R: Harry Yoo <harry.yoo@oracle.com>
R: Harry Yoo <harry@kernel.org>
R: Jann Horn <jannh@google.com>
L: linux-mm@kvack.org
S: Maintained
@@ -21066,8 +21075,7 @@ F: include/uapi/linux/atmppp.h
F: net/atm/pppoatm.c
PPP OVER ETHERNET
M: Michal Ostrowski <mostrows@earthlink.net>
S: Maintained
S: Orphan
F: drivers/net/ppp/pppoe.c
F: drivers/net/ppp/pppox.c
@@ -22121,7 +22129,7 @@ S: Supported
F: drivers/infiniband/sw/rdmavt
RDS - RELIABLE DATAGRAM SOCKETS
M: Allison Henderson <allison.henderson@oracle.com>
M: Allison Henderson <achender@kernel.org>
L: netdev@vger.kernel.org
L: linux-rdma@vger.kernel.org
L: rds-devel@oss.oracle.com (moderated for non-subscribers)
@@ -24343,7 +24351,7 @@ F: drivers/nvmem/layouts/sl28vpd.c
SLAB ALLOCATOR
M: Vlastimil Babka <vbabka@kernel.org>
M: Harry Yoo <harry.yoo@oracle.com>
M: Harry Yoo <harry@kernel.org>
M: Andrew Morton <akpm@linux-foundation.org>
R: Hao Li <hao.li@linux.dev>
R: Christoph Lameter <cl@gentwo.org>

View File

@@ -2,7 +2,7 @@
VERSION = 7
PATCHLEVEL = 0
SUBLEVEL = 0
EXTRAVERSION = -rc5
EXTRAVERSION = -rc6
NAME = Baby Opossum Posse
# *DOCUMENTATION*

View File

@@ -41,4 +41,40 @@
.cfi_endproc; \
SYM_END(name, SYM_T_NONE)
/*
* This is for the signal handler trampoline, which is used as the return
* address of the signal handlers in userspace instead of called normally.
* The long standing libgcc bug https://gcc.gnu.org/PR124050 requires a
* nop between .cfi_startproc and the actual address of the trampoline, so
* we cannot simply use SYM_FUNC_START.
*
* This wrapper also contains all the .cfi_* directives for recovering
* the content of the GPRs and the "return address" (where the rt_sigreturn
* syscall will jump to), assuming there is a struct rt_sigframe (where
* a struct sigcontext containing those information we need to recover) at
* $sp. The "DWARF for the LoongArch(TM) Architecture" manual states
* column 0 is for $zero, but it does not make too much sense to
* save/restore the hardware zero register. Repurpose this column here
* for the return address (here it's not the content of $ra we cannot use
* the default column 3).
*/
#define SYM_SIGFUNC_START(name) \
.cfi_startproc; \
.cfi_signal_frame; \
.cfi_def_cfa 3, RT_SIGFRAME_SC; \
.cfi_return_column 0; \
.cfi_offset 0, SC_PC; \
\
.irp num, 1, 2, 3, 4, 5, 6, 7, 8, \
9, 10, 11, 12, 13, 14, 15, 16, \
17, 18, 19, 20, 21, 22, 23, 24, \
25, 26, 27, 28, 29, 30, 31; \
.cfi_offset \num, SC_REGS + \num * SZREG; \
.endr; \
\
nop; \
SYM_START(name, SYM_L_GLOBAL, SYM_A_ALIGN)
#define SYM_SIGFUNC_END(name) SYM_FUNC_END(name)
#endif

View File

@@ -0,0 +1,9 @@
/* SPDX-License-Identifier: GPL-2.0+ */
#include <asm/siginfo.h>
#include <asm/ucontext.h>
struct rt_sigframe {
struct siginfo rs_info;
struct ucontext rs_uctx;
};

View File

@@ -16,6 +16,7 @@
#include <asm/ptrace.h>
#include <asm/processor.h>
#include <asm/ftrace.h>
#include <asm/sigframe.h>
#include <vdso/datapage.h>
static void __used output_ptreg_defines(void)
@@ -220,6 +221,7 @@ static void __used output_sc_defines(void)
COMMENT("Linux sigcontext offsets.");
OFFSET(SC_REGS, sigcontext, sc_regs);
OFFSET(SC_PC, sigcontext, sc_pc);
OFFSET(RT_SIGFRAME_SC, rt_sigframe, rs_uctx.uc_mcontext);
BLANK();
}

View File

@@ -42,16 +42,15 @@ static int __init init_cpu_fullname(void)
int cpu, ret;
char *cpuname;
const char *model;
struct device_node *root;
/* Parsing cpuname from DTS model property */
root = of_find_node_by_path("/");
ret = of_property_read_string(root, "model", &model);
ret = of_property_read_string(of_root, "model", &model);
if (ret == 0) {
cpuname = kstrdup(model, GFP_KERNEL);
if (!cpuname)
return -ENOMEM;
loongson_sysconf.cpuname = strsep(&cpuname, " ");
}
of_node_put(root);
if (loongson_sysconf.cpuname && !strncmp(loongson_sysconf.cpuname, "Loongson", 8)) {
for (cpu = 0; cpu < NR_CPUS; cpu++)

View File

@@ -35,6 +35,7 @@
#include <asm/cpu-features.h>
#include <asm/fpu.h>
#include <asm/lbt.h>
#include <asm/sigframe.h>
#include <asm/ucontext.h>
#include <asm/vdso.h>
@@ -51,11 +52,6 @@
#define lock_lbt_owner() ({ preempt_disable(); pagefault_disable(); })
#define unlock_lbt_owner() ({ pagefault_enable(); preempt_enable(); })
struct rt_sigframe {
struct siginfo rs_info;
struct ucontext rs_uctx;
};
struct _ctx_layout {
struct sctx_info *addr;
unsigned int size;

View File

@@ -83,7 +83,7 @@ static inline void eiointc_update_sw_coremap(struct loongarch_eiointc *s,
if (!(s->status & BIT(EIOINTC_ENABLE_CPU_ENCODE))) {
cpuid = ffs(cpuid) - 1;
cpuid = (cpuid >= 4) ? 0 : cpuid;
cpuid = ((cpuid < 0) || (cpuid >= 4)) ? 0 : cpuid;
}
vcpu = kvm_get_vcpu_by_cpuid(s->kvm, cpuid);
@@ -472,34 +472,34 @@ static int kvm_eiointc_regs_access(struct kvm_device *dev,
switch (addr) {
case EIOINTC_NODETYPE_START ... EIOINTC_NODETYPE_END:
offset = (addr - EIOINTC_NODETYPE_START) / 4;
p = s->nodetype + offset * 4;
p = (void *)s->nodetype + offset * 4;
break;
case EIOINTC_IPMAP_START ... EIOINTC_IPMAP_END:
offset = (addr - EIOINTC_IPMAP_START) / 4;
p = &s->ipmap + offset * 4;
p = (void *)&s->ipmap + offset * 4;
break;
case EIOINTC_ENABLE_START ... EIOINTC_ENABLE_END:
offset = (addr - EIOINTC_ENABLE_START) / 4;
p = s->enable + offset * 4;
p = (void *)s->enable + offset * 4;
break;
case EIOINTC_BOUNCE_START ... EIOINTC_BOUNCE_END:
offset = (addr - EIOINTC_BOUNCE_START) / 4;
p = s->bounce + offset * 4;
p = (void *)s->bounce + offset * 4;
break;
case EIOINTC_ISR_START ... EIOINTC_ISR_END:
offset = (addr - EIOINTC_ISR_START) / 4;
p = s->isr + offset * 4;
p = (void *)s->isr + offset * 4;
break;
case EIOINTC_COREISR_START ... EIOINTC_COREISR_END:
if (cpu >= s->num_cpu)
return -EINVAL;
offset = (addr - EIOINTC_COREISR_START) / 4;
p = s->coreisr[cpu] + offset * 4;
p = (void *)s->coreisr[cpu] + offset * 4;
break;
case EIOINTC_COREMAP_START ... EIOINTC_COREMAP_END:
offset = (addr - EIOINTC_COREMAP_START) / 4;
p = s->coremap + offset * 4;
p = (void *)s->coremap + offset * 4;
break;
default:
kvm_err("%s: unknown eiointc register, addr = %d\n", __func__, addr);

View File

@@ -588,6 +588,9 @@ struct kvm_vcpu *kvm_get_vcpu_by_cpuid(struct kvm *kvm, int cpuid)
{
struct kvm_phyid_map *map;
if (cpuid < 0)
return NULL;
if (cpuid >= KVM_MAX_PHYID)
return NULL;

View File

@@ -5,9 +5,11 @@
#include <linux/kernel.h>
#include <linux/init.h>
#include <linux/acpi.h>
#include <linux/delay.h>
#include <linux/types.h>
#include <linux/pci.h>
#include <linux/vgaarb.h>
#include <linux/io-64-nonatomic-lo-hi.h>
#include <asm/cacheflush.h>
#include <asm/loongson.h>
@@ -15,6 +17,9 @@
#define PCI_DEVICE_ID_LOONGSON_DC1 0x7a06
#define PCI_DEVICE_ID_LOONGSON_DC2 0x7a36
#define PCI_DEVICE_ID_LOONGSON_DC3 0x7a46
#define PCI_DEVICE_ID_LOONGSON_GPU1 0x7a15
#define PCI_DEVICE_ID_LOONGSON_GPU2 0x7a25
#define PCI_DEVICE_ID_LOONGSON_GPU3 0x7a35
int raw_pci_read(unsigned int domain, unsigned int bus, unsigned int devfn,
int reg, int len, u32 *val)
@@ -99,3 +104,78 @@ static void pci_fixup_vgadev(struct pci_dev *pdev)
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON, PCI_DEVICE_ID_LOONGSON_DC1, pci_fixup_vgadev);
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON, PCI_DEVICE_ID_LOONGSON_DC2, pci_fixup_vgadev);
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON, PCI_DEVICE_ID_LOONGSON_DC3, pci_fixup_vgadev);
#define CRTC_NUM_MAX 2
#define CRTC_OUTPUT_ENABLE 0x100
static void loongson_gpu_fixup_dma_hang(struct pci_dev *pdev, bool on)
{
u32 i, val, count, crtc_offset, device;
void __iomem *crtc_reg, *base, *regbase;
static u32 crtc_status[CRTC_NUM_MAX] = { 0 };
base = pdev->bus->ops->map_bus(pdev->bus, pdev->devfn + 1, 0);
device = readw(base + PCI_DEVICE_ID);
regbase = ioremap(readq(base + PCI_BASE_ADDRESS_0) & ~0xffull, SZ_64K);
if (!regbase) {
pci_err(pdev, "Failed to ioremap()\n");
return;
}
switch (device) {
case PCI_DEVICE_ID_LOONGSON_DC2:
crtc_reg = regbase + 0x1240;
crtc_offset = 0x10;
break;
case PCI_DEVICE_ID_LOONGSON_DC3:
crtc_reg = regbase;
crtc_offset = 0x400;
break;
}
for (i = 0; i < CRTC_NUM_MAX; i++, crtc_reg += crtc_offset) {
val = readl(crtc_reg);
if (!on)
crtc_status[i] = val;
/* No need to fixup if the status is off at startup. */
if (!(crtc_status[i] & CRTC_OUTPUT_ENABLE))
continue;
if (on)
val |= CRTC_OUTPUT_ENABLE;
else
val &= ~CRTC_OUTPUT_ENABLE;
mb();
writel(val, crtc_reg);
for (count = 0; count < 40; count++) {
val = readl(crtc_reg) & CRTC_OUTPUT_ENABLE;
if ((on && val) || (!on && !val))
break;
udelay(1000);
}
pci_info(pdev, "DMA hang fixup at reg[0x%lx]: 0x%x\n",
(unsigned long)crtc_reg & 0xffff, readl(crtc_reg));
}
iounmap(regbase);
}
static void pci_fixup_dma_hang_early(struct pci_dev *pdev)
{
loongson_gpu_fixup_dma_hang(pdev, false);
}
DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON, PCI_DEVICE_ID_LOONGSON_GPU2, pci_fixup_dma_hang_early);
DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON, PCI_DEVICE_ID_LOONGSON_GPU3, pci_fixup_dma_hang_early);
static void pci_fixup_dma_hang_final(struct pci_dev *pdev)
{
loongson_gpu_fixup_dma_hang(pdev, true);
}
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON, PCI_DEVICE_ID_LOONGSON_GPU2, pci_fixup_dma_hang_final);
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON, PCI_DEVICE_ID_LOONGSON_GPU3, pci_fixup_dma_hang_final);

View File

@@ -26,7 +26,7 @@ cflags-vdso := $(ccflags-vdso) \
$(filter -W%,$(filter-out -Wa$(comma)%,$(KBUILD_CFLAGS))) \
-std=gnu11 -fms-extensions -O2 -g -fno-strict-aliasing -fno-common -fno-builtin \
-fno-stack-protector -fno-jump-tables -DDISABLE_BRANCH_PROFILING \
$(call cc-option, -fno-asynchronous-unwind-tables) \
$(call cc-option, -fasynchronous-unwind-tables) \
$(call cc-option, -fno-stack-protector)
aflags-vdso := $(ccflags-vdso) \
-D__ASSEMBLY__ -Wa,-gdwarf-2
@@ -41,7 +41,7 @@ endif
# VDSO linker flags.
ldflags-y := -Bsymbolic --no-undefined -soname=linux-vdso.so.1 \
$(filter -E%,$(KBUILD_CFLAGS)) -shared --build-id -T
$(filter -E%,$(KBUILD_CFLAGS)) -shared --build-id --eh-frame-hdr -T
#
# Shared build commands.

View File

@@ -12,13 +12,13 @@
#include <asm/regdef.h>
#include <asm/asm.h>
#include <asm/asm-offsets.h>
.section .text
.cfi_sections .debug_frame
SYM_FUNC_START(__vdso_rt_sigreturn)
SYM_SIGFUNC_START(__vdso_rt_sigreturn)
li.w a7, __NR_rt_sigreturn
syscall 0
SYM_FUNC_END(__vdso_rt_sigreturn)
SYM_SIGFUNC_END(__vdso_rt_sigreturn)

View File

@@ -62,8 +62,8 @@ do { \
* @size: number of elements in array
*/
#define array_index_mask_nospec array_index_mask_nospec
static inline unsigned long array_index_mask_nospec(unsigned long index,
unsigned long size)
static __always_inline unsigned long array_index_mask_nospec(unsigned long index,
unsigned long size)
{
unsigned long mask;

View File

@@ -271,6 +271,7 @@ SYM_CODE_START(system_call)
xgr %r9,%r9
xgr %r10,%r10
xgr %r11,%r11
xgr %r12,%r12
la %r2,STACK_FRAME_OVERHEAD(%r15) # pointer to pt_regs
mvc __PT_R8(64,%r2),__LC_SAVE_AREA(%r13)
MBEAR %r2,%r13
@@ -407,6 +408,7 @@ SYM_CODE_START(\name)
xgr %r6,%r6
xgr %r7,%r7
xgr %r10,%r10
xgr %r12,%r12
xc __PT_FLAGS(8,%r11),__PT_FLAGS(%r11)
mvc __PT_R8(64,%r11),__LC_SAVE_AREA(%r13)
MBEAR %r11,%r13
@@ -496,6 +498,7 @@ SYM_CODE_START(mcck_int_handler)
xgr %r6,%r6
xgr %r7,%r7
xgr %r10,%r10
xgr %r12,%r12
stmg %r8,%r9,__PT_PSW(%r11)
xc __PT_FLAGS(8,%r11),__PT_FLAGS(%r11)
xc __SF_BACKCHAIN(8,%r15),__SF_BACKCHAIN(%r15)

View File

@@ -13,6 +13,7 @@
*/
#include <linux/cpufeature.h>
#include <linux/nospec.h>
#include <linux/errno.h>
#include <linux/sched.h>
#include <linux/mm.h>
@@ -131,8 +132,10 @@ void noinstr __do_syscall(struct pt_regs *regs, int per_trap)
if (unlikely(test_and_clear_pt_regs_flag(regs, PIF_SYSCALL_RET_SET)))
goto out;
regs->gprs[2] = -ENOSYS;
if (likely(nr < NR_syscalls))
if (likely(nr < NR_syscalls)) {
nr = array_index_nospec(nr, NR_syscalls);
regs->gprs[2] = sys_call_table[nr](regs);
}
out:
syscall_exit_to_user_mode(regs);
}

View File

@@ -134,32 +134,6 @@ int dat_set_asce_limit(struct kvm_s390_mmu_cache *mc, union asce *asce, int newt
return 0;
}
/**
* dat_crstep_xchg() - Exchange a gmap CRSTE with another.
* @crstep: Pointer to the CRST entry
* @new: Replacement entry.
* @gfn: The affected guest address.
* @asce: The ASCE of the address space.
*
* Context: This function is assumed to be called with kvm->mmu_lock held.
*/
void dat_crstep_xchg(union crste *crstep, union crste new, gfn_t gfn, union asce asce)
{
if (crstep->h.i) {
WRITE_ONCE(*crstep, new);
return;
} else if (cpu_has_edat2()) {
crdte_crste(crstep, *crstep, new, gfn, asce);
return;
}
if (machine_has_tlb_guest())
idte_crste(crstep, gfn, IDTE_GUEST_ASCE, asce, IDTE_GLOBAL);
else
idte_crste(crstep, gfn, 0, NULL_ASCE, IDTE_GLOBAL);
WRITE_ONCE(*crstep, new);
}
/**
* dat_crstep_xchg_atomic() - Atomically exchange a gmap CRSTE with another.
* @crstep: Pointer to the CRST entry.
@@ -175,8 +149,8 @@ void dat_crstep_xchg(union crste *crstep, union crste new, gfn_t gfn, union asce
*
* Return: %true if the exchange was successful.
*/
bool dat_crstep_xchg_atomic(union crste *crstep, union crste old, union crste new, gfn_t gfn,
union asce asce)
bool __must_check dat_crstep_xchg_atomic(union crste *crstep, union crste old, union crste new,
gfn_t gfn, union asce asce)
{
if (old.h.i)
return arch_try_cmpxchg((long *)crstep, &old.val, new.val);
@@ -292,6 +266,7 @@ static int dat_split_ste(struct kvm_s390_mmu_cache *mc, union pmd *pmdp, gfn_t g
pt->ptes[i].val = init.val | i * PAGE_SIZE;
/* No need to take locks as the page table is not installed yet. */
pgste_init.prefix_notif = old.s.fc1.prefix_notif;
pgste_init.vsie_notif = old.s.fc1.vsie_notif;
pgste_init.pcl = uses_skeys && init.h.i;
dat_init_pgstes(pt, pgste_init.val);
} else {
@@ -893,7 +868,8 @@ static long _dat_slot_crste(union crste *crstep, gfn_t gfn, gfn_t next, struct d
/* This table entry needs to be updated. */
if (walk->start <= gfn && walk->end >= next) {
dat_crstep_xchg_atomic(crstep, crste, new_crste, gfn, walk->asce);
if (!dat_crstep_xchg_atomic(crstep, crste, new_crste, gfn, walk->asce))
return -EINVAL;
/* A lower level table was present, needs to be freed. */
if (!crste.h.fc && !crste.h.i) {
if (is_pmd(crste))
@@ -1021,67 +997,21 @@ bool dat_test_age_gfn(union asce asce, gfn_t start, gfn_t end)
return _dat_walk_gfn_range(start, end, asce, &test_age_ops, 0, NULL) > 0;
}
int dat_link(struct kvm_s390_mmu_cache *mc, union asce asce, int level,
bool uses_skeys, struct guest_fault *f)
{
union crste oldval, newval;
union pte newpte, oldpte;
union pgste pgste;
int rc = 0;
rc = dat_entry_walk(mc, f->gfn, asce, DAT_WALK_ALLOC_CONTINUE, level, &f->crstep, &f->ptep);
if (rc == -EINVAL || rc == -ENOMEM)
return rc;
if (rc)
return -EAGAIN;
if (WARN_ON_ONCE(unlikely(get_level(f->crstep, f->ptep) > level)))
return -EINVAL;
if (f->ptep) {
pgste = pgste_get_lock(f->ptep);
oldpte = *f->ptep;
newpte = _pte(f->pfn, f->writable, f->write_attempt | oldpte.s.d, !f->page);
newpte.s.sd = oldpte.s.sd;
oldpte.s.sd = 0;
if (oldpte.val == _PTE_EMPTY.val || oldpte.h.pfra == f->pfn) {
pgste = __dat_ptep_xchg(f->ptep, pgste, newpte, f->gfn, asce, uses_skeys);
if (f->callback)
f->callback(f);
} else {
rc = -EAGAIN;
}
pgste_set_unlock(f->ptep, pgste);
} else {
oldval = READ_ONCE(*f->crstep);
newval = _crste_fc1(f->pfn, oldval.h.tt, f->writable,
f->write_attempt | oldval.s.fc1.d);
newval.s.fc1.sd = oldval.s.fc1.sd;
if (oldval.val != _CRSTE_EMPTY(oldval.h.tt).val &&
crste_origin_large(oldval) != crste_origin_large(newval))
return -EAGAIN;
if (!dat_crstep_xchg_atomic(f->crstep, oldval, newval, f->gfn, asce))
return -EAGAIN;
if (f->callback)
f->callback(f);
}
return rc;
}
static long dat_set_pn_crste(union crste *crstep, gfn_t gfn, gfn_t next, struct dat_walk *walk)
{
union crste crste = READ_ONCE(*crstep);
union crste newcrste, oldcrste;
int *n = walk->priv;
if (!crste.h.fc || crste.h.i || crste.h.p)
return 0;
do {
oldcrste = READ_ONCE(*crstep);
if (!oldcrste.h.fc || oldcrste.h.i || oldcrste.h.p)
return 0;
if (oldcrste.s.fc1.prefix_notif)
break;
newcrste = oldcrste;
newcrste.s.fc1.prefix_notif = 1;
} while (!dat_crstep_xchg_atomic(crstep, oldcrste, newcrste, gfn, walk->asce));
*n = 2;
if (crste.s.fc1.prefix_notif)
return 0;
crste.s.fc1.prefix_notif = 1;
dat_crstep_xchg(crstep, crste, gfn, walk->asce);
return 0;
}

View File

@@ -160,14 +160,14 @@ union pmd {
unsigned long :44; /* HW */
unsigned long : 3; /* Unused */
unsigned long : 1; /* HW */
unsigned long s : 1; /* Special */
unsigned long w : 1; /* Writable soft-bit */
unsigned long r : 1; /* Readable soft-bit */
unsigned long d : 1; /* Dirty */
unsigned long y : 1; /* Young */
unsigned long prefix_notif : 1; /* Guest prefix invalidation notification */
unsigned long : 3; /* HW */
unsigned long prefix_notif : 1; /* Guest prefix invalidation notification */
unsigned long vsie_notif : 1; /* Referenced in a shadow table */
unsigned long : 1; /* Unused */
unsigned long : 4; /* HW */
unsigned long sd : 1; /* Soft-Dirty */
unsigned long pr : 1; /* Present */
@@ -183,14 +183,14 @@ union pud {
unsigned long :33; /* HW */
unsigned long :14; /* Unused */
unsigned long : 1; /* HW */
unsigned long s : 1; /* Special */
unsigned long w : 1; /* Writable soft-bit */
unsigned long r : 1; /* Readable soft-bit */
unsigned long d : 1; /* Dirty */
unsigned long y : 1; /* Young */
unsigned long prefix_notif : 1; /* Guest prefix invalidation notification */
unsigned long : 3; /* HW */
unsigned long prefix_notif : 1; /* Guest prefix invalidation notification */
unsigned long vsie_notif : 1; /* Referenced in a shadow table */
unsigned long : 1; /* Unused */
unsigned long : 4; /* HW */
unsigned long sd : 1; /* Soft-Dirty */
unsigned long pr : 1; /* Present */
@@ -254,14 +254,14 @@ union crste {
struct {
unsigned long :47;
unsigned long : 1; /* HW (should be 0) */
unsigned long s : 1; /* Special */
unsigned long w : 1; /* Writable */
unsigned long r : 1; /* Readable */
unsigned long d : 1; /* Dirty */
unsigned long y : 1; /* Young */
unsigned long prefix_notif : 1; /* Guest prefix invalidation notification */
unsigned long : 3; /* HW */
unsigned long prefix_notif : 1; /* Guest prefix invalidation notification */
unsigned long vsie_notif : 1; /* Referenced in a shadow table */
unsigned long : 1;
unsigned long : 4; /* HW */
unsigned long sd : 1; /* Soft-Dirty */
unsigned long pr : 1; /* Present */
@@ -540,8 +540,6 @@ int dat_set_slot(struct kvm_s390_mmu_cache *mc, union asce asce, gfn_t start, gf
u16 type, u16 param);
int dat_set_prefix_notif_bit(union asce asce, gfn_t gfn);
bool dat_test_age_gfn(union asce asce, gfn_t start, gfn_t end);
int dat_link(struct kvm_s390_mmu_cache *mc, union asce asce, int level,
bool uses_skeys, struct guest_fault *f);
int dat_perform_essa(union asce asce, gfn_t gfn, int orc, union essa_state *state, bool *dirty);
long dat_reset_cmma(union asce asce, gfn_t start_gfn);
@@ -938,11 +936,14 @@ static inline bool dat_pudp_xchg_atomic(union pud *pudp, union pud old, union pu
return dat_crstep_xchg_atomic(_CRSTEP(pudp), _CRSTE(old), _CRSTE(new), gfn, asce);
}
static inline void dat_crstep_clear(union crste *crstep, gfn_t gfn, union asce asce)
static inline union crste dat_crstep_clear_atomic(union crste *crstep, gfn_t gfn, union asce asce)
{
union crste newcrste = _CRSTE_EMPTY(crstep->h.tt);
union crste oldcrste, empty = _CRSTE_EMPTY(crstep->h.tt);
dat_crstep_xchg(crstep, newcrste, gfn, asce);
do {
oldcrste = READ_ONCE(*crstep);
} while (!dat_crstep_xchg_atomic(crstep, oldcrste, empty, gfn, asce));
return oldcrste;
}
static inline int get_level(union crste *crstep, union pte *ptep)

View File

@@ -1436,13 +1436,21 @@ static int _do_shadow_pte(struct gmap *sg, gpa_t raddr, union pte *ptep_h, union
if (!pgste_get_trylock(ptep_h, &pgste))
return -EAGAIN;
newpte = _pte(f->pfn, f->writable, !p, 0);
newpte.s.d |= ptep->s.d;
newpte.s.sd |= ptep->s.sd;
newpte.h.p &= ptep->h.p;
pgste = _gmap_ptep_xchg(sg->parent, ptep_h, newpte, pgste, f->gfn, false);
pgste.vsie_notif = 1;
newpte = _pte(f->pfn, f->writable, !p, ptep_h->s.s);
newpte.s.d |= ptep_h->s.d;
newpte.s.sd |= ptep_h->s.sd;
newpte.h.p &= ptep_h->h.p;
if (!newpte.h.p && !f->writable) {
rc = -EOPNOTSUPP;
} else {
pgste = _gmap_ptep_xchg(sg->parent, ptep_h, newpte, pgste, f->gfn, false);
pgste.vsie_notif = 1;
}
pgste_set_unlock(ptep_h, pgste);
if (rc)
return rc;
if (!sg->parent)
return -EAGAIN;
newpte = _pte(f->pfn, 0, !p, 0);
if (!pgste_get_trylock(ptep, &pgste))
@@ -1456,7 +1464,7 @@ static int _do_shadow_pte(struct gmap *sg, gpa_t raddr, union pte *ptep_h, union
static int _do_shadow_crste(struct gmap *sg, gpa_t raddr, union crste *host, union crste *table,
struct guest_fault *f, bool p)
{
union crste newcrste;
union crste newcrste, oldcrste;
gfn_t gfn;
int rc;
@@ -1469,16 +1477,28 @@ static int _do_shadow_crste(struct gmap *sg, gpa_t raddr, union crste *host, uni
if (rc)
return rc;
newcrste = _crste_fc1(f->pfn, host->h.tt, f->writable, !p);
newcrste.s.fc1.d |= host->s.fc1.d;
newcrste.s.fc1.sd |= host->s.fc1.sd;
newcrste.h.p &= host->h.p;
newcrste.s.fc1.vsie_notif = 1;
newcrste.s.fc1.prefix_notif = host->s.fc1.prefix_notif;
_gmap_crstep_xchg(sg->parent, host, newcrste, f->gfn, false);
do {
/* _gmap_crstep_xchg_atomic() could have unshadowed this shadow gmap */
if (!sg->parent)
return -EAGAIN;
oldcrste = READ_ONCE(*host);
newcrste = _crste_fc1(f->pfn, oldcrste.h.tt, f->writable, !p);
newcrste.s.fc1.d |= oldcrste.s.fc1.d;
newcrste.s.fc1.sd |= oldcrste.s.fc1.sd;
newcrste.h.p &= oldcrste.h.p;
newcrste.s.fc1.vsie_notif = 1;
newcrste.s.fc1.prefix_notif = oldcrste.s.fc1.prefix_notif;
newcrste.s.fc1.s = oldcrste.s.fc1.s;
if (!newcrste.h.p && !f->writable)
return -EOPNOTSUPP;
} while (!_gmap_crstep_xchg_atomic(sg->parent, host, oldcrste, newcrste, f->gfn, false));
if (!sg->parent)
return -EAGAIN;
newcrste = _crste_fc1(f->pfn, host->h.tt, 0, !p);
dat_crstep_xchg(table, newcrste, gpa_to_gfn(raddr), sg->asce);
newcrste = _crste_fc1(f->pfn, oldcrste.h.tt, 0, !p);
gfn = gpa_to_gfn(raddr);
while (!dat_crstep_xchg_atomic(table, READ_ONCE(*table), newcrste, gfn, sg->asce))
;
return 0;
}
@@ -1502,21 +1522,31 @@ static int _gaccess_do_shadow(struct kvm_s390_mmu_cache *mc, struct gmap *sg,
if (rc)
return rc;
/* A race occourred. The shadow mapping is already valid, nothing to do */
if ((ptep && !ptep->h.i) || (!ptep && crste_leaf(*table)))
/* A race occurred. The shadow mapping is already valid, nothing to do */
if ((ptep && !ptep->h.i && ptep->h.p == w->p) ||
(!ptep && crste_leaf(*table) && !table->h.i && table->h.p == w->p))
return 0;
gl = get_level(table, ptep);
/* In case of a real address space */
if (w->level <= LEVEL_MEM) {
l = TABLE_TYPE_PAGE_TABLE;
hl = TABLE_TYPE_REGION1;
goto real_address_space;
}
/*
* Skip levels that are already protected. For each level, protect
* only the page containing the entry, not the whole table.
*/
for (i = gl ; i >= w->level; i--) {
rc = gmap_protect_rmap(mc, sg, entries[i - 1].gfn, gpa_to_gfn(saddr),
entries[i - 1].pfn, i, entries[i - 1].writable);
rc = gmap_protect_rmap(mc, sg, entries[i].gfn, gpa_to_gfn(saddr),
entries[i].pfn, i + 1, entries[i].writable);
if (rc)
return rc;
if (!sg->parent)
return -EAGAIN;
}
rc = dat_entry_walk(NULL, entries[LEVEL_MEM].gfn, sg->parent->asce, DAT_WALK_LEAF,
@@ -1528,6 +1558,7 @@ static int _gaccess_do_shadow(struct kvm_s390_mmu_cache *mc, struct gmap *sg,
/* Get the smallest granularity */
l = min3(gl, hl, w->level);
real_address_space:
flags = DAT_WALK_SPLIT_ALLOC | (uses_skeys(sg->parent) ? DAT_WALK_USES_SKEYS : 0);
/* If necessary, create the shadow mapping */
if (l < gl) {

View File

@@ -313,13 +313,16 @@ static long gmap_clear_young_crste(union crste *crstep, gfn_t gfn, gfn_t end, st
struct clear_young_pte_priv *priv = walk->priv;
union crste crste, new;
crste = READ_ONCE(*crstep);
do {
crste = READ_ONCE(*crstep);
if (!crste.h.fc)
return 0;
if (!crste.s.fc1.y && crste.h.i)
return 0;
if (crste_prefix(crste) && !gmap_mkold_prefix(priv->gmap, gfn, end))
break;
if (!crste.h.fc)
return 0;
if (!crste.s.fc1.y && crste.h.i)
return 0;
if (!crste_prefix(crste) || gmap_mkold_prefix(priv->gmap, gfn, end)) {
new = crste;
new.h.i = 1;
new.s.fc1.y = 0;
@@ -328,8 +331,8 @@ static long gmap_clear_young_crste(union crste *crstep, gfn_t gfn, gfn_t end, st
folio_set_dirty(phys_to_folio(crste_origin_large(crste)));
new.s.fc1.d = 0;
new.h.p = 1;
dat_crstep_xchg(crstep, new, gfn, walk->asce);
}
} while (!dat_crstep_xchg_atomic(crstep, crste, new, gfn, walk->asce));
priv->young = 1;
return 0;
}
@@ -391,14 +394,18 @@ static long _gmap_unmap_crste(union crste *crstep, gfn_t gfn, gfn_t next, struct
{
struct gmap_unmap_priv *priv = walk->priv;
struct folio *folio = NULL;
union crste old = *crstep;
if (crstep->h.fc) {
if (crstep->s.fc1.pr && test_bit(GMAP_FLAG_EXPORT_ON_UNMAP, &priv->gmap->flags))
folio = phys_to_folio(crste_origin_large(*crstep));
gmap_crstep_xchg(priv->gmap, crstep, _CRSTE_EMPTY(crstep->h.tt), gfn);
if (folio)
uv_convert_from_secure_folio(folio);
}
if (!old.h.fc)
return 0;
if (old.s.fc1.pr && test_bit(GMAP_FLAG_EXPORT_ON_UNMAP, &priv->gmap->flags))
folio = phys_to_folio(crste_origin_large(old));
/* No races should happen because kvm->mmu_lock is held in write mode */
KVM_BUG_ON(!gmap_crstep_xchg_atomic(priv->gmap, crstep, old, _CRSTE_EMPTY(old.h.tt), gfn),
priv->gmap->kvm);
if (folio)
uv_convert_from_secure_folio(folio);
return 0;
}
@@ -474,23 +481,24 @@ static long _crste_test_and_clear_softdirty(union crste *table, gfn_t gfn, gfn_t
if (fatal_signal_pending(current))
return 1;
crste = READ_ONCE(*table);
if (!crste.h.fc)
return 0;
if (crste.h.p && !crste.s.fc1.sd)
return 0;
do {
crste = READ_ONCE(*table);
if (!crste.h.fc)
return 0;
if (crste.h.p && !crste.s.fc1.sd)
return 0;
/*
* If this large page contains one or more prefixes of vCPUs that are
* currently running, do not reset the protection, leave it marked as
* dirty.
*/
if (!crste.s.fc1.prefix_notif || gmap_mkold_prefix(gmap, gfn, end)) {
/*
* If this large page contains one or more prefixes of vCPUs that are
* currently running, do not reset the protection, leave it marked as
* dirty.
*/
if (crste.s.fc1.prefix_notif && !gmap_mkold_prefix(gmap, gfn, end))
break;
new = crste;
new.h.p = 1;
new.s.fc1.sd = 0;
gmap_crstep_xchg(gmap, table, new, gfn);
}
} while (!gmap_crstep_xchg_atomic(gmap, table, crste, new, gfn));
for ( ; gfn < end; gfn++)
mark_page_dirty(gmap->kvm, gfn);
@@ -511,7 +519,7 @@ void gmap_sync_dirty_log(struct gmap *gmap, gfn_t start, gfn_t end)
_dat_walk_gfn_range(start, end, gmap->asce, &walk_ops, 0, gmap);
}
static int gmap_handle_minor_crste_fault(union asce asce, struct guest_fault *f)
static int gmap_handle_minor_crste_fault(struct gmap *gmap, struct guest_fault *f)
{
union crste newcrste, oldcrste = READ_ONCE(*f->crstep);
@@ -536,10 +544,8 @@ static int gmap_handle_minor_crste_fault(union asce asce, struct guest_fault *f)
newcrste.s.fc1.d = 1;
newcrste.s.fc1.sd = 1;
}
if (!oldcrste.s.fc1.d && newcrste.s.fc1.d)
SetPageDirty(phys_to_page(crste_origin_large(newcrste)));
/* In case of races, let the slow path deal with it. */
return !dat_crstep_xchg_atomic(f->crstep, oldcrste, newcrste, f->gfn, asce);
return !gmap_crstep_xchg_atomic(gmap, f->crstep, oldcrste, newcrste, f->gfn);
}
/* Trying to write on a read-only page, let the slow path deal with it. */
return 1;
@@ -568,8 +574,6 @@ static int _gmap_handle_minor_pte_fault(struct gmap *gmap, union pgste *pgste,
newpte.s.d = 1;
newpte.s.sd = 1;
}
if (!oldpte.s.d && newpte.s.d)
SetPageDirty(pfn_to_page(newpte.h.pfra));
*pgste = gmap_ptep_xchg(gmap, f->ptep, newpte, *pgste, f->gfn);
return 0;
@@ -606,7 +610,7 @@ int gmap_try_fixup_minor(struct gmap *gmap, struct guest_fault *fault)
fault->callback(fault);
pgste_set_unlock(fault->ptep, pgste);
} else {
rc = gmap_handle_minor_crste_fault(gmap->asce, fault);
rc = gmap_handle_minor_crste_fault(gmap, fault);
if (!rc && fault->callback)
fault->callback(fault);
}
@@ -623,10 +627,61 @@ static inline bool gmap_1m_allowed(struct gmap *gmap, gfn_t gfn)
return test_bit(GMAP_FLAG_ALLOW_HPAGE_1M, &gmap->flags);
}
static int _gmap_link(struct kvm_s390_mmu_cache *mc, struct gmap *gmap, int level,
struct guest_fault *f)
{
union crste oldval, newval;
union pte newpte, oldpte;
union pgste pgste;
int rc = 0;
rc = dat_entry_walk(mc, f->gfn, gmap->asce, DAT_WALK_ALLOC_CONTINUE, level,
&f->crstep, &f->ptep);
if (rc == -ENOMEM)
return rc;
if (KVM_BUG_ON(rc == -EINVAL, gmap->kvm))
return rc;
if (rc)
return -EAGAIN;
if (KVM_BUG_ON(get_level(f->crstep, f->ptep) > level, gmap->kvm))
return -EINVAL;
if (f->ptep) {
pgste = pgste_get_lock(f->ptep);
oldpte = *f->ptep;
newpte = _pte(f->pfn, f->writable, f->write_attempt | oldpte.s.d, !f->page);
newpte.s.sd = oldpte.s.sd;
oldpte.s.sd = 0;
if (oldpte.val == _PTE_EMPTY.val || oldpte.h.pfra == f->pfn) {
pgste = gmap_ptep_xchg(gmap, f->ptep, newpte, pgste, f->gfn);
if (f->callback)
f->callback(f);
} else {
rc = -EAGAIN;
}
pgste_set_unlock(f->ptep, pgste);
} else {
do {
oldval = READ_ONCE(*f->crstep);
newval = _crste_fc1(f->pfn, oldval.h.tt, f->writable,
f->write_attempt | oldval.s.fc1.d);
newval.s.fc1.s = !f->page;
newval.s.fc1.sd = oldval.s.fc1.sd;
if (oldval.val != _CRSTE_EMPTY(oldval.h.tt).val &&
crste_origin_large(oldval) != crste_origin_large(newval))
return -EAGAIN;
} while (!gmap_crstep_xchg_atomic(gmap, f->crstep, oldval, newval, f->gfn));
if (f->callback)
f->callback(f);
}
return rc;
}
int gmap_link(struct kvm_s390_mmu_cache *mc, struct gmap *gmap, struct guest_fault *f)
{
unsigned int order;
int rc, level;
int level;
lockdep_assert_held(&gmap->kvm->mmu_lock);
@@ -638,16 +693,14 @@ int gmap_link(struct kvm_s390_mmu_cache *mc, struct gmap *gmap, struct guest_fau
else if (order >= get_order(_SEGMENT_SIZE) && gmap_1m_allowed(gmap, f->gfn))
level = TABLE_TYPE_SEGMENT;
}
rc = dat_link(mc, gmap->asce, level, uses_skeys(gmap), f);
KVM_BUG_ON(rc == -EINVAL, gmap->kvm);
return rc;
return _gmap_link(mc, gmap, level, f);
}
static int gmap_ucas_map_one(struct kvm_s390_mmu_cache *mc, struct gmap *gmap,
gfn_t p_gfn, gfn_t c_gfn, bool force_alloc)
{
union crste newcrste, oldcrste;
struct page_table *pt;
union crste newcrste;
union crste *crstep;
union pte *ptep;
int rc;
@@ -673,7 +726,11 @@ static int gmap_ucas_map_one(struct kvm_s390_mmu_cache *mc, struct gmap *gmap,
&crstep, &ptep);
if (rc)
return rc;
dat_crstep_xchg(crstep, newcrste, c_gfn, gmap->asce);
do {
oldcrste = READ_ONCE(*crstep);
if (oldcrste.val == newcrste.val)
break;
} while (!dat_crstep_xchg_atomic(crstep, oldcrste, newcrste, c_gfn, gmap->asce));
return 0;
}
@@ -777,8 +834,10 @@ static void gmap_ucas_unmap_one(struct gmap *gmap, gfn_t c_gfn)
int rc;
rc = dat_entry_walk(NULL, c_gfn, gmap->asce, 0, TABLE_TYPE_SEGMENT, &crstep, &ptep);
if (!rc)
dat_crstep_xchg(crstep, _PMD_EMPTY, c_gfn, gmap->asce);
if (rc)
return;
while (!dat_crstep_xchg_atomic(crstep, READ_ONCE(*crstep), _PMD_EMPTY, c_gfn, gmap->asce))
;
}
void gmap_ucas_unmap(struct gmap *gmap, gfn_t c_gfn, unsigned long count)
@@ -1017,8 +1076,8 @@ static void gmap_unshadow_level(struct gmap *sg, gfn_t r_gfn, int level)
dat_ptep_xchg(ptep, _PTE_EMPTY, r_gfn, sg->asce, uses_skeys(sg));
return;
}
crste = READ_ONCE(*crstep);
dat_crstep_clear(crstep, r_gfn, sg->asce);
crste = dat_crstep_clear_atomic(crstep, r_gfn, sg->asce);
if (crste_leaf(crste) || crste.h.i)
return;
if (is_pmd(crste))
@@ -1101,6 +1160,7 @@ struct gmap_protect_asce_top_level {
static inline int __gmap_protect_asce_top_level(struct kvm_s390_mmu_cache *mc, struct gmap *sg,
struct gmap_protect_asce_top_level *context)
{
struct gmap *parent;
int rc, i;
guard(write_lock)(&sg->kvm->mmu_lock);
@@ -1108,7 +1168,12 @@ static inline int __gmap_protect_asce_top_level(struct kvm_s390_mmu_cache *mc, s
if (kvm_s390_array_needs_retry_safe(sg->kvm, context->seq, context->f))
return -EAGAIN;
scoped_guard(spinlock, &sg->parent->children_lock) {
parent = READ_ONCE(sg->parent);
if (!parent)
return -EAGAIN;
scoped_guard(spinlock, &parent->children_lock) {
if (READ_ONCE(sg->parent) != parent)
return -EAGAIN;
for (i = 0; i < CRST_TABLE_PAGES; i++) {
if (!context->f[i].valid)
continue;
@@ -1191,6 +1256,9 @@ struct gmap *gmap_create_shadow(struct kvm_s390_mmu_cache *mc, struct gmap *pare
struct gmap *sg, *new;
int rc;
if (WARN_ON(!parent))
return ERR_PTR(-EINVAL);
scoped_guard(spinlock, &parent->children_lock) {
sg = gmap_find_shadow(parent, asce, edat_level);
if (sg) {

View File

@@ -185,6 +185,8 @@ static inline union pgste _gmap_ptep_xchg(struct gmap *gmap, union pte *ptep, un
else
_gmap_handle_vsie_unshadow_event(gmap, gfn);
}
if (!ptep->s.d && newpte.s.d && !newpte.s.s)
SetPageDirty(pfn_to_page(newpte.h.pfra));
return __dat_ptep_xchg(ptep, pgste, newpte, gfn, gmap->asce, uses_skeys(gmap));
}
@@ -194,35 +196,42 @@ static inline union pgste gmap_ptep_xchg(struct gmap *gmap, union pte *ptep, uni
return _gmap_ptep_xchg(gmap, ptep, newpte, pgste, gfn, true);
}
static inline void _gmap_crstep_xchg(struct gmap *gmap, union crste *crstep, union crste ne,
gfn_t gfn, bool needs_lock)
static inline bool __must_check _gmap_crstep_xchg_atomic(struct gmap *gmap, union crste *crstep,
union crste oldcrste, union crste newcrste,
gfn_t gfn, bool needs_lock)
{
unsigned long align = 8 + (is_pmd(*crstep) ? 0 : 11);
unsigned long align = is_pmd(newcrste) ? _PAGE_ENTRIES : _PAGE_ENTRIES * _CRST_ENTRIES;
if (KVM_BUG_ON(crstep->h.tt != oldcrste.h.tt || newcrste.h.tt != oldcrste.h.tt, gmap->kvm))
return true;
lockdep_assert_held(&gmap->kvm->mmu_lock);
if (!needs_lock)
lockdep_assert_held(&gmap->children_lock);
gfn = ALIGN_DOWN(gfn, align);
if (crste_prefix(*crstep) && (ne.h.p || ne.h.i || !crste_prefix(ne))) {
ne.s.fc1.prefix_notif = 0;
if (crste_prefix(oldcrste) && (newcrste.h.p || newcrste.h.i || !crste_prefix(newcrste))) {
newcrste.s.fc1.prefix_notif = 0;
gmap_unmap_prefix(gmap, gfn, gfn + align);
}
if (crste_leaf(*crstep) && crstep->s.fc1.vsie_notif &&
(ne.h.p || ne.h.i || !ne.s.fc1.vsie_notif)) {
ne.s.fc1.vsie_notif = 0;
if (crste_leaf(oldcrste) && oldcrste.s.fc1.vsie_notif &&
(newcrste.h.p || newcrste.h.i || !newcrste.s.fc1.vsie_notif)) {
newcrste.s.fc1.vsie_notif = 0;
if (needs_lock)
gmap_handle_vsie_unshadow_event(gmap, gfn);
else
_gmap_handle_vsie_unshadow_event(gmap, gfn);
}
dat_crstep_xchg(crstep, ne, gfn, gmap->asce);
if (!oldcrste.s.fc1.d && newcrste.s.fc1.d && !newcrste.s.fc1.s)
SetPageDirty(phys_to_page(crste_origin_large(newcrste)));
return dat_crstep_xchg_atomic(crstep, oldcrste, newcrste, gfn, gmap->asce);
}
static inline void gmap_crstep_xchg(struct gmap *gmap, union crste *crstep, union crste ne,
gfn_t gfn)
static inline bool __must_check gmap_crstep_xchg_atomic(struct gmap *gmap, union crste *crstep,
union crste oldcrste, union crste newcrste,
gfn_t gfn)
{
return _gmap_crstep_xchg(gmap, crstep, ne, gfn, true);
return _gmap_crstep_xchg_atomic(gmap, crstep, oldcrste, newcrste, gfn, true);
}
/**

View File

@@ -5520,9 +5520,21 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
}
#endif
case KVM_S390_VCPU_FAULT: {
idx = srcu_read_lock(&vcpu->kvm->srcu);
r = vcpu_dat_fault_handler(vcpu, arg, 0);
srcu_read_unlock(&vcpu->kvm->srcu, idx);
gpa_t gaddr = arg;
scoped_guard(srcu, &vcpu->kvm->srcu) {
r = vcpu_ucontrol_translate(vcpu, &gaddr);
if (r)
break;
r = kvm_s390_faultin_gfn_simple(vcpu, NULL, gpa_to_gfn(gaddr), false);
if (r == PGM_ADDRESSING)
r = -EFAULT;
if (r <= 0)
break;
r = -EIO;
KVM_BUG_ON(r, vcpu->kvm);
}
break;
}
case KVM_ENABLE_CAP:

View File

@@ -1328,7 +1328,7 @@ static void unregister_shadow_scb(struct kvm_vcpu *vcpu)
static int vsie_run(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
{
struct kvm_s390_sie_block *scb_s = &vsie_page->scb_s;
struct gmap *sg;
struct gmap *sg = NULL;
int rc = 0;
while (1) {
@@ -1368,6 +1368,8 @@ static int vsie_run(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
sg = gmap_put(sg);
cond_resched();
}
if (sg)
sg = gmap_put(sg);
if (rc == -EFAULT) {
/*

View File

@@ -121,6 +121,9 @@ noinstr struct ghcb *__sev_get_ghcb(struct ghcb_state *state)
WARN_ON(!irqs_disabled());
if (!sev_cfg.ghcbs_initialized)
return boot_ghcb;
data = this_cpu_read(runtime_data);
ghcb = &data->ghcb_page;
@@ -164,6 +167,9 @@ noinstr void __sev_put_ghcb(struct ghcb_state *state)
WARN_ON(!irqs_disabled());
if (!sev_cfg.ghcbs_initialized)
return;
data = this_cpu_read(runtime_data);
ghcb = &data->ghcb_page;

View File

@@ -177,6 +177,16 @@ static noinstr void fred_extint(struct pt_regs *regs)
}
}
#ifdef CONFIG_AMD_MEM_ENCRYPT
noinstr void exc_vmm_communication(struct pt_regs *regs, unsigned long error_code)
{
if (user_mode(regs))
return user_exc_vmm_communication(regs, error_code);
else
return kernel_exc_vmm_communication(regs, error_code);
}
#endif
static noinstr void fred_hwexc(struct pt_regs *regs, unsigned long error_code)
{
/* Optimize for #PF. That's the only exception which matters performance wise */
@@ -207,6 +217,10 @@ static noinstr void fred_hwexc(struct pt_regs *regs, unsigned long error_code)
#ifdef CONFIG_X86_CET
case X86_TRAP_CP: return exc_control_protection(regs, error_code);
#endif
#ifdef CONFIG_AMD_MEM_ENCRYPT
case X86_TRAP_VC: return exc_vmm_communication(regs, error_code);
#endif
default: return fred_bad_type(regs, error_code);
}

View File

@@ -433,7 +433,20 @@ static __always_inline void setup_lass(struct cpuinfo_x86 *c)
/* These bits should not change their value after CPU init is finished. */
static const unsigned long cr4_pinned_mask = X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_UMIP |
X86_CR4_FSGSBASE | X86_CR4_CET | X86_CR4_FRED;
X86_CR4_FSGSBASE | X86_CR4_CET;
/*
* The CR pinning protects against ROP on the 'mov %reg, %CRn' instruction(s).
* Since you can ROP directly to these instructions (barring shadow stack),
* any protection must follow immediately and unconditionally after that.
*
* Specifically, the CR[04] write functions below will have the value
* validation controlled by the @cr_pinning static_branch which is
* __ro_after_init, just like the cr4_pinned_bits value.
*
* Once set, an attacker will have to defeat page-tables to get around these
* restrictions. Which is a much bigger ask than 'simple' ROP.
*/
static DEFINE_STATIC_KEY_FALSE_RO(cr_pinning);
static unsigned long cr4_pinned_bits __ro_after_init;
@@ -2050,12 +2063,6 @@ static void identify_cpu(struct cpuinfo_x86 *c)
setup_umip(c);
setup_lass(c);
/* Enable FSGSBASE instructions if available. */
if (cpu_has(c, X86_FEATURE_FSGSBASE)) {
cr4_set_bits(X86_CR4_FSGSBASE);
elf_hwcap2 |= HWCAP2_FSGSBASE;
}
/*
* The vendor-specific functions might have changed features.
* Now we do "generic changes."
@@ -2416,6 +2423,18 @@ void cpu_init_exception_handling(bool boot_cpu)
/* GHCB needs to be setup to handle #VC. */
setup_ghcb();
/*
* On CPUs with FSGSBASE support, paranoid_entry() uses
* ALTERNATIVE-patched RDGSBASE/WRGSBASE instructions. Secondary CPUs
* boot after alternatives are patched globally, so early exceptions
* execute patched code that depends on FSGSBASE. Enable the feature
* before any exceptions occur.
*/
if (cpu_feature_enabled(X86_FEATURE_FSGSBASE)) {
cr4_set_bits(X86_CR4_FSGSBASE);
elf_hwcap2 |= HWCAP2_FSGSBASE;
}
if (cpu_feature_enabled(X86_FEATURE_FRED)) {
/* The boot CPU has enabled FRED during early boot */
if (!boot_cpu)

View File

@@ -3044,12 +3044,6 @@ static int mmu_set_spte(struct kvm_vcpu *vcpu, struct kvm_memory_slot *slot,
bool prefetch = !fault || fault->prefetch;
bool write_fault = fault && fault->write;
if (unlikely(is_noslot_pfn(pfn))) {
vcpu->stat.pf_mmio_spte_created++;
mark_mmio_spte(vcpu, sptep, gfn, pte_access);
return RET_PF_EMULATE;
}
if (is_shadow_present_pte(*sptep)) {
if (prefetch && is_last_spte(*sptep, level) &&
pfn == spte_to_pfn(*sptep))
@@ -3066,13 +3060,22 @@ static int mmu_set_spte(struct kvm_vcpu *vcpu, struct kvm_memory_slot *slot,
child = spte_to_child_sp(pte);
drop_parent_pte(vcpu->kvm, child, sptep);
flush = true;
} else if (WARN_ON_ONCE(pfn != spte_to_pfn(*sptep))) {
} else if (pfn != spte_to_pfn(*sptep)) {
WARN_ON_ONCE(vcpu->arch.mmu->root_role.direct);
drop_spte(vcpu->kvm, sptep);
flush = true;
} else
was_rmapped = 1;
}
if (unlikely(is_noslot_pfn(pfn))) {
vcpu->stat.pf_mmio_spte_created++;
mark_mmio_spte(vcpu, sptep, gfn, pte_access);
if (flush)
kvm_flush_remote_tlbs_gfn(vcpu->kvm, gfn, level);
return RET_PF_EMULATE;
}
wrprot = make_spte(vcpu, sp, slot, pte_access, gfn, pfn, *sptep, prefetch,
false, host_writable, &spte);

View File

@@ -424,7 +424,7 @@ void __init efi_unmap_boot_services(void)
if (efi_enabled(EFI_DBG))
return;
sz = sizeof(*ranges_to_free) * efi.memmap.nr_map + 1;
sz = sizeof(*ranges_to_free) * (efi.memmap.nr_map + 1);
ranges_to_free = kzalloc(sz, GFP_KERNEL);
if (!ranges_to_free) {
pr_err("Failed to allocate storage for freeable EFI regions\n");

View File

@@ -35,6 +35,7 @@
#define IVPU_HW_IP_60XX 60
#define IVPU_HW_IP_REV_LNL_B0 4
#define IVPU_HW_IP_REV_NVL_A0 0
#define IVPU_HW_BTRS_MTL 1
#define IVPU_HW_BTRS_LNL 2

View File

@@ -70,8 +70,10 @@ static void wa_init(struct ivpu_device *vdev)
if (ivpu_hw_btrs_gen(vdev) == IVPU_HW_BTRS_MTL)
vdev->wa.interrupt_clear_with_0 = ivpu_hw_btrs_irqs_clear_with_0_mtl(vdev);
if (ivpu_device_id(vdev) == PCI_DEVICE_ID_LNL &&
ivpu_revision(vdev) < IVPU_HW_IP_REV_LNL_B0)
if ((ivpu_device_id(vdev) == PCI_DEVICE_ID_LNL &&
ivpu_revision(vdev) < IVPU_HW_IP_REV_LNL_B0) ||
(ivpu_device_id(vdev) == PCI_DEVICE_ID_NVL &&
ivpu_revision(vdev) == IVPU_HW_IP_REV_NVL_A0))
vdev->wa.disable_clock_relinquish = true;
if (ivpu_test_mode & IVPU_TEST_MODE_CLK_RELINQ_ENABLE)

View File

@@ -1656,6 +1656,8 @@ static int acpi_ec_setup(struct acpi_ec *ec, struct acpi_device *device, bool ca
ret = ec_install_handlers(ec, device, call_reg);
if (ret) {
ec_remove_handlers(ec);
if (ec == first_ec)
first_ec = NULL;

View File

@@ -99,8 +99,13 @@ static int lcd2s_print(struct charlcd *lcd, int c)
{
struct lcd2s_data *lcd2s = lcd->drvdata;
u8 buf[2] = { LCD2S_CMD_WRITE, c };
int ret;
lcd2s_i2c_master_send(lcd2s->i2c, buf, sizeof(buf));
ret = lcd2s_i2c_master_send(lcd2s->i2c, buf, sizeof(buf));
if (ret < 0)
return ret;
if (ret != sizeof(buf))
return -EIO;
return 0;
}
@@ -108,9 +113,13 @@ static int lcd2s_gotoxy(struct charlcd *lcd, unsigned int x, unsigned int y)
{
struct lcd2s_data *lcd2s = lcd->drvdata;
u8 buf[3] = { LCD2S_CMD_CUR_POS, y + 1, x + 1 };
int ret;
lcd2s_i2c_master_send(lcd2s->i2c, buf, sizeof(buf));
ret = lcd2s_i2c_master_send(lcd2s->i2c, buf, sizeof(buf));
if (ret < 0)
return ret;
if (ret != sizeof(buf))
return -EIO;
return 0;
}

View File

@@ -365,7 +365,7 @@ static DEFINE_IDA(linedisp_id);
static void linedisp_release(struct device *dev)
{
struct linedisp *linedisp = to_linedisp(dev);
struct linedisp *linedisp = container_of(dev, struct linedisp, dev);
kfree(linedisp->map);
kfree(linedisp->message);

View File

@@ -1545,6 +1545,7 @@ static int _regmap_select_page(struct regmap *map, unsigned int *reg,
unsigned int val_num)
{
void *orig_work_buf;
unsigned int selector_reg;
unsigned int win_offset;
unsigned int win_page;
bool page_chg;
@@ -1563,10 +1564,31 @@ static int _regmap_select_page(struct regmap *map, unsigned int *reg,
return -EINVAL;
}
/* It is possible to have selector register inside data window.
In that case, selector register is located on every page and
it needs no page switching, when accessed alone. */
/*
* Calculate the address of the selector register in the corresponding
* data window if it is located on every page.
*/
page_chg = in_range(range->selector_reg, range->window_start, range->window_len);
if (page_chg)
selector_reg = range->range_min + win_page * range->window_len +
range->selector_reg - range->window_start;
/*
* It is possible to have selector register inside data window.
* In that case, selector register is located on every page and it
* needs no page switching, when accessed alone.
*
* Nevertheless we should synchronize the cache values for it.
* This can't be properly achieved if the selector register is
* the first and the only one to be read inside the data window.
* That's why we update it in that case as well.
*
* However, we specifically avoid updating it for the default page,
* when it's overlapped with the real data window, to prevent from
* infinite looping.
*/
if (val_num > 1 ||
(page_chg && selector_reg != range->selector_reg) ||
range->window_start + win_offset != range->selector_reg) {
/* Use separate work_buf during page switching */
orig_work_buf = map->work_buf;
@@ -1575,7 +1597,7 @@ static int _regmap_select_page(struct regmap *map, unsigned int *reg,
ret = _regmap_update_bits(map, range->selector_reg,
range->selector_mask,
win_page << range->selector_shift,
&page_chg, false);
NULL, false);
map->work_buf = orig_work_buf;

View File

@@ -109,9 +109,6 @@ static int h4_recv(struct hci_uart *hu, const void *data, int count)
{
struct h4_struct *h4 = hu->priv;
if (!test_bit(HCI_UART_REGISTERED, &hu->flags))
return -EUNATCH;
h4->rx_skb = h4_recv_buf(hu, h4->rx_skb, data, count,
h4_recv_pkts, ARRAY_SIZE(h4_recv_pkts));
if (IS_ERR(h4->rx_skb)) {

View File

@@ -1427,12 +1427,9 @@ static int cpufreq_policy_online(struct cpufreq_policy *policy,
* If there is a problem with its frequency table, take it
* offline and drop it.
*/
if (policy->freq_table_sorted != CPUFREQ_TABLE_SORTED_ASCENDING &&
policy->freq_table_sorted != CPUFREQ_TABLE_SORTED_DESCENDING) {
ret = cpufreq_table_validate_and_sort(policy);
if (ret)
goto out_offline_policy;
}
ret = cpufreq_table_validate_and_sort(policy);
if (ret)
goto out_offline_policy;
/* related_cpus should at least include policy->cpus. */
cpumask_copy(policy->related_cpus, policy->cpus);

View File

@@ -313,6 +313,17 @@ static void cs_start(struct cpufreq_policy *policy)
dbs_info->requested_freq = policy->cur;
}
static void cs_limits(struct cpufreq_policy *policy)
{
struct cs_policy_dbs_info *dbs_info = to_dbs_info(policy->governor_data);
/*
* The limits have changed, so may have the current frequency. Reset
* requested_freq to avoid any unintended outcomes due to the mismatch.
*/
dbs_info->requested_freq = policy->cur;
}
static struct dbs_governor cs_governor = {
.gov = CPUFREQ_DBS_GOVERNOR_INITIALIZER("conservative"),
.kobj_type = { .default_groups = cs_groups },
@@ -322,6 +333,7 @@ static struct dbs_governor cs_governor = {
.init = cs_init,
.exit = cs_exit,
.start = cs_start,
.limits = cs_limits,
};
#define CPU_FREQ_GOV_CONSERVATIVE (cs_governor.gov)

View File

@@ -563,6 +563,7 @@ EXPORT_SYMBOL_GPL(cpufreq_dbs_governor_stop);
void cpufreq_dbs_governor_limits(struct cpufreq_policy *policy)
{
struct dbs_governor *gov = dbs_governor_of(policy);
struct policy_dbs_info *policy_dbs;
/* Protect gov->gdbs_data against cpufreq_dbs_governor_exit() */
@@ -574,6 +575,8 @@ void cpufreq_dbs_governor_limits(struct cpufreq_policy *policy)
mutex_lock(&policy_dbs->update_mutex);
cpufreq_policy_apply_limits(policy);
gov_update_sample_delay(policy_dbs, 0);
if (gov->limits)
gov->limits(policy);
mutex_unlock(&policy_dbs->update_mutex);
out:

View File

@@ -138,6 +138,7 @@ struct dbs_governor {
int (*init)(struct dbs_data *dbs_data);
void (*exit)(struct dbs_data *dbs_data);
void (*start)(struct cpufreq_policy *policy);
void (*limits)(struct cpufreq_policy *policy);
};
static inline struct dbs_governor *dbs_governor_of(struct cpufreq_policy *policy)

View File

@@ -360,6 +360,10 @@ int cpufreq_table_validate_and_sort(struct cpufreq_policy *policy)
if (policy_has_boost_freq(policy))
policy->boost_supported = true;
if (policy->freq_table_sorted == CPUFREQ_TABLE_SORTED_ASCENDING ||
policy->freq_table_sorted == CPUFREQ_TABLE_SORTED_DESCENDING)
return 0;
return set_freq_table_sorted(policy);
}

View File

@@ -844,6 +844,7 @@ static int dw_edma_irq_request(struct dw_edma *dw,
{
struct dw_edma_chip *chip = dw->chip;
struct device *dev = dw->chip->dev;
struct msi_desc *msi_desc;
u32 wr_mask = 1;
u32 rd_mask = 1;
int i, err = 0;
@@ -895,9 +896,12 @@ static int dw_edma_irq_request(struct dw_edma *dw,
&dw->irq[i]);
if (err)
goto err_irq_free;
if (irq_get_msi_desc(irq))
msi_desc = irq_get_msi_desc(irq);
if (msi_desc) {
get_cached_msi_msg(irq, &dw->irq[i].msi);
if (!msi_desc->pci.msi_attrib.is_msix)
dw->irq[i].msi.data = dw->irq[0].msi.data + i;
}
}
dw->nr_irqs = i;

View File

@@ -252,10 +252,10 @@ static void dw_hdma_v0_core_start(struct dw_edma_chunk *chunk, bool first)
lower_32_bits(chunk->ll_region.paddr));
SET_CH_32(dw, chan->dir, chan->id, llp.msb,
upper_32_bits(chunk->ll_region.paddr));
/* Set consumer cycle */
SET_CH_32(dw, chan->dir, chan->id, cycle_sync,
HDMA_V0_CONSUMER_CYCLE_STAT | HDMA_V0_CONSUMER_CYCLE_BIT);
}
/* Set consumer cycle */
SET_CH_32(dw, chan->dir, chan->id, cycle_sync,
HDMA_V0_CONSUMER_CYCLE_STAT | HDMA_V0_CONSUMER_CYCLE_BIT);
dw_hdma_v0_sync_ll_data(chunk);

View File

@@ -317,10 +317,8 @@ static struct dma_chan *fsl_edma3_xlate(struct of_phandle_args *dma_spec,
return NULL;
i = fsl_chan - fsl_edma->chans;
fsl_chan->priority = dma_spec->args[1];
fsl_chan->is_rxchan = dma_spec->args[2] & FSL_EDMA_RX;
fsl_chan->is_remote = dma_spec->args[2] & FSL_EDMA_REMOTE;
fsl_chan->is_multi_fifo = dma_spec->args[2] & FSL_EDMA_MULTI_FIFO;
if (!b_chmux && i != dma_spec->args[0])
continue;
if ((dma_spec->args[2] & FSL_EDMA_EVEN_CH) && (i & 0x1))
continue;
@@ -328,17 +326,15 @@ static struct dma_chan *fsl_edma3_xlate(struct of_phandle_args *dma_spec,
if ((dma_spec->args[2] & FSL_EDMA_ODD_CH) && !(i & 0x1))
continue;
if (!b_chmux && i == dma_spec->args[0]) {
chan = dma_get_slave_channel(chan);
chan->device->privatecnt++;
return chan;
} else if (b_chmux && !fsl_chan->srcid) {
/* if controller support channel mux, choose a free channel */
chan = dma_get_slave_channel(chan);
chan->device->privatecnt++;
fsl_chan->srcid = dma_spec->args[0];
return chan;
}
fsl_chan->srcid = dma_spec->args[0];
fsl_chan->priority = dma_spec->args[1];
fsl_chan->is_rxchan = dma_spec->args[2] & FSL_EDMA_RX;
fsl_chan->is_remote = dma_spec->args[2] & FSL_EDMA_REMOTE;
fsl_chan->is_multi_fifo = dma_spec->args[2] & FSL_EDMA_MULTI_FIFO;
chan = dma_get_slave_channel(chan);
chan->device->privatecnt++;
return chan;
}
return NULL;
}

View File

@@ -158,11 +158,7 @@ static const struct device_type idxd_cdev_file_type = {
static void idxd_cdev_dev_release(struct device *dev)
{
struct idxd_cdev *idxd_cdev = dev_to_cdev(dev);
struct idxd_cdev_context *cdev_ctx;
struct idxd_wq *wq = idxd_cdev->wq;
cdev_ctx = &ictx[wq->idxd->data->type];
ida_free(&cdev_ctx->minor_ida, idxd_cdev->minor);
kfree(idxd_cdev);
}
@@ -582,11 +578,15 @@ int idxd_wq_add_cdev(struct idxd_wq *wq)
void idxd_wq_del_cdev(struct idxd_wq *wq)
{
struct idxd_cdev_context *cdev_ctx;
struct idxd_cdev *idxd_cdev;
idxd_cdev = wq->idxd_cdev;
wq->idxd_cdev = NULL;
cdev_device_del(&idxd_cdev->cdev, cdev_dev(idxd_cdev));
cdev_ctx = &ictx[wq->idxd->data->type];
ida_free(&cdev_ctx->minor_ida, idxd_cdev->minor);
put_device(cdev_dev(idxd_cdev));
}

View File

@@ -175,6 +175,7 @@ void idxd_wq_free_resources(struct idxd_wq *wq)
free_descs(wq);
dma_free_coherent(dev, wq->compls_size, wq->compls, wq->compls_addr);
sbitmap_queue_free(&wq->sbq);
wq->type = IDXD_WQT_NONE;
}
EXPORT_SYMBOL_NS_GPL(idxd_wq_free_resources, "IDXD");
@@ -382,7 +383,6 @@ static void idxd_wq_disable_cleanup(struct idxd_wq *wq)
lockdep_assert_held(&wq->wq_lock);
wq->state = IDXD_WQ_DISABLED;
memset(wq->wqcfg, 0, idxd->wqcfg_size);
wq->type = IDXD_WQT_NONE;
wq->threshold = 0;
wq->priority = 0;
wq->enqcmds_retries = IDXD_ENQCMDS_RETRIES;
@@ -831,8 +831,7 @@ static void idxd_device_evl_free(struct idxd_device *idxd)
struct device *dev = &idxd->pdev->dev;
struct idxd_evl *evl = idxd->evl;
gencfg.bits = ioread32(idxd->reg_base + IDXD_GENCFG_OFFSET);
if (!gencfg.evl_en)
if (!evl)
return;
mutex_lock(&evl->lock);
@@ -1125,7 +1124,11 @@ int idxd_device_config(struct idxd_device *idxd)
{
int rc;
lockdep_assert_held(&idxd->dev_lock);
guard(spinlock)(&idxd->dev_lock);
if (!test_bit(IDXD_FLAG_CONFIGURABLE, &idxd->flags))
return 0;
rc = idxd_wqs_setup(idxd);
if (rc < 0)
return rc;
@@ -1332,6 +1335,11 @@ void idxd_wq_free_irq(struct idxd_wq *wq)
free_irq(ie->vector, ie);
idxd_flush_pending_descs(ie);
/* The interrupt might have been already released by FLR */
if (ie->int_handle == INVALID_INT_HANDLE)
return;
if (idxd->request_int_handles)
idxd_device_release_int_handle(idxd, ie->int_handle, IDXD_IRQ_MSIX);
idxd_device_clear_perm_entry(idxd, ie);
@@ -1340,6 +1348,23 @@ void idxd_wq_free_irq(struct idxd_wq *wq)
ie->pasid = IOMMU_PASID_INVALID;
}
void idxd_wq_flush_descs(struct idxd_wq *wq)
{
struct idxd_irq_entry *ie = &wq->ie;
struct idxd_device *idxd = wq->idxd;
guard(mutex)(&wq->wq_lock);
if (wq->state != IDXD_WQ_ENABLED || wq->type != IDXD_WQT_KERNEL)
return;
idxd_flush_pending_descs(ie);
if (idxd->request_int_handles)
idxd_device_release_int_handle(idxd, ie->int_handle, IDXD_IRQ_MSIX);
idxd_device_clear_perm_entry(idxd, ie);
ie->int_handle = INVALID_INT_HANDLE;
}
int idxd_wq_request_irq(struct idxd_wq *wq)
{
struct idxd_device *idxd = wq->idxd;
@@ -1454,11 +1479,7 @@ int idxd_drv_enable_wq(struct idxd_wq *wq)
}
}
rc = 0;
spin_lock(&idxd->dev_lock);
if (test_bit(IDXD_FLAG_CONFIGURABLE, &idxd->flags))
rc = idxd_device_config(idxd);
spin_unlock(&idxd->dev_lock);
rc = idxd_device_config(idxd);
if (rc < 0) {
dev_dbg(dev, "Writing wq %d config failed: %d\n", wq->id, rc);
goto err;
@@ -1533,7 +1554,6 @@ void idxd_drv_disable_wq(struct idxd_wq *wq)
idxd_wq_reset(wq);
idxd_wq_free_resources(wq);
percpu_ref_exit(&wq->wq_active);
wq->type = IDXD_WQT_NONE;
wq->client_count = 0;
}
EXPORT_SYMBOL_NS_GPL(idxd_drv_disable_wq, "IDXD");
@@ -1554,10 +1574,7 @@ int idxd_device_drv_probe(struct idxd_dev *idxd_dev)
}
/* Device configuration */
spin_lock(&idxd->dev_lock);
if (test_bit(IDXD_FLAG_CONFIGURABLE, &idxd->flags))
rc = idxd_device_config(idxd);
spin_unlock(&idxd->dev_lock);
rc = idxd_device_config(idxd);
if (rc < 0)
return -ENXIO;

View File

@@ -194,6 +194,22 @@ static void idxd_dma_release(struct dma_device *device)
kfree(idxd_dma);
}
static int idxd_dma_terminate_all(struct dma_chan *c)
{
struct idxd_wq *wq = to_idxd_wq(c);
idxd_wq_flush_descs(wq);
return 0;
}
static void idxd_dma_synchronize(struct dma_chan *c)
{
struct idxd_wq *wq = to_idxd_wq(c);
idxd_wq_drain(wq);
}
int idxd_register_dma_device(struct idxd_device *idxd)
{
struct idxd_dma_dev *idxd_dma;
@@ -224,6 +240,8 @@ int idxd_register_dma_device(struct idxd_device *idxd)
dma->device_issue_pending = idxd_dma_issue_pending;
dma->device_alloc_chan_resources = idxd_dma_alloc_chan_resources;
dma->device_free_chan_resources = idxd_dma_free_chan_resources;
dma->device_terminate_all = idxd_dma_terminate_all;
dma->device_synchronize = idxd_dma_synchronize;
rc = dma_async_device_register(dma);
if (rc < 0) {

View File

@@ -803,6 +803,7 @@ void idxd_wq_quiesce(struct idxd_wq *wq);
int idxd_wq_init_percpu_ref(struct idxd_wq *wq);
void idxd_wq_free_irq(struct idxd_wq *wq);
int idxd_wq_request_irq(struct idxd_wq *wq);
void idxd_wq_flush_descs(struct idxd_wq *wq);
/* submission */
int idxd_submit_desc(struct idxd_wq *wq, struct idxd_desc *desc);

View File

@@ -973,7 +973,8 @@ static void idxd_device_config_restore(struct idxd_device *idxd,
idxd->rdbuf_limit = idxd_saved->saved_idxd.rdbuf_limit;
idxd->evl->size = saved_evl->size;
if (idxd->evl)
idxd->evl->size = saved_evl->size;
for (i = 0; i < idxd->max_groups; i++) {
struct idxd_group *saved_group, *group;
@@ -1104,12 +1105,10 @@ static void idxd_reset_done(struct pci_dev *pdev)
idxd_device_config_restore(idxd, idxd->idxd_saved);
/* Re-configure IDXD device if allowed. */
if (test_bit(IDXD_FLAG_CONFIGURABLE, &idxd->flags)) {
rc = idxd_device_config(idxd);
if (rc < 0) {
dev_err(dev, "HALT: %s config fails\n", idxd_name);
goto out;
}
rc = idxd_device_config(idxd);
if (rc < 0) {
dev_err(dev, "HALT: %s config fails\n", idxd_name);
goto out;
}
/* Bind IDXD device to driver. */
@@ -1147,6 +1146,7 @@ static void idxd_reset_done(struct pci_dev *pdev)
}
out:
kfree(idxd->idxd_saved);
idxd->idxd_saved = NULL;
}
static const struct pci_error_handlers idxd_error_handler = {

View File

@@ -397,6 +397,17 @@ static void idxd_device_flr(struct work_struct *work)
dev_err(&idxd->pdev->dev, "FLR failed\n");
}
static void idxd_wqs_flush_descs(struct idxd_device *idxd)
{
int i;
for (i = 0; i < idxd->max_wqs; i++) {
struct idxd_wq *wq = idxd->wqs[i];
idxd_wq_flush_descs(wq);
}
}
static irqreturn_t idxd_halt(struct idxd_device *idxd)
{
union gensts_reg gensts;
@@ -415,6 +426,11 @@ static irqreturn_t idxd_halt(struct idxd_device *idxd)
} else if (gensts.reset_type == IDXD_DEVICE_RESET_FLR) {
idxd->state = IDXD_DEV_HALTED;
idxd_mask_error_interrupts(idxd);
/* Flush all pending descriptors, and disable
* interrupts, they will be re-enabled when FLR
* concludes.
*/
idxd_wqs_flush_descs(idxd);
dev_dbg(&idxd->pdev->dev,
"idxd halted, doing FLR. After FLR, configs are restored\n");
INIT_WORK(&idxd->work, idxd_device_flr);

View File

@@ -138,7 +138,7 @@ static void llist_abort_desc(struct idxd_wq *wq, struct idxd_irq_entry *ie,
*/
list_for_each_entry_safe(d, t, &flist, list) {
list_del_init(&d->list);
idxd_dma_complete_txd(found, IDXD_COMPLETE_ABORT, true,
idxd_dma_complete_txd(d, IDXD_COMPLETE_ABORT, true,
NULL, NULL);
}
}

View File

@@ -1836,6 +1836,7 @@ static void idxd_conf_device_release(struct device *dev)
{
struct idxd_device *idxd = confdev_to_idxd(dev);
destroy_workqueue(idxd->wq);
kfree(idxd->groups);
bitmap_free(idxd->wq_enable_map);
kfree(idxd->wqs);

View File

@@ -10,6 +10,7 @@
*/
#include <linux/bitfield.h>
#include <linux/cleanup.h>
#include <linux/dma-mapping.h>
#include <linux/dmaengine.h>
#include <linux/interrupt.h>
@@ -296,13 +297,10 @@ static void rz_dmac_disable_hw(struct rz_dmac_chan *channel)
{
struct dma_chan *chan = &channel->vc.chan;
struct rz_dmac *dmac = to_rz_dmac(chan->device);
unsigned long flags;
dev_dbg(dmac->dev, "%s channel %d\n", __func__, channel->index);
local_irq_save(flags);
rz_dmac_ch_writel(channel, CHCTRL_DEFAULT, CHCTRL, 1);
local_irq_restore(flags);
}
static void rz_dmac_set_dmars_register(struct rz_dmac *dmac, int nr, u32 dmars)
@@ -447,6 +445,7 @@ static int rz_dmac_alloc_chan_resources(struct dma_chan *chan)
if (!desc)
break;
/* No need to lock. This is called only for the 1st client. */
list_add_tail(&desc->node, &channel->ld_free);
channel->descs_allocated++;
}
@@ -502,18 +501,21 @@ rz_dmac_prep_dma_memcpy(struct dma_chan *chan, dma_addr_t dest, dma_addr_t src,
dev_dbg(dmac->dev, "%s channel: %d src=0x%pad dst=0x%pad len=%zu\n",
__func__, channel->index, &src, &dest, len);
if (list_empty(&channel->ld_free))
return NULL;
scoped_guard(spinlock_irqsave, &channel->vc.lock) {
if (list_empty(&channel->ld_free))
return NULL;
desc = list_first_entry(&channel->ld_free, struct rz_dmac_desc, node);
desc = list_first_entry(&channel->ld_free, struct rz_dmac_desc, node);
desc->type = RZ_DMAC_DESC_MEMCPY;
desc->src = src;
desc->dest = dest;
desc->len = len;
desc->direction = DMA_MEM_TO_MEM;
desc->type = RZ_DMAC_DESC_MEMCPY;
desc->src = src;
desc->dest = dest;
desc->len = len;
desc->direction = DMA_MEM_TO_MEM;
list_move_tail(channel->ld_free.next, &channel->ld_queue);
}
list_move_tail(channel->ld_free.next, &channel->ld_queue);
return vchan_tx_prep(&channel->vc, &desc->vd, flags);
}
@@ -529,27 +531,29 @@ rz_dmac_prep_slave_sg(struct dma_chan *chan, struct scatterlist *sgl,
int dma_length = 0;
int i = 0;
if (list_empty(&channel->ld_free))
return NULL;
scoped_guard(spinlock_irqsave, &channel->vc.lock) {
if (list_empty(&channel->ld_free))
return NULL;
desc = list_first_entry(&channel->ld_free, struct rz_dmac_desc, node);
desc = list_first_entry(&channel->ld_free, struct rz_dmac_desc, node);
for_each_sg(sgl, sg, sg_len, i) {
dma_length += sg_dma_len(sg);
for_each_sg(sgl, sg, sg_len, i)
dma_length += sg_dma_len(sg);
desc->type = RZ_DMAC_DESC_SLAVE_SG;
desc->sg = sgl;
desc->sgcount = sg_len;
desc->len = dma_length;
desc->direction = direction;
if (direction == DMA_DEV_TO_MEM)
desc->src = channel->src_per_address;
else
desc->dest = channel->dst_per_address;
list_move_tail(channel->ld_free.next, &channel->ld_queue);
}
desc->type = RZ_DMAC_DESC_SLAVE_SG;
desc->sg = sgl;
desc->sgcount = sg_len;
desc->len = dma_length;
desc->direction = direction;
if (direction == DMA_DEV_TO_MEM)
desc->src = channel->src_per_address;
else
desc->dest = channel->dst_per_address;
list_move_tail(channel->ld_free.next, &channel->ld_queue);
return vchan_tx_prep(&channel->vc, &desc->vd, flags);
}
@@ -561,8 +565,8 @@ static int rz_dmac_terminate_all(struct dma_chan *chan)
unsigned int i;
LIST_HEAD(head);
rz_dmac_disable_hw(channel);
spin_lock_irqsave(&channel->vc.lock, flags);
rz_dmac_disable_hw(channel);
for (i = 0; i < DMAC_NR_LMDESC; i++)
lmdesc[i].header = 0;
@@ -699,7 +703,9 @@ static void rz_dmac_irq_handle_channel(struct rz_dmac_chan *channel)
if (chstat & CHSTAT_ER) {
dev_err(dmac->dev, "DMAC err CHSTAT_%d = %08X\n",
channel->index, chstat);
rz_dmac_ch_writel(channel, CHCTRL_DEFAULT, CHCTRL, 1);
scoped_guard(spinlock_irqsave, &channel->vc.lock)
rz_dmac_ch_writel(channel, CHCTRL_DEFAULT, CHCTRL, 1);
goto done;
}

View File

@@ -1234,8 +1234,8 @@ static int xdma_probe(struct platform_device *pdev)
xdev->rmap = devm_regmap_init_mmio(&pdev->dev, reg_base,
&xdma_regmap_config);
if (!xdev->rmap) {
xdma_err(xdev, "config regmap failed: %d", ret);
if (IS_ERR(xdev->rmap)) {
xdma_err(xdev, "config regmap failed: %pe", xdev->rmap);
goto failed;
}
INIT_LIST_HEAD(&xdev->dma_dev.channels);

View File

@@ -997,16 +997,16 @@ static u32 xilinx_dma_get_residue(struct xilinx_dma_chan *chan,
struct xilinx_cdma_tx_segment,
node);
cdma_hw = &cdma_seg->hw;
residue += (cdma_hw->control - cdma_hw->status) &
chan->xdev->max_buffer_len;
residue += (cdma_hw->control & chan->xdev->max_buffer_len) -
(cdma_hw->status & chan->xdev->max_buffer_len);
} else if (chan->xdev->dma_config->dmatype ==
XDMA_TYPE_AXIDMA) {
axidma_seg = list_entry(entry,
struct xilinx_axidma_tx_segment,
node);
axidma_hw = &axidma_seg->hw;
residue += (axidma_hw->control - axidma_hw->status) &
chan->xdev->max_buffer_len;
residue += (axidma_hw->control & chan->xdev->max_buffer_len) -
(axidma_hw->status & chan->xdev->max_buffer_len);
} else {
aximcdma_seg =
list_entry(entry,
@@ -1014,8 +1014,8 @@ static u32 xilinx_dma_get_residue(struct xilinx_dma_chan *chan,
node);
aximcdma_hw = &aximcdma_seg->hw;
residue +=
(aximcdma_hw->control - aximcdma_hw->status) &
chan->xdev->max_buffer_len;
(aximcdma_hw->control & chan->xdev->max_buffer_len) -
(aximcdma_hw->status & chan->xdev->max_buffer_len);
}
}
@@ -1235,14 +1235,6 @@ static int xilinx_dma_alloc_chan_resources(struct dma_chan *dchan)
dma_cookie_init(dchan);
if (chan->xdev->dma_config->dmatype == XDMA_TYPE_AXIDMA) {
/* For AXI DMA resetting once channel will reset the
* other channel as well so enable the interrupts here.
*/
dma_ctrl_set(chan, XILINX_DMA_REG_DMACR,
XILINX_DMA_DMAXR_ALL_IRQ_MASK);
}
if ((chan->xdev->dma_config->dmatype == XDMA_TYPE_CDMA) && chan->has_sg)
dma_ctrl_set(chan, XILINX_DMA_REG_DMACR,
XILINX_CDMA_CR_SGMODE);
@@ -1564,8 +1556,29 @@ static void xilinx_dma_start_transfer(struct xilinx_dma_chan *chan)
if (chan->err)
return;
if (list_empty(&chan->pending_list))
if (list_empty(&chan->pending_list)) {
if (chan->cyclic) {
struct xilinx_dma_tx_descriptor *desc;
struct list_head *entry;
desc = list_last_entry(&chan->done_list,
struct xilinx_dma_tx_descriptor, node);
list_for_each(entry, &desc->segments) {
struct xilinx_axidma_tx_segment *axidma_seg;
struct xilinx_axidma_desc_hw *axidma_hw;
axidma_seg = list_entry(entry,
struct xilinx_axidma_tx_segment,
node);
axidma_hw = &axidma_seg->hw;
axidma_hw->status = 0;
}
list_splice_tail_init(&chan->done_list, &chan->active_list);
chan->desc_pendingcount = 0;
chan->idle = false;
}
return;
}
if (!chan->idle)
return;
@@ -1591,6 +1604,7 @@ static void xilinx_dma_start_transfer(struct xilinx_dma_chan *chan)
head_desc->async_tx.phys);
reg &= ~XILINX_DMA_CR_DELAY_MAX;
reg |= chan->irq_delay << XILINX_DMA_CR_DELAY_SHIFT;
reg |= XILINX_DMA_DMAXR_ALL_IRQ_MASK;
dma_ctrl_write(chan, XILINX_DMA_REG_DMACR, reg);
xilinx_dma_start(chan);
@@ -3024,7 +3038,7 @@ static int xilinx_dma_chan_probe(struct xilinx_dma_device *xdev,
return -EINVAL;
}
xdev->common.directions |= chan->direction;
xdev->common.directions |= BIT(chan->direction);
/* Request the interrupt */
chan->irq = of_irq_get(node, chan->tdest);

View File

@@ -692,9 +692,9 @@ int amdgpu_amdkfd_submit_ib(struct amdgpu_device *adev,
goto err_ib_sched;
}
/* Drop the initial kref_init count (see drm_sched_main as example) */
dma_fence_put(f);
ret = dma_fence_wait(f, false);
/* Drop the returned fence reference after the wait completes */
dma_fence_put(f);
err_ib_sched:
amdgpu_job_free(job);

View File

@@ -4207,7 +4207,8 @@ fail:
static int amdgpu_device_get_job_timeout_settings(struct amdgpu_device *adev)
{
char *input = amdgpu_lockup_timeout;
char buf[AMDGPU_MAX_TIMEOUT_PARAM_LENGTH];
char *input = buf;
char *timeout_setting = NULL;
int index = 0;
long timeout;
@@ -4217,9 +4218,17 @@ static int amdgpu_device_get_job_timeout_settings(struct amdgpu_device *adev)
adev->gfx_timeout = adev->compute_timeout = adev->sdma_timeout =
adev->video_timeout = msecs_to_jiffies(2000);
if (!strnlen(input, AMDGPU_MAX_TIMEOUT_PARAM_LENGTH))
if (!strnlen(amdgpu_lockup_timeout, AMDGPU_MAX_TIMEOUT_PARAM_LENGTH))
return 0;
/*
* strsep() destructively modifies its input by replacing delimiters
* with '\0'. Use a stack copy so the global module parameter buffer
* remains intact for multi-GPU systems where this function is called
* once per device.
*/
strscpy(buf, amdgpu_lockup_timeout, sizeof(buf));
while ((timeout_setting = strsep(&input, ",")) &&
strnlen(timeout_setting, AMDGPU_MAX_TIMEOUT_PARAM_LENGTH)) {
ret = kstrtol(timeout_setting, 0, &timeout);

View File

@@ -35,10 +35,13 @@
* PASIDs are global address space identifiers that can be shared
* between the GPU, an IOMMU and the driver. VMs on different devices
* may use the same PASID if they share the same address
* space. Therefore PASIDs are allocated using a global IDA. VMs are
* looked up from the PASID per amdgpu_device.
* space. Therefore PASIDs are allocated using IDR cyclic allocator
* (similar to kernel PID allocation) which naturally delays reuse.
* VMs are looked up from the PASID per amdgpu_device.
*/
static DEFINE_IDA(amdgpu_pasid_ida);
static DEFINE_IDR(amdgpu_pasid_idr);
static DEFINE_SPINLOCK(amdgpu_pasid_idr_lock);
/* Helper to free pasid from a fence callback */
struct amdgpu_pasid_cb {
@@ -50,8 +53,8 @@ struct amdgpu_pasid_cb {
* amdgpu_pasid_alloc - Allocate a PASID
* @bits: Maximum width of the PASID in bits, must be at least 1
*
* Allocates a PASID of the given width while keeping smaller PASIDs
* available if possible.
* Uses kernel's IDR cyclic allocator (same as PID allocation).
* Allocates sequentially with automatic wrap-around.
*
* Returns a positive integer on success. Returns %-EINVAL if bits==0.
* Returns %-ENOSPC if no PASID was available. Returns %-ENOMEM on
@@ -59,14 +62,15 @@ struct amdgpu_pasid_cb {
*/
int amdgpu_pasid_alloc(unsigned int bits)
{
int pasid = -EINVAL;
int pasid;
for (bits = min(bits, 31U); bits > 0; bits--) {
pasid = ida_alloc_range(&amdgpu_pasid_ida, 1U << (bits - 1),
(1U << bits) - 1, GFP_KERNEL);
if (pasid != -ENOSPC)
break;
}
if (bits == 0)
return -EINVAL;
spin_lock(&amdgpu_pasid_idr_lock);
pasid = idr_alloc_cyclic(&amdgpu_pasid_idr, NULL, 1,
1U << bits, GFP_KERNEL);
spin_unlock(&amdgpu_pasid_idr_lock);
if (pasid >= 0)
trace_amdgpu_pasid_allocated(pasid);
@@ -81,7 +85,10 @@ int amdgpu_pasid_alloc(unsigned int bits)
void amdgpu_pasid_free(u32 pasid)
{
trace_amdgpu_pasid_freed(pasid);
ida_free(&amdgpu_pasid_ida, pasid);
spin_lock(&amdgpu_pasid_idr_lock);
idr_remove(&amdgpu_pasid_idr, pasid);
spin_unlock(&amdgpu_pasid_idr_lock);
}
static void amdgpu_pasid_free_cb(struct dma_fence *fence,
@@ -616,3 +623,15 @@ void amdgpu_vmid_mgr_fini(struct amdgpu_device *adev)
}
}
}
/**
* amdgpu_pasid_mgr_cleanup - cleanup PASID manager
*
* Cleanup the IDR allocator.
*/
void amdgpu_pasid_mgr_cleanup(void)
{
spin_lock(&amdgpu_pasid_idr_lock);
idr_destroy(&amdgpu_pasid_idr);
spin_unlock(&amdgpu_pasid_idr_lock);
}

View File

@@ -74,6 +74,7 @@ int amdgpu_pasid_alloc(unsigned int bits);
void amdgpu_pasid_free(u32 pasid);
void amdgpu_pasid_free_delayed(struct dma_resv *resv,
u32 pasid);
void amdgpu_pasid_mgr_cleanup(void);
bool amdgpu_vmid_had_gpu_reset(struct amdgpu_device *adev,
struct amdgpu_vmid *id);

View File

@@ -2898,6 +2898,7 @@ void amdgpu_vm_manager_fini(struct amdgpu_device *adev)
xa_destroy(&adev->vm_manager.pasids);
amdgpu_vmid_mgr_fini(adev);
amdgpu_pasid_mgr_cleanup();
}
/**
@@ -2973,14 +2974,14 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device *adev, u32 pasid,
if (!root)
return false;
addr /= AMDGPU_GPU_PAGE_SIZE;
if (is_compute_context && !svm_range_restore_pages(adev, pasid, vmid,
node_id, addr, ts, write_fault)) {
node_id, addr >> PAGE_SHIFT, ts, write_fault)) {
amdgpu_bo_unref(&root);
return true;
}
addr /= AMDGPU_GPU_PAGE_SIZE;
r = amdgpu_bo_reserve(root, true);
if (r)
goto error_unref;

View File

@@ -3170,11 +3170,11 @@ static int kfd_ioctl_create_process(struct file *filep, struct kfd_process *p, v
struct kfd_process *process;
int ret;
/* Each FD owns only one kfd_process */
if (p->context_id != KFD_CONTEXT_ID_PRIMARY)
if (!filep->private_data || !p)
return -EINVAL;
if (!filep->private_data || !p)
/* Each FD owns only one kfd_process */
if (p->context_id != KFD_CONTEXT_ID_PRIMARY)
return -EINVAL;
mutex_lock(&kfd_processes_mutex);

View File

@@ -3909,8 +3909,9 @@ void amdgpu_dm_update_connector_after_detect(
aconnector->dc_sink = sink;
dc_sink_retain(aconnector->dc_sink);
drm_edid_free(aconnector->drm_edid);
aconnector->drm_edid = NULL;
if (sink->dc_edid.length == 0) {
aconnector->drm_edid = NULL;
hdmi_cec_unset_edid(aconnector);
if (aconnector->dc_link->aux_mode) {
drm_dp_cec_unset_edid(&aconnector->dm_dp_aux.aux);
@@ -5422,7 +5423,7 @@ static void setup_backlight_device(struct amdgpu_display_manager *dm,
caps = &dm->backlight_caps[aconnector->bl_idx];
/* Only offer ABM property when non-OLED and user didn't turn off by module parameter */
if (!caps->ext_caps->bits.oled && amdgpu_dm_abm_level < 0)
if (caps->ext_caps && !caps->ext_caps->bits.oled && amdgpu_dm_abm_level < 0)
drm_object_attach_property(&aconnector->base.base,
dm->adev->mode_info.abm_level_property,
ABM_SYSFS_CONTROL);
@@ -12523,6 +12524,11 @@ static int amdgpu_dm_atomic_check(struct drm_device *dev,
}
if (dc_resource_is_dsc_encoding_supported(dc)) {
for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state, new_crtc_state, i) {
dm_new_crtc_state = to_dm_crtc_state(new_crtc_state);
dm_new_crtc_state->mode_changed_independent_from_dsc = new_crtc_state->mode_changed;
}
for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state, new_crtc_state, i) {
if (drm_atomic_crtc_needs_modeset(new_crtc_state)) {
ret = add_affected_mst_dsc_crtcs(state, crtc);

View File

@@ -984,6 +984,7 @@ struct dm_crtc_state {
bool freesync_vrr_info_changed;
bool mode_changed_independent_from_dsc;
bool dsc_force_changed;
bool vrr_supported;
struct mod_freesync_config freesync_config;

View File

@@ -1744,9 +1744,11 @@ int pre_validate_dsc(struct drm_atomic_state *state,
int ind = find_crtc_index_in_state_by_stream(state, stream);
if (ind >= 0) {
struct dm_crtc_state *dm_new_crtc_state = to_dm_crtc_state(state->crtcs[ind].new_state);
DRM_INFO_ONCE("%s:%d MST_DSC no mode changed for stream 0x%p\n",
__func__, __LINE__, stream);
state->crtcs[ind].new_state->mode_changed = 0;
dm_new_crtc_state->base.mode_changed = dm_new_crtc_state->mode_changed_independent_from_dsc;
}
}
}

View File

@@ -650,9 +650,6 @@ static struct link_encoder *dce100_link_encoder_create(
return &enc110->base;
}
if (enc_init_data->hpd_source >= ARRAY_SIZE(link_enc_hpd_regs))
return NULL;
link_regs_id =
map_transmitter_id_to_phy_instance(enc_init_data->transmitter);
@@ -661,7 +658,8 @@ static struct link_encoder *dce100_link_encoder_create(
&link_enc_feature,
&link_enc_regs[link_regs_id],
&link_enc_aux_regs[enc_init_data->channel - 1],
&link_enc_hpd_regs[enc_init_data->hpd_source]);
enc_init_data->hpd_source >= ARRAY_SIZE(link_enc_hpd_regs) ?
NULL : &link_enc_hpd_regs[enc_init_data->hpd_source]);
return &enc110->base;
}

View File

@@ -671,7 +671,7 @@ static struct link_encoder *dce110_link_encoder_create(
kzalloc_obj(struct dce110_link_encoder);
int link_regs_id;
if (!enc110 || enc_init_data->hpd_source >= ARRAY_SIZE(link_enc_hpd_regs))
if (!enc110)
return NULL;
link_regs_id =
@@ -682,7 +682,8 @@ static struct link_encoder *dce110_link_encoder_create(
&link_enc_feature,
&link_enc_regs[link_regs_id],
&link_enc_aux_regs[enc_init_data->channel - 1],
&link_enc_hpd_regs[enc_init_data->hpd_source]);
enc_init_data->hpd_source >= ARRAY_SIZE(link_enc_hpd_regs) ?
NULL : &link_enc_hpd_regs[enc_init_data->hpd_source]);
return &enc110->base;
}

View File

@@ -632,7 +632,7 @@ static struct link_encoder *dce112_link_encoder_create(
kzalloc_obj(struct dce110_link_encoder);
int link_regs_id;
if (!enc110 || enc_init_data->hpd_source >= ARRAY_SIZE(link_enc_hpd_regs))
if (!enc110)
return NULL;
link_regs_id =
@@ -643,7 +643,8 @@ static struct link_encoder *dce112_link_encoder_create(
&link_enc_feature,
&link_enc_regs[link_regs_id],
&link_enc_aux_regs[enc_init_data->channel - 1],
&link_enc_hpd_regs[enc_init_data->hpd_source]);
enc_init_data->hpd_source >= ARRAY_SIZE(link_enc_hpd_regs) ?
NULL : &link_enc_hpd_regs[enc_init_data->hpd_source]);
return &enc110->base;
}

View File

@@ -716,7 +716,7 @@ static struct link_encoder *dce120_link_encoder_create(
kzalloc_obj(struct dce110_link_encoder);
int link_regs_id;
if (!enc110 || enc_init_data->hpd_source >= ARRAY_SIZE(link_enc_hpd_regs))
if (!enc110)
return NULL;
link_regs_id =
@@ -727,7 +727,8 @@ static struct link_encoder *dce120_link_encoder_create(
&link_enc_feature,
&link_enc_regs[link_regs_id],
&link_enc_aux_regs[enc_init_data->channel - 1],
&link_enc_hpd_regs[enc_init_data->hpd_source]);
enc_init_data->hpd_source >= ARRAY_SIZE(link_enc_hpd_regs) ?
NULL : &link_enc_hpd_regs[enc_init_data->hpd_source]);
return &enc110->base;
}

View File

@@ -746,18 +746,16 @@ static struct link_encoder *dce60_link_encoder_create(
return &enc110->base;
}
if (enc_init_data->hpd_source >= ARRAY_SIZE(link_enc_hpd_regs))
return NULL;
link_regs_id =
map_transmitter_id_to_phy_instance(enc_init_data->transmitter);
dce60_link_encoder_construct(enc110,
enc_init_data,
&link_enc_feature,
&link_enc_regs[link_regs_id],
&link_enc_aux_regs[enc_init_data->channel - 1],
&link_enc_hpd_regs[enc_init_data->hpd_source]);
enc_init_data,
&link_enc_feature,
&link_enc_regs[link_regs_id],
&link_enc_aux_regs[enc_init_data->channel - 1],
enc_init_data->hpd_source >= ARRAY_SIZE(link_enc_hpd_regs) ?
NULL : &link_enc_hpd_regs[enc_init_data->hpd_source]);
return &enc110->base;
}

View File

@@ -752,9 +752,6 @@ static struct link_encoder *dce80_link_encoder_create(
return &enc110->base;
}
if (enc_init_data->hpd_source >= ARRAY_SIZE(link_enc_hpd_regs))
return NULL;
link_regs_id =
map_transmitter_id_to_phy_instance(enc_init_data->transmitter);
@@ -763,7 +760,8 @@ static struct link_encoder *dce80_link_encoder_create(
&link_enc_feature,
&link_enc_regs[link_regs_id],
&link_enc_aux_regs[enc_init_data->channel - 1],
&link_enc_hpd_regs[enc_init_data->hpd_source]);
enc_init_data->hpd_source >= ARRAY_SIZE(link_enc_hpd_regs) ?
NULL : &link_enc_hpd_regs[enc_init_data->hpd_source]);
return &enc110->base;
}

View File

@@ -59,6 +59,10 @@
#define to_amdgpu_device(x) (container_of(x, struct amdgpu_device, pm.smu_i2c))
static void smu_v13_0_0_get_od_setting_limits(struct smu_context *smu,
int od_feature_bit,
int32_t *min, int32_t *max);
static const struct smu_feature_bits smu_v13_0_0_dpm_features = {
.bits = {
SMU_FEATURE_BIT_INIT(FEATURE_DPM_GFXCLK_BIT),
@@ -1043,8 +1047,35 @@ static bool smu_v13_0_0_is_od_feature_supported(struct smu_context *smu,
PPTable_t *pptable = smu->smu_table.driver_pptable;
const OverDriveLimits_t * const overdrive_upperlimits =
&pptable->SkuTable.OverDriveLimitsBasicMax;
int32_t min_value, max_value;
bool feature_enabled;
return overdrive_upperlimits->FeatureCtrlMask & (1U << od_feature_bit);
switch (od_feature_bit) {
case PP_OD_FEATURE_FAN_CURVE_BIT:
feature_enabled = !!(overdrive_upperlimits->FeatureCtrlMask & (1U << od_feature_bit));
if (feature_enabled) {
smu_v13_0_0_get_od_setting_limits(smu, PP_OD_FEATURE_FAN_CURVE_TEMP,
&min_value, &max_value);
if (!min_value && !max_value) {
feature_enabled = false;
goto out;
}
smu_v13_0_0_get_od_setting_limits(smu, PP_OD_FEATURE_FAN_CURVE_PWM,
&min_value, &max_value);
if (!min_value && !max_value) {
feature_enabled = false;
goto out;
}
}
break;
default:
feature_enabled = !!(overdrive_upperlimits->FeatureCtrlMask & (1U << od_feature_bit));
break;
}
out:
return feature_enabled;
}
static void smu_v13_0_0_get_od_setting_limits(struct smu_context *smu,

View File

@@ -1391,7 +1391,7 @@ static int smu_v13_0_6_emit_clk_levels(struct smu_context *smu,
break;
case SMU_OD_MCLK:
if (!smu_v13_0_6_cap_supported(smu, SMU_CAP(SET_UCLK_MAX)))
return 0;
return -EOPNOTSUPP;
size += sysfs_emit_at(buf, size, "%s:\n", "OD_MCLK");
size += sysfs_emit_at(buf, size, "0: %uMhz\n1: %uMhz\n",
@@ -2122,6 +2122,7 @@ static int smu_v13_0_6_usr_edit_dpm_table(struct smu_context *smu,
{
struct smu_dpm_context *smu_dpm = &(smu->smu_dpm);
struct smu_13_0_dpm_context *dpm_context = smu_dpm->dpm_context;
struct smu_dpm_table *uclk_table = &dpm_context->dpm_tables.uclk_table;
struct smu_umd_pstate_table *pstate_table = &smu->pstate_table;
uint32_t min_clk;
uint32_t max_clk;
@@ -2221,14 +2222,16 @@ static int smu_v13_0_6_usr_edit_dpm_table(struct smu_context *smu,
if (ret)
return ret;
min_clk = SMU_DPM_TABLE_MIN(
&dpm_context->dpm_tables.uclk_table);
max_clk = SMU_DPM_TABLE_MAX(
&dpm_context->dpm_tables.uclk_table);
ret = smu_v13_0_6_set_soft_freq_limited_range(
smu, SMU_UCLK, min_clk, max_clk, false);
if (ret)
return ret;
if (SMU_DPM_TABLE_MAX(uclk_table) !=
pstate_table->uclk_pstate.curr.max) {
min_clk = SMU_DPM_TABLE_MIN(&dpm_context->dpm_tables.uclk_table);
max_clk = SMU_DPM_TABLE_MAX(&dpm_context->dpm_tables.uclk_table);
ret = smu_v13_0_6_set_soft_freq_limited_range(smu,
SMU_UCLK, min_clk,
max_clk, false);
if (ret)
return ret;
}
smu_v13_0_reset_custom_level(smu);
}
break;

View File

@@ -59,6 +59,10 @@
#define to_amdgpu_device(x) (container_of(x, struct amdgpu_device, pm.smu_i2c))
static void smu_v13_0_7_get_od_setting_limits(struct smu_context *smu,
int od_feature_bit,
int32_t *min, int32_t *max);
static const struct smu_feature_bits smu_v13_0_7_dpm_features = {
.bits = {
SMU_FEATURE_BIT_INIT(FEATURE_DPM_GFXCLK_BIT),
@@ -1053,8 +1057,35 @@ static bool smu_v13_0_7_is_od_feature_supported(struct smu_context *smu,
PPTable_t *pptable = smu->smu_table.driver_pptable;
const OverDriveLimits_t * const overdrive_upperlimits =
&pptable->SkuTable.OverDriveLimitsBasicMax;
int32_t min_value, max_value;
bool feature_enabled;
return overdrive_upperlimits->FeatureCtrlMask & (1U << od_feature_bit);
switch (od_feature_bit) {
case PP_OD_FEATURE_FAN_CURVE_BIT:
feature_enabled = !!(overdrive_upperlimits->FeatureCtrlMask & (1U << od_feature_bit));
if (feature_enabled) {
smu_v13_0_7_get_od_setting_limits(smu, PP_OD_FEATURE_FAN_CURVE_TEMP,
&min_value, &max_value);
if (!min_value && !max_value) {
feature_enabled = false;
goto out;
}
smu_v13_0_7_get_od_setting_limits(smu, PP_OD_FEATURE_FAN_CURVE_PWM,
&min_value, &max_value);
if (!min_value && !max_value) {
feature_enabled = false;
goto out;
}
}
break;
default:
feature_enabled = !!(overdrive_upperlimits->FeatureCtrlMask & (1U << od_feature_bit));
break;
}
out:
return feature_enabled;
}
static void smu_v13_0_7_get_od_setting_limits(struct smu_context *smu,

View File

@@ -56,6 +56,10 @@
#define to_amdgpu_device(x) (container_of(x, struct amdgpu_device, pm.smu_i2c))
static void smu_v14_0_2_get_od_setting_limits(struct smu_context *smu,
int od_feature_bit,
int32_t *min, int32_t *max);
static const struct smu_feature_bits smu_v14_0_2_dpm_features = {
.bits = { SMU_FEATURE_BIT_INIT(FEATURE_DPM_GFXCLK_BIT),
SMU_FEATURE_BIT_INIT(FEATURE_DPM_UCLK_BIT),
@@ -922,8 +926,35 @@ static bool smu_v14_0_2_is_od_feature_supported(struct smu_context *smu,
PPTable_t *pptable = smu->smu_table.driver_pptable;
const OverDriveLimits_t * const overdrive_upperlimits =
&pptable->SkuTable.OverDriveLimitsBasicMax;
int32_t min_value, max_value;
bool feature_enabled;
return overdrive_upperlimits->FeatureCtrlMask & (1U << od_feature_bit);
switch (od_feature_bit) {
case PP_OD_FEATURE_FAN_CURVE_BIT:
feature_enabled = !!(overdrive_upperlimits->FeatureCtrlMask & (1U << od_feature_bit));
if (feature_enabled) {
smu_v14_0_2_get_od_setting_limits(smu, PP_OD_FEATURE_FAN_CURVE_TEMP,
&min_value, &max_value);
if (!min_value && !max_value) {
feature_enabled = false;
goto out;
}
smu_v14_0_2_get_od_setting_limits(smu, PP_OD_FEATURE_FAN_CURVE_PWM,
&min_value, &max_value);
if (!min_value && !max_value) {
feature_enabled = false;
goto out;
}
}
break;
default:
feature_enabled = !!(overdrive_upperlimits->FeatureCtrlMask & (1U << od_feature_bit));
break;
}
out:
return feature_enabled;
}
static void smu_v14_0_2_get_od_setting_limits(struct smu_context *smu,

View File

@@ -550,27 +550,27 @@ int drm_gem_shmem_dumb_create(struct drm_file *file, struct drm_device *dev,
}
EXPORT_SYMBOL_GPL(drm_gem_shmem_dumb_create);
static bool drm_gem_shmem_try_map_pmd(struct vm_fault *vmf, unsigned long addr,
struct page *page)
static vm_fault_t try_insert_pfn(struct vm_fault *vmf, unsigned int order,
unsigned long pfn)
{
if (!order) {
return vmf_insert_pfn(vmf->vma, vmf->address, pfn);
#ifdef CONFIG_ARCH_SUPPORTS_PMD_PFNMAP
unsigned long pfn = page_to_pfn(page);
unsigned long paddr = pfn << PAGE_SHIFT;
bool aligned = (addr & ~PMD_MASK) == (paddr & ~PMD_MASK);
} else if (order == PMD_ORDER) {
unsigned long paddr = pfn << PAGE_SHIFT;
bool aligned = (vmf->address & ~PMD_MASK) == (paddr & ~PMD_MASK);
if (aligned &&
pmd_none(*vmf->pmd) &&
folio_test_pmd_mappable(page_folio(page))) {
pfn &= PMD_MASK >> PAGE_SHIFT;
if (vmf_insert_pfn_pmd(vmf, pfn, false) == VM_FAULT_NOPAGE)
return true;
}
if (aligned &&
folio_test_pmd_mappable(page_folio(pfn_to_page(pfn)))) {
pfn &= PMD_MASK >> PAGE_SHIFT;
return vmf_insert_pfn_pmd(vmf, pfn, false);
}
#endif
return false;
}
return VM_FAULT_FALLBACK;
}
static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf)
static vm_fault_t drm_gem_shmem_any_fault(struct vm_fault *vmf, unsigned int order)
{
struct vm_area_struct *vma = vmf->vma;
struct drm_gem_object *obj = vma->vm_private_data;
@@ -581,6 +581,9 @@ static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf)
pgoff_t page_offset;
unsigned long pfn;
if (order && order != PMD_ORDER)
return VM_FAULT_FALLBACK;
/* Offset to faulty address in the VMA. */
page_offset = vmf->pgoff - vma->vm_pgoff;
@@ -593,13 +596,8 @@ static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf)
goto out;
}
if (drm_gem_shmem_try_map_pmd(vmf, vmf->address, pages[page_offset])) {
ret = VM_FAULT_NOPAGE;
goto out;
}
pfn = page_to_pfn(pages[page_offset]);
ret = vmf_insert_pfn(vma, vmf->address, pfn);
ret = try_insert_pfn(vmf, order, pfn);
out:
dma_resv_unlock(shmem->base.resv);
@@ -607,6 +605,11 @@ static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf)
return ret;
}
static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf)
{
return drm_gem_shmem_any_fault(vmf, 0);
}
static void drm_gem_shmem_vm_open(struct vm_area_struct *vma)
{
struct drm_gem_object *obj = vma->vm_private_data;
@@ -643,6 +646,9 @@ static void drm_gem_shmem_vm_close(struct vm_area_struct *vma)
const struct vm_operations_struct drm_gem_shmem_vm_ops = {
.fault = drm_gem_shmem_fault,
#ifdef CONFIG_ARCH_SUPPORTS_PMD_PFNMAP
.huge_fault = drm_gem_shmem_any_fault,
#endif
.open = drm_gem_shmem_vm_open,
.close = drm_gem_shmem_vm_close,
};

View File

@@ -602,7 +602,7 @@ int drm_syncobj_get_handle(struct drm_file *file_private,
drm_syncobj_get(syncobj);
ret = xa_alloc(&file_private->syncobj_xa, handle, syncobj, xa_limit_32b,
GFP_NOWAIT);
GFP_KERNEL);
if (ret)
drm_syncobj_put(syncobj);
@@ -716,7 +716,7 @@ static int drm_syncobj_fd_to_handle(struct drm_file *file_private,
drm_syncobj_get(syncobj);
ret = xa_alloc(&file_private->syncobj_xa, handle, syncobj, xa_limit_32b,
GFP_NOWAIT);
GFP_KERNEL);
if (ret)
drm_syncobj_put(syncobj);

View File

@@ -4602,6 +4602,7 @@ intel_crtc_prepare_cleared_state(struct intel_atomic_state *state,
struct intel_crtc_state *crtc_state =
intel_atomic_get_new_crtc_state(state, crtc);
struct intel_crtc_state *saved_state;
int err;
saved_state = intel_crtc_state_alloc(crtc);
if (!saved_state)
@@ -4610,7 +4611,12 @@ intel_crtc_prepare_cleared_state(struct intel_atomic_state *state,
/* free the old crtc_state->hw members */
intel_crtc_free_hw_state(crtc_state);
intel_dp_tunnel_atomic_clear_stream_bw(state, crtc_state);
err = intel_dp_tunnel_atomic_clear_stream_bw(state, crtc_state);
if (err) {
kfree(saved_state);
return err;
}
/* FIXME: before the switch to atomic started, a new pipe_config was
* kzalloc'd. Code that depends on any field being zero should be

View File

@@ -621,19 +621,27 @@ int intel_dp_tunnel_atomic_compute_stream_bw(struct intel_atomic_state *state,
*
* Clear any DP tunnel stream BW requirement set by
* intel_dp_tunnel_atomic_compute_stream_bw().
*
* Returns 0 in case of success, a negative error code otherwise.
*/
void intel_dp_tunnel_atomic_clear_stream_bw(struct intel_atomic_state *state,
struct intel_crtc_state *crtc_state)
int intel_dp_tunnel_atomic_clear_stream_bw(struct intel_atomic_state *state,
struct intel_crtc_state *crtc_state)
{
struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
int err;
if (!crtc_state->dp_tunnel_ref.tunnel)
return;
return 0;
err = drm_dp_tunnel_atomic_set_stream_bw(&state->base,
crtc_state->dp_tunnel_ref.tunnel,
crtc->pipe, 0);
if (err)
return err;
drm_dp_tunnel_atomic_set_stream_bw(&state->base,
crtc_state->dp_tunnel_ref.tunnel,
crtc->pipe, 0);
drm_dp_tunnel_ref_put(&crtc_state->dp_tunnel_ref);
return 0;
}
/**

View File

@@ -40,8 +40,8 @@ int intel_dp_tunnel_atomic_compute_stream_bw(struct intel_atomic_state *state,
struct intel_dp *intel_dp,
const struct intel_connector *connector,
struct intel_crtc_state *crtc_state);
void intel_dp_tunnel_atomic_clear_stream_bw(struct intel_atomic_state *state,
struct intel_crtc_state *crtc_state);
int intel_dp_tunnel_atomic_clear_stream_bw(struct intel_atomic_state *state,
struct intel_crtc_state *crtc_state);
int intel_dp_tunnel_atomic_add_state_for_crtc(struct intel_atomic_state *state,
struct intel_crtc *crtc);
@@ -88,9 +88,12 @@ intel_dp_tunnel_atomic_compute_stream_bw(struct intel_atomic_state *state,
return 0;
}
static inline void
static inline int
intel_dp_tunnel_atomic_clear_stream_bw(struct intel_atomic_state *state,
struct intel_crtc_state *crtc_state) {}
struct intel_crtc_state *crtc_state)
{
return 0;
}
static inline int
intel_dp_tunnel_atomic_add_state_for_crtc(struct intel_atomic_state *state,

View File

@@ -496,8 +496,10 @@ gmbus_xfer_read_chunk(struct intel_display *display,
val = intel_de_read_fw(display, GMBUS3(display));
do {
if (extra_byte_added && len == 1)
if (extra_byte_added && len == 1) {
len--;
break;
}
*buf++ = val & 0xff;
val >>= 8;

View File

@@ -436,11 +436,16 @@ void intel_plane_copy_hw_state(struct intel_plane_state *plane_state,
drm_framebuffer_get(plane_state->hw.fb);
}
static void unlink_nv12_plane(struct intel_crtc_state *crtc_state,
struct intel_plane_state *plane_state);
void intel_plane_set_invisible(struct intel_crtc_state *crtc_state,
struct intel_plane_state *plane_state)
{
struct intel_plane *plane = to_intel_plane(plane_state->uapi.plane);
unlink_nv12_plane(crtc_state, plane_state);
crtc_state->active_planes &= ~BIT(plane->id);
crtc_state->scaled_planes &= ~BIT(plane->id);
crtc_state->nv12_planes &= ~BIT(plane->id);
@@ -1513,6 +1518,9 @@ static void unlink_nv12_plane(struct intel_crtc_state *crtc_state,
struct intel_display *display = to_intel_display(plane_state);
struct intel_plane *plane = to_intel_plane(plane_state->uapi.plane);
if (!plane_state->planar_linked_plane)
return;
plane_state->planar_linked_plane = NULL;
if (!plane_state->is_y_plane)
@@ -1550,8 +1558,7 @@ static int icl_check_nv12_planes(struct intel_atomic_state *state,
if (plane->pipe != crtc->pipe)
continue;
if (plane_state->planar_linked_plane)
unlink_nv12_plane(crtc_state, plane_state);
unlink_nv12_plane(crtc_state, plane_state);
}
if (!crtc_state->nv12_planes)

View File

@@ -25,9 +25,9 @@
might_sleep(); \
for (;;) { \
const bool expired__ = ktime_after(ktime_get_raw(), end__); \
OP; \
/* Guarantee COND check prior to timeout */ \
barrier(); \
OP; \
if (COND) { \
ret__ = 0; \
break; \

View File

@@ -1236,6 +1236,11 @@ static int mtk_dsi_probe(struct platform_device *pdev)
dsi->host.ops = &mtk_dsi_ops;
dsi->host.dev = dev;
init_waitqueue_head(&dsi->irq_wait_queue);
platform_set_drvdata(pdev, dsi);
ret = mipi_dsi_host_register(&dsi->host);
if (ret < 0)
return dev_err_probe(dev, ret, "Failed to register DSI host\n");
@@ -1247,10 +1252,6 @@ static int mtk_dsi_probe(struct platform_device *pdev)
return dev_err_probe(&pdev->dev, ret, "Failed to request DSI irq\n");
}
init_waitqueue_head(&dsi->irq_wait_queue);
platform_set_drvdata(pdev, dsi);
dsi->bridge.of_node = dev->of_node;
dsi->bridge.type = DRM_MODE_CONNECTOR_DSI;

View File

@@ -553,6 +553,7 @@
#define ENABLE_SMP_LD_RENDER_SURFACE_CONTROL REG_BIT(44 - 32)
#define FORCE_SLM_FENCE_SCOPE_TO_TILE REG_BIT(42 - 32)
#define FORCE_UGM_FENCE_SCOPE_TO_TILE REG_BIT(41 - 32)
#define L3_128B_256B_WRT_DIS REG_BIT(40 - 32)
#define MAXREQS_PER_BANK REG_GENMASK(39 - 32, 37 - 32)
#define DISABLE_128B_EVICTION_COMMAND_UDW REG_BIT(36 - 32)

View File

@@ -1442,9 +1442,9 @@ static int op_check_svm_userptr(struct xe_vm *vm, struct xe_vma_op *op,
err = vma_check_userptr(vm, op->map.vma, pt_update);
break;
case DRM_GPUVA_OP_REMAP:
if (op->remap.prev)
if (op->remap.prev && !op->remap.skip_prev)
err = vma_check_userptr(vm, op->remap.prev, pt_update);
if (!err && op->remap.next)
if (!err && op->remap.next && !op->remap.skip_next)
err = vma_check_userptr(vm, op->remap.next, pt_update);
break;
case DRM_GPUVA_OP_UNMAP:
@@ -2198,12 +2198,12 @@ static int op_prepare(struct xe_vm *vm,
err = unbind_op_prepare(tile, pt_update_ops, old);
if (!err && op->remap.prev) {
if (!err && op->remap.prev && !op->remap.skip_prev) {
err = bind_op_prepare(vm, tile, pt_update_ops,
op->remap.prev, false);
pt_update_ops->wait_vm_bookkeep = true;
}
if (!err && op->remap.next) {
if (!err && op->remap.next && !op->remap.skip_next) {
err = bind_op_prepare(vm, tile, pt_update_ops,
op->remap.next, false);
pt_update_ops->wait_vm_bookkeep = true;
@@ -2428,10 +2428,10 @@ static void op_commit(struct xe_vm *vm,
unbind_op_commit(vm, tile, pt_update_ops, old, fence, fence2);
if (op->remap.prev)
if (op->remap.prev && !op->remap.skip_prev)
bind_op_commit(vm, tile, pt_update_ops, op->remap.prev,
fence, fence2, false);
if (op->remap.next)
if (op->remap.next && !op->remap.skip_next)
bind_op_commit(vm, tile, pt_update_ops, op->remap.next,
fence, fence2, false);
break;

View File

@@ -341,6 +341,8 @@ ssize_t xe_sriov_packet_write_single(struct xe_device *xe, unsigned int vfid,
ret = xe_sriov_pf_migration_restore_produce(xe, vfid, *data);
if (ret) {
xe_sriov_packet_free(*data);
*data = NULL;
return ret;
}

View File

@@ -2554,7 +2554,6 @@ static int xe_vma_op_commit(struct xe_vm *vm, struct xe_vma_op *op)
if (!err && op->remap.skip_prev) {
op->remap.prev->tile_present =
tile_present;
op->remap.prev = NULL;
}
}
if (op->remap.next) {
@@ -2564,11 +2563,13 @@ static int xe_vma_op_commit(struct xe_vm *vm, struct xe_vma_op *op)
if (!err && op->remap.skip_next) {
op->remap.next->tile_present =
tile_present;
op->remap.next = NULL;
}
}
/* Adjust for partial unbind after removing VMA from VM */
/*
* Adjust for partial unbind after removing VMA from VM. In case
* of unwind we might need to undo this later.
*/
if (!err) {
op->base.remap.unmap->va->va.addr = op->remap.start;
op->base.remap.unmap->va->va.range = op->remap.range;
@@ -2687,6 +2688,8 @@ static int vm_bind_ioctl_ops_parse(struct xe_vm *vm, struct drm_gpuva_ops *ops,
op->remap.start = xe_vma_start(old);
op->remap.range = xe_vma_size(old);
op->remap.old_start = op->remap.start;
op->remap.old_range = op->remap.range;
flags |= op->base.remap.unmap->va->flags & XE_VMA_CREATE_MASK;
if (op->base.remap.prev) {
@@ -2835,8 +2838,19 @@ static void xe_vma_op_unwind(struct xe_vm *vm, struct xe_vma_op *op,
xe_svm_notifier_lock(vm);
vma->gpuva.flags &= ~XE_VMA_DESTROYED;
xe_svm_notifier_unlock(vm);
if (post_commit)
if (post_commit) {
/*
* Restore the old va range, in case of the
* prev/next skip optimisation. Otherwise what
* we re-insert here could be smaller than the
* original range.
*/
op->base.remap.unmap->va->va.addr =
op->remap.old_start;
op->base.remap.unmap->va->va.range =
op->remap.old_range;
xe_vm_insert_vma(vm, vma);
}
}
break;
}

View File

@@ -373,6 +373,10 @@ struct xe_vma_op_remap {
u64 start;
/** @range: range of the VMA unmap */
u64 range;
/** @old_start: Original start of the VMA we unmap */
u64 old_start;
/** @old_range: Original range of the VMA we unmap */
u64 old_range;
/** @skip_prev: skip prev rebind */
bool skip_prev;
/** @skip_next: skip next rebind */

View File

@@ -247,7 +247,8 @@ static const struct xe_rtp_entry_sr gt_was[] = {
LSN_DIM_Z_WGT_MASK,
LSN_LNI_WGT(1) | LSN_LNE_WGT(1) |
LSN_DIM_X_WGT(1) | LSN_DIM_Y_WGT(1) |
LSN_DIM_Z_WGT(1)))
LSN_DIM_Z_WGT(1)),
SET(LSC_CHICKEN_BIT_0_UDW, L3_128B_256B_WRT_DIS))
},
/* Xe2_HPM */

View File

@@ -10,6 +10,8 @@
#include <linux/hwmon.h>
#include <linux/i2c.h>
#include <linux/init.h>
#include <linux/math64.h>
#include <linux/minmax.h>
#include <linux/module.h>
#include <linux/regulator/consumer.h>
@@ -33,7 +35,7 @@
struct adm1177_state {
struct i2c_client *client;
u32 r_sense_uohm;
u32 alert_threshold_ua;
u64 alert_threshold_ua;
bool vrange_high;
};
@@ -48,7 +50,7 @@ static int adm1177_write_cmd(struct adm1177_state *st, u8 cmd)
}
static int adm1177_write_alert_thr(struct adm1177_state *st,
u32 alert_threshold_ua)
u64 alert_threshold_ua)
{
u64 val;
int ret;
@@ -91,8 +93,8 @@ static int adm1177_read(struct device *dev, enum hwmon_sensor_types type,
*val = div_u64((105840000ull * dummy),
4096 * st->r_sense_uohm);
return 0;
case hwmon_curr_max_alarm:
*val = st->alert_threshold_ua;
case hwmon_curr_max:
*val = div_u64(st->alert_threshold_ua, 1000);
return 0;
default:
return -EOPNOTSUPP;
@@ -126,9 +128,10 @@ static int adm1177_write(struct device *dev, enum hwmon_sensor_types type,
switch (type) {
case hwmon_curr:
switch (attr) {
case hwmon_curr_max_alarm:
adm1177_write_alert_thr(st, val);
return 0;
case hwmon_curr_max:
val = clamp_val(val, 0,
div_u64(105840000ULL, st->r_sense_uohm));
return adm1177_write_alert_thr(st, (u64)val * 1000);
default:
return -EOPNOTSUPP;
}
@@ -156,7 +159,7 @@ static umode_t adm1177_is_visible(const void *data,
if (st->r_sense_uohm)
return 0444;
return 0;
case hwmon_curr_max_alarm:
case hwmon_curr_max:
if (st->r_sense_uohm)
return 0644;
return 0;
@@ -170,7 +173,7 @@ static umode_t adm1177_is_visible(const void *data,
static const struct hwmon_channel_info * const adm1177_info[] = {
HWMON_CHANNEL_INFO(curr,
HWMON_C_INPUT | HWMON_C_MAX_ALARM),
HWMON_C_INPUT | HWMON_C_MAX),
HWMON_CHANNEL_INFO(in,
HWMON_I_INPUT),
NULL
@@ -192,7 +195,8 @@ static int adm1177_probe(struct i2c_client *client)
struct device *dev = &client->dev;
struct device *hwmon_dev;
struct adm1177_state *st;
u32 alert_threshold_ua;
u64 alert_threshold_ua;
u32 prop;
int ret;
st = devm_kzalloc(dev, sizeof(*st), GFP_KERNEL);
@@ -208,22 +212,26 @@ static int adm1177_probe(struct i2c_client *client)
if (device_property_read_u32(dev, "shunt-resistor-micro-ohms",
&st->r_sense_uohm))
st->r_sense_uohm = 0;
if (device_property_read_u32(dev, "adi,shutdown-threshold-microamp",
&alert_threshold_ua)) {
if (st->r_sense_uohm)
/*
* set maximum default value from datasheet based on
* shunt-resistor
*/
alert_threshold_ua = div_u64(105840000000,
st->r_sense_uohm);
else
alert_threshold_ua = 0;
if (!device_property_read_u32(dev, "adi,shutdown-threshold-microamp",
&prop)) {
alert_threshold_ua = prop;
} else if (st->r_sense_uohm) {
/*
* set maximum default value from datasheet based on
* shunt-resistor
*/
alert_threshold_ua = div_u64(105840000000ULL,
st->r_sense_uohm);
} else {
alert_threshold_ua = 0;
}
st->vrange_high = device_property_read_bool(dev,
"adi,vrange-high-enable");
if (alert_threshold_ua && st->r_sense_uohm)
adm1177_write_alert_thr(st, alert_threshold_ua);
if (alert_threshold_ua && st->r_sense_uohm) {
ret = adm1177_write_alert_thr(st, alert_threshold_ua);
if (ret)
return ret;
}
ret = adm1177_write_cmd(st, ADM1177_CMD_V_CONT |
ADM1177_CMD_I_CONT |

View File

@@ -131,7 +131,7 @@ static int get_temp_target(struct peci_cputemp *priv, enum peci_temp_target_type
*val = priv->temp.target.tjmax;
break;
case crit_hyst_type:
*val = priv->temp.target.tjmax - priv->temp.target.tcontrol;
*val = priv->temp.target.tcontrol;
break;
default:
ret = -EOPNOTSUPP;
@@ -319,7 +319,7 @@ static umode_t cputemp_is_visible(const void *data, enum hwmon_sensor_types type
{
const struct peci_cputemp *priv = data;
if (channel > CPUTEMP_CHANNEL_NUMS)
if (channel >= CPUTEMP_CHANNEL_NUMS)
return 0;
if (channel < channel_core)

View File

@@ -72,7 +72,8 @@ static int ina233_read_word_data(struct i2c_client *client, int page,
/* Adjust returned value to match VIN coefficients */
/* VIN: 1.25 mV VSHUNT: 2.5 uV LSB */
ret = DIV_ROUND_CLOSEST(ret * 25, 12500);
ret = clamp_val(DIV_ROUND_CLOSEST((s16)ret * 25, 12500),
S16_MIN, S16_MAX) & 0xffff;
break;
default:
ret = -ENODATA;

Some files were not shown because too many files have changed in this diff Show More