Pull perf fixes from Ingo Molnar:
"Two fixlets"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/hwbp: Simplify the perf-hwbp code, fix documentation
perf/x86/intel: Fix linear IP of PEBS real_ip on Haswell and later CPUs
Pull x86 PTI fixes from Ingo Molnar:
"Two fixes: a relatively simple objtool fix that makes Clang built
kernels work with ORC debug info, plus an alternatives macro fix"
* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/alternatives: Fixup alternative_call_2
objtool: Add Clang support
Pull Kbuild fixes from Masahiro Yamada:
- fix missed rebuild of TRIM_UNUSED_KSYMS
- fix rpm-pkg for GNU tar >= 1.29
- include scripts/dtc/include-prefixes/* to kernel header deb-pkg
- add -no-integrated-as option ealier to fix building with Clang
- fix netfilter Makefile for parallel building
* tag 'kbuild-fixes-v4.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
netfilter: nf_nat_snmp_basic: add correct dependency to Makefile
kbuild: rpm-pkg: Support GNU tar >= 1.29
builddeb: Fix header package regarding dtc source links
kbuild: set no-integrated-as before incl. arch Makefile
kbuild: make scripts/adjust_autoksyms.sh robust against timestamp races
Pull networking fixes from David Miller:
1) Fix RCU locking in xfrm_local_error(), from Taehee Yoo.
2) Fix return value assignments and thus error checking in
iwl_mvm_start_ap_ibss(), from Johannes Berg.
3) Don't count header length twice in vti4, from Stefano Brivio.
4) Fix deadlock in rt6_age_examine_exception, from Eric Dumazet.
5) Fix out-of-bounds access in nf_sk_lookup_slow{v4,v6}() from Subash
Abhinov.
6) Check nladdr size in netlink_connect(), from Alexander Potapenko.
7) VF representor SQ numbers are 32 not 16 bits, in mlx5 driver, from
Or Gerlitz.
8) Out of bounds read in skb_network_protocol(), from Eric Dumazet.
9) r8169 driver sets driver data pointer after register_netdev() which
is too late. Fix from Heiner Kallweit.
10) Fix memory leak in mlx4 driver, from Moshe Shemesh.
11) The multi-VLAN decap fix added a regression when dealing with device
that lack a MAC header, such as tun. Fix from Toshiaki Makita.
12) Fix integer overflow in dynamic interrupt coalescing code. From Tal
Gilboa.
13) Use after free in vrf code, from David Ahern.
14) IPV6 route leak between VRFs fix, also from David Ahern.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (81 commits)
net: mvneta: fix enable of all initialized RXQs
net/ipv6: Fix route leaking between VRFs
vrf: Fix use after free and double free in vrf_finish_output
ipv6: sr: fix seg6 encap performances with TSO enabled
net/dim: Fix int overflow
vlan: Fix vlan insertion for packets without ethernet header
net: Fix untag for vlan packets without ethernet header
atm: iphase: fix spelling mistake: "Receiverd" -> "Received"
vhost: validate log when IOTLB is enabled
qede: Do not drop rx-checksum invalidated packets.
hv_netvsc: enable multicast if necessary
ip_tunnel: Resolve ipsec merge conflict properly.
lan78xx: Crash in lan78xx_writ_reg (Workqueue: events lan78xx_deferred_multicast_write)
qede: Fix barrier usage after tx doorbell write.
vhost: correctly remove wait queue during poll failure
net/mlx4_core: Fix memory leak while delete slave's resources
net/mlx4_en: Fix mixed PFC and Global pause user control requests
net/smc: use announced length in sock_recvmsg()
llc: properly handle dev_queue_xmit() return value
strparser: Fix sign of err codes
...
In mvneta_port_up() we enable relevant RX and TX port queues by write
queues bit map to an appropriate register.
q_map must be ZERO in the beginning of this process.
Signed-off-by: Yelena Krivosheev <yelena@marvell.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Acked-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Donald reported that IPv6 route leaking between VRFs is not working.
The root cause is the strict argument in the call to rt6_lookup when
validating the nexthop spec.
ip6_route_check_nh validates the gateway and device (if given) of a
route spec. It in turn could call rt6_lookup (e.g., lookup in a given
table did not succeed so it falls back to a full lookup) and if so
sets the strict argument to 1. That means if the egress device is given,
the route lookup needs to return a result with the same device. This
strict requirement does not work with VRFs (IPv4 or IPv6) because the
oif in the flow struct is overridden with the index of the VRF device
to trigger a match on the l3mdev rule and force the lookup to its table.
The right long term solution is to add an l3mdev index to the flow
struct such that the oif is not overridden. That solution will not
backport well, so this patch aims for a simpler solution to relax the
strict argument if the route spec device is an l3mdev slave. As done
in other places, use the FLOWI_FLAG_SKIP_NH_OIF to know that the
RT6_LOOKUP_F_IFACE flag needs to be removed.
Fixes: ca254490c8 ("net: Add VRF support to IPv6 stack")
Reported-by: Donald Sharp <sharpd@cumulusnetworks.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Miguel reported an skb use after free / double free in vrf_finish_output
when neigh_output returns an error. The vrf driver should return after
the call to neigh_output as it takes over the skb on error path as well.
Patch is a simplified version of Miguel's patch which was written for 4.9,
and updated to top of tree.
Fixes: 8f58336d3f ("net: Add ethernet header for pass through VRF device")
Signed-off-by: Miguel Fadon Perlines <mfadon@teldat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Enabling TSO can lead to abysmal performances when using seg6 in
encap mode, such as with the ixgbe driver. This patch adds a call to
iptunnel_handle_offloads() to remove the encapsulation bit if needed.
Before:
root@comp4-seg6bpf:~# iperf3 -c fc00::55
Connecting to host fc00::55, port 5201
[ 4] local fc45::4 port 36592 connected to fc00::55 port 5201
[ ID] Interval Transfer Bandwidth Retr Cwnd
[ 4] 0.00-1.00 sec 196 KBytes 1.60 Mbits/sec 47 6.66 KBytes
[ 4] 1.00-2.00 sec 304 KBytes 2.49 Mbits/sec 100 5.33 KBytes
[ 4] 2.00-3.00 sec 284 KBytes 2.32 Mbits/sec 92 5.33 KBytes
After:
root@comp4-seg6bpf:~# iperf3 -c fc00::55
Connecting to host fc00::55, port 5201
[ 4] local fc45::4 port 43062 connected to fc00::55 port 5201
[ ID] Interval Transfer Bandwidth Retr Cwnd
[ 4] 0.00-1.00 sec 1.03 GBytes 8.89 Gbits/sec 0 743 KBytes
[ 4] 1.00-2.00 sec 1.03 GBytes 8.87 Gbits/sec 0 743 KBytes
[ 4] 2.00-3.00 sec 1.03 GBytes 8.87 Gbits/sec 0 743 KBytes
Reported-by: Tom Herbert <tom@quantonium.net>
Fixes: 6c8702c60b ("ipv6: sr: add support for SRH encapsulation and injection with lwtunnels")
Signed-off-by: David Lebrun <dlebrun@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull ceph fix from Ilya Dryomov:
"A fix for a dio-enabled loop on ceph deadlock from Zheng, marked for
stable"
* tag 'ceph-for-4.16-rc8' of git://github.com/ceph/ceph-client:
ceph: only dirty ITER_IOVEC pages for direct read
Pull KVM fixes from Radim Krčmář:
"PPC:
- Fix a bug causing occasional machine check exceptions on POWER8
hosts (introduced in 4.16-rc1)
x86:
- Fix a guest crashing regression with nested VMX and restricted
guest (introduced in 4.16-rc1)
- Fix dependency check for pv tlb flush (the wrong dependency that
effectively disabled the feature was added in 4.16-rc4, the
original feature in 4.16-rc1, so it got decent testing)"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: Fix pv tlb flush dependencies
KVM: nVMX: sync vmcs02 segment regs prior to vmx_set_cr0
KVM: PPC: Book3S HV: Fix duplication of host SLB entries
Pull i2c fix from Wolfram Sang:
"A simple but worthwhile I2C driver fix for 4.16"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: i2c-stm32f7: fix no check on returned setup
Pull sound fixes from Takashi Iwai:
"Very small fixes (all one-liners) at this time.
One fix is for a PCM core stuff to correct the mmap behavior on
non-x86. It doesn't show on most machines but mostly only for exotic
non-interleaved formats"
* tag 'sound-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: pcm: potential uninitialized return values
ALSA: pcm: Use dma_bytes as size parameter in dma_mmap_coherent()
ALSA: usb-audio: Add native DSD support for TEAC UD-301
When calculating difference between samples, the values
are multiplied by 100. Large values may cause int overflow
when multiplied (usually on first iteration).
Fixed by forcing 100 to be of type unsigned long.
Fixes: 4c4dbb4a73 ("net/mlx5e: Move dynamic interrupt coalescing code to include/linux")
Signed-off-by: Tal Gilboa <talgi@mellanox.com>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Toshiaki Makita says:
====================
Fix vlan tag handling for vlan packets without ethernet headers
Eric Dumazet reported syzbot found a new bug which leads to underflow of
size argument of memmove(), causing crash[1]. This can be triggered by tun
devices.
The underflow happened because skb_vlan_untag() did not expect vlan packets
without ethernet headers, and tun can produce such packets.
I also checked vlan_insert_inner_tag() and found a similar bug.
This series fixes these problems.
[1] https://marc.info/?l=linux-netdev&m=152221753920510&w=2
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
In some situation vlan packets do not have ethernet headers. One example
is packets from tun devices. Users can specify vlan protocol in tun_pi
field instead of IP protocol. When we have a vlan device with reorder_hdr
disabled on top of the tun device, such packets from tun devices are
untagged in skb_vlan_untag() and vlan headers will be inserted back in
vlan_insert_inner_tag().
vlan_insert_inner_tag() however did not expect packets without ethernet
headers, so in such a case size argument for memmove() underflowed.
We don't need to copy headers for packets which do not have preceding
headers of vlan headers, so skip memmove() in that case.
Also don't write vlan protocol in skb->data when it does not have enough
room for it.
Fixes: cbe7128c4b ("vlan: Fix out of order vlan headers with reorder header off")
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
If a page is already locked, attempting to dirty it leads to a deadlock
in lock_page(). This is what currently happens to ITER_BVEC pages when
a dio-enabled loop device is backed by ceph:
$ losetup --direct-io /dev/loop0 /mnt/cephfs/img
$ xfs_io -c 'pread 0 4k' /dev/loop0
Follow other file systems and only dirty ITER_IOVEC pages.
Cc: stable@kernel.org
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Pull device mapper fixes from Mike Snitzer:
- Fix a DM multipath regression introduced in a v4.16-rc6 commit:
restore support for loading, and attaching, scsi_dh modules during
multipath table load. Otherwise some users may find themselves unable
to boot, as was reported today:
https://marc.info/?l=linux-scsi&m=152231276114962&w=2
- Fix a DM core ioctl permission check regression introduced in a
v4.16-rc5 commit.
* tag 'for-4.16/dm-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm: fix dropped return code from dm_get_bdev_for_ioctl
dm mpath: fix support for loading scsi_dh modules during table load
Pull rdma fixes from Jason Gunthorpe:
"It has been fairly silent lately on our -rc front. Big queue of
patches on the mailing list going to for-next though.
Bug fixes:
- qedr driver bugfixes causing application hangs, wrong uapi errnos,
and a race condition
- three syzkaller found bugfixes in the ucma uapi
Regression fixes for things introduced in 4.16:
- Crash on error introduced in mlx5 UMR flow
- Crash on module unload/etc introduced by bad interaction of
restrack and mlx5 patches this cycle
- Typo in a two line syzkaller bugfix causing a bad regression
- Coverity report of nonsense code in hns driver"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/ucma: Introduce safer rdma_addr_size() variants
RDMA/hns: ensure for-loop actually iterates and free's buffers
RDMA/ucma: Check that device exists prior to accessing it
RDMA/ucma: Check that device is connected prior to access it
RDMA/rdma_cm: Fix use after free race with process_one_req
RDMA/qedr: Fix QP state initialization race
RDMA/qedr: Fix rc initialization on CNQ allocation failure
RDMA/qedr: fix QP's ack timeout configuration
RDMA/ucma: Correct option size check using optlen
RDMA/restrack: Move restrack_clean to be symmetrical to restrack_init
IB/mlx5: Don't clean uninitialized UMR resources
Pull MTD fixes from Boris Brezillon:
"Two fixes, one in the atmel NAND driver and another one in the
CFI/JEDEC code.
Summary:
- Fix a bug in Atmel ECC engine driver
- Fix a bug in the CFI/JEDEC driver"
* tag 'mtd/fixes-for-4.16' of git://git.infradead.org/linux-mtd:
mtd: jedec_probe: Fix crash in jedec_read_mfr()
mtd: nand: atmel: Fix get_sectorsize() function
dm_get_bdev_for_ioctl()'s return of 0 or 1 must be the result from
prepare_ioctl (1 means the ioctl was issued to a partition, 0 means it
wasn't). Unfortunately commit 519049afea ("dm: use blkdev_get rather
than bdgrab when issuing pass-through ioctl") reused the variable 'r'
to store the return from blkdev_get() that follows prepare_ioctl()
-- whereby dropping prepare_ioctl()'s result on the floor.
This can lead to an ioctl or persistent reservation being issued to a
partition going unnoticed, which implies the extra permission check for
CAP_SYS_RAWIO is skipped.
Fix this by using a different variable to store blkdev_get()'s return.
Fixes: 519049afea ("dm: use blkdev_get rather than bdgrab when issuing pass-through ioctl")
Reported-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Daniel Borkman says:
====================
pull-request: bpf 2018-03-29
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) Fix nfp to properly check max insn count while emitting
instructions in the JIT which was wrongly comparing bytes
against number of instructions before, from Jakub.
2) Fix for bpftool to avoid usage of hex numbers in JSON
output since JSON doesn't accept hex numbers with 0x
prefix, also from Jakub.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The ability to have multipath dynamically attach a scsi_dh, that the user
specified in the multipath table, was broken by commit e8f74a0f00 ("dm
mpath: eliminate need to use scsi_device_from_queue").
Restore the ability to load, and attach, a particular scsi_dh module if
one is specified (as noticed by checking m->hw_handler_name).
Fixes: e8f74a0f00 ("dm mpath: eliminate need to use scsi_device_from_queue")
Reported-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Vq log_base is the userspace address of bitmap which has nothing to do
with IOTLB. So it needs to be validated unconditionally otherwise we
may try use 0 as log_base which may lead to pin pages that will lead
unexpected result (e.g trigger BUG_ON() in set_bit_to_user()).
Fixes: 6b1e6cc785 ("vhost: new device IOTLB API")
Reported-by: syzbot+6304bf97ef436580fede@syzkaller.appspotmail.com
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Today, driver drops received packets which are indicated as
invalid checksum by the device. Instead of dropping such packets,
pass them to the stack with CHECKSUM_NONE indication in skb.
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: Manish Chopra <manish.chopra@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It turns out that the loop where we read manufacturer
jedec_read_mfd() can under some circumstances get a
CFI_MFR_CONTINUATION repeatedly, making the loop go
over all banks and eventually hit the end of the
map and crash because of an access violation:
Unable to handle kernel paging request at virtual address c4980000
pgd = (ptrval)
[c4980000] *pgd=03808811, *pte=00000000, *ppte=00000000
Internal error: Oops: 7 [#1] PREEMPT ARM
CPU: 0 PID: 1 Comm: swapper Not tainted 4.16.0-rc1+ #150
Hardware name: Gemini (Device Tree)
PC is at jedec_probe_chip+0x6ec/0xcd0
LR is at 0x4
pc : [<c03a2bf4>] lr : [<00000004>] psr: 60000013
sp : c382dd18 ip : 0000ffff fp : 00000000
r10: c0626388 r9 : 00020000 r8 : c0626340
r7 : 00000000 r6 : 00000001 r5 : c3a71afc r4 : c382dd70
r3 : 00000001 r2 : c4900000 r1 : 00000002 r0 : 00080000
Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
Control: 0000397f Table: 00004000 DAC: 00000053
Process swapper (pid: 1, stack limit = 0x(ptrval))
Fix this by breaking the loop with a return 0 if
the offset exceeds the map size.
Fixes: 5c9c11e1c4 ("[MTD] [NOR] Add support for flash chips with ID in bank other than 0")
Cc: <stable@vger.kernel.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
My recent change to netvsc drive in how receive flags are handled
broke multicast. The Hyper-v/Azure virtual interface there is not a
multicast filter list, filtering is only all or none. The driver must
enable all multicast if any multicast address is present.
Fixes: 009f766ca2 ("hv_netvsc: filter multicast/broadcast")
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Description:
Crash was reported with syzkaller pointing to lan78xx_write_reg routine.
Root-cause:
Proper cleanup of workqueues and init/setup routines was not happening
in failure conditions.
Fix:
Handled the error conditions by cleaning up the queues and init/setup
routines.
Fixes: 55d7de9de6 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver")
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Raghuram Chary J <raghuramchary.jallipalli@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Steffen Klassert says:
====================
pull request (net): ipsec 2018-03-29
1) Fix a rcu_read_lock/rcu_read_unlock imbalance
in the error path of xfrm_local_error().
From Taehee Yoo.
2) Some VTI MTU fixes. From Stefano Brivio.
3) Fix a too early overwritten skb control buffer
on xfrm transport mode.
Please note that this pull request has a merge conflict
in net/ipv4/ip_tunnel.c.
The conflict is between
commit f6cc9c054e ("ip_tunnel: Emit events for post-register MTU changes")
from the net tree and
commit 24fc79798b ("ip_tunnel: Clamp MTU to bounds on new link")
from the ipsec tree.
It can be solved as it is currently done in linux-next.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull drm fixes from Dave Airlie:
"Nothing serious, two amdkfd and two tegra fixes"
* tag 'drm-fixes-for-v4.16-rc8' of git://people.freedesktop.org/~airlied/linux:
drm/tegra: dc: Using NULL instead of plain integer
drm/amdkfd: Deallocate SDMA queues correctly
drm/amdkfd: Fix scratch memory with HWS enabled
drm/tegra: dc: Use correct format array for Tegra124
nf_nat_snmp_basic_main.c includes a generated header, but the
necessary dependency is missing in Makefile. This could cause
build error in parallel building.
Remove a weird line, and add a correct one.
Fixes: cc2d58634e ("netfilter: nf_nat_snmp_basic: use asn1 decoder library")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Merge misc fixes from Andrew Morton:
"8 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
MAINTAINERS: demote ARM port to "odd fixes"
MAINTAINERS: correct rmk's email address
mm/kmemleak.c: wait for scan completion before disabling free
mm/memcontrol.c: fix parameter description mismatch
mm/vmstat.c: fix vmstat_update() preemption BUG
mm/page_owner: fix recursion bug after changing skip entries
ipc/shm.c: add split function to shm_vm_ops
mm, slab: memcg_link the SLAB's kmem_cache
drm/tegra: Fixes for v4.16
This contains two small fixes, one which fixes a typo that causes a
crash with the new framebuffer modifier query support and another that
fixes a build warning.
* tag 'drm/tegra/for-4.16-fixes' of git://anongit.freedesktop.org/tegra/linux:
drm/tegra: dc: Using NULL instead of plain integer
drm/tegra: dc: Use correct format array for Tegra124
Pull powerpc fixes from Michael Ellerman:
"Some more powerpc fixes for 4.16. Apologies if this is a bit big at
rc7, but they're all reasonably important fixes. None are actually for
new code, so they aren't indicative of 4.16 being in bad shape from
our point of view.
- Fix missing AT_BASE_PLATFORM (in auxv) when we're using a new
firmware interface for describing CPU features.
- Fix lost pending interrupts due to a race in our interrupt
soft-masking code.
- A workaround for a nest MMU bug with TLB invalidations on Power9.
- A workaround for broadcast TLB invalidations on Power9.
- Fix a bug in our instruction SLB miss handler, when handling bad
addresses (eg. >= TASK_SIZE), which could corrupt non-volatile user
GPRs.
Thanks to: Aneesh Kumar K.V, Balbir Singh, Benjamin Herrenschmidt,
Nicholas Piggin"
* tag 'powerpc-4.16-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/64s: Fix i-side SLB miss bad address handler saving nonvolatile GPRs
powerpc/mm: Fixup tlbie vs store ordering issue on POWER9
powerpc/mm/radix: Move the functions that does the actual tlbie closer
powerpc/mm/radix: Remove unused code
powerpc/mm: Workaround Nest MMU bug with TLB invalidations
powerpc/mm: Add tracking of the number of coprocessors using a context
powerpc/64s: Fix lost pending interrupt due to race causing lost update to irq_happened
powerpc/64s: Fix NULL AT_BASE_PLATFORM when using DT CPU features
Pull ARM SoC fixes from Arnd Bergmann:
"Here are are a couple of last-minute fixes for 4.16, mostly for
regressions. As usual, the majory are device tree changes:
- USB 3 support on rk3399 didn't work and is being reverted for now
- One fix for an old suspend/resume bug on rk3399
- A few regulator related fixes on Banana Pi M2, and on imx7d-sdb
- A boot regression fix for all Aspeed SoCs failing to find their
memory
- One more dtc warning fix
The other changes are:
- A few updates to the MAINTAINERS file
- A revert for an incorrect orion5x cleanup
- Two power management fixes for OMAP"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: OMAP: Fix SRAM W+X mapping
ARM: dts: aspeed: Add default memory node
mailmap: Update email address for Gregory CLEMENT
ARM: davinci: fix the GPIO lookup for omapl138-hawk
MAINTAINERS: Update Tegra IOMMU maintainer
ARM: dts: imx7d-sdb: Fix regulator-usb-otg2-vbus node name
ARM: ux500: Fix PMU IRQ regression
ARM: dts: rockchip: Add missing #sound-dai-cells on rk3288
Revert "arm64: dts: rockchip: add usb3-phy otg-port support for rk3399"
arm64: dts: rockchip: Fix rk3399-gru-* s2r (pinctrl hogs, wifi reset)
ARM: OMAP: Fix dmtimer init for omap1
MAINTAINERS: update email address for Maxime Ripard
ARM: dts: sun6i: a31s: bpi-m2: add missing regulators
ARM: dts: sun6i: a31s: bpi-m2: improve pmic properties
As of the start of 2018, I am no longer paid to support the core 32-bit
ARM architecture code. This means that this code is no longer
commercially supported, and is now only supported through voluntary
effort.
I will continue to merge patches as and when able, but this will be at a
lower priority than before (which means a longer latency.) I have also
be scaled back the amount of time spent reading email, so email that is
intended for my attention needs to make itself plainly obvious, or I
will miss it.
In an attempt to reduce the amount of email Cc'd to me, exclude
arch/arm/boot/dts from the maintainers patterns, but add entries for the
SolidRun platforms I look after.
Link: http://lkml.kernel.org/r/E1ezkgn-0002fO-52@rmk-PC.armlinux.org.uk
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A crash is observed when kmemleak_scan accesses the object->pointer,
likely due to the following race.
TASK A TASK B TASK C
kmemleak_write
(with "scan" and
NOT "scan=on")
kmemleak_scan()
create_object
kmem_cache_alloc fails
kmemleak_disable
kmemleak_do_cleanup
kmemleak_free_enabled = 0
kfree
kmemleak_free bails out
(kmemleak_free_enabled is 0)
slub frees object->pointer
update_checksum
crash - object->pointer
freed (DEBUG_PAGEALLOC)
kmemleak_do_cleanup waits for the scan thread to complete, but not for
direct call to kmemleak_scan via kmemleak_write. So add a wait for
kmemleak_scan completion before disabling kmemleak_free, and while at it
fix the comment on stop_scan_thread.
[vinmenon@codeaurora.org: fix stop_scan_thread comment]
Link: http://lkml.kernel.org/r/1522219972-22809-1-git-send-email-vinmenon@codeaurora.org
Link: http://lkml.kernel.org/r/1522063429-18992-1-git-send-email-vinmenon@codeaurora.org
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Attempting to hotplug CPUs with CONFIG_VM_EVENT_COUNTERS enabled can
cause vmstat_update() to report a BUG due to preemption not being
disabled around smp_processor_id().
Discovered on Ubiquiti EdgeRouter Pro with Cavium Octeon II processor.
BUG: using smp_processor_id() in preemptible [00000000] code:
kworker/1:1/269
caller is vmstat_update+0x50/0xa0
CPU: 0 PID: 269 Comm: kworker/1:1 Not tainted
4.16.0-rc4-Cavium-Octeon-00009-gf83bbd5-dirty #1
Workqueue: mm_percpu_wq vmstat_update
Call Trace:
show_stack+0x94/0x128
dump_stack+0xa4/0xe0
check_preemption_disabled+0x118/0x120
vmstat_update+0x50/0xa0
process_one_work+0x144/0x348
worker_thread+0x150/0x4b8
kthread+0x110/0x140
ret_from_kernel_thread+0x14/0x1c
Link: http://lkml.kernel.org/r/1520881552-25659-1-git-send-email-steven.hill@cavium.com
Signed-off-by: Steven J. Hill <steven.hill@cavium.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Tejun Heo <htejun@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch fixes commit 5f48f0bd4e ("mm, page_owner: skip unnecessary
stack_trace entries").
Because if we skip first two entries then logic of checking count value
as 2 for recursion is broken and code will go in one depth recursion.
so we need to check only one call of _RET_IP(__set_page_owner) while
checking for recursion.
Current Backtrace while checking for recursion:-
(save_stack) from (__set_page_owner) // (But recursion returns true here)
(__set_page_owner) from (get_page_from_freelist)
(get_page_from_freelist) from (__alloc_pages_nodemask)
(__alloc_pages_nodemask) from (depot_save_stack)
(depot_save_stack) from (save_stack) // recursion should return true here
(save_stack) from (__set_page_owner)
(__set_page_owner) from (get_page_from_freelist)
(get_page_from_freelist) from (__alloc_pages_nodemask+)
(__alloc_pages_nodemask) from (depot_save_stack)
(depot_save_stack) from (save_stack)
(save_stack) from (__set_page_owner)
(__set_page_owner) from (get_page_from_freelist)
Correct Backtrace with fix:
(save_stack) from (__set_page_owner) // recursion returned true here
(__set_page_owner) from (get_page_from_freelist)
(get_page_from_freelist) from (__alloc_pages_nodemask+)
(__alloc_pages_nodemask) from (depot_save_stack)
(depot_save_stack) from (save_stack)
(save_stack) from (__set_page_owner)
(__set_page_owner) from (get_page_from_freelist)
Link: http://lkml.kernel.org/r/1521607043-34670-1-git-send-email-maninder1.s@samsung.com
Fixes: 5f48f0bd4e ("mm, page_owner: skip unnecessary stack_trace entries")
Signed-off-by: Maninder Singh <maninder1.s@samsung.com>
Signed-off-by: Vaneet Narang <v.narang@samsung.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oscar Salvador <osalvador@techadventures.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ayush Mittal <ayush.m@samsung.com>
Cc: Prakash Gupta <guptap@codeaurora.org>
Cc: Vinayak Menon <vinmenon@codeaurora.org>
Cc: Vasyl Gomonovych <gomonovych@gmail.com>
Cc: Amit Sahrawat <a.sahrawat@samsung.com>
Cc: <pankaj.m@samsung.com>
Cc: Vaneet Narang <v.narang@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If System V shmget/shmat operations are used to create a hugetlbfs
backed mapping, it is possible to munmap part of the mapping and split
the underlying vma such that it is not huge page aligned. This will
untimately result in the following BUG:
kernel BUG at /build/linux-jWa1Fv/linux-4.15.0/mm/hugetlb.c:3310!
Oops: Exception in kernel mode, sig: 5 [#1]
LE SMP NR_CPUS=2048 NUMA PowerNV
Modules linked in: kcm nfc af_alg caif_socket caif phonet fcrypt
CPU: 18 PID: 43243 Comm: trinity-subchil Tainted: G C E 4.15.0-10-generic #11-Ubuntu
NIP: c00000000036e764 LR: c00000000036ee48 CTR: 0000000000000009
REGS: c000003fbcdcf810 TRAP: 0700 Tainted: G C E (4.15.0-10-generic)
MSR: 9000000000029033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 24002222 XER: 20040000
CFAR: c00000000036ee44 SOFTE: 1
NIP __unmap_hugepage_range+0xa4/0x760
LR __unmap_hugepage_range_final+0x28/0x50
Call Trace:
0x7115e4e00000 (unreliable)
__unmap_hugepage_range_final+0x28/0x50
unmap_single_vma+0x11c/0x190
unmap_vmas+0x94/0x140
exit_mmap+0x9c/0x1d0
mmput+0xa8/0x1d0
do_exit+0x360/0xc80
do_group_exit+0x60/0x100
SyS_exit_group+0x24/0x30
system_call+0x58/0x6c
---[ end trace ee88f958a1c62605 ]---
This bug was introduced by commit 31383c6865 ("mm, hugetlbfs:
introduce ->split() to vm_operations_struct"). A split function was
added to vm_operations_struct to determine if a mapping can be split.
This was mostly for device-dax and hugetlbfs mappings which have
specific alignment constraints.
Mappings initiated via shmget/shmat have their original vm_ops
overwritten with shm_vm_ops. shm_vm_ops functions will call back to the
original vm_ops if needed. Add such a split function to shm_vm_ops.
Link: http://lkml.kernel.org/r/20180321161314.7711-1-mike.kravetz@oracle.com
Fixes: 31383c6865 ("mm, hugetlbfs: introduce ->split() to vm_operations_struct")
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reported-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Reviewed-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Tested-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There are several places in the ucma ABI where userspace can pass in a
sockaddr but set the address family to AF_IB. When that happens,
rdma_addr_size() will return a size bigger than sizeof struct sockaddr_in6,
and the ucma kernel code might end up copying past the end of a buffer
not sized for a struct sockaddr_ib.
Fix this by introducing new variants
int rdma_addr_size_in6(struct sockaddr_in6 *addr);
int rdma_addr_size_kss(struct __kernel_sockaddr_storage *addr);
that are type-safe for the types used in the ucma ABI and return 0 if the
size computed is bigger than the size of the type passed in. We can use
these new variants to check what size userspace has passed in before
copying any addresses.
Reported-by: <syzbot+6800425d54ed3ed8135d@syzkaller.appspotmail.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
PV TLB FLUSH can only be turned on when steal time is enabled.
The condition got reversed during conflict resolution.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Fixes: 4f2f61fc50 ("KVM: X86: Avoid traversing all the cpus for pv tlb flush when steal time is disabled")
[Rebased on top of kvm/master and reworded the commit message. - Radim]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Pull ARM fixes from Russell King:
"A small number of small fixes for ARM, mostly for some build issues.
One fix for a regression caused by the cpu hotplug conversion from a
few kernel versions ago"
* 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 8750/1: deflate_xip_data.sh: minor fixes
ARM: 8748/1: mm: Define vdso_start, vdso_end as array
ARM: 8747/1: make CONFIG_DEBUG_WX depend on MMU
ARM: 8746/1: vfp: Go back to clearing vfp_current_hw_state[]
Pull SCSI fixes from James Bottomley:
"Two driver fixes (ibmvfc, iscsi_tcp) and a USB fix for devices that
give the wrong return to Read Capacity and cause a huge log spew.
The remaining five patches all try to fix commit 84676c1f21
("genirq/affinity: assign vectors to all possible CPUs") which broke
the non-mq I/O path"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: iscsi_tcp: set BDI_CAP_STABLE_WRITES when data digest enabled
scsi: sd: Remember that READ CAPACITY(16) succeeded
scsi: ibmvfc: Avoid unnecessary port relogin
scsi: virtio_scsi: unify scsi_host_template
scsi: virtio_scsi: fix IO hang caused by automatic irq vector affinity
scsi: core: introduce force_blk_mq
scsi: megaraid_sas: fix selection of reply queue
scsi: hpsa: fix selection of reply queue
The current for-loop zeros variable i and only loops once, hence
not all the buffers are free'd. Fix this by setting i correctly.
Detected by CoverityScan, CID#1463415 ("Operands don't affect result")
Fixes: a5073d6054 ("RDMA/hns: Add eq support of hip08")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Since commit c5ad119fb6
("net: sched: pfifo_fast use skb_array") driver is exposed
to an issue where it is hitting NULL skbs while handling TX
completions. Driver uses mmiowb() to flush the writes to the
doorbell bar which is a write-combined bar, however on x86
mmiowb() does not flush the write combined buffer.
This patch fixes this problem by replacing mmiowb() with wmb()
after the write combined doorbell write so that writes are
flushed and synchronized from more than one processor.
V1->V2:
-------
This patch was marked as "superseded" in patchwork.
(Not really sure for what reason).Resending it as v2.
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: Manish Chopra <manish.chopra@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We tried to remove vq poll from wait queue, but do not check whether
or not it was in a list before. This will lead double free. Fixing
this by switching to use vhost_poll_stop() which zeros poll->wqh after
removing poll from waitqueue to make sure it won't be freed twice.
Cc: Darren Kenny <darren.kenny@oracle.com>
Reported-by: syzbot+c0272972b01b872e604a@syzkaller.appspotmail.com
Fixes: 2b8b328b61 ("vhost_net: handle polling errors when setting backend")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is a change in how command line parsing is done in this version.
Excludes and includes are now ordered with the file list. Since
the spec file puts the file list before the exclude list it means newer
tar ignores the excludes and packs all the build output into the
kernel-devel RPM resulting in a huge package.
Simple argument re-ordering fixes the problem.
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Tariq Toukan says:
====================
mlx4 misc fixes for 4.16
This patchset contains misc bug fixes from the team
to the mlx4 Core and Eth drivers.
Patch 1 by Eran fixes a control mix of PFC and Global pauses, please queue it
to -stable for >= v4.8.
Patch 2 by Moshe fixes a resource leak in slave's delete flow, please queue it
to -stable for >= v4.5.
Series generated against net commit:
3c82b372a9 net: dsa: mt7530: fix module autoloading for OF platform drivers
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
mlx4_delete_all_resources_for_slave in resource tracker should free all
memory allocated for a slave.
While releasing memory of fs_rule, it misses releasing memory of
fs_rule->mirr_mbox.
Fixes: 78efed2751 ('net/mlx4_core: Support mirroring VF DMFS rules on both ports')
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Global pause and PFC configuration should be mutually exclusive (i.e. only
one of them at most can be set). However, once PFC was turned off,
driver automatically turned Global pause on. This is a bug.
Fix the driver behaviour to turn off PFC/Global once the user turned the
other on.
This also fixed a weird behaviour that at a current time, the profile
had both PFC and global pause configuration turned on, which is
Hardware-wise impossible and caused returning false positive indication
to query tools.
In addition, fix error code when setting global pause or PFC to change
metadata only upon successful change.
Also, removed useless debug print.
Fixes: af7d518526 ("net/mlx4_en: Add DCB PFC support through CEE netlink commands")
Fixes: c27a02cd94 ("mlx4_en: Add driver for Mellanox ConnectX 10GbE NIC")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Not every CLC proposal message needs the maximum buffer length.
Due to the MSG_WAITALL flag, it is important to use the peeked
real length when receiving the message.
Fixes: d63d271ce2 ("smc: switch to sock_recvmsg()")
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
llc_conn_send_pdu() pushes the skb into write queue and
calls llc_conn_send_pdus() to flush them out. However, the
status of dev_queue_xmit() is not returned to caller,
in this case, llc_conn_state_process().
llc_conn_state_process() needs hold the skb no matter
success or failure, because it still uses it after that,
therefore we should hold skb before dev_queue_xmit() when
that skb is the one being processed by llc_conn_state_process().
For other callers, they can just pass NULL and ignore
the return value as they are.
Reported-by: Noam Rathaus <noamr@beyondsecurity.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Saeed Mahameed says:
====================
Mellanox, mlx5 fixes 2018-03-23
The following series includes fixes for mlx5 netdev and eswitch.
v1->v2:
- Fixed commit message quotation marks in patch #7
For -stable v4.12
('net/mlx5e: Avoid using the ipv6 stub in the TC offload neigh update path')
('net/mlx5e: Fix traffic being dropped on VF representor')
For -stable v4.13
('net/mlx5e: Fix memory usage issues in offloading TC flows')
('net/mlx5e: Verify coalescing parameters in range')
For -stable v4.14
('net/mlx5e: Don't override vport admin link state in switchdev mode')
For -stable v4.15
('108b2b6d5c02 net/mlx5e: Sync netdev vxlan ports at open')
Please pull and let me know if there's any problem.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
strp_parser_err is called with a negative code everywhere, which then
calls abort_parser with a negative code. strp_msg_timeout calls
abort_parser directly with a positive code. Negate ETIMEDOUT
to match signed-ness of other calls.
The default abort_parser callback, strp_abort_strp, sets
sk->sk_err to err. Also negate the error here so sk_err always
holds a positive value, as the rest of the net code expects. Currently
a negative sk_err can result in endless loops, or user code that
thinks it actually sent/received err bytes.
Found while testing net/tls_sw recv path.
Fixes: 43a0c6751a ("strparser: Stream parser for messages")
Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes a bug in the tcf_dump_walker function that can cause some actions
to not be reported when dumping a large number of actions. This issue
became more aggrevated when cookies feature was added. In particular
this issue is manifest when large cookie values are assigned to the
actions and when enough actions are created that the resulting table
must be dumped in multiple batches.
The number of actions returned in each batch is limited by the total
number of actions and the memory buffer size. With small cookies
the numeric limit is reached before the buffer size limit, which avoids
the code path triggering this bug. When large cookies are used buffer
fills before the numeric limit, and the erroneous code path is hit.
For example after creating 32 csum actions with the cookie
aaaabbbbccccdddd
$ tc actions ls action csum
total acts 26
action order 0: csum (tcp) action continue
index 1 ref 1 bind 0
cookie aaaabbbbccccdddd
.....
action order 25: csum (tcp) action continue
index 26 ref 1 bind 0
cookie aaaabbbbccccdddd
total acts 6
action order 0: csum (tcp) action continue
index 28 ref 1 bind 0
cookie aaaabbbbccccdddd
......
action order 5: csum (tcp) action continue
index 32 ref 1 bind 0
cookie aaaabbbbccccdddd
Note that the action with index 27 is omitted from the report.
Fixes: 4b3550ef53 ("[NET_SCHED]: Use nla_nest_start/nla_nest_end")"
Signed-off-by: Craig Dillabaugh <cdillaba@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
pci_set_drvdata() is called only after registering the net_device,
therefore we could run into a NPE if one of the functions using
driver_data is called before it's set.
Fix this by calling pci_set_drvdata() before registering the
net_device.
This fix is a candidate for stable. As far as I can see the
bug has been there in kernel version 3.2 already, therefore
I can't provide a reference which commit is fixed by it.
The fix may need small adjustments per kernel version because
due to other changes the label which is jumped to if
register_netdev() fails has changed over time.
Reported-by: David Miller <davem@davemloft.net>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Simon Wunderlich says:
====================
Here are some batman-adv bugfixes:
- fix multicast-via-unicast transmissions for AP isolation and gateway
extension, by Linus Luessing (2 patches)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Smatch complains that "tmp" can be uninitialized if we do a zero size
write.
Fixes: 02a5d6925c ("ALSA: pcm: Avoid potential races between OSS ioctls and read/write")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Pull "Allwinner Fixes for 4.16" from Maxime Ripard:
The first and second patches fix the regulator support for the Bananapi M2
board.
The last one updates my email address in MAINTAINERS.
* tag 'sunxi-fixes-for-4.16' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/sunxi/linux:
MAINTAINERS: update email address for Maxime Ripard
ARM: dts: sun6i: a31s: bpi-m2: add missing regulators
ARM: dts: sun6i: a31s: bpi-m2: improve pmic properties
Pull "Two fixes for omap variants for v4.16-rc cycle" from Tony Lindgren:
Fix insecure W+X mapping warning for SRAM for omaps that
don't yet use drivers/misc/*sram*.c code. An earlier attempt
at fixing this turned out to cause problems with PM on omap3,
this version works with PM on omap3.
Also fix dmtimer probe for omap16xx devices that was noticed
with the pending dmtimer move to drivers. It seems this has
been broken for a while and is a non-critical for booting.
It is needed for PM on omap16xx though.
* tag 'omap-for-v4.16/sram-fix-signed' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: OMAP: Fix SRAM W+X mapping
ARM: OMAP: Fix dmtimer init for omap1
Pull "ARM: tegra: Miscellaneous changes for v4.17-rc1" from Thierry Reding:
This contains a single patch to update the MAINTAINERS entry for the
Tegra SMMU driver.
* tag 'tegra-for-4.17-misc' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tegra/linux:
MAINTAINERS: Update Tegra IOMMU maintainer
The following pattern fails to compile while the same pattern
with alternative_call() does:
if (...)
alternative_call_2(...);
else
alternative_call_2(...);
as it expands into
if (...)
{
}; <===
else
{
};
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20180114120504.GA11368@avx2
- Programming VMID correctly for scratch memory with HWS
- deallocating SDMA queues correctly in various situations
* tag 'drm-amdkfd-fixes-2018-03-25' of git://people.freedesktop.org/~gabbayo/linux:
drm/amdkfd: Deallocate SDMA queues correctly
drm/amdkfd: Fix scratch memory with HWS enabled
this patch fix a bug in how the pebs->real_ip is handled in the PEBS
handler. real_ip only exists in Haswell and later processor. It is
actually the eventing IP, i.e., where the event occurred. As opposed
to the pebs->ip which is the PEBS interrupt IP which is always off
by one.
The problem is that the real_ip just like the IP needs to be fixed up
because PEBS does not record all the machine state registers, and
in particular the code segement (cs). This is why we have the set_linear_ip()
function. The problem was that set_linear_ip() was only used on the pebs->ip
and not the pebs->real_ip.
We have profiles which ran into invalid callstacks because of this.
Here is an example:
..... 0: ffffffffffffff80 recent entry, marker kernel v
..... 1: 000000000040044d <= user address in kernel space!
..... 2: fffffffffffffe00 marker enter user v
..... 3: 000000000040044d
..... 4: 00000000004004b6 oldest entry
Debugging output in get_perf_callchain():
[ 857.769909] CALLCHAIN: CPU8 ip=40044d regs->cs=10 user_mode(regs)=0
The problem is that the kernel entry in 1: points to a user level
address. How can that be?
The reason is that with PEBS sampling the instruction that caused the event
to occur and the instruction where the CPU was when the interrupt was posted
may be far apart. And sometime during that time window, the privilege level may
change. This happens, for instance, when the PEBS sample is taken close to a
kernel entry point. Here PEBS, eventing IP (real_ip) captured a user level
instruction. But by the time the PMU interrupt fired, the processor had already
entered kernel space. This is why the debug output shows a user address with
user_mode() false.
The problem comes from PEBS not recording the code segment (cs) register.
The register is used in x86_64 to determine if executing in kernel vs user
space. This is okay because the kernel has a software workaround called
set_linear_ip(). But the issue in setup_pebs_sample_data() is that
set_linear_ip() is never called on the real_ip value when it is available
(Haswell and later) and precise_ip > 1.
This patch fixes this problem and eliminates the callchain discrepancy.
The patch restructures the code around set_linear_ip() to minimize the number
of times the IP has to be set.
Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: kan.liang@intel.com
Link: http://lkml.kernel.org/r/1521788507-10231-1-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Since the ORC unwinder was made the default on x86_64, Clang-built
defconfig kernels have triggered some new objtool warnings:
drivers/gpu/drm/i915/i915_gpu_error.o: warning: objtool: i915_error_printf()+0x6c: return with modified stack frame
drivers/gpu/drm/i915/intel_display.o: warning: objtool: pipe_config_err()+0xa6: return with modified stack frame
The problem is that objtool has never seen clang-built binaries before.
Shockingly enough, objtool is apparently able to follow the code flow
mostly fine, except for one instruction sequence. Instead of a LEAVE
instruction, clang restores RSP and RBP the long way:
67c: 48 89 ec mov %rbp,%rsp
67f: 5d pop %rbp
Teach objtool about this new code sequence.
Reported-and-test-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthias Kaehlcke <mka@chromium.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/fce88ce81c356eedcae7f00ed349cfaddb3363cc.1521741586.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
When mlx5_core is loaded it is expected to sync ports
with all vxlan devices so it can support vxlan encap/decap.
This is done via udp_tunnel_get_rx_info(). Currently this
call is set in mlx5e_nic_enable() and if the netdev is not in
NETREG_REGISTERED state it will not be called.
Normally on load the netdev state is not NETREG_REGISTERED
so udp_tunnel_get_rx_info() will not be called.
Moving udp_tunnel_get_rx_info() to mlx5e_open() so
it will be called on netdev UP event and allow encap/decap.
Fixes: 610e89e05c ("net/mlx5e: Don't sync netdev state when not registered")
Signed-off-by: Shahar Klein <shahark@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Currently we use the global ipv6_stub var to access the ipv6 global
nd table. This practice gets us to troubles when the stub is only partially
set e.g when ipv6 is loaded under the disabled policy. In this case, as of commit
343d60aada ("ipv6: change ipv6_stub_impl.ipv6_dst_lookup to take net argument")
the stub is not null, but stub->nd_tbl is and we crash.
As we can access the ipv6 nd_tbl directly, the fix is just to avoid the
reference through the stub. There is one place in the code where we
issue ipv6 route lookup and keep doing it through the stub, but that
mentioned commit makes sure we get -EAFNOSUPPORT from the stack.
Fixes: 232c001398 ("net/mlx5e: Add support to neighbour update flow")
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Aviv Heller <avivh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
For NIC flows, the parsed attributes are not freed when we exit
successfully from mlx5e_configure_flower().
There is possible double free for eswitch flows. If error is returned
from rhashtable_insert_fast(), the parse attrs will be freed in
mlx5e_tc_del_flow(), but they will be freed again before exiting
mlx5e_configure_flower().
To fix both issues we do the following:
(1) change the condition that determines if to issue the free call to
check if this flow is NIC flow, or it does not have encap action.
(2) reorder the code such that that the check and free calls are done
before we attempt to add into the hash table.
Fixes: 232c001398 ('net/mlx5e: Add support to neighbour update flow')
Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Increase representor netdev RQ size to avoid dropped packets.
The current size (two) is just too small to keep up with
conventional slow path traffic patterns.
Also match the SQ size to the RQ size.
Fixes: cb67b83292 ("net/mlx5e: Introduce SRIOV VF representors")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Add check of coalescing parameters received through ethtool are within
range of values supported by the HW.
Driver gets the coalescing rx/tx-usecs and rx/tx-frames as set by the
users through ethtool. The ethtool support up to 32 bit value for each.
However, mlx5 modify cq limits the coalescing time parameter to 12 bit
and coalescing frames parameters to 16 bits.
Return out of range error if user tries to set these parameters to
higher values.
Fixes: f62b8bb8f2 ('net/mlx5: Extend mlx5_core to support ConnectX-4 Ethernet functionality')
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Add dependancy for switchdev to be congfigured as any user-space control
plane SW is expected to use the HW switchdev ID to locate the representors
related to VFs of a certain PF and apply SW/offloaded switching on them.
Fixes: e80541ecab ('net/mlx5: Add CONFIG_MLX5_ESWITCH Kconfig')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
SQs are 32 and not 16 bits, hence it's wrong to use only 16 bits to
store the sq number for which are going to set steering rule, fix that.
Fixes: cb67b83292 ('net/mlx5e: Introduce SRIOV VF representors')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The vport admin original link state will be re-applied after returning
back to legacy mode, it is not right to change the admin link state value
when in switchdev mode.
Use direct vport commands to alter logical vport state in netdev
representor open/close flows rather than the administrative eswitch API.
Fixes: 20a1ea6747 ('net/mlx5e: Support VF vport link state control for SRIOV switchdev mode')
Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
It's required to create a modules.alias via MODULE_DEVICE_TABLE helper
for the OF platform driver. Otherwise, module autoloading cannot work.
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
MODULE_ALIAS exports information to allow the module to be auto-loaded at
boot for the drivers registered using legacy platform registration.
However, currently the driver is always used by DT-only platform,
MODULE_ALIAS is redundant and should be removed properly.
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
We try to hold TX virtqueue mutex in vhost_net_rx_peek_head_len()
after RX virtqueue mutex is held in handle_rx(). This requires an
appropriate lock nesting notation to calm down deadlock detector.
Fixes: 0308813724 ("vhost_net: basic polling support")
Reported-by: syzbot+7f073540b1384a614e09@syzkaller.appspotmail.com
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The same fix as in 'bonding: move dev_mc_sync after master_upper_dev_link
in bond_enslave' is needed for team driver.
The panic can be reproduced easily:
ip link add team1 type team
ip link set team1 up
ip link add link team1 vlan1 type vlan id 80
ip link set vlan1 master team1
Fixes: cb41c997d4 ("team: team should sync the port's uc/mc addrs when add a port")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long says:
====================
bonding: a bunch of fixes for dev hwaddr sync in bond_enslave
This patchset is mainly to fix a crash when adding vlan as slave of
bond which is also the parent link in patch 2/3, and also fix some
err process problems in bond_enslave in patch 1/3 and 3/3.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When dev_set_promiscuity(1) succeeds but dev_set_allmulti(1) fails,
dev_set_promiscuity(-1) should be done before going to the err path.
Otherwise, dev->promiscuity will leak.
Fixes: 7e1a1ac1fb ("bonding: Check return of dev_set_promiscuity/allmulti")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Beniamino found a crash when adding vlan as slave of bond which is also
the parent link:
ip link add bond1 type bond
ip link set bond1 up
ip link add link bond1 vlan1 type vlan id 80
ip link set vlan1 master bond1
The call trace is as below:
[<ffffffffa850842a>] queued_spin_lock_slowpath+0xb/0xf
[<ffffffffa8515680>] _raw_spin_lock+0x20/0x30
[<ffffffffa83f6f07>] dev_mc_sync+0x37/0x80
[<ffffffffc08687dc>] vlan_dev_set_rx_mode+0x1c/0x30 [8021q]
[<ffffffffa83efd2a>] __dev_set_rx_mode+0x5a/0xa0
[<ffffffffa83f7138>] dev_mc_sync_multiple+0x78/0x80
[<ffffffffc084127c>] bond_enslave+0x67c/0x1190 [bonding]
[<ffffffffa8401909>] do_setlink+0x9c9/0xe50
[<ffffffffa8403bf2>] rtnl_newlink+0x522/0x880
[<ffffffffa8403ff7>] rtnetlink_rcv_msg+0xa7/0x260
[<ffffffffa8424ecb>] netlink_rcv_skb+0xab/0xc0
[<ffffffffa83fe498>] rtnetlink_rcv+0x28/0x30
[<ffffffffa8424850>] netlink_unicast+0x170/0x210
[<ffffffffa8424bf8>] netlink_sendmsg+0x308/0x420
[<ffffffffa83cc396>] sock_sendmsg+0xb6/0xf0
This is actually a dead lock caused by sync slave hwaddr from master when
the master is the slave's 'slave'. This dead loop check is actually done
by netdev_master_upper_dev_link. However, Commit 1f718f0f4f ("bonding:
populate neighbour's private on enslave") moved it after dev_mc_sync.
This patch is to fix it by moving dev_mc_sync after master_upper_dev_link,
so that this loop check would be earlier than dev_mc_sync. It also moves
if (mode == BOND_MODE_8023AD) into if (!bond_uses_primary) clause as an
improvement.
Note team driver also has this issue, I will fix it in another patch.
Fixes: 1f718f0f4f ("bonding: populate neighbour's private on enslave")
Reported-by: Beniamino Galvani <bgalvani@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
vlan_vids_add_by_dev is called right after dev hwaddr sync, so on
the err path it should unsync dev hwaddr. Otherwise, the slave
dev's hwaddr will never be unsync when this err happens.
Fixes: 1ff412ad77 ("bonding: change the bond's vlan syncing functions with the standard ones")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Acked-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
After the qdisc lock was dropped in pfifo_fast we allow multiple
enqueue threads and dequeue threads to run in parallel. On the
enqueue side the skb bit ooo_okay is used to ensure all related
skbs are enqueued in-order. On the dequeue side though there is
no similar logic. What we observe is with fewer queues than CPUs
it is possible to re-order packets when two instances of
__qdisc_run() are running in parallel. Each thread will dequeue
a skb and then whichever thread calls the ndo op first will
be sent on the wire. This doesn't typically happen because
qdisc_run() is usually triggered by the same core that did the
enqueue. However, drivers will trigger __netif_schedule()
when queues are transitioning from stopped to awake using the
netif_tx_wake_* APIs. When this happens netif_schedule() calls
qdisc_run() on the same CPU that did the netif_tx_wake_* which
is usually done in the interrupt completion context. This CPU
is selected with the irq affinity which is unrelated to the
enqueue operations.
To resolve this we add a RUNNING bit to the qdisc to ensure
only a single dequeue per qdisc is running. Enqueue and dequeue
operations can still run in parallel and also on multi queue
NICs we can still have a dequeue in-flight per qdisc, which
is typically per CPU.
Fixes: c5ad119fb6 ("net: sched: pfifo_fast use skb_array")
Reported-by: Jakob Unterwurzacher <jakob.unterwurzacher@theobroma-systems.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
BroadMobi BM806U is an Qualcomm MDM9225 based 3G/4G modem.
Tested hardware BM806U is mounted on D-Link DWR-921-C3 router.
The USB id is added to qmi_wwan.c to allow QMI communication with
the BM806U.
Tested on 4.14 kernel and OpenWRT.
Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
and switch to https where possible.
All links have been eyeballed to verify that the domains have
not changed, etc.
Signed-off-by: Sanjeev Gupta <ghane0@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When trying to use the driver (e.g. aplay *.wav), the 4MiB DMA buffer
will get mmapp'ed in 16KiB chunks. But this fails with the 2nd 16KiB
area, as the page offset is outside of the VMA range (size), which is
currently used as size parameter in snd_pcm_lib_default_mmap(). By
using the DMA buffer size (dma_bytes) instead, the complete DMA buffer
can be mmapp'ed and the issue is fixed.
This issue was detected on an ARM platform (TI AM57xx) using the RME
HDSP MADI PCIe soundcard.
Fixes: 657b1989da ("ALSA: pcm - Use dma_mmap_coherent() if available")
Signed-off-by: Stefan Roese <sr@denx.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Kalle Valo says:
====================
wireless-drivers fixes for 4.16
Some fixes for 4.16, only for iwlwifi and brcmfmac this time. All
pretty small.
iwlwifi
* fix an issue with the multicast queue
* fix IGTK handling
* fix some missing return value checks
* add support for a HW workaround for issues on some platforms
* a couple of fixes for channel-switch
* a few fixes for the aggregation handling code
brcmfmac
* drop Inter-Access Point Protocol packets by default
* fix check for ISO3166 regulatory code
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
KMSAN reports use of uninitialized memory in the case when |alen| is
smaller than sizeof(struct sockaddr_nl), and therefore |nladdr| isn't
fully copied from the userspace.
Signed-off-by: Alexander Potapenko <glider@google.com>
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Description:
EEE does not work with lan7800 when AutoSpeed is not set.
(This can happen when EEPROM is not populated or configured incorrectly)
Root-Cause:
When EEE is enabled, the mac config register ASD is not set
i.e. in default state, causing EEE fail.
Fix:
Set the register when eeprom is not present.
Fixes: 55d7de9de6 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver")
Signed-off-by: Raghuram Chary J <raghuramchary.jallipalli@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, the SMC experimental TCP option in a SYN packet is lost on
the server side when SYN Cookies are active. However, the corresponding
SYNACK sent back to the client contains the SMC option. This causes an
inconsistent view of the SMC capabilities on the client and server.
This patch disables the SMC option in the SYNACK when SYN Cookies are
active to avoid this issue.
Fixes: 60e2a77807 ("tcp: TCP experimental option for SMC")
Signed-off-by: Hans Wippel <hwippel@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The SLB bad address handler's trap number fixup does not preserve the
low bit that indicates nonvolatile GPRs have not been saved. This
leads save_nvgprs to skip saving them, and subsequent functions and
return from interrupt will think they are saved.
This causes kernel branch-to-garbage debugging to not have correct
registers, can also cause userspace to have its registers clobbered
after a segfault.
Fixes: f0f558b131 ("powerpc/mm: Preserve CFAR value on SLB miss caused by access to bogus address")
Cc: stable@vger.kernel.org # v4.9+
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Pull dmaengine fix from Vinod Koul:
"One small fix for stm32-dmamux fixing buffer overflow"
* tag 'dmaengine-fix-4.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/slave-dma:
dmaengine: stm32-dmamux: fix a potential buffer overflow
Pull x86 and PTI fixes from Ingo Molnar:
"Misc fixes:
- fix EFI pagetables freeing
- fix vsyscall pagetable setting on Xen PV guests
- remove ancient CONFIG_X86_PPRO_FENCE=y - x86 is TSO again
- fix two binutils (ld) development version related incompatibilities
- clean up breakpoint handling
- fix an x86 self-test"
* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/entry/64: Don't use IST entry for #BP stack
x86/efi: Free efi_pgd with free_pages()
x86/vsyscall/64: Use proper accessor to update P4D entry
x86/cpu: Remove the CONFIG_X86_PPRO_FENCE=y quirk
x86/boot/64: Verify alignment of the LOAD segment
x86/build/64: Force the linker to use 2MB page size
selftests/x86/ptrace_syscall: Fix for yet more glibc interference
Pull scheduler fixes from Ingo Molnar:
"Two sched debug output related fixes: a console output fix and
formatting fixes"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/debug: Adjust newlines for better alignment
sched/debug: Fix per-task line continuation for console output
Pull locking fixes from Ingo Molnar:
"Two fixes: tighten up a jump-labels warning to not trigger on certain
modules and fix confusing (and non-existent) mutex API documentation"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
jump_label: Disable jump labels in __exit code
locking/mutex: Improve documentation
Pull mqueuefs revert from Eric Biederman:
"This fixes a regression that came in the merge window for v4.16.
The problem is that the permissions for mounting and using the
mqueuefs filesystem are broken. The necessary permission check is
missing letting people who should not be able to mount mqueuefs mount
mqueuefs. The field sb->s_user_ns is set incorrectly not allowing the
mounter of mqueuefs to remount and otherwise have proper control over
the filesystem.
Al Viro and I see the path to the necessary fixes differently and I am
not even certain at this point he actually sees all of the necessary
fixes. Given a couple weeks we can probably work something out but I
don't see the review being resolved in time for the final v4.16. I
don't want v4.16 shipping with a nasty regression. So unfortunately I
am sending a revert"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
Revert "mqueue: switch to on-demand creation of internal mount"
This reverts commit 36735a6a2b.
Aleksa Sarai <asarai@suse.de> writes:
> [REGRESSION v4.16-rc6] [PATCH] mqueue: forbid unprivileged user access to internal mount
>
> Felix reported weird behaviour on 4.16.0-rc6 with regards to mqueue[1],
> which was introduced by 36735a6a2b ("mqueue: switch to on-demand
> creation of internal mount").
>
> Basically, the reproducer boils down to being able to mount mqueue if
> you create a new user namespace, even if you don't unshare the IPC
> namespace.
>
> Previously this was not possible, and you would get an -EPERM. The mount
> is the *host* mqueue mount, which is being cached and just returned from
> mqueue_mount(). To be honest, I'm not sure if this is safe or not (or if
> it was intentional -- since I'm not familiar with mqueue).
>
> To me it looks like there is a missing permission check. I've included a
> patch below that I've compile-tested, and should block the above case.
> Can someone please tell me if I'm missing something? Is this actually
> safe?
>
> [1]: https://github.com/docker/docker/issues/36674
The issue is a lot deeper than a missing permission check. sb->s_user_ns
was is improperly set as well. So in addition to the filesystem being
mounted when it should not be mounted, so things are not allow that should
be.
We are practically to the release of 4.16 and there is no agreement between
Al Viro and myself on what the code should looks like to fix things properly.
So revert the code to what it was before so that we can take our time
and discuss this properly.
Fixes: 36735a6a2b ("mqueue: switch to on-demand creation of internal mount")
Reported-by: Felix Abecassis <fabecassis@nvidia.com>
Reported-by: Aleksa Sarai <asarai@suse.de>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for your net tree,
they are:
1) Don't pick fixed hash implementation for NFT_SET_EVAL sets, otherwise
userspace hits EOPNOTSUPP with valid rules using the meter statement,
from Florian Westphal.
2) If you send a batch that flushes the existing ruleset (that contains
a NAT chain) and the new ruleset definition comes with a new NAT
chain, don't bogusly hit EBUSY. Also from Florian.
3) Missing netlink policy attribute validation, from Florian.
4) Detach conntrack template from skbuff if IP_NODEFRAG is set on,
from Paolo Abeni.
5) Cache device names in flowtable object, otherwise we may end up
walking over devices going aways given no rtnl_lock is held.
6) Fix incorrect net_device ingress with ingress hooks.
7) Fix crash when trying to read more data than available in UDP
packets from the nf_socket infrastructure, from Subash.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
skb_header_pointer will copy data into a buffer if data is non linear,
otherwise it will return a pointer in the linear section of the data.
nf_sk_lookup_slow_v{4,6} always copies data of size udphdr but later
accesses memory within the size of tcphdr (th->doff) in case of TCP
packets. This causes a crash when running with KASAN with the following
call stack -
BUG: KASAN: stack-out-of-bounds in xt_socket_lookup_slow_v4+0x524/0x718
net/netfilter/xt_socket.c:178
Read of size 2 at addr ffffffe3d417a87c by task syz-executor/28971
CPU: 2 PID: 28971 Comm: syz-executor Tainted: G B W O 4.9.65+ #1
Call trace:
[<ffffff9467e8d390>] dump_backtrace+0x0/0x428 arch/arm64/kernel/traps.c:76
[<ffffff9467e8d7e0>] show_stack+0x28/0x38 arch/arm64/kernel/traps.c:226
[<ffffff946842d9b8>] __dump_stack lib/dump_stack.c:15 [inline]
[<ffffff946842d9b8>] dump_stack+0xd4/0x124 lib/dump_stack.c:51
[<ffffff946811d4b0>] print_address_description+0x68/0x258 mm/kasan/report.c:248
[<ffffff946811d8c8>] kasan_report_error mm/kasan/report.c:347 [inline]
[<ffffff946811d8c8>] kasan_report.part.2+0x228/0x2f0 mm/kasan/report.c:371
[<ffffff946811df44>] kasan_report+0x5c/0x70 mm/kasan/report.c:372
[<ffffff946811bebc>] check_memory_region_inline mm/kasan/kasan.c:308 [inline]
[<ffffff946811bebc>] __asan_load2+0x84/0x98 mm/kasan/kasan.c:739
[<ffffff94694d6f04>] __tcp_hdrlen include/linux/tcp.h:35 [inline]
[<ffffff94694d6f04>] xt_socket_lookup_slow_v4+0x524/0x718 net/netfilter/xt_socket.c:178
Fix this by copying data into appropriate size headers based on protocol.
Fixes: a583636a83 ("inet: refactor inet[6]_lookup functions to take skb")
Signed-off-by: Tejaswi Tanikella <tejaswit@codeaurora.org>
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
NFP program allocation length is in bytes and NFP program length
is in instructions, fix the comparison of the two.
Fixes: 9314c442d7 ("nfp: bpf: move translation prepare to offload.c")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Pull pin control fixes from Linus Walleij:
"Two fixes for pin control for v4.16:
- Renesas SH-PFC: remove a duplicate clkout pin which was causing
crashes
- fix Samsung out of bounds exceptions"
* tag 'pinctrl-v4.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: samsung: Validate alias coming from DT
pinctrl: sh-pfc: r8a7795: remove duplicate of CLKOUT pin in pinmux_pins[]
Send nm complaints about broken pipe (when sed exits early) to /dev/null.
All errors should be printed to stderr.
Don't trap on normal exit so the trap can return an error code.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Define vdso_start, vdso_end as array to avoid compile-time analysis error
for the case of built with CONFIG_FORTIFY_SOURCE.
and, since vdso_start, vdso_end are used in vdso.c only,
move extern-declaration from vdso.h to vdso.c.
If kernel is built with CONFIG_FORTIFY_SOURCE,
compile-time error happens at this code.
- if (memcmp(&vdso_start, "177ELF", 4))
The size of "&vdso_start" is recognized as 1 byte, but n is 4,
So that compile-time error is reported.
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jinbum Park <jinb.park7@gmail.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Without CONFIG_MMU, this results in a build failure:
./arch/arm/include/asm/memory.h:92:23: error: initializer element is not constant
#define VECTORS_BASE vectors_base
arch/arm/mm/dump.c:32:4: note: in expansion of macro 'VECTORS_BASE'
{ VECTORS_BASE, "Vectors" },
arch/arm/mm/dump.c:71:11: error: 'L_PTE_USER' undeclared here (not in a function); did you mean 'VTIME_USER'?
.mask = L_PTE_USER,
^~~~~~~~~~
Obviously the feature only makes sense with an MMU, so let's add the
dependency here.
Fixes: a8e53c151f ("ARM: 8737/1: mm: dump: add checking for writable and executable")
Acked-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Commit 384b38b669 ("ARM: 7873/1: vfp: clear vfp_current_hw_state
for dying cpu") fixed the cpu dying notifier by clearing
vfp_current_hw_state[]. However commit e5b61bafe7 ("arm: Convert VFP
hotplug notifiers to state machine") incorrectly used the original
vfp_force_reload() function in the cpu dying notifier.
Fix it by going back to clearing vfp_current_hw_state[].
Fixes: e5b61bafe7 ("arm: Convert VFP hotplug notifiers to state machine")
Cc: linux-stable <stable@vger.kernel.org>
Reported-by: Kohji Okuno <okuno.kohji@jp.panasonic.com>
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
DHCP connectivity issues can currently occur if the following conditions
are met:
1) A DHCP packet from a client to a server
2) This packet has a multicast destination
3) This destination has a matching entry in the translation table
(FF:FF:FF:FF:FF:FF for IPv4, 33:33:00:01:00:02/33:33:00:01:00:03
for IPv6)
4) The orig-node determined by TT for the multicast destination
does not match the orig-node determined by best-gateway-selection
In this case the DHCP packet will be dropped.
The "gateway-out-of-range" check is supposed to only be applied to
unicasted DHCP packets to a specific DHCP server.
In that case dropping the the unicasted frame forces the client to
retry via a broadcasted one, but now directed to the new best
gateway.
A DHCP packet with broadcast/multicast destination is already ensured to
always be delivered to the best gateway. Dropping a multicasted
DHCP packet here will only prevent completing DHCP as there is no
other fallback.
So far, it seems the unicast check was implicitly performed by
expecting the batadv_transtable_search() to return NULL for multicast
destinations. However, a multicast address could have always ended up in
the translation table and in fact is now common.
To fix this potential loss of a DHCP client-to-server packet to a
multicast address this patch adds an explicit multicast destination
check to reliably bail out of the gateway-out-of-range check for such
destinations.
The issue and fix were tested in the following three node setup:
- Line topology, A-B-C
- A: gateway client, DHCP client
- B: gateway server, hop-penalty increased: 30->60, DHCP server
- C: gateway server, code modifications to announce FF:FF:FF:FF:FF:FF
Without this patch, A would never transmit its DHCP Discover packet
due to an always "out-of-range" condition. With this patch,
a full DHCP handshake between A and B was possible again.
Fixes: be7af5cf9c ("batman-adv: refactoring gateway handling code")
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
For multicast frames AP isolation is only supposed to be checked on
the receiving nodes and never on the originating one.
Furthermore, the isolation or wifi flag bits should only be intepreted
as such for unicast and never multicast TT entries.
By injecting flags to the multicast TT entry claimed by a single
target node it was verified in tests that this multicast address
becomes unreachable, leading to packet loss.
Omitting the "src" parameter to the batadv_transtable_search() call
successfully skipped the AP isolation check and made the target
reachable again.
Fixes: 1d8ab8d3c1 ("batman-adv: Modified forwarding behaviour for multicast packets")
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Pull kprobe fixes from Steven Rostedt:
"The documentation for kprobe events says that symbol offets can take
both a + and - sign to get to befor and after the symbol address.
But in actuality, the code does not support the minus. This fixes that
issue, and adds a few more selftests to kprobe events"
* tag 'trace-v4.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
selftests: ftrace: Add a testcase for probepoint
selftests: ftrace: Add a testcase for string type with kprobe_event
selftests: ftrace: Add probe event argument syntax testcase
tracing: probeevent: Fix to support minus offset from symbol
There's nothing IST-worthy about #BP/int3. We don't allow kprobes
in the small handful of places in the kernel that run at CPL0 with
an invalid stack, and 32-bit kernels have used normal interrupt
gates for #BP forever.
Furthermore, we don't allow kprobes in places that have usergs while
in kernel mode, so "paranoid" is also unnecessary.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Deallocate SDMA queues during abnormal process termination and when
queue creation fails after the SDMA allocation.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Pull MIPS fixes from James Hogan:
"Another miscellaneous pile of MIPS fixes for 4.16:
- lantiq: fixes for clocks and Amazon SE (4.14)
- ralink: fix booting on MT7621 (4.5)
- ralink: fix halt (3.9)"
* tag 'mips_fixes_4.16_5' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/mips:
MIPS: ralink: Fix booting on MT7621
MIPS: ralink: Remove ralink_halt()
MIPS: lantiq: ase: Enable MFD_SYSCON
MIPS: lantiq: Enable AHB Bus for USB
MIPS: lantiq: Fix Danube USB clock
Pull VFIO fix from Alex Williamson:
"Revert masking INTx where it cannot be enabled - it plays poorly with
SR-IOV VFs and presumes DisINTx support"
* tag 'vfio-v4.16-rc7' of git://github.com/awilliam/linux-vfio:
Revert: "vfio-pci: Mask INTx if a device is not capabable of enabling it"
Pull MTD fixes from Boris Brezillon:
- Fix several problems in the fsl_ifc NAND controller driver
- Fix misuse of mtd_ooblayout_ecc() in mtdchar.c
* tag 'mtd/fixes-for-4.16-rc7' of git://git.infradead.org/linux-mtd:
mtd: nand: fsl_ifc: Read ECCSTAT0 and ECCSTAT1 registers for IFC 2.0
mtd: nand: fsl_ifc: Fix eccstat array overflow for IFC ver >= 2.0.0
mtd: nand: fsl_ifc: Fix nand waitfunc return value
mtdchar: fix usage of mtd_ooblayout_ecc()
Pull staging/IIO fixes from Greg KH:
"Here are a few small staging and IIO fixes for various reported
issues.
All of them are tiny, the majority being iio driver fixes for small
issues, and one staging driver fix for a memory corruption issue.
All have been in linux-next with no reported issues"
* tag 'staging-4.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: ncpfs: memory corruption in ncp_read_kernel()
iio: st_pressure: st_accel: pass correct platform data to init
Revert "iio: accel: st_accel: remove redundant pointer pdata"
iio: adc: meson-saradc: unlock on error in meson_sar_adc_lock()
dt-bindings: iio: adc: sd-modulator: fix io-channel-cells
iio: adc: stm32-dfsdm: fix multiple channel initialization
iio: adc: stm32-dfsdm: fix clock source selection
iio: adc: stm32-dfsdm: fix call to stop channel
iio: adc: stm32-dfsdm: fix compatible data use
iio: chemical: ccs811: Corrected firmware boot/application mode transition
Pull hyperv fix from Greg KH:
"This is a single hyperv bugfix for 4.16-rc7.
It resolves an issue with the ring-buffer signaling to resolve
reported problems.
It's been in linux-next for a while now with no reported issues"
* tag 'char-misc-4.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
Drivers: hv: vmbus: Fix ring buffer signaling
Pull media fixes from Mauro Carvalho Chehab:
"Three fixes:
- dvb: fix a Kconfig typo on a help text
- tegra-cec: reset rx_buf_cnt when start bit detected
- rc: lirc does not use LIRC_CAN_SEND_SCANCODE feature"
* tag 'media/v4.16-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
media: dvb: fix a Kconfig typo
media: tegra-cec: reset rx_buf_cnt when start bit detected
media: rc: lirc does not use LIRC_CAN_SEND_SCANCODE feature
Segment registers must be synchronized prior to any code that may
trigger a call to emulation_required()/guest_state_valid(), e.g.
vmx_set_cr0(). Because preparing vmcs02 writes segmentation fields
directly, i.e. doesn't use vmx_set_segment(), emulation_required
will not be re-evaluated when synchronizing the segment registers,
which can result in L0 incorrectly starting emulation of L2.
Fixes: 8665c3f973 ("KVM: nVMX: initialize descriptor cache fields in prepare_vmcs02_full")
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
[Move all of prepare_vmcs02_full earlier, not just segment registers. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Pull sound fixes from Takashi Iwai:
"Things look calming down, but people were still busy to plaster over
small holes:
- Two fixes to harden against races in aloop driver
- A correction of a long-standing bug in USB-audio UAC2 processing
unit parser
- As usual suspects, HD-audio: a workaround for Coffee Lake
controller and a few other device-specific fixes
All small and for stable"
* tag 'sound-4.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: aloop: Fix access to not-yet-ready substream via cable
ALSA: aloop: Sync stale timer before release
ALSA: hda/realtek - Fix speaker no sound after system resume
ALSA: hda/realtek - Fix Dell headset Mic can't record
ALSA: hda - Force polling mode on CFL for fixing codec communication
ALSA: usb-audio: Fix parsing descriptor of UAC2 processing unit
ALSA: hda/realtek - Always immediately update mute LED with pin VREF
Ido Schimmel says:
====================
mlxsw: Handle changes to MTU in GRE tunnels
Petr says:
When offloading GRE tunnels, the MTU setting is kept fixed after the
initial offload even as the slow-path configuration changed. Worse: the
offloaded MTU setting is actually just a transient value set at the time
of NETDEV_REGISTER of the tunnel. As of commit ffc2b6ee41 ("ip_gre:
fix IFLA_MTU ignored on NEWLINK"), that transient value is zero, and
unless there's e.g. a VRF migration that prompts re-offload, it stays at
zero, and all GRE packets end up trapping.
Thus, in patch #1, change the way the MTU is changed post-registration,
so that the full event protocol is observed. That way the drivers get to
see the change and have a chance to react.
In the remaining two patches, implement support for MTU change in mlxsw
driver.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Update MTU of overlay loopback in accordance with the setting on the
tunnel netdevice.
Fixes: 0063587d35 ("mlxsw: spectrum: Support decap-only IP-in-IP tunnels")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move the function so that it can be called without forward declaration
from a function that will be added in a follow-up patch.
Fixes: 0063587d35 ("mlxsw: spectrum: Support decap-only IP-in-IP tunnels")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For tunnels created with IFLA_MTU, MTU of the netdevice is set by
rtnl_create_link() (called from rtnl_newlink()) before the device is
registered. However without IFLA_MTU that's not done.
rtnl_newlink() proceeds by calling struct rtnl_link_ops.newlink, which
via ip_tunnel_newlink() calls register_netdevice(), and that emits
NETDEV_REGISTER. Thus any listeners that inspect the netdevice get the
MTU of 0.
After ip_tunnel_newlink() corrects the MTU after registering the
netdevice, but since there's no event, the listeners don't get to know
about the MTU until something else happens--such as a NETDEV_UP event.
That's not ideal.
So instead of setting the MTU directly, go through dev_set_mtu(), which
takes care of distributing the necessary NETDEV_PRECHANGEMTU and
NETDEV_CHANGEMTU events.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On POWER9, under some circumstances, a broadcast TLB invalidation
might complete before all previous stores have drained, potentially
allowing stale stores from becoming visible after the invalidation.
This works around it by doubling up those TLB invalidations which was
verified by HW to be sufficient to close the risk window.
This will be documented in a yet-to-be-published errata.
Fixes: 1a472c9dba ("powerpc/mm/radix: Add tlbflush routines")
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
[mpe: Enable the feature in the DT CPU features code for all Power9,
rename the feature to CPU_FTR_P9_TLBIE_BUG per benh.]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
A recent commit introduced a new struct xfrm_trans_cb
that is used with the sk_buff control buffer. Unfortunately
it placed the structure in front of the control buffer and
overlooked that the IPv4/IPv6 control buffer is still needed
for some layer 4 protocols. As a result the IPv4/IPv6 control
buffer is overwritten with this structure. Fix this by setting
a apropriate header in front of the structure.
Fixes acf568ee85 ("xfrm: Reinject transport-mode packets ...")
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
On POWER9 the Nest MMU may fail to invalidate some translations when
doing a tlbie "by PID" or "by LPID" that is targeted at the TLB only
and not the page walk cache.
This works around it by forcing such invalidations to escalate to
RIC=2 (full invalidation of TLB *and* PWC) when a coprocessor is in
use for the context.
Fixes: 03b8abedf4 ("cxl: Enable global TLBIs for cxl contexts")
Cc: stable@vger.kernel.org # v4.15+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
[balbirs: fixed spelling and coding style to quiesce checkpatch.pl]
Tested-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Currently, when using coprocessors (which use the Nest MMU), we
simply increment the active_cpu count to force all TLB invalidations
to be come broadcast.
Unfortunately, due to an errata in POWER9, we will need to know
more specifically that coprocessors are in use.
This maintains a separate copros counter in the MMU context for
that purpose.
NB. The commit mentioned in the fixes tag below is not at fault for
the bug we're fixing in this commit and the next, but this fix applies
on top the infrastructure it introduced.
Fixes: 03b8abedf4 ("cxl: Enable global TLBIs for cxl contexts")
Cc: stable@vger.kernel.org # v4.15+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Tested-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Since commit 6964e6a4e4 ("KVM: PPC: Book3S HV: Do SLB load/unload
with guest LPCR value loaded", 2018-01-11), we have been seeing
occasional machine check interrupts on POWER8 systems when running
KVM guests, due to SLB multihit errors.
This turns out to be due to the guest exit code reloading the host
SLB entries from the SLB shadow buffer when the SLB was not previously
cleared in the guest entry path. This can happen because the path
which skips from the guest entry code to the guest exit code without
entering the guest now does the skip before the SLB is cleared and
loaded with guest values, but the host values are loaded after the
point in the guest exit path that we skip to.
To fix this, we move the code that reloads the host SLB values up
so that it occurs just before the point in the guest exit code (the
label guest_bypass:) where we skip to from the guest entry path.
Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Fixes: 6964e6a4e4 ("KVM: PPC: Book3S HV: Do SLB load/unload with guest LPCR value loaded")
Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Merge misc fixes from Andrew Morton:
"13 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
mm, thp: do not cause memcg oom for thp
mm/vmscan: wake up flushers for legacy cgroups too
Revert "mm: page_alloc: skip over regions of invalid pfns where possible"
mm/shmem: do not wait for lock_page() in shmem_unused_huge_shrink()
mm/thp: do not wait for lock_page() in deferred_split_scan()
mm/khugepaged.c: convert VM_BUG_ON() to collapse fail
x86/mm: implement free pmd/pte page interfaces
mm/vmalloc: add interfaces to free unmapped page table
h8300: remove extraneous __BIG_ENDIAN definition
hugetlbfs: check for pgoff value overflow
lockdep: fix fs_reclaim warning
MAINTAINERS: update Mark Fasheh's e-mail
mm/mempolicy.c: avoid use uninitialized preferred_node
Pull libnvdimm fixes from Dan Williams:
"Two regression fixes, two bug fixes for older issues, two fixes for
new functionality added this cycle that have userspace ABI concerns,
and a small cleanup. These have appeared in a linux-next release and
have a build success report from the 0day robot.
* The 4.16 rework of altmap handling led to some configurations
leaking page table allocations due to freeing from the altmap
reservation rather than the page allocator.
The impact without the fix is leaked memory and a WARN() message
when tearing down libnvdimm namespaces. The rework also missed a
place where error handling code needed to be removed that can lead
to a crash if devm_memremap_pages() fails.
* acpi_map_pxm_to_node() had a latent bug whereby it could
misidentify the closest online node to a given proximity domain.
* Block integrity handling was reworked several kernels back to allow
calling add_disk() after setting up the integrity profile.
The nd_btt and nd_blk drivers are just now catching up to fix
automatic partition detection at driver load time.
* The new peristence_domain attribute, a platform indicator of
whether cpu caches are powerfail protected for example, is meant to
be a single value enum and not a set of flags.
This oversight was caught while reviewing new userspace code in
libndctl to communicate the attribute.
Fix this new enabling up so that we are not stuck with an unwanted
userspace ABI"
* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
libnvdimm, nfit: fix persistence domain reporting
libnvdimm, region: hide persistence_domain when unknown
acpi, numa: fix pxm to online numa node associations
x86, memremap: fix altmap accounting at free
libnvdimm: remove redundant assignment to pointer 'dev'
libnvdimm, {btt, blk}: do integrity setup before add_disk()
kernel/memremap: Remove stale devres_free() call
Pull drm fixes from Dave Airlie:
"A bunch of fixes all over the place (core, i915, amdgpu, imx, sun4i,
ast, tegra, vmwgfx), nothing too serious or worrying at this stage.
- one uapi fix to stop multi-planar images with getfb
- Sun4i error path and clock fixes
- udl driver mmap offset fix
- i915 DP MST and GPU reset fixes
- vmwgfx mutex and black screen fixes
- imx array underflow fix and vblank fix
- amdgpu: display fixes
- exynos devicetree fix
- ast mode fix"
* tag 'drm-fixes-for-v4.16-rc7' of git://people.freedesktop.org/~airlied/linux: (29 commits)
drm/ast: Fixed 1280x800 Display Issue
drm: udl: Properly check framebuffer mmap offsets
drm/i915: Specify which engines to reset following semaphore/event lockups
drm/vmwgfx: Fix a destoy-while-held mutex problem.
drm/vmwgfx: Fix black screen and device errors when running without fbdev
drm: Reject getfb for multi-plane framebuffers
drm/amd/display: Add one to EDID's audio channel count when passing to DC
drm/amd/display: We shouldn't set format_default on plane as atomic driver
drm/amd/display: Fix FMT truncation programming
drm/amd/display: Allow truncation to 10 bits
drm/sun4i: hdmi: Fix another error handling path in 'sun4i_hdmi_bind()'
drm/sun4i: hdmi: Fix an error handling path in 'sun4i_hdmi_bind()'
drm/i915/dp: Write to SET_POWER dpcd to enable MST hub.
drm/amd/display: fix dereferencing possible ERR_PTR()
drm/amd/display: Refine disable VGA
drm/tegra: Shutdown on driver unbind
drm/tegra: dsi: Don't disable regulator on ->exit()
drm/tegra: dc: Detach IOMMU group from domain only once
dt-bindings: exynos: Document #sound-dai-cells property of the HDMI node
drm/imx: move arming of the vblank event to atomic_flush
...
Commit 2516035499 ("mm, thp: remove __GFP_NORETRY from khugepaged and
madvised allocations") changed the page allocator to no longer detect
thp allocations based on __GFP_NORETRY.
It did not, however, modify the mem cgroup try_charge() path to avoid
oom kill for either khugepaged collapsing or thp faulting. It is never
expected to oom kill a process to allocate a hugepage for thp; reclaim
is governed by the thp defrag mode and MADV_HUGEPAGE, but allocations
(and charging) should fallback instead of oom killing processes.
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1803191409420.124411@chino.kir.corp.google.com
Fixes: 2516035499 ("mm, thp: remove __GFP_NORETRY from khugepaged and madvised allocations")
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 726d061fbd ("mm: vmscan: kick flushers when we encounter dirty
pages on the LRU") added flusher invocation to shrink_inactive_list()
when many dirty pages on the LRU are encountered.
However, shrink_inactive_list() doesn't wake up flushers for legacy
cgroup reclaim, so the next commit bbef938429 ("mm: vmscan: remove old
flusher wakeup from direct reclaim path") removed the only source of
flusher's wake up in legacy mem cgroup reclaim path.
This leads to premature OOM if there is too many dirty pages in cgroup:
# mkdir /sys/fs/cgroup/memory/test
# echo $$ > /sys/fs/cgroup/memory/test/tasks
# echo 50M > /sys/fs/cgroup/memory/test/memory.limit_in_bytes
# dd if=/dev/zero of=tmp_file bs=1M count=100
Killed
dd invoked oom-killer: gfp_mask=0x14000c0(GFP_KERNEL), nodemask=(null), order=0, oom_score_adj=0
Call Trace:
dump_stack+0x46/0x65
dump_header+0x6b/0x2ac
oom_kill_process+0x21c/0x4a0
out_of_memory+0x2a5/0x4b0
mem_cgroup_out_of_memory+0x3b/0x60
mem_cgroup_oom_synchronize+0x2ed/0x330
pagefault_out_of_memory+0x24/0x54
__do_page_fault+0x521/0x540
page_fault+0x45/0x50
Task in /test killed as a result of limit of /test
memory: usage 51200kB, limit 51200kB, failcnt 73
memory+swap: usage 51200kB, limit 9007199254740988kB, failcnt 0
kmem: usage 296kB, limit 9007199254740988kB, failcnt 0
Memory cgroup stats for /test: cache:49632KB rss:1056KB rss_huge:0KB shmem:0KB
mapped_file:0KB dirty:49500KB writeback:0KB swap:0KB inactive_anon:0KB
active_anon:1168KB inactive_file:24760KB active_file:24960KB unevictable:0KB
Memory cgroup out of memory: Kill process 3861 (bash) score 88 or sacrifice child
Killed process 3876 (dd) total-vm:8484kB, anon-rss:1052kB, file-rss:1720kB, shmem-rss:0kB
oom_reaper: reaped process 3876 (dd), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
Wake up flushers in legacy cgroup reclaim too.
Link: http://lkml.kernel.org/r/20180315164553.17856-1-aryabinin@virtuozzo.com
Fixes: bbef938429 ("mm: vmscan: remove old flusher wakeup from direct reclaim path")
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Tested-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Tejun Heo <tj@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
khugepaged is not yet able to convert PTE-mapped huge pages back to PMD
mapped. We do not collapse such pages. See check
khugepaged_scan_pmd().
But if between khugepaged_scan_pmd() and __collapse_huge_page_isolate()
somebody managed to instantiate THP in the range and then split the PMD
back to PTEs we would have a problem --
VM_BUG_ON_PAGE(PageCompound(page)) will get triggered.
It's possible since we drop mmap_sem during collapse to re-take for
write.
Replace the VM_BUG_ON() with graceful collapse fail.
Link: http://lkml.kernel.org/r/20180315152353.27989-1-kirill.shutemov@linux.intel.com
Fixes: b1caa957ae ("khugepaged: ignore pmd tables with THP mapped with ptes")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
On architectures with CONFIG_HAVE_ARCH_HUGE_VMAP set, ioremap() may
create pud/pmd mappings. A kernel panic was observed on arm64 systems
with Cortex-A75 in the following steps as described by Hanjun Guo.
1. ioremap a 4K size, valid page table will build,
2. iounmap it, pte0 will set to 0;
3. ioremap the same address with 2M size, pgd/pmd is unchanged,
then set the a new value for pmd;
4. pte0 is leaked;
5. CPU may meet exception because the old pmd is still in TLB,
which will lead to kernel panic.
This panic is not reproducible on x86. INVLPG, called from iounmap,
purges all levels of entries associated with purged address on x86. x86
still has memory leak.
The patch changes the ioremap path to free unmapped page table(s) since
doing so in the unmap path has the following issues:
- The iounmap() path is shared with vunmap(). Since vmap() only
supports pte mappings, making vunmap() to free a pte page is an
overhead for regular vmap users as they do not need a pte page freed
up.
- Checking if all entries in a pte page are cleared in the unmap path
is racy, and serializing this check is expensive.
- The unmap path calls free_vmap_area_noflush() to do lazy TLB purges.
Clearing a pud/pmd entry before the lazy TLB purges needs extra TLB
purge.
Add two interfaces, pud_free_pmd_page() and pmd_free_pte_page(), which
clear a given pud/pmd entry and free up a page for the lower level
entries.
This patch implements their stub functions on x86 and arm64, which work
as workaround.
[akpm@linux-foundation.org: fix typo in pmd_free_pte_page() stub]
Link: http://lkml.kernel.org/r/20180314180155.19492-2-toshi.kani@hpe.com
Fixes: e61ce6ade4 ("mm: change ioremap to set up huge I/O mappings")
Reported-by: Lei Li <lious.lilei@hisilicon.com>
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Wang Xuefeng <wxf.wang@hisilicon.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Chintan Pandya <cpandya@codeaurora.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A bugfix I did earlier caused a build regression on h8300, which defines
the __BIG_ENDIAN macro in a slightly different way than the generic
code:
arch/h8300/include/asm/byteorder.h:5:0: warning: "__BIG_ENDIAN" redefined
We don't need to define it here, as the same macro is already provided
by the linux/byteorder/big_endian.h, and that version does not conflict.
While this is a v4.16 regression, my earlier patch also got backported
to the 4.14 and 4.15 stable kernels, so we need the fixup there as well.
Link: http://lkml.kernel.org/r/20180313120752.2645129-1-arnd@arndb.de
Fixes: 101110f627 ("Kbuild: always define endianess in kconfig.h")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A vma with vm_pgoff large enough to overflow a loff_t type when
converted to a byte offset can be passed via the remap_file_pages system
call. The hugetlbfs mmap routine uses the byte offset to calculate
reservations and file size.
A sequence such as:
mmap(0x20a00000, 0x600000, 0, 0x66033, -1, 0);
remap_file_pages(0x20a00000, 0x600000, 0, 0x20000000000000, 0);
will result in the following when task exits/file closed,
kernel BUG at mm/hugetlb.c:749!
Call Trace:
hugetlbfs_evict_inode+0x2f/0x40
evict+0xcb/0x190
__dentry_kill+0xcb/0x150
__fput+0x164/0x1e0
task_work_run+0x84/0xa0
exit_to_usermode_loop+0x7d/0x80
do_syscall_64+0x18b/0x190
entry_SYSCALL_64_after_hwframe+0x3d/0xa2
The overflowed pgoff value causes hugetlbfs to try to set up a mapping
with a negative range (end < start) that leaves invalid state which
causes the BUG.
The previous overflow fix to this code was incomplete and did not take
the remap_file_pages system call into account.
[mike.kravetz@oracle.com: v3]
Link: http://lkml.kernel.org/r/20180309002726.7248-1-mike.kravetz@oracle.com
[akpm@linux-foundation.org: include mmdebug.h]
[akpm@linux-foundation.org: fix -ve left shift count on sh]
Link: http://lkml.kernel.org/r/20180308210502.15952-1-mike.kravetz@oracle.com
Fixes: 045c7a3f53 ("hugetlbfs: fix offset overflow in hugetlbfs mmap")
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reported-by: Nic Losby <blurbdust@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Yisheng Xie <xieyisheng1@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dave Jones reported fs_reclaim lockdep warnings.
============================================
WARNING: possible recursive locking detected
4.15.0-rc9-backup-debug+ #1 Not tainted
--------------------------------------------
sshd/24800 is trying to acquire lock:
(fs_reclaim){+.+.}, at: [<0000000084f438c2>] fs_reclaim_acquire.part.102+0x5/0x30
but task is already holding lock:
(fs_reclaim){+.+.}, at: [<0000000084f438c2>] fs_reclaim_acquire.part.102+0x5/0x30
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(fs_reclaim);
lock(fs_reclaim);
*** DEADLOCK ***
May be due to missing lock nesting notation
2 locks held by sshd/24800:
#0: (sk_lock-AF_INET6){+.+.}, at: [<000000001a069652>] tcp_sendmsg+0x19/0x40
#1: (fs_reclaim){+.+.}, at: [<0000000084f438c2>] fs_reclaim_acquire.part.102+0x5/0x30
stack backtrace:
CPU: 3 PID: 24800 Comm: sshd Not tainted 4.15.0-rc9-backup-debug+ #1
Call Trace:
dump_stack+0xbc/0x13f
__lock_acquire+0xa09/0x2040
lock_acquire+0x12e/0x350
fs_reclaim_acquire.part.102+0x29/0x30
kmem_cache_alloc+0x3d/0x2c0
alloc_extent_state+0xa7/0x410
__clear_extent_bit+0x3ea/0x570
try_release_extent_mapping+0x21a/0x260
__btrfs_releasepage+0xb0/0x1c0
btrfs_releasepage+0x161/0x170
try_to_release_page+0x162/0x1c0
shrink_page_list+0x1d5a/0x2fb0
shrink_inactive_list+0x451/0x940
shrink_node_memcg.constprop.88+0x4c9/0x5e0
shrink_node+0x12d/0x260
try_to_free_pages+0x418/0xaf0
__alloc_pages_slowpath+0x976/0x1790
__alloc_pages_nodemask+0x52c/0x5c0
new_slab+0x374/0x3f0
___slab_alloc.constprop.81+0x47e/0x5a0
__slab_alloc.constprop.80+0x32/0x60
__kmalloc_track_caller+0x267/0x310
__kmalloc_reserve.isra.40+0x29/0x80
__alloc_skb+0xee/0x390
sk_stream_alloc_skb+0xb8/0x340
tcp_sendmsg_locked+0x8e6/0x1d30
tcp_sendmsg+0x27/0x40
inet_sendmsg+0xd0/0x310
sock_write_iter+0x17a/0x240
__vfs_write+0x2ab/0x380
vfs_write+0xfb/0x260
SyS_write+0xb6/0x140
do_syscall_64+0x1e5/0xc05
entry_SYSCALL64_slow_path+0x25/0x25
This warning is caused by commit d92a8cfcb3 ("locking/lockdep:
Rework FS_RECLAIM annotation") which replaced the use of
lockdep_{set,clear}_current_reclaim_state() in __perform_reclaim()
and lockdep_trace_alloc() in slab_pre_alloc_hook() with
fs_reclaim_acquire()/ fs_reclaim_release().
Since __kmalloc_reserve() from __alloc_skb() adds __GFP_NOMEMALLOC |
__GFP_NOWARN to gfp_mask, and all reclaim path simply propagates
__GFP_NOMEMALLOC, fs_reclaim_acquire() in slab_pre_alloc_hook() is
trying to grab the 'fake' lock again when __perform_reclaim() already
grabbed the 'fake' lock.
The
/* this guy won't enter reclaim */
if ((current->flags & PF_MEMALLOC) && !(gfp_mask & __GFP_NOMEMALLOC))
return false;
test which causes slab_pre_alloc_hook() to try to grab the 'fake' lock
was added by commit cf40bd16fd ("lockdep: annotate reclaim context
(__GFP_NOFS)"). But that test is outdated because PF_MEMALLOC thread
won't enter reclaim regardless of __GFP_NOMEMALLOC after commit
341ce06f69 ("page allocator: calculate the alloc_flags for allocation
only once") added the PF_MEMALLOC safeguard (
/* Avoid recursion of direct reclaim */
if (p->flags & PF_MEMALLOC)
goto nopage;
in __alloc_pages_slowpath()).
Thus, let's fix outdated test by removing __GFP_NOMEMALLOC test and
allow __need_fs_reclaim() to return false.
Link: http://lkml.kernel.org/r/201802280650.FJC73911.FOSOMLJVFFQtHO@I-love.SAKURA.ne.jp
Fixes: d92a8cfcb3 ("locking/lockdep: Rework FS_RECLAIM annotation")
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Tested-by: Dave Jones <davej@codemonkey.org.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Nick Piggin <npiggin@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Nikolay Borisov <nborisov@suse.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: <stable@vger.kernel.org> [4.14+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The original ast driver cannot display properly if the resolution is 1280x800 and the pixel clock is 83.5MHz.
Here is the update to fix it.
Signed-off-by: Y.C. Chen <yc_chen@aspeedtech.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Pull ACPI fixes from Rafael Wysocki:
"These revert one recent commit that added incorrect battery quirks for
some Asus systems and fix an off-by-one error in the watchdog driver
based on the ACPI WDAT table.
Specifics:
- Revert the recent change adding battery quirks for Asus GL502VSK
and UX305LA as these quirks turn out to be inadequate and possibly
premature (Daniel Drake).
- Fix an off-by-one error in the resource allocation part of the
watchdog driver based on the ACPI WDAT table (Takashi Iwai)"
* tag 'acpi-4.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / watchdog: Fix off-by-one error at resource assignment
Revert "ACPI / battery: Add quirk for Asus GL502VSK and UX305LA"
Pull modules fix from Jessica Yu:
"Propagate error in modules_open() to avoid possible later NULL
dereference if seq_open() had failed"
* tag 'modules-for-v4.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
module: propagate error in modules_open()
force_external_irq_replay() can be called in the do_IRQ path with
interrupts hard enabled and soft disabled if may_hard_irq_enable() set
MSR[EE]=1. It updates local_paca->irq_happened with a load, modify,
store sequence. If a maskable interrupt hits during this sequence, it
will go to the masked handler to be marked pending in irq_happened.
This update will be lost when the interrupt returns and the store
instruction executes. This can result in unpredictable latencies,
timeouts, lockups, etc.
Fix this by ensuring hard interrupts are disabled before modifying
irq_happened.
This could cause any maskable asynchronous interrupt to get lost, but
it was noticed on P9 SMP system doing RDMA NVMe target over 100GbE,
so very high external interrupt rate and high IPI rate. The hang was
bisected down to enabling doorbell interrupts for IPIs. These provided
an interrupt type that could run at high rates in the do_IRQ path,
stressing the race.
Fixes: 1d607bb3bd ("powerpc/irq: Add mechanism to force a replay of interrupts")
Cc: stable@vger.kernel.org # v4.8+
Reported-by: Carol L. Soto <clsoto@us.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Pull networking fixes from David Miller:
1) Always validate XFRM esn replay attribute, from Florian Westphal.
2) Fix RCU read lock imbalance in xfrm_get_tos(), from Xin Long.
3) Don't try to get firmware dump if not loaded in iwlwifi, from Shaul
Triebitz.
4) Fix BPF helpers to deal with SCTP GSO SKBs properly, from Daniel
Axtens.
5) Fix some interrupt handling issues in e1000e driver, from Benjamin
Poitier.
6) Use strlcpy() in several ethtool get_strings methods, from Florian
Fainelli.
7) Fix rhlist dup insertion, from Paul Blakey.
8) Fix SKB leak in netem packet scheduler, from Alexey Kodanev.
9) Fix driver unload crash when link is up in smsc911x, from Jeremy
Linton.
10) Purge out invalid socket types in l2tp_tunnel_create(), from Eric
Dumazet.
11) Need to purge the write queue when TCP connections are aborted,
otherwise userspace using MSG_ZEROCOPY can't close the fd. From
Soheil Hassas Yeganeh.
12) Fix double free in error path of team driver, from Arkadi
Sharshevsky.
13) Filter fixes for hv_netvsc driver, from Stephen Hemminger.
14) Fix non-linear packet access in ipv6 ndisc code, from Lorenzo
Bianconi.
15) Properly filter out unsupported feature flags in macvlan driver,
from Shannon Nelson.
16) Don't request loading the diag module for a protocol if the protocol
itself is not even registered. From Xin Long.
17) If datagram connect fails in ipv6, make sure the socket state is
consistent afterwards. From Paolo Abeni.
18) Use after free in qed driver, from Dan Carpenter.
19) If received ipv4 PMTU is less than the min pmtu, lock the mtu in the
entry. From Sabrina Dubroca.
20) Fix sleep in atomic in tg3 driver, from Jonathan Toppins.
21) Fix vlan in vlan untagging in some situations, from Toshiaki Makita.
22) Fix double SKB free in genlmsg_mcast(). From Nicolas Dichtel.
23) Fix NULL derefs in error paths of tcf_*_init(), from Davide Caratti.
24) Unbalanced PM runtime calls in FEC driver, from Florian Fainelli.
25) Memory leak in gemini driver, from Igor Pylypiv.
26) IDR leaks in error paths of tcf_*_init() functions, from Davide
Caratti.
27) Need to use GFP_ATOMIC in seg6_build_state(), from David Lebrun.
28) Missing dev_put() in error path of macsec_newlink(), from Dan
Carpenter.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (201 commits)
macsec: missing dev_put() on error in macsec_newlink()
net: dsa: Fix functional dsa-loop dependency on FIXED_PHY
hv_netvsc: common detach logic
hv_netvsc: change GPAD teardown order on older versions
hv_netvsc: use RCU to fix concurrent rx and queue changes
hv_netvsc: disable NAPI before channel close
net/ipv6: Handle onlink flag with multipath routes
ppp: avoid loop in xmit recursion detection code
ipv6: sr: fix NULL pointer dereference when setting encap source address
ipv6: sr: fix scheduling in RCU when creating seg6 lwtunnel state
net: aquantia: driver version bump
net: aquantia: Implement pci shutdown callback
net: aquantia: Allow live mac address changes
net: aquantia: Add tx clean budget and valid budget handling logic
net: aquantia: Change inefficient wait loop on fw data reads
net: aquantia: Fix a regression with reset on old firmware
net: aquantia: Fix hardware reset when SPI may rarely hangup
s390/qeth: on channel error, reject further cmd requests
s390/qeth: lock read device while queueing next buffer
s390/qeth: when thread completes, wake up all waiters
...
Pull MMC fixes from Ulf Hansson:
"A couple of MMC fixes intended for v4.16-rc7:
MMC host:
- dw_mmc: Fix the suspend/resume issue for Exynos5433
- dw_mmc: Fix the DTO/CTO timeout overflow calculation for 32-bit
systems
- dw_mmc: Make PIO mode work when failing with idmac when
dw_mci_reset occurs
- sdhci-acpi: Re-allow IRQ 0 to fix broken probe
MMC core:
- Update EXT_CSD caches to correctly switch partition for ioctl calls
- Fix tracepoint print of blk_addr and blksz
- Disable HPI on broken Micron (Numonyx) eMMC cards"
* tag 'mmc-v4.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: sdhci-acpi: Fix IRQ 0
mmc: dw_mmc: fix falling from idmac to PIO mode when dw_mci_reset occurs
mmc: core: Fix tracepoint print of blk_addr and blksz
mmc: core: Disable HPI for certain Micron (Numonyx) eMMC cards
mmc: dw_mmc: exynos: fix the suspend/resume issue for exynos5433
mmc: block: fix updating ext_csd caches on ioctl call
mmc: dw_mmc: Fix the DTO/CTO timeout overflow calculation for 32-bit systems
Main change is a patch to reject getfb call for multiplanar framebuffers,
then we have a couple of error path fixes on the sun4i driver. Still on that
driver there is a clk fix and finally a mmap offset fix on the udl driver.
* tag 'drm-misc-fixes-2018-03-22' of git://anongit.freedesktop.org/drm/drm-misc:
drm: udl: Properly check framebuffer mmap offsets
drm: Reject getfb for multi-plane framebuffers
drm/sun4i: hdmi: Fix another error handling path in 'sun4i_hdmi_bind()'
drm/sun4i: hdmi: Fix an error handling path in 'sun4i_hdmi_bind()'
drm/sun4i: Fix an error handling path in 'sun4i_drv_bind()'
drm/sun4i: Fix exclusivity of the TCON clocks
One fix for DP MST and one fix for GPU reset on hang check.
* tag 'drm-intel-fixes-2018-03-21' of git://anongit.freedesktop.org/drm/drm-intel:
drm/i915: Specify which engines to reset following semaphore/event lockups
drm/i915/dp: Write to SET_POWER dpcd to enable MST hub.
Two vmwgfx fixes for 4.16. Both cc'd stable.
* 'vmwgfx-fixes-4.16' of git://people.freedesktop.org/~thomash/linux:
drm/vmwgfx: Fix a destoy-while-held mutex problem.
drm/vmwgfx: Fix black screen and device errors when running without fbdev
drm/imx: fixes for early vblank event issue, array underflow error
- fix an array underflow error by reordering the range check before the array
subscript in ipu-prg.
- make some local functions static in ipuv3-plane.
- add a missng header for ipu_planes_assign_pre in ipuv3-plane.
- move arming of the vblank event from atomic_begin to atomic_flush, to avoid
signalling atomic commit completion to userspace before plane atomic_update
has finished, due to a race condition that is likely to be hit on i.MX6QP on
PRE enabled channels.
* tag 'imx-drm-fixes-2018-03-22' of git://git.pengutronix.de/git/pza/linux:
drm/imx: move arming of the vblank event to atomic_flush
drm/imx: ipuv3-plane: Include "imx-drm.h" header file
drm/imx: ipuv3-plane: Make functions static when possible
gpu: ipu-v3: prg: avoid possible array underflow
We moved the dev_hold(real_dev); call earlier in the function but forgot
to update the error paths.
Fixes: 0759e552bc ("macsec: fix negative refcnt on parent link")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Johannes Berg says:
====================
Two more fixes (in three patches):
* ath9k_htc doesn't like QoS NDP frames, use regular ones
* hwsim: set up wmediumd for radios created later
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
We have a functional dependency on the FIXED_PHY MDIO bus because we register
fixed PHY devices "the old way" which only works if the code that does this has
had a chance to run before the fixed MDIO bus is probed. Make sure we account
for that and have dsa_loop_bdinfo.o be either built-in or modular depending on
whether CONFIG_FIXED_PHY reflects that too.
Fixes: 98cd1552ea ("net: dsa: Mock-up driver")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stephen Hemminger says:
====================
hv_netvsc: fix races during shutdown and changes
This set of patches fixes issues identified by Vitaly Kuznetsov and
Mohammed Gamal related to state changes in Hyper-v network driver.
A lot of the issues are because setting up the netvsc device requires
a second step (in work queue) to get all the sub-channels running.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Make common function for detaching internals of device
during changes to MTU and RSS. Make sure no more packets
are transmitted and all packets have been received before
doing device teardown.
Change the wait logic to be common and use usleep_range().
Changes transmit enabling logic so that transmit queues are disabled
during the period when lower device is being changed. And enabled
only after sub channels are setup. This avoids issue where it could
be that a packet was being sent while subchannel was not initialized.
Fixes: 8195b1396e ("hv_netvsc: fix deadlock on hotplug")
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On older versions of Windows, the host ignores messages after
vmbus channel is closed.
Workaround this by doing what Windows does and send the teardown
before close on older versions of NVSP protocol.
Reported-by: Mohammed Gamal <mgamal@redhat.com>
Fixes: 0cf737808a ("hv_netvsc: netvsc_teardown_gpadl() split")
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The receive processing may continue to happen while the
internal network device state is in RCU grace period.
The internal RNDIS structure is associated with the
internal netvsc_device structure; both have the same
RCU lifetime.
Defer freeing all associated parts until after grace
period.
Fixes: 0cf737808a ("hv_netvsc: netvsc_teardown_gpadl() split")
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This makes sure that no CPU is still process packets when
the channel is closed.
Fixes: 76bb5db5c7 ("netvsc: fix use after free on module removal")
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For multipath routes the ONLINK flag can be specified per nexthop in
rtnh_flags or globally in rtm_flags. Update ip6_route_multipath_add
to consider the ONLINK setting coming from rtnh_flags. Each loop over
nexthops the config for the sibling route is initialized to the global
config and then per nexthop settings overlayed. The flag is 'or'ed into
fib6_config to handle the ONLINK flag coming from either rtm_flags or
rtnh_flags.
Fixes: fc1e64e109 ("net/ipv6: Add support for onlink flag")
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We already detect situations where a PPP channel sends packets back to
its upper PPP device. While this is enough to avoid deadlocking on xmit
locks, this doesn't prevent packets from looping between the channel
and the unit.
The problem is that ppp_start_xmit() enqueues packets in ppp->file.xq
before checking for xmit recursion. Therefore, __ppp_xmit_process()
might dequeue a packet from ppp->file.xq and send it on the channel
which, in turn, loops it back on the unit. Then ppp_start_xmit()
queues the packet back to ppp->file.xq and __ppp_xmit_process() picks
it up and sends it again through the channel. Therefore, the packet
will loop between __ppp_xmit_process() and ppp_start_xmit() until some
other part of the xmit path drops it.
For L2TP, we rapidly fill the skb's headroom and pppol2tp_xmit() drops
the packet after a few iterations. But PPTP reallocates the headroom
if necessary, letting the loop run and exhaust the machine resources
(as reported in https://bugzilla.kernel.org/show_bug.cgi?id=199109).
Fix this by letting __ppp_xmit_process() enqueue the skb to
ppp->file.xq, so that we can check for recursion before adding it to
the queue. Now ppp_xmit_process() can drop the packet when recursion is
detected.
__ppp_channel_push() is a bit special. It calls __ppp_xmit_process()
without having any actual packet to send. This is used by
ppp_output_wakeup() to re-enable transmission on the parent unit (for
implementations like ppp_async.c, where the .start_xmit() function
might not consume the skb, leaving it in ppp->xmit_pending and
disabling transmission).
Therefore, __ppp_xmit_process() needs to handle the case where skb is
NULL, dequeuing as many packets as possible from ppp->file.xq.
Reported-by: xu heng <xuheng333@zoho.com>
Fixes: 55454a5658 ("ppp: avoid dealock on recursive xmit")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Igor Russkikh says:
====================
Aquantia atlantic hot fixes 03-2018
This is a set of atlantic driver hot fixes for various areas:
Some issues with hardware reset covered,
Fixed napi_poll flood happening on some traffic conditions,
Allow system to change MAC address on live device,
Add pci shutdown handler.
patch v2:
- reverse christmas tree
- remove driver private parameter, replacing it with define.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
We should close link and all NIC operations during shutdown.
On some systems graceful reboot never closes NIC interface on its own,
but only indicates pci device shutdown. Without explicit handler, NIC
rx rings continued to transfer DMA data into prepared buffers while CPU
rebooted already. That caused memory corruptions on soft reboot.
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is nothing prevents us from changing MAC on the running interface.
Allow this with ndev priv flag.
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We should report to napi full budget only when we have more job to do.
Before this fix, on any tx queue cleanup we forced napi to do poll again.
Thats a waste of cpu resources and caused storming with napi polls when
there was at least one tx on each interrupt.
With this fix we report full budget only when there is more job on TX
to do. Or, as before, when rx budget was fully consumed.
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
B1 hardware changes behavior of mailbox interface, it has busy bit
always raised. Data ready condition should be detected by increment
of address register.
Old code has empty `for` loop, and that caused cpu overloads on B1
hardware. aq_nic_service_timer_cb consumed ~100ms because of that.
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
FW 1.5.58 and below needs a fixed delay even after 0x18 register
is filled. Otherwise, setting MPI_INIT state too fast causes
traffic hang.
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Under some circumstances (notably using thunderbolt interface) SPI
on chip reset may be in active transaction.
Here we forcibly cleanup SPI to prevent possible hangups.
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann says:
====================
s390/qeth: fixes 2018-03-20
Please apply one final set of qeth patches for 4.16.
All of these fix long-standing bugs, so please queue them up for -stable
as well.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When the IRQ handler determines that one of the cmd IO channels has
failed and schedules recovery, block any further cmd requests from
being submitted. The request would inevitably stall, and prevent the
recovery from making progress until the request times out.
This sort of error was observed after Live Guest Relocation, where
the pending IO on the READ channel intentionally gets terminated to
kick-start recovery. Simultaneously the guest executed SIOCETHTOOL,
triggering qeth to issue a QUERY CARD INFO command. The command
then stalled in the inoperabel WRITE channel.
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For calling ccw_device_start(), issue_next_read() needs to hold the
device's ccwlock.
This is satisfied for the IRQ handler path (where qeth_irq() gets called
under the ccwlock), but we need explicit locking for the initial call by
the MPC initialization.
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
qeth_wait_for_threads() is potentially called by multiple users, make
sure to notify all of them after qeth_clear_thread_running_bit()
adjusted the thread_running_mask. With no timeout, callers would
otherwise stall.
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On removal, a qeth card's netdevice is currently not properly freed
because the call chain looks as follows:
qeth_core_remove_device(card)
lx_remove_device(card)
unregister_netdev(card->dev)
card->dev = NULL !!!
qeth_core_free_card(card)
if (card->dev) !!!
free_netdev(card->dev)
Fix it by free'ing the netdev straight after unregistering. This also
fixes the sysfs-driven layer switch case (qeth_dev_layer2_store()),
where the need to free the current netdevice was not considered at all.
Note that free_netdev() takes care of the netif_napi_del() for us too.
Fixes: 4a71df5004 ("qeth: new qeth device driver")
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kevin Hao says:
====================
net: phy: Add general dummy stubs for MMD register access
v2:
As suggested by Andrew:
- Add general dummy stubs
- Also use that for the micrel phy
This patch series fix the Ethernet broken on the mpc8315erdb board introduced
by commit b6b5e8a691 ("gianfar: Disable EEE autoneg by default").
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The new general dummy stubs for MMD register access were introduced.
Use that for the codes reuse.
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Ethernet on mpc8315erdb is broken since commit b6b5e8a691
("gianfar: Disable EEE autoneg by default"). The reason is that
even though the rtl8211b doesn't support the MMD extended registers
access, it does return some random values if we trying to access
the MMD register via indirect method. This makes it seem that the
EEE is supported by this phy device. And the subsequent writing to
the MMD registers does cause the phy malfunction. So use the dummy
stubs for the MMD register access to fix this issue.
Fixes: b6b5e8a691 ("gianfar: Disable EEE autoneg by default")
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For some phy devices, even though they don't support the MMD extended
register access, it does have some side effect if we are trying to
read/write the MMD registers via indirect method. So introduce general
dummy stubs for MMD register access which these devices can use to avoid
such side effect.
Fixes: b6b5e8a691 ("gianfar: Disable EEE autoneg by default")
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Simon Wunderlich says:
====================
Here are some batman-adv bugfixes:
- fix possible IPv6 packet loss when multicast extension is used, by Linus Luessing
- fix SKB handling issues for TTVN and DAT, by Matthias Schiffer (two patches)
- fix include for eventpoll, by Sven Eckelmann
- fix skb checksum for ttvn reroutes, by Sven Eckelmann
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Use tegra124_(primary|overlay)_formats for Tegra124, otherwise the count
specified in the Tegra124 SoC info structure will be different from the
array size and cause a crash.
Fixes: 511c7023cf ("drm/tegra: dc: Support more formats")
Signed-off-by: Stefan Agner <stefan@agner.ch>
Acked-by: Marcel Ziswiler <marcel.ziswiler@toradex.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
The netfilter netdevice event handler hold the nfnl_lock mutex, this
avoids races with a device going away while such device is being
attached to hooks from the netlink control plane. Therefore, either
control plane bails out with ENOENT or netdevice event path waits until
the hook that is attached to net_device is registered.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Devices going away have to grab the nfnl_lock from the netdev event path
to avoid races with control plane updates.
However, netlink dumps in netfilter do not hold nfnl_lock mutex. Cache
the device name into the objects to avoid an use-after-free situation
for a device that is going away.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
In loopback_open() and loopback_close(), we assign and release the
substream object to the corresponding cable in a racy way. It's
neither locked nor done in the right position. The open callback
assigns the substream before its preparation finishes, hence the other
side of the cable may pick it up, which may lead to the invalid memory
access.
This patch addresses these: move the assignment to the end of the open
callback, and wrap with cable->lock for avoiding concurrent accesses.
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The aloop driver tries to stop the pending timer via timer_del() in
the trigger callback and in the close callback. The former is
correct, as it's an atomic operation, while the latter expects that
the timer gets really removed and proceeds the resource releases after
that. But timer_del() doesn't synchronize, hence the running timer
may still access the released resources.
A similar situation can be also seen in the prepare callback after
trigger(STOP) where the prepare tries to re-initialize the things
while a timer is still running.
The problems like the above are seen indirectly in some syzkaller
reports (although it's not 100% clear whether this is the only cause,
as the race condition is quite narrow and not always easy to
trigger).
For addressing these issues, this patch adds the explicit alls of
timer_del_sync() in some places, so that the pending timer is properly
killed / synced.
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
It will have a chance speaker no sound after system resume.
To toggle NID 0x53 index 0x2 bit 15 will solve this issue.
This usage will also suitable with ALC256.
Fixes: 4a219ef8f3 ("ALSA: hda/realtek - Add ALC256 HP depop function")
Signed-off-by: Kailang Yang <kailang@realtek.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
This platform was hardware fixed type for CTIA type for headset port.
Assigned 0x19 verb will fix can't record issue.
Signed-off-by: Kailang Yang <kailang@realtek.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The bitfield dma_inuse is allocated of size dma_requests bits, thus a
valid bit address is from 0 to (dma_requests - 1).
When find_first_zero_bit() fails, it returns dma_requests as invalid
address.
Using such address for the following set_bit() is incorrect and, if
dma_requests is a multiple of BITS_PER_LONG, it will cause a buffer
overflow.
Currently this driver is only used in DT stm32h743.dtsi where a safe value
dma_requests=16 is not triggering the buffer overflow.
Fixed by checking the return value of find_first_zero_bit() _before_
using it.
Signed-off-by: Antonio Borneo <borneo.antonio@gmail.com>
Signed-off-by: Pierre-Yves MORDRET <pierre-yves.mordret@st.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
This reverts commit 2170dd0431
The intent of commit 2170dd0431 ("vfio-pci: Mask INTx if a device is
not capabable of enabling it") was to disallow the user from seeing
that the device supports INTx if the platform is incapable of enabling
it. The detection of this case however incorrectly includes devices
which natively do not support INTx, such as SR-IOV VFs, and further
discussions reveal gaps even for the target use case.
Reported-by: Arjun Vynipadath <arjun@chelsio.com>
Fixes: 2170dd0431 ("vfio-pci: Mask INTx if a device is not capabable of enabling it")
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Since commit 3af5a67c86 ("MIPS: Fix early CM probing") the MT7621 has
not been able to boot.
This commit caused mips_cm_probe() to be called before
mt7621.c::proc_soc_init().
prom_soc_init() has a comment explaining that mips_cm_probe() "wipes out
the bootloader config" and means that configuration registers are no
longer available. It has some code to re-enable this config.
Before this re-enable code is run, the sysc register cannot be read, so
when SYSC_REG_CHIP_NAME0 is read, a garbage value is returned and
panic() is called.
If we move the config-repair code to the top of prom_soc_init(), the
registers can be read and boot can proceed.
Very occasionally, the first register read after the reconfiguration
returns garbage, so add a call to __sync().
Fixes: 3af5a67c86 ("MIPS: Fix early CM probing")
Signed-off-by: NeilBrown <neil@brown.name>
Reviewed-by: Matt Redfearn <matt.redfearn@mips.com>
Cc: John Crispin <john@phrozen.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 4.5+
Patchwork: https://patchwork.linux-mips.org/patch/18859/
Signed-off-by: James Hogan <jhogan@kernel.org>
ralink_halt() does nothing that machine_halt() doesn't already do, so it
adds no value.
It actually causes incorrect behaviour due to the "unreachable()" at the
end. This tells the compiler that the end of the function will never be
reached, which isn't true. The compiler responds by not adding a
'return' instruction, so control simply moves on to whatever bytes come
afterwards in memory. In my tested, that was the ralink_restart()
function. This means that an attempt to 'halt' the machine would
actually cause a reboot.
So remove ralink_halt() so that a 'halt' really does halt.
Fixes: c06e836ada ("MIPS: ralink: adds reset code")
Signed-off-by: NeilBrown <neil@brown.name>
Cc: John Crispin <john@phrozen.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 3.9+
Patchwork: https://patchwork.linux-mips.org/patch/18851/
Signed-off-by: James Hogan <jhogan@kernel.org>
A few more fixes for 4.16. Mostly for displays:
- A fix for DP handling on radeon
- Fix banding on eDP panels
- Fix HBR audio
- Fix for disabling VGA mode on Raven that leads to a corrupt or
blank display on some platforms
* 'drm-fixes-4.16' of git://people.freedesktop.org/~agd5f/linux:
drm/amd/display: Add one to EDID's audio channel count when passing to DC
drm/amd/display: We shouldn't set format_default on plane as atomic driver
drm/amd/display: Fix FMT truncation programming
drm/amd/display: Allow truncation to 10 bits
drm/amd/display: fix dereferencing possible ERR_PTR()
drm/amd/display: Refine disable VGA
drm/amdgpu: Use atomic function to disable crtcs with dc enabled
drm/radeon: Don't turn off DP sink when disconnected
Davide Caratti says:
====================
fix idr leak in actions
This series fixes situations where a temporary failure to install a TC
action results in the permanent impossibility to reuse the configured
value of 'index'.
Thanks to Cong Wang for the initial review.
v2: fix build error in act_ipt.c, reported by kbuild test robot
====================
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tcf_skbmod_init() can fail after the idr has been successfully reserved.
When this happens, every subsequent attempt to configure skbmod rules
using the same idr value will systematically fail with -ENOSPC, unless
the first attempt was done using the 'replace' keyword:
# tc action add action skbmod swap mac index 100
RTNETLINK answers: Cannot allocate memory
We have an error talking to the kernel
# tc action add action skbmod swap mac index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
# tc action add action skbmod swap mac index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
...
Fix this in tcf_skbmod_init(), ensuring that tcf_idr_release() is called
on the error path when the idr has been reserved, but not yet inserted.
Also, don't test 'ovr' in the error path, to avoid a 'replace' failure
implicitly become a 'delete' that leaks refcount in act_skbmod module:
# rmmod act_skbmod; modprobe act_skbmod
# tc action add action skbmod swap mac index 100
# tc action add action skbmod swap mac continue index 100
RTNETLINK answers: File exists
We have an error talking to the kernel
# tc action replace action skbmod swap mac continue index 100
RTNETLINK answers: Cannot allocate memory
We have an error talking to the kernel
# tc action list action skbmod
#
# rmmod act_skbmod
rmmod: ERROR: Module act_skbmod is in use
Fixes: 65a206c01e ("net/sched: Change act_api and act_xxx modules to use IDR")
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tcf_vlan_init() can fail after the idr has been successfully reserved.
When this happens, every subsequent attempt to configure vlan rules using
the same idr value will systematically fail with -ENOSPC, unless the first
attempt was done using the 'replace' keyword.
# tc action add action vlan pop index 100
RTNETLINK answers: Cannot allocate memory
We have an error talking to the kernel
# tc action add action vlan pop index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
# tc action add action vlan pop index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
...
Fix this in tcf_vlan_init(), ensuring that tcf_idr_release() is called on
the error path when the idr has been reserved, but not yet inserted. Also,
don't test 'ovr' in the error path, to avoid a 'replace' failure implicitly
become a 'delete' that leaks refcount in act_vlan module:
# rmmod act_vlan; modprobe act_vlan
# tc action add action vlan push id 5 index 100
# tc action replace action vlan push id 7 index 100
RTNETLINK answers: Cannot allocate memory
We have an error talking to the kernel
# tc action list action vlan
#
# rmmod act_vlan
rmmod: ERROR: Module act_vlan is in use
Fixes: 4c5b9d9642 ("act_vlan: VLAN action rewrite to use RCU lock/unlock and update")
Fixes: 65a206c01e ("net/sched: Change act_api and act_xxx modules to use IDR")
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
__tcf_ipt_init() can fail after the idr has been successfully reserved.
When this happens, subsequent attempts to configure xt/ipt rules using
the same idr value systematically fail with -ENOSPC:
# tc action add action xt -j LOG --log-prefix test1 index 100
tablename: mangle hook: NF_IP_POST_ROUTING
target: LOG level warning prefix "test1" index 100
RTNETLINK answers: Cannot allocate memory
We have an error talking to the kernel
Command "(null)" is unknown, try "tc actions help".
# tc action add action xt -j LOG --log-prefix test1 index 100
tablename: mangle hook: NF_IP_POST_ROUTING
target: LOG level warning prefix "test1" index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
Command "(null)" is unknown, try "tc actions help".
# tc action add action xt -j LOG --log-prefix test1 index 100
tablename: mangle hook: NF_IP_POST_ROUTING
target: LOG level warning prefix "test1" index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
...
Fix this in the error path of __tcf_ipt_init(), calling tcf_idr_release()
in place of tcf_idr_cleanup(). Since tcf_ipt_release() can now be called
when tcfi_t is NULL, we also need to protect calls to ipt_destroy_target()
to avoid NULL pointer dereference.
Fixes: 65a206c01e ("net/sched: Change act_api and act_xxx modules to use IDR")
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tcf_pedit_init() can fail to allocate 'keys' after the idr has been
successfully reserved. When this happens, subsequent attempts to configure
a pedit rule using the same idr value systematically fail with -ENOSPC:
# tc action add action pedit munge ip ttl set 63 index 100
RTNETLINK answers: Cannot allocate memory
We have an error talking to the kernel
# tc action add action pedit munge ip ttl set 63 index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
# tc action add action pedit munge ip ttl set 63 index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
...
Fix this in the error path of tcf_act_pedit_init(), calling
tcf_idr_release() in place of tcf_idr_cleanup().
Fixes: 65a206c01e ("net/sched: Change act_api and act_xxx modules to use IDR")
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The persistence domain is a point in the platform where once writes
reach that destination the platform claims it will make them persistent
relative to power loss. In the ACPI NFIT this is currently communicated
as 2 bits in the "NFIT - Platform Capabilities Structure". The bits
comprise a hierarchy, i.e. bit0 "CPU Cache Flush to NVDIMM Durability on
Power Loss Capable" implies bit1 "Memory Controller Flush to NVDIMM
Durability on Power Loss Capable".
Commit 96c3a23905 "libnvdimm: expose platform persistence attr..."
shows the persistence domain as flags, but it's really an enumerated
hierarchy.
Fix this newly introduced user ABI to show the closest available
persistence domain before userspace develops dependencies on seeing, or
needing to develop code to tolerate, the raw NFIT flags communicated
through the libnvdimm-generic region attribute.
Fixes: 96c3a23905 ("libnvdimm: expose platform persistence attr...")
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
tcf_act_police_init() can fail after the idr has been successfully
reserved (e.g., qdisc_get_rtab() may return NULL). When this happens,
subsequent attempts to configure a police rule using the same idr value
systematiclly fail with -ENOSPC:
# tc action add action police rate 1000 burst 1000 drop index 100
RTNETLINK answers: Cannot allocate memory
We have an error talking to the kernel
# tc action add action police rate 1000 burst 1000 drop index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
# tc action add action police rate 1000 burst 1000 drop index 100
RTNETLINK answers: No space left on device
...
Fix this in the error path of tcf_act_police_init(), calling
tcf_idr_release() in place of tcf_idr_cleanup().
Fixes: 65a206c01e ("net/sched: Change act_api and act_xxx modules to use IDR")
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
if the kernel fails to duplicate 'sdata', creation of a new action fails
with -ENOMEM. However, subsequent attempts to install the same action
using the same value of 'index' systematically fail with -ENOSPC, and
that value of 'index' will no more be usable by act_simple, until rmmod /
insmod of act_simple.ko is done:
# tc actions add action simple sdata hello index 100
# tc actions list action simple
action order 0: Simple <hello>
index 100 ref 1 bind 0
# tc actions flush action simple
# tc actions add action simple sdata hello index 100
RTNETLINK answers: Cannot allocate memory
We have an error talking to the kernel
# tc actions flush action simple
# tc actions add action simple sdata hello index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
# tc actions add action simple sdata hello index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
...
Fix this in the error path of tcf_simp_init(), calling tcf_idr_release()
in place of tcf_idr_cleanup().
Fixes: 65a206c01e ("net/sched: Change act_api and act_xxx modules to use IDR")
Suggested-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
when the following command sequence is entered
# tc action add action bpf bytecode '4,40 0 0 12,31 0 1 2048,6 0 0 262144,6 0 0 0' index 100
RTNETLINK answers: Invalid argument
We have an error talking to the kernel
# tc action add action bpf bytecode '4,40 0 0 12,21 0 1 2048,6 0 0 262144,6 0 0 0' index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
act_bpf correctly refuses to install the first TC rule, because 31 is not
a valid instruction. However, it refuses to install the second TC rule,
even if the BPF code is correct. Furthermore, it's no more possible to
install any other rule having the same value of 'index' until act_bpf
module is unloaded/inserted again. After the idr has been reserved, call
tcf_idr_release() instead of tcf_idr_cleanup(), to fix this issue.
Fixes: 65a206c01e ("net/sched: Change act_api and act_xxx modules to use IDR")
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Trivial fix to spelling mistakes in DP_ERR error message text and
comments
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Once the FW is transitioned to error, FLUSH cqes can be received.
We want the driver to be aware of the fact that QP is already in error.
Without this fix, a user may see false error messages in the dmesg log,
mentioning that a FLUSH cqe was received while QP is not in error state.
Fixes: cecbcddf ("qedr: Add support for QP verbs")
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Return code wasn't set properly when CNQ allocation failed.
This only affect error message logging, currently user will
receive an error message that says the qedr driver load failed
with rc '0', instead of ENOMEM
Fixes: ec72fce4 ("qedr: Add support for RoCE HW init")
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
QPs that were configured with ack timeout value lower than 1
msec will not implement re-transmission timeout.
This means that if a packet / ACK were dropped, the QP
will not retransmit this packet.
This can lead to an application hang.
Fixes: cecbcddf6 ("qedr: Add support for QP verbs")
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The option size check is using optval instead of optlen
causing the set option call to fail. Use the correct
field, optlen, for size check.
Fixes: 6a21dfc0d0 ("RDMA/ucma: Limit possible option size")
Signed-off-by: Chien Tin Tung <chien.tin.tung@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
In case we failed to create UMR resources, mark them as invalid so we
won't try to destroy them on the unwind path.
Add the relevant checks to destroy_umrc_res(), this is done for the
unlikely event ib_register_device() or create_umr_res() err out and we
try to destroy invalid objects.
Fixes: 42cea83f95 ("IB/mlx5: Fix cleanup order on unload")
Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Daniel Borkmann says:
====================
pull-request: bpf 2018-03-21
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) Follow-up fix to the fault injection framework to prevent jump
optimization on the kprobe by installing a dummy post-handler,
from Masami.
2) Drop bpf_perf_prog_read_value helper from tracepoint type programs
which was mistakenly added there and would otherwise crash due to
wrong input context, from Yonghong.
3) Fix a crash in BPF fs when compiled with clang. Code appears to
be fine just that clang tries to overly aggressive optimize in
non C conform ways, therefore fix the kernel's Makefile to
generally prevent such issues, from Daniel.
4) Skip unnecessary capability checks in bpf syscall, which is otherwise
triggering unnecessary security hooks on capability checking and
causing false alarms on unprivileged processes trying to access
CAP_SYS_ADMIN restricted infra, from Chenbo.
5) Fix the test_bpf.ko module when CONFIG_BPF_JIT_ALWAYS_ON is set
with regards to a test case that is really just supposed to fail
on x8_64 JIT but not others, from Thadeu.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
We are still using custom SRAM code for some SoCs and are not marking
the PM code mapped to SRAM as read-only and executable after we're
done. With CONFIG_DEBUG_WX=y, we will get "Found insecure W+X mapping
at address" warning.
Let's fix this issue the same way as commit 728bbe75c8 ("misc: sram:
Introduce support code for protect-exec sram type") is doing for
drivers/misc/sram-exec.c.
On omap3, we need to restore SRAM when returning from off mode after
idle, so init time configuration is not enough.
And as we no longer have users for omap_sram_push_address() we can
make it static while at it.
Note that eventually we should be using sram-exec.c for all SoCs.
Cc: stable@vger.kernel.org # v4.12+
Cc: Dave Gerlach <d-gerlach@ti.com>
Reported-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Set the wmediumd to the net's wmediumd when the radio gets created.
Radios created after HWSIM_CMD_REGISTER don't currently get their
data->wmediumd set and the userspace would need to reconnect to
netlink to be able to call HWSIM_CMD_REGISTER again.
Alternatively I think data->netgroup and data->wmedium could be
replaced with a pointer to hwsim_net.
Signed-off-by: Andrew Zaborowski <andrew.zaborowski@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Commit 7b6ddeaf27 ("mac80211: use QoS NDP for AP probing") added an
argument qos_ok to ieee80211_nullfunc_get to support QoS NDP. Despite
the claim in the commit log "Change all the drivers to *not* allow
QoS NDP for now, even though it looks like most of them should be OK
with that", this commit enables QoS NDP in response to beacons (see
change to mlme.c:ieee80211_send_nullfunc), causing ath9k_htc to lose
IP connectivity. See:
https://patchwork.kernel.org/patch/10241109/https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=891060
Introduce a hardware flag to allow such buggy drivers to override the
correct default behaviour of mac80211 of sending QoS NDP packets.
Signed-off-by: Ben Caradoc-Davies <ben@transient.nz>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
When validating legacy surfaces, the backup bo might be destroyed at
surface validate time. However, the kms resource validation code may have
the bo reserved, so we will destroy a locked mutex. While there shouldn't
be any other users of that mutex when it is destroyed, it causes a lock
leak and thus throws a lockdep error.
Fix this by having the kms resource validation code hold a reference to
the bo while we have it reserved. We do this by introducing a validation
context which might come in handy when the kms code is extended to validate
multiple resources or buffers.
Cc: <stable@vger.kernel.org>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
When we are running without fbdev, transitioning from the login screen to
X or gnome-shell/wayland will cause a vt switch and the driver will disable
svga mode, losing all modesetting resources. However, the kms atomic state
does not reflect that and may think that a crtc is still turned on, which
will cause device errors when we try to bind an fb to the crtc, and the
screen will remain black.
Fix this by turning off all kms resources before disabling svga mode.
Cc: <stable@vger.kernel.org>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
We've observed too long probe time with Coffee Lake (CFL) machines,
and the likely cause is some communication problem between the
HD-audio controller and the codec chips. While the controller expects
an IRQ wakeup for each codec response, it seems sometimes missing, and
it takes one second for the controller driver to time out and read the
response in the polling mode.
Although we aren't sure about the real culprit yet, in this patch, we
put a workaround by forcing the polling mode as default for CFL
machines; the polling mode itself isn't too heavy, and much better
than other workarounds initially suggested (e.g. disabling
power-save), at least.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199007
Fixes: e79b0006c4 ("ALSA: hda - Add Coffelake PCI ID")
Reported-and-tested-by: Hui Wang <hui.wang@canonical.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Due to missing information in Hardware manual, current
implementation doesn't read ECCSTAT0 and ECCSTAT1 registers
for IFC 2.0.
Add support to read ECCSTAT0 and ECCSTAT1 registers during
ecccheck for IFC 2.0.
Fixes: 656441478e ("mtd: nand: ifc: Fix location of eccstat registers for IFC V1.0")
Cc: stable@vger.kernel.org # v3.18+
Signed-off-by: Jagdish Gediya <jagdish.gediya@nxp.com>
Reviewed-by: Prabhakar Kushwaha <prabhakar.kushwaha@nxp.com>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
In order to make sure compiler flag detection for ARM works
correctly the no-integrated-as flags need to be set before
including the arch specific Makefile.
Fixes: cfe17c9bbe ("kbuild: move cc-option and cc-disable-warning after incl. arch Makefile")
Signed-off-by: Stefan Agner <stefan@agner.ch>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Number of ECC status registers i.e. (ECCSTATx) has been increased in IFC
version 2.0.0 due to increase in SRAM size. This is causing eccstat
array to over flow.
So, replace eccstat array with u32 variable to make it fail-safe and
independent of number of ECC status registers or SRAM size.
Fixes: bccb06c353 ("mtd: nand: ifc: update bufnum mask for ver >= 2.0.0")
Cc: stable@vger.kernel.org # 3.18+
Signed-off-by: Prabhakar Kushwaha <prabhakar.kushwaha@nxp.com>
Signed-off-by: Jagdish Gediya <jagdish.gediya@nxp.com>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
Some filesystems have timestamps with coarse precision that may allow
for a recently built object file to have the same timestamp as the
updated time on one of its dependency files. When that happens, the
object file doesn't get rebuilt as it should.
This is especially the case on filesystems that don't have sub-second
time precision, such as ext3 or Ext4 with 128B inodes.
Let's prevent that by making sure updated dependency files have a newer
timestamp than the first file we created (i.e. autoksyms.h.tmpnew).
Reported-by: Thomas Lindroth <thomas.lindroth@gmail.com>
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Thomas Lindroth <thomas.lindroth@gmail.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
As per the IFC hardware manual, Most significant 2 bytes in
nand_fsr register are the outcome of NAND READ STATUS command.
So status value need to be shifted and aligned as per the nand
framework requirement.
Fixes: 82771882d9 ("NAND Machine support for Integrated Flash Controller")
Cc: stable@vger.kernel.org # v3.18+
Signed-off-by: Jagdish Gediya <jagdish.gediya@nxp.com>
Reviewed-by: Prabhakar Kushwaha <prabhakar.kushwaha@nxp.com>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
The truncation isn't being programmed if the truncation
depth is set to 2, it causes an issue with dce11.2 asic
using 6bit eDP panel. It required to truncate 12:10 in order to
perform spatial dither 10:6.
This change will allow 12:10 truncation to be enabled.
Signed-off-by: Mikita Lipski <mikita.lipski@amd.com>
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add a device tree property description for hdmi device node
. '#sound-dai-cells' property is required to describe link between
the HDMI IP block and the SoC's audio subsystem and Exynos SoC device
tree files already have this property but we missed its description.
* tag 'exynos-drm-fixes-for-v4.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos:
dt-bindings: exynos: Document #sound-dai-cells property of the HDMI node
drm/tegra: Fixes for v4.16-rc7
This contains two small fixes for the alpha blending support that was
merged into v4.16-rc1 and a fix for connector reference leaks caused by
the fact that display pipelines are no longer automatically disabled if
the framebuffer is removed.
Furthermore this contains a fix for a crash on IOMMU detach at driver
unbind time and a regulator enable/disable unbalance fix.
* tag 'drm/tegra/for-4.16-rc7-fixes' of git://anongit.freedesktop.org/tegra/linux:
drm/tegra: Shutdown on driver unbind
drm/tegra: dsi: Don't disable regulator on ->exit()
drm/tegra: dc: Detach IOMMU group from domain only once
drm/tegra: plane: Correct legacy blending
drm/tegra: plane: Fix RGB565 format on older Tegra
Pull clk fixes from Stephen Boyd:
"A late collection of fixes for regressions seen this release cycle.
Normally I send this earlier than now but real life got in the way.
Things are back to normal now.
There's the normal set of SoC driver fixes: i.MX boot warning, TI
display clks, allwinner clk ops being wrong (fun), driver probe
badness on error paths, correctness fix for the new aspeed driver, and
even a fix for a race condition in the bcm2835 clk driver.
At the core framework level we also got some fixes for the clk phase
API caching at the wrong time, better handling of the enabled state of
orphan clks, and a fix for a newly introduced bug in how we handle
rate calculations for pass-through clks"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: bcm2835: Protect sections updating shared registers
clk: bcm2835: Fix ana->maskX definitions
clk: aspeed: Prevent reset if clock is enabled
clk: aspeed: Fix is_enabled for certain clocks
clk: qcom: msm8916: Fix return value check in qcom_apcs_msm8916_clk_probe()
clk: hisilicon: hi3660:Fix potential NULL dereference in hi3660_stub_clk_probe()
clk: fix determine rate error with pass-through clock
clk: migrate the count of orphaned clocks at init
clk: update cached phase to respect the fact when setting phase
clk: ti: am43xx: add set-rate-parent support for display clkctrl clock
clk: ti: am33xx: add set-rate-parent support for display clkctrl clock
clk: ti: clkctrl: add support for CLK_SET_RATE_PARENT flag
clk: imx51-imx53: Fix UART4/5 registration on i.MX50 and i.MX53
clk: sunxi-ng: a31: Fix CLK_OUT_* clock ops
Prasad reported that he has seen crashes in BPF subsystem with netd
on Android with arm64 in the form of (note, the taint is unrelated):
[ 4134.721483] Unable to handle kernel paging request at virtual address 800000001
[ 4134.820925] Mem abort info:
[ 4134.901283] Exception class = DABT (current EL), IL = 32 bits
[ 4135.016736] SET = 0, FnV = 0
[ 4135.119820] EA = 0, S1PTW = 0
[ 4135.201431] Data abort info:
[ 4135.301388] ISV = 0, ISS = 0x00000021
[ 4135.359599] CM = 0, WnR = 0
[ 4135.470873] user pgtable: 4k pages, 39-bit VAs, pgd = ffffffe39b946000
[ 4135.499757] [0000000800000001] *pgd=0000000000000000, *pud=0000000000000000
[ 4135.660725] Internal error: Oops: 96000021 [#1] PREEMPT SMP
[ 4135.674610] Modules linked in:
[ 4135.682883] CPU: 5 PID: 1260 Comm: netd Tainted: G S W 4.14.19+ #1
[ 4135.716188] task: ffffffe39f4aa380 task.stack: ffffff801d4e0000
[ 4135.731599] PC is at bpf_prog_add+0x20/0x68
[ 4135.741746] LR is at bpf_prog_inc+0x20/0x2c
[ 4135.751788] pc : [<ffffff94ab7ad584>] lr : [<ffffff94ab7ad638>] pstate: 60400145
[ 4135.769062] sp : ffffff801d4e3ce0
[...]
[ 4136.258315] Process netd (pid: 1260, stack limit = 0xffffff801d4e0000)
[ 4136.273746] Call trace:
[...]
[ 4136.442494] 3ca0: ffffff94ab7ad584 0000000060400145 ffffffe3a01bf8f8 0000000000000006
[ 4136.460936] 3cc0: 0000008000000000 ffffff94ab844204 ffffff801d4e3cf0 ffffff94ab7ad584
[ 4136.479241] [<ffffff94ab7ad584>] bpf_prog_add+0x20/0x68
[ 4136.491767] [<ffffff94ab7ad638>] bpf_prog_inc+0x20/0x2c
[ 4136.504536] [<ffffff94ab7b5d08>] bpf_obj_get_user+0x204/0x22c
[ 4136.518746] [<ffffff94ab7ade68>] SyS_bpf+0x5a8/0x1a88
Android's netd was basically pinning the uid cookie BPF map in BPF
fs (/sys/fs/bpf/traffic_cookie_uid_map) and later on retrieving it
again resulting in above panic. Issue is that the map was wrongly
identified as a prog! Above kernel was compiled with clang 4.0,
and it turns out that clang decided to merge the bpf_prog_iops and
bpf_map_iops into a single memory location, such that the two i_ops
could then not be distinguished anymore.
Reason for this miscompilation is that clang has the more aggressive
-fmerge-all-constants enabled by default. In fact, clang source code
has a comment about it in lib/AST/ExprConstant.cpp on why it is okay
to do so:
Pointers with different bases cannot represent the same object.
(Note that clang defaults to -fmerge-all-constants, which can
lead to inconsistent results for comparisons involving the address
of a constant; this generally doesn't matter in practice.)
The issue never appeared with gcc however, since gcc does not enable
-fmerge-all-constants by default and even *explicitly* states in
it's option description that using this flag results in non-conforming
behavior, quote from man gcc:
Languages like C or C++ require each variable, including multiple
instances of the same variable in recursive calls, to have distinct
locations, so using this option results in non-conforming behavior.
There are also various clang bug reports open on that matter [1],
where clang developers acknowledge the non-conforming behavior,
and refer to disabling it with -fno-merge-all-constants. But even
if this gets fixed in clang today, there are already users out there
that triggered this. Thus, fix this issue by explicitly adding
-fno-merge-all-constants to the kernel's Makefile to generically
disable this optimization, since potentially other places in the
kernel could subtly break as well.
Note, there is also a flag called -fmerge-constants (not supported
by clang), which is more conservative and only applies to strings
and it's enabled in gcc's -O/-O2/-O3/-Os optimization levels. In
gcc's code, the two flags -fmerge-{all-,}constants share the same
variable internally, so when disabling it via -fno-merge-all-constants,
then we really don't merge any const data (e.g. strings), and text
size increases with gcc (14,927,214 -> 14,942,646 for vmlinux.o).
$ gcc -fverbose-asm -O2 foo.c -S -o foo.S
-> foo.S lists -fmerge-constants under options enabled
$ gcc -fverbose-asm -O2 -fno-merge-all-constants foo.c -S -o foo.S
-> foo.S doesn't list -fmerge-constants under options enabled
$ gcc -fverbose-asm -O2 -fno-merge-all-constants -fmerge-constants foo.c -S -o foo.S
-> foo.S lists -fmerge-constants under options enabled
Thus, as a workaround we need to set both -fno-merge-all-constants
*and* -fmerge-constants in the Makefile in order for text size to
stay as is.
[1] https://bugs.llvm.org/show_bug.cgi?id=18538
Reported-by: Prasad Sodagudi <psodagud@codeaurora.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Chenbo Feng <fengc@google.com>
Cc: Richard Smith <richard-llvm@metafoo.co.uk>
Cc: Chandler Carruth <chandlerc@gmail.com>
Cc: linux-kernel@vger.kernel.org
Tested-by: Prasad Sodagudi <psodagud@codeaurora.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Pull rdma fixes from Jason Gunthorpe:
"Not much exciting here, almost entirely syzkaller fixes.
This is going to be on ongoing theme for some time, I think. Both
Google and Mellanox are now running syzkaller on different parts of
the user API.
Summary:
- Many bug fixes related to syzkaller from Leon Romanovsky. These are
still for the mlx driver and ucma interface.
- Fix a situation with port reuse for iWarp, discovered during
scale-up testing
- Bug fixes for the profile and restrack patches accepted during this
merge window
- Compile warning cleanups from Arnd, this is apparently the last
warning to make 32 bit builds quiet"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/ucma: Ensure that CM_ID exists prior to access it
RDMA/verbs: Remove restrack entry from XRCD structure
RDMA/ucma: Fix use-after-free access in ucma_close
RDMA/ucma: Check AF family prior resolving address
infiniband: bnxt_re: use BIT_ULL() for 64-bit bit masks
infiniband: qplib_fp: fix pointer cast
IB/mlx5: Fix cleanup order on unload
RDMA/ucma: Don't allow join attempts for unsupported AF family
RDMA/ucma: Fix access to non-initialized CM_ID object
RDMA/core: Do not use invalid destination in determining port reuse
RDMA/mlx5: Fix crash while accessing garbage pointer and freed memory
IB/mlx5: Fix integer overflows in mlx5_ib_create_srq
IB/mlx5: Fix out-of-bounds read in create_raw_packet_qp_rq
Pull SCSI fixes from James Bottomley:
- one driver patch (qla2xxx) which fixes a problem caused by an
existing regression fix (FCP discovery is failing)
- one generic fix to a longstanding bug in libsas that causes I/O
eventually to hang to the device in the face of ATA error recovery.
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: qla2xxx: Remove FC_NO_LOOP_ID for FCP and FC-NVMe Discovery
scsi: libsas: defer ata device eh commands to libata
Pull nfsd fix from Bruce Fields:
"Just one fix for an occasional panic from Jeff Layton"
* tag 'nfsd-4.16-1' of git://linux-nfs.org/~bfields/linux:
nfsd: remove blocked locks on client teardown
The current check statement in BPF syscall will do a capability check
for CAP_SYS_ADMIN before checking sysctl_unprivileged_bpf_disabled. This
code path will trigger unnecessary security hooks on capability checking
and cause false alarms on unprivileged process trying to get CAP_SYS_ADMIN
access. This can be resolved by simply switch the order of the statement
and CAP_SYS_ADMIN is not required anyway if unprivileged bpf syscall is
allowed.
Signed-off-by: Chenbo Feng <fengc@google.com>
Acked-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Commit 4bebdc7a85 ("bpf: add helper bpf_perf_prog_read_value")
added helper bpf_perf_prog_read_value so that perf_event type program
can read event counter and enabled/running time.
This commit, however, introduced a bug which allows this helper
for tracepoint type programs. This is incorrect as bpf_perf_prog_read_value
needs to access perf_event through its bpf_perf_event_data_kern type context,
which is not available for tracepoint type program.
This patch fixed the issue by separating bpf_func_proto between tracepoint
and perf_event type programs and removed bpf_perf_prog_read_value
from tracepoint func prototype.
Fixes: 4bebdc7a85 ("bpf: add helper bpf_perf_prog_read_value")
Reported-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Function bpf_fill_maxinsns11 is designed to not be able to be JITed on
x86_64. So, it fails when CONFIG_BPF_JIT_ALWAYS_ON=y, and
commit 09584b4067 ("bpf: fix selftests/bpf test_kmod.sh failure when
CONFIG_BPF_JIT_ALWAYS_ON=y") makes sure that failure is detected on that
case.
However, it does not fail on other architectures, which have a different
JIT compiler design. So, test_bpf has started to fail to load on those.
After this fix, test_bpf loads fine on both x86_64 and ppc64el.
Fixes: 09584b4067 ("bpf: fix selftests/bpf test_kmod.sh failure when CONFIG_BPF_JIT_ALWAYS_ON=y")
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
The undocumented 'icebp' instruction (aka 'int1') works pretty much like
'int3' in the absense of in-circuit probing equipment (except,
obviously, that it raises #DB instead of raising #BP), and is used by
some validation test-suites as such.
But Andy Lutomirski noticed that his test suite acted differently in kvm
than on bare hardware.
The reason is that kvm used an inexact test for the icebp instruction:
it just assumed that an all-zero VM exit qualification value meant that
the VM exit was due to icebp.
That is not unlike the guess that do_debug() does for the actual
exception handling case, but it's purely a heuristic, not an absolute
rule. do_debug() does it because it wants to ascribe _some_ reasons to
the #DB that happened, and an empty %dr6 value means that 'icebp' is the
most likely casue and we have no better information.
But kvm can just do it right, because unlike the do_debug() case, kvm
actually sees the real reason for the #DB in the VM-exit interruption
information field.
So instead of relying on an inexact heuristic, just use the actual VM
exit information that says "it was 'icebp'".
Right now the 'icebp' instruction isn't technically documented by Intel,
but that will hopefully change. The special "privileged software
exception" information _is_ actually mentioned in the Intel SDM, even
though the cause of it isn't enumerated.
Reported-by: Andy Lutomirski <luto@kernel.org>
Tested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Marc Kleine-Budde says:
====================
pull-request: can 2018-03-19
this is a pull reqeust of one patch for net/master.
The patch is by Andri Yngvason and fixes a potential use-after-free bug
in the cc770 driver introduced in the previous pull-request.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
If the optional regulator is deferred, we must release some resources.
They will be re-allocated when the probe function will be called again.
Fixes: 6eacf31139 ("ethernet: arc: Add support for Rockchip SoC layer device tree bindings")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current code performs unneeded free. Remove the redundant skb freeing
during the error path.
Fixes: 1555d204e7 ("devlink: Support for pipeline debug (dpipe)")
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Johan Hedberg says:
====================
Here are a few more important Bluetooth driver fixes for the 4.16
kernel.
Please let me know if there are any issues pulling. Thanks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Sergei Trofimovich reported that restoring an nft ruleset doesn't work
anymore unless old rule content is flushed first.
The problem stems from a recent change designed to prevent multiple nat
hooks at the same hook point locations and nftables transaction model.
A 'flush ruleset' won't take effect until the entire transaction has
completed.
So, if one has a nft.rules file that contains a 'flush ruleset',
followed by a nat hook register request, then 'nft -f file' will work,
but running 'nft -f file' again will fail with -EBUSY.
Reason is that nftables will place the flush/removal requests in the
transaction list, but it will not act on the removal until after all new
rules are in place.
The netfilter core will therefore get request to register a new nat
hook before the old one is removed -- this now fails as the netfilter
core can't know the existing hook is staged for removal.
To fix this, we can search the transaction log when a hook collision
is detected. The collision is okay if
1. there is a delete request pending for the nat hook that is already
registered.
2. there is no second add request for a matching nat hook.
This is required to only apply the exception once.
Fixes: f92b40a8b2 ("netfilter: core: only allow one nat hook per hook point")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
in nftables, 'meter' can be used to instantiate a hash-table at run
time:
rule add filter forward iif "internal" meter hostacct { ip saddr counter}
nft list meter ip filter hostacct
table ip filter {
meter hostacct {
type ipv4_addr
elements = { 192.168.0.1 : counter packets 8 bytes 2672, ..
because elemets get added on the fly, the kernel must chose a set
backend type that implements the ->update() function, otherwise
rule insertion fails with EOPNOTSUPP.
Therefore, skip set types that lack ->update, and also
make sure we do not discard a (bad) candidate when we did yet
find any candidate at all. This could happen when userspace prefers
low memory footprint -- the set implementation currently checked might
not be a fit at all. Make sure we pick it anyway (!bops). In
case next candidate is a better fix, it will be chosen instead.
But in case nothing else is found we at least have a non-ideal
match rather than no match at all.
Fixes: 6c03ae210c ("netfilter: nft_set_hash: add non-resizable hashtable implementation")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The commit "regulatory: add NUL to request alpha2" increases the length of
alpha2 to 3. This causes a regression on brcmfmac, because
brcmf_cfg80211_reg_notifier() expect valid ISO3166 codes in the complete
array. So fix this accordingly.
Fixes: 657308f73e ("regulatory: add NUL to request alpha2")
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Acked-by: Franky Lin <franky.lin@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Scheduler debug stats include newlines that display out of alignment
when prefixed by timestamps. For example, the dmesg utility:
% echo t > /proc/sysrq-trigger
% dmesg
...
[ 83.124251]
runnable tasks:
S task PID tree-key switches prio wait-time
sum-exec sum-sleep
-----------------------------------------------------------------------------------------------------------
At the same time, some syslog utilities (like rsyslog by default) don't
like the additional newlines control characters, saving lines like this
to /var/log/messages:
Mar 16 16:02:29 localhost kernel: #012runnable tasks:#012 S task PID tree-key ...
^^^^ ^^^^
Clean these up by moving newline characters to their own SEQ_printf
invocation. This leaves the /proc/sched_debug unchanged, but brings the
entire output into alignment when prefixed:
% echo t > /proc/sysrq-trigger
% dmesg
...
[ 62.410368] runnable tasks:
[ 62.410368] S task PID tree-key switches prio wait-time sum-exec sum-sleep
[ 62.410369] -----------------------------------------------------------------------------------------------------------
[ 62.410369] I kworker/u12:0 5 1932.215593 332 120 0.000000 3.621252 0.000000 0 0 /
and no escaped control characters from rsyslog in /var/log/messages:
Mar 16 16:15:06 localhost kernel: runnable tasks:
Mar 16 16:15:06 localhost kernel: S task PID tree-key ...
Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1521484555-8620-3-git-send-email-joe.lawrence@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
When the SEQ_printf() macro prints to the console, it runs a simple
printk() without KERN_CONT "continued" line printing. The result of
this is oddly wrapped task info, for example:
% echo t > /proc/sysrq-trigger
% dmesg
...
runnable tasks:
...
[ 29.608611] I
[ 29.608613] rcu_sched 8 3252.013846 4087 120
[ 29.608614] 0.000000 29.090111 0.000000
[ 29.608615] 0 0
[ 29.608616] /
Modify SEQ_printf to use pr_cont() for expected one-line results:
% echo t > /proc/sysrq-trigger
% dmesg
...
runnable tasks:
...
[ 106.716329] S cpuhp/5 37 2006.315026 14 120 0.000000 0.496893 0.000000 0 0 /
Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1521484555-8620-2-git-send-email-joe.lawrence@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
When a perf_event is attached to parent cgroup, it should count events
for all children cgroups:
parent_group <---- perf_event
\
- child_group <---- process(es)
However, in our tests, we found this perf_event cannot report reliable
results. Here is an example case:
# create cgroups
mkdir -p /sys/fs/cgroup/p/c
# start perf for parent group
perf stat -e instructions -G "p"
# on another console, run test process in child cgroup:
stressapptest -s 2 -M 1000 & echo $! > /sys/fs/cgroup/p/c/cgroup.procs
# after the test process is done, stop perf in the first console shows
<not counted> instructions p
The instruction should not be "not counted" as the process runs in the
child cgroup.
We found this is because perf_event->cgrp and cpuctx->cgrp are not
identical, thus perf_event->cgrp are not updated properly.
This patch fixes this by updating perf_cgroup properly for ancestor
cgroup(s).
Reported-by: Ephraim Park <ephiepark@fb.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <jolsa@redhat.com>
Cc: <kernel-team@fb.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/20180312165943.1057894-1-songliubraving@fb.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
With the following commit:
3335224470 ("jump_label: Explicitly disable jump labels in __init code")
... we explicitly disabled jump labels in __init code, so they could be
detected and not warned about in the following commit:
dc1dd184c2 ("jump_label: Warn on failed jump_label patching attempt")
In-kernel __exit code has the same issue. It's never used, so it's
freed along with the rest of initmem. But jump label entries in __exit
code aren't explicitly disabled, so we get the following warning when
enabling pr_debug() in __exit code:
can't patch jump_label at dmi_sysfs_exit+0x0/0x2d
WARNING: CPU: 0 PID: 22572 at kernel/jump_label.c:376 __jump_label_update+0x9d/0xb0
Fix the warning by disabling all jump labels in initmem (which includes
both __init and __exit code).
Reported-and-tested-by: Li Wang <liwang@redhat.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: dc1dd184c2 ("jump_label: Warn on failed jump_label patching attempt")
Link: http://lkml.kernel.org/r/7121e6e595374f06616c505b6e690e275c0054d1.1521483452.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
iscsi tcp will first send out data, then calculate and send data
digest. If we don't have BDI_CAP_STABLE_WRITES, the page cache will be
written in spite of the on going writeback. Consequently, wrong digest
will be got and sent to target.
To fix this, set BDI_CAP_STABLE_WRITES when data digest is enabled
in iscsi_tcp .slave_configure callback.
Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
Acked-by: Chris Leech <cleech@redhat.com>
Acked-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The USB storage glue sets the try_rc_10_first flag in an attempt to
avoid wedging poorly implemented legacy USB devices.
If the device capacity is too large to be expressed in the provided
response buffer field of READ CAPACITY(10), a well-behaved device will
set the reported capacity to 0xFFFFFFFF. We will then attempt to issue a
READ CAPACITY(16) to obtain the real capacity.
Since this part of the discovery logic is not covered by the first_scan
flag, a warning will be printed a couple of times times per revalidate
attempt if we upgrade from READ CAPACITY(10) to READ CAPACITY(16).
Remember that we have successfully issued READ CAPACITY(16) so we can
take the fast path on subsequent revalidate attempts.
Reported-by: Menion <menion@gmail.com>
Reviewed-by: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Grygorii Strashko says:
====================
net: phy: relax error checking when creating sysfs link netdev->phydev
Some ethernet drivers (like TI CPSW) may connect and manage >1 Net PHYs per
one netdevice, as result such drivers will produce warning during system
boot and fail to connect second phy to netdevice when PHYLIB framework
will try to create sysfs link netdev->phydev for second PHY
in phy_attach_direct(), because sysfs link with the same name has been
created already for the first PHY.
As result, second CPSW external port will became unusable.
This regression was introduced by commits:
5568363f0c ("net: phy: Create sysfs reciprocal links for attached_dev/phydev"
a399546049 ("net: phy: Relax error checking on sysfs_create_link()"
Patch 1: exports sysfs_create_link_nowarn() function as preparation for Patch 2.
Patch 2: relaxes error checking when PHYLIB framework is creating sysfs
link netdev->phydev in phy_attach_direct(), suppresses warning by using
sysfs_create_link_nowarn() and adds error message instead, so links creation
failure is not fatal any more and system can continue working,
which fixes TI CPSW issue and makes boot logs accessible
in case of NFS boot, for example.
This can be stable material 4.13+.
Changes in v2:
- commit messages updated.
v1:
https://patchwork.ozlabs.org/cover/886058/
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Some ethernet drivers (like TI CPSW) may connect and manage >1 Net PHYs per
one netdevice, as result such drivers will produce warning during system
boot and fail to connect second phy to netdevice when PHYLIB framework
will try to create sysfs link netdev->phydev for second PHY
in phy_attach_direct(), because sysfs link with the same name has been
created already for the first PHY. As result, second CPSW external
port will became unusable.
Fix it by relaxing error checking when PHYLIB framework is creating sysfs
link netdev->phydev in phy_attach_direct(), suppressing warning by using
sysfs_create_link_nowarn() and adding error message instead.
After this change links (phy->netdev and netdev->phy) creation failure is not
fatal any more and system can continue working, which fixes TI CPSW issue.
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Andrew Lunn <andrew@lunn.ch>
Fixes: a399546049 ("net: phy: Relax error checking on sysfs_create_link()")
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The sysfs_create_link_nowarn() is going to be used in phylib framework in
subsequent patch which can be built as module. Hence, export
sysfs_create_link_nowarn() to avoid build errors.
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Andrew Lunn <andrew@lunn.ch>
Fixes: a399546049 ("net: phy: Relax error checking on sysfs_create_link()")
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull cgroup fixes from Tejun Heo:
"Two commits to fix the following subtle cgroup2 behavior bugs:
- cpu.max was rejecting config when it shouldn't
- thread mode enable was allowed when it shouldn't"
* 'for-4.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: fix rule checking for threaded mode switching
sched, cgroup: Don't reject lower cpu.max on ancestors
The resource allocation in WDAT watchdog has off-one-by error, it sets
one byte more than the actual end address. This may eventually lead
to unexpected resource conflicts.
Fixes: 058dfc7670 (ACPI / watchdog: Add support for WDAT hardware watchdog)
Cc: 4.9+ <stable@vger.kernel.org> # 4.9+
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Pull workqueue fixes from Tejun Heo:
"Two low-impact workqueue commits.
One fixes workqueue creation error path and the other removes the
unused cancel_work()"
* 'for-4.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: remove unused cancel_work()
workqueue: use put_device() instead of kfree()
Pull percpu fixes from Tejun Heo:
"Late percpu pull request for v4.16-rc6.
- percpu allocator pool replenishing no longer triggers OOM or
warning messages.
Also, the alloc interface now understands __GFP_NORETRY and
__GFP_NOWARN. This is to allow avoiding OOMs from userland
triggered actions like bpf map creation.
Also added cond_resched() in alloc loop.
- perpcu allocation now can be interrupted by kill sigs to avoid
deadlocking OOM killer.
- Added Dennis Zhou as a co-maintainer.
He has rewritten the area map allocator, understands most of the
code base and has been responsive for all bug reports"
* 'for-4.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
percpu_ref: Update doc to dissuade users from depending on internal RCU grace periods
mm: Allow to kill tasks doing pcpu_alloc() and waiting for pcpu_balance_workfn()
percpu: include linux/sched.h for cond_resched()
percpu: add a schedule point in pcpu_balance_workfn()
percpu: allow select gfp to be passed to underlying allocators
percpu: add __GFP_NORETRY semantics to the percpu balancing path
percpu: match chunk allocator declarations with definitions
percpu: add Dennis Zhou as a percpu co-maintainer
Pull libata fixes from Tejun Heo:
"I sat on them too long and it's quite a few this late, but nothing has
a wide blast area. The changes are...
- Fix corner cases in SG command handling.
- Recent introduction of default powersaving mode config option
exposed several devices with broken powersaving behaviors. A number
of patches to update the blacklist accordingly.
- Fix a kernel panic on SAS hotplug.
- Other misc and device specific updates"
* 'for-4.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
libata: Modify quirks for MX100 to limit NCQ_TRIM quirk to MU01 version
libata: Make Crucial BX100 500GB LPM quirk apply to all firmware versions
libata: Apply NOLPM quirk to Crucial M500 480 and 960GB SSDs
libata: Enable queued TRIM for Samsung SSD 860
PCI: Add function 1 DMA alias quirk for Highpoint RocketRAID 644L
ahci: Add PCI-id for the Highpoint Rocketraid 644L card
ata: do not schedule hot plug if it is a sas host
libata: disable LPM for Crucial BX100 SSD 500GB drive
libata: Apply NOLPM quirk to Crucial MX100 512GB SSDs
libata: update documentation for sysfs interfaces
ata: sata_rcar: Remove unused variable in sata_rcar_init_controller()
libata: transport: cleanup documentation of sysfs interface
sata_rcar: Reset SATA PHY when Salvator-X board resumes
libata: don't try to pass through NCQ commands to non-NCQ devices
libata: remove WARN() for DMA or PIO command without data
libata: fix length validation of ATAPI-relayed SCSI commands
ata: libahci: fix comment indentation
ahci: Add check for device presence (PCIe hot unplug) in ahci_stop_engine()
libata: Fix compile warning with ATA_DEBUG enabled
We had some reports of panics in nfsd4_lm_notify, and that showed a
nfs4_lockowner that had outlived its so_client.
Ensure that we walk any leftover lockowners after tearing down all of
the stateids, and remove any blocked locks that they hold.
With this change, we also don't need to walk the nbl_lru on nfsd_net
shutdown, as that will happen naturally when we tear down the clients.
Fixes: 76d348fadf (nfsd: have nfsd4_lock use blocking locks for v4.1+ locks)
Reported-by: Frank Sorenson <fsorenso@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Cc: stable@vger.kernel.org # 4.9
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
XRCD object is not implemented in the restrack, so lets remove it.
Fixes: 02d8883f52 ("RDMA/restrack: Add general infrastructure to track RDMA resources")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The error in ucma_create_id() left ctx in the list of contexts belong
to ucma file descriptor. The attempt to close this file descriptor causes
to use-after-free accesses while iterating over such list.
Fixes: 7521663857 ("RDMA/cma: Export rdma cm interface to userspace")
Reported-by: <syzbot+dcfd344365a56fbebd0f@syzkaller.appspotmail.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
percpu_ref internally uses sched-RCU to implement the percpu -> atomic
mode switching and the documentation suggested that this could be
depended upon. This doesn't seem like a good idea.
* percpu_ref uses sched-RCU which has different grace periods regular
RCU. Users may combine percpu_ref with regular RCU usage and
incorrectly believe that regular RCU grace periods are performed by
percpu_ref. This can lead to, for example, use-after-free due to
premature freeing.
* percpu_ref has a grace period when switching from percpu to atomic
mode. It doesn't have one between the last put and release. This
distinction is subtle and can lead to surprising bugs.
* percpu_ref allows starting in and switching to atomic mode manually
for debugging and other purposes. This means that there may not be
any grace periods from kill to release.
This patch makes it clear that the grace periods are percpu_ref's
internal implementation detail and can't be depended upon by the
users.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
In case of memory deficit and low percpu memory pages,
pcpu_balance_workfn() takes pcpu_alloc_mutex for a long
time (as it makes memory allocations itself and waits
for memory reclaim). If tasks doing pcpu_alloc() are
choosen by OOM killer, they can't exit, because they
are waiting for the mutex.
The patch makes pcpu_alloc() to care about killing signal
and use mutex_lock_killable(), when it's allowed by GFP
flags. This guarantees, a task does not miss SIGKILL
from OOM killer.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
microblaze build broke due to missing declaration of the
cond_resched() invocation added recently. Let's include linux/sched.h
explicitly.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: kbuild test robot <fengguang.wu@intel.com>
CM_PLLx and A2W_XOSC_CTRL registers are accessed by different clock
handlers and must be accessed with ->regs_lock held.
Update the sections where this protection is missing.
Fixes: 41691b8862 ("clk: bcm2835: Add support for programming the audio domain clocks")
Cc: <stable@vger.kernel.org>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
ana->maskX values are already '~'-ed in bcm2835_pll_set_rate(). Remove
the '~' in the definition to fix ANA setup.
Note that this commit fixes a long standing bug preventing one from
using an HDMI display if it's plugged after the FW has booted Linux.
This is because PLLH is used by the HDMI encoder to generate the pixel
clock.
Fixes: 41691b8862 ("clk: bcm2835: Add support for programming the audio domain clocks")
Cc: <stable@vger.kernel.org>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Currently, the offsets in the UAC2 processing unit descriptor are
calculated incorrectly. It causes an issue when connecting the device which
provides such a feature:
~~~~
[84126.724420] usb 1-1.3.1: invalid Processing Unit descriptor (id 18)
~~~~
After this patch is applied, the UAC2 processing unit inits w/o this error.
Fixes: 23caaf19b1 ("ALSA: usb-mixer: Add support for Audio Class v2.0")
Signed-off-by: Kirill Marinushkin <k.marinushkin@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
When commit 9c7be59fc5 ("libata: Apply NOLPM quirk to Crucial MX100
512GB SSDs") was added it inherited the ATA_HORKAGE_NO_NCQ_TRIM quirk
from the existing "Crucial_CT*MX100*" entry, but that entry sets model_rev
to "MU01", where as the entry adding the NOLPM quirk sets it to NULL.
This means that after this commit we no apply the NO_NCQ_TRIM quirk to
all "Crucial_CT512MX100*" SSDs even if they have the fixed "MU02"
firmware. This commit splits the "Crucial_CT512MX100*" quirk into 2
quirks, one for the "MU01" firmware and one for all other firmware
versions, so that we once again only apply the NO_NCQ_TRIM quirk to the
"MU01" firmware version.
Fixes: 9c7be59fc5 ("libata: Apply NOLPM quirk to ... MX100 512GB SSDs")
Cc: stable@vger.kernel.org
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Commit b17e5729a6 ("libata: disable LPM for Crucial BX100 SSD 500GB
drive"), introduced a ATA_HORKAGE_NOLPM quirk for Crucial BX100 500GB SSDs
but limited this to the MU02 firmware version, according to:
http://www.crucial.com/usa/en/support-ssd-firmware
MU02 is the last version, so there are no newer possibly fixed versions
and if the MU02 version has broken LPM then the MU01 almost certainly
also has broken LPM, so this commit changes the quirk to apply to all
firmware versions.
Fixes: b17e5729a6 ("libata: disable LPM for Crucial BX100 SSD 500GB...")
Cc: stable@vger.kernel.org
Cc: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
There have been reports of the Crucial M500 480GB model not working
with LPM set to min_power / med_power_with_dipm level.
It has not been tested with medium_power, but that typically has no
measurable power-savings.
Note the reporters Crucial_CT480M500SSD3 has a firmware version of MU03
and there is a MU05 update available, but that update does not mention any
LPM fixes in its changelog, so the quirk matches all firmware versions.
In my experience the LPM problems with (older) Crucial SSDs seem to be
limited to higher capacity versions of the SSDs (different firmware?),
so this commit adds a NOLPM quirk for the 480 and 960GB versions of the
M500, to avoid LPM causing issues with these SSDs.
Cc: stable@vger.kernel.org
Reported-and-tested-by: Martin Steigerwald <martin@lichtvoll.de>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
If the server is malicious then *bytes_read could be larger than the
size of the "target" buffer. It would lead to memory corruption when we
do the memcpy().
Reported-by: Dr Silvio Cesare of InfoSect <Silvio Cesare <silvio.cesare@gmail.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jonathan writes:
Second set of IIO fixes for the 4.16 cycle.
A slightly fiddly revert then fix pair in here as the bug lead to
an unused local variable that was then removed without us noticing
the bug. The revert should only be needed on 4.16 - the fix
goes back futher.
* ccs811
- Fix the transition from 'boot' to 'application' mode. Fixes the case
where the power is not cut between boot cycles.
* meson-saradc
- Fix missing mutex_unlock in an error path.
* sd-modulator
- Fix bindings doc to have the right value of io-channel-cells to reflect
that this device type only ever outputs one channel.
* st-accel
- Revert drop of redundant pointer patch.
- Use the now available pointer to avoid overwriting the platform data
pointer and causing trouble on reprobing the driver.
* st-pressure
- Use local copy of the platform data pointer to avoid overwriting the
one associated with the device, which would cause issues on reprobing
the driver.
* stm32-dfsdm
- Use the right regmap_cfg for the type of device.
- Correct the ID passed to stop channel to be the channel one.
- Correct which clock is used to allow for the 'audio' clock.
- Fix allocation of channels when more than one is enabled.
Section was not properly computed. The value of OOB region definition is
always ECC section 0 information in the OOB area, but we want to get all
the ECC bytes information, so we should call
mtd_ooblayout_ecc(mtd, section++, &oobregion) until it returns -ERANGE.
Fixes: c2b78452a9 ("mtd: use mtd_ooblayout_xxx() helpers where appropriate")
Cc: <stable@vger.kernel.org>
Signed-off-by: OuYang ZhiZhong <ouyzz@yealink.com>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
Revert commit c68f0676ef ("ACPI / battery: Add quirk for Asus
GL502VSK and UX305LA") and commit 4446823e25 ("ACPI / battery: Add
quirk for Asus UX360UA and UX410UAK").
On many many Asus products, the battery is sometimes reported as
charging or discharging even when it is full and you are on AC power.
This change quirked the kernel to avoid advertising the discharging
state when this happens on 4 laptop models, under the belief that
this was incorrect information. I presume it originates from user
reports who are confused that their battery status icon says that it
is discharging.
However, the reported information is indeed correct, and the quirk
approach taken is inadequate and more thought is needed first.
Specifically:
1. It only quirks discharging state, not charging
2. There are so many different Asus products and DMI naming variants
within those product families that behave this way; Linux could
grow to quirk hundreds of products and still not even be close at
"winning" this battle.
3. Asus previously clarified that this behaviour is intentional. The
platform will periodically do a partial discharge/charge cycle
when the battery is full, because this is one way to extend the
lifetime of the battery (leaving a battery at 100% charge and
unused will decrease its usable capacity over time).
My understanding is that any decent consumer product will have
this behaviour, but it appears that Asus is different in that
they expose this info through ACPI.
However, the behaviour seems correct. The ACPI spec does not
suggest in that the platform should hide the truth. It lets you
report that the battery is full of charge, and discharging, and
with external power connected; and Asus does this.
4. In terms of not confusing the user, this seems like something that
could/should be handled by userspace, which can also detect these
same (accurate) conditions in the general case.
Revert this quirk before it gets included in a release, while we look
for better approaches.
Signed-off-by: Daniel Drake <drake@endlessm.com>
Acked-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Since commit 846c7dfc11 ("drm/atomic: Try to preserve the crtc enabled
state in drm_atomic_remove_fb, v2."), removing the last framebuffer will
no longer disable the corresponding pipeline, which causes the KMS core
to complain about leaked connectors on driver unbind.
Fix this by calling drm_atomic_helper_shutdown() on driver unbind, which
will cause all display pipelines to be shut down and therefore drop the
extra references on the connectors.
Signed-off-by: Thierry Reding <treding@nvidia.com>
The regulator is controlled as part of runtime PM, so it should not be
additionally disabled from the ->exit() callback.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Detaching from an IOMMU group multiple times can lead to a crash. This
could potentially be fixed in the IOMMU driver, but it's easy to avoid
the subsequent detach operations in this driver, so do that as well.
Signed-off-by: Thierry Reding <treding@nvidia.com>
When immediate quiet bit is set in CSA, the entire channel is blocked
by the firmware. It is expected that all the MACs will evacuate the
channel and the phy will be eventually either moved or removed.
Currently, the phy context is just unreferenced and thus, the quiet
bit is kept set and it will be impossible to TX on this phy, if we
will need to reuse it in the future. This can be seen when doing a
channel switch with mode=1 (quiet) twice from channel X to Y and then
back to channel X.
Fix that, by moving the phy context to a default channel when not
referenced anymore.
Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
When starting aggregation, the code checks the status of the queue
allocated to the aggregation tid, which might not yet be allocated
and thus the queue index may be invalid.
Fix this by reserving a new queue in case the queue id is invalid.
While at it, clean up some unreachable code (a condition that is
already handled earlier) and remove all the non-DQA comments since
non-DQA mode is no longer supported.
Fixes: cf961e1662 ("iwlwifi: mvm: support dqa-mode agg on non-shared queue")
Signed-off-by: Avraham Stern <avraham.stern@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
If the driver failed to resume from D3, it is possible that it has
no valid aux station. In such case, fw restart will end up in sending
station related commands with an invalid station id, which will
result in an assert.
Fix this by allocating a new station id for the aux station if it
does not have a valid id even in the case of fw restart.
Signed-off-by: Avraham Stern <avraham.stern@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
When a queue is reserved for aggregation, the queue id is assigned
to the tid_data. This is fine since iwl_mvm_sta_tx_agg_oper()
takes care of allocating the queue before actual tx starts.
When the reservation is cancelled (e.g. when the AP declined the
aggregation request) the tid_data is not cleared. As a result,
following tx for this tid was trying to use an unallocated queue.
Fix this by setting the txq_id for the tid to invalid when unreserving
the queue.
Signed-off-by: Avraham Stern <avraham.stern@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
After switching to a new channel, driver schedules session protection
time event in order to hear the beacon on the new channel.
The duration of the protection is two beacon intervals.
However, since we start to switch slightly before beacon with count 1, in
case we don't hear (or AP doesn't transmit) the very first beacon on the
new channel the protection ends without hearing any beacon at all.
At this stage the switch is not complete, the queues are closed and the
interface doesn't have quota yet or TBTT events. As the result, we are
stuck forever waiting for iwl_mvm_post_channel_switch() to be called.
Fix this by increasing the protection time to be 3 beacon intervals and
in addition drop the connection if the time event ends before we got any
beacon.
Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
We shouldn't allow a tunnel to have IP_MAX_MTU as MTU, because
another IPv6 header is going on top of our packets. Without this
patch, we might end up building packets bigger than IP_MAX_MTU.
Fixes: b96f9afee4 ("ipv4/6: use core net MTU range checking")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Acked-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
In vti6_link_config(), if MTU is already given on link creation
or change, validate and use it instead of recomputing it. To do
that, we need to propagate the knowledge that MTU was set by
userspace all the way down to vti6_link_config().
To keep this simple, vti6_dev_init() sets the new 'keep_mtu'
argument of vti6_link_config() to true: on initialization, we
don't have convenient access to netlink attributes there, but we
will anyway check whether dev->mtu is set in vti6_link_config().
If it's non-zero, it was set to the value of the IFLA_MTU
attribute during creation. Otherwise, determine a reasonable
value.
Fixes: ed1efb2aef ("ipv6: Add support for IPsec virtual tunnel interfaces")
Fixes: 53c81e95df ("ip6_vti: adjust vti mtu according to mtu of lower device")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Acked-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
If a lower device is found, we don't need to subtract
LL_MAX_HEADER to calculate our MTU: just use its MTU, the link
layer headers are already taken into account by it.
If the lower device is not found, start from ETH_DATA_LEN
instead, and only in this case subtract a worst-case
LL_MAX_HEADER.
We then need to subtract our additional IPv6 header from the
calculation.
While at it, note that vti6 doesn't have a hardware header, so
it doesn't need to set dev->hard_header_len. And as
vti6_link_config() now always sets the MTU, there's no need to
set a default value in vti6_dev_setup().
This makes the behaviour consistent with IPv4 vti, after
commit a32452366b ("vti4: Don't count header length twice."),
which was accidentally reverted by merge commit f895f0cfbb
("Merge branch 'master' of
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec").
While commit 53c81e95df ("ip6_vti: adjust vti mtu according to
mtu of lower device") improved on the original situation, this
was still not ideal. As reported in that commit message itself,
if we start from an underlying veth MTU of 9000, we end up with
an MTU of 8832, that is, 9000 - LL_MAX_HEADER - sizeof(ipv6hdr).
This should simply be 8880, or 9000 - sizeof(ipv6hdr) instead:
we found the lower device (veth) and we know we don't have any
additional link layer header, so there's no need to subtract an
hypothetical worst-case number.
Fixes: 53c81e95df ("ip6_vti: adjust vti mtu according to mtu of lower device")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Acked-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Don't hardcode a MTU value on vti tunnel initialization,
ip_tunnel_newlink() is able to deal with this already. See also
commit ffc2b6ee41 ("ip_gre: fix IFLA_MTU ignored on NEWLINK").
Fixes: 1181412c1a ("net/ipv4: VTI support new module for ip_vti.")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Acked-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Otherwise, it's possible to specify invalid MTU values directly
on creation of a link (via 'ip link add'). This is already
prevented on subsequent MTU changes by commit b96f9afee4
("ipv4/6: use core net MTU range checking").
Fixes: c544193214 ("GRE: Refactor GRE tunneling code.")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Acked-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
This re-introduces the effect of commit a32452366b ("vti4:
Don't count header length twice.") which was accidentally
reverted by merge commit f895f0cfbb ("Merge branch 'master' of
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec").
The commit message from Steffen Klassert said:
We currently count the size of LL_MAX_HEADER and struct iphdr
twice for vti4 devices, this leads to a wrong device mtu.
The size of LL_MAX_HEADER and struct iphdr is already counted in
ip_tunnel_bind_dev(), so don't do it again in vti_tunnel_init().
And this is still the case now: ip_tunnel_bind_dev() already
accounts for the header length of the link layer (not
necessarily LL_MAX_HEADER, if the output device is found), plus
one IP header.
For example, with a vti device on top of veth, with MTU of 1500,
the existing implementation would set the initial vti MTU to
1332, accounting once for LL_MAX_HEADER (128, included in
hard_header_len by vti) and twice for the same IP header (once
from hard_header_len, once from ip_tunnel_bind_dev()).
It should instead be 1480, because ip_tunnel_bind_dev() is able
to figure out that the output device is veth, so no additional
link layer header is attached, and will properly count one
single IP header.
The existing issue had the side effect of avoiding PMTUD for
most xfrm policies, by arbitrarily lowering the initial MTU.
However, the only way to get a consistent PMTU value is to let
the xfrm PMTU discovery do its course, and commit d6af1a31cc
("vti: Add pmtu handling to vti_xmit.") now takes care of local
delivery cases where the application ignores local socket
notifications.
Fixes: b9959fd3b0 ("vti: switch to new ip tunnel code")
Fixes: f895f0cfbb ("Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Acked-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
The #sound-dai-cells DT property is required to describe link between
the HDMI IP block and the SoC's audio subsystem.
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
When unbinding/removing the driver, we will run into the following warnings:
[ 259.655198] fec 400d1000.ethernet: 400d1000.ethernet supply phy not found, using dummy regulator
[ 259.665065] fec 400d1000.ethernet: Unbalanced pm_runtime_enable!
[ 259.672770] fec 400d1000.ethernet (unnamed net_device) (uninitialized): Invalid MAC address: 00:00:00:00:00:00
[ 259.683062] fec 400d1000.ethernet (unnamed net_device) (uninitialized): Using random MAC address: f2:3e:93:b7:29:c1
[ 259.696239] libphy: fec_enet_mii_bus: probed
Avoid these warnings by balancing the runtime PM calls during fec_drv_remove().
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull x86/pti updates from Thomas Gleixner:
"Another set of melted spectrum updates:
- Iron out the last late microcode loading issues by actually
checking whether new microcode is present and preventing the CPU
synchronization to run into a timeout induced hang.
- Remove Skylake C2 from the microcode blacklist according to the
latest Intel documentation
- Fix the VM86 POPF emulation which traps if VIP is set, but VIF is
not. Enhance the selftests to catch that kind of issue
- Annotate indirect calls/jumps for objtool on 32bit. This is not a
functional issue, but for consistency sake its the right thing to
do.
- Fix a jump label build warning observed on SPARC64 which uses 32bit
storage for the code location which is casted to 64 bit pointer w/o
extending it to 64bit first.
- Add two new cpufeature bits. Not really an urgent issue, but
provides them for both x86 and x86/kvm work. No impact on the
current kernel"
* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/microcode: Fix CPU synchronization routine
x86/microcode: Attempt late loading only when new microcode is present
x86/speculation: Remove Skylake C2 from Speculation Control microcode blacklist
jump_label: Fix sparc64 warning
x86/speculation, objtool: Annotate indirect calls/jumps for objtool on 32-bit kernels
x86/vm86/32: Fix POPF emulation
selftests/x86/entry_from_vm86: Add test cases for POPF
selftests/x86/entry_from_vm86: Exit with 1 if we fail
x86/cpufeatures: Add Intel PCONFIG cpufeature
x86/cpufeatures: Add Intel Total Memory Encryption cpufeature
Pull x86 fix from Thomas Gleixner:
"A single fix for vmalloc_fault() which uses p*d_huge() unconditionally
whether CONFIG_HUGETLBFS is set or not. In case of CONFIG_HUGETLBFS=n
this results in a crash as p*d_huge() returns 0 in that case"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Fix vmalloc_fault to use pXd_large
Pull irq fixes from Thomas Gleixner:
"Three fixes for irq chip drivers:
- Make sure the allocations in the GIC-V3 ITS driver are large enough
to accomodate the interrupt space
- Fix a misplaced __iomem annotation which causes a splat of 26
sparse warnings
- Remove an unused function in the IMX GPCV2 driver which causes
build warnings"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/irq-imx-gpcv2: Remove unused function
irqchip/gic-v3-its: Ensure nr_ites >= nr_lpis
irqchip/gic-v3-its: Fix misplaced __iomem annotations
Pull EFI fix from Thomas Gleixner:
"A single fix to prevent partially initialized pointers in mixed mode
(64bit kernel on 32bit UEFI)"
* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi/libstub/tpm: Initialize pointer variables to zero for mixed mode
Pull KVM fixes from Paolo Bonzini:
"PPC:
- fix bug leading to lost IPIs and smp_call_function_many() lockups
on POWER9
ARM:
- locking fix
- reset fix
- GICv2 multi-source SGI injection fix
- GICv2-on-v3 MMIO synchronization fix
- make the console less verbose.
x86:
- fix device passthrough on AMD SME"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: Fix device passthrough when SME is active
kvm: arm/arm64: vgic-v3: Tighten synchronization for guests using v2 on v3
KVM: arm/arm64: vgic: Don't populate multiple LRs with the same vintid
KVM: arm/arm64: Reduce verbosity of KVM init log
KVM: arm/arm64: Reset mapped IRQs on VM reset
KVM: arm/arm64: Avoid vcpu_load for other vcpu ioctls than KVM_RUN
KVM: arm/arm64: vgic: Add missing irq_lock to vgic_mmio_read_pending
KVM: PPC: Book3S HV: Fix trap number return from __kvmppc_vcore_entry
batadv_check_unicast_ttvn may redirect a packet to itself or another
originator. This involves rewriting the ttvn and the destination address in
the batadv unicast header. These field were not yet pulled (with skb rcsum
update) and thus any change to them also requires a change in the receive
checksum.
Reported-by: Matthias Schiffer <mschiffer@universe-factory.net>
Fixes: a73105b8d4 ("batman-adv: improved client announcement mechanism")
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
'Commit 45dac1d6ea ("vmxnet3: Changes for vmxnet3 adapter version 2
(fwd)")' introduced a flag "lro" in structure vmxnet3_adapter which is
used to indicate whether LRO is enabled or not. However, the patch
did not set the flag and hence it was never exercised.
So, when LRO is enabled, it resulted in poor TCP performance due to
delayed acks. This issue is seen with packets which are larger than
the mss getting a delayed ack rather than an immediate ack, thus
resulting in high latency.
This patch removes the lro flag and directly uses device features
against NETIF_F_LRO to check if lro is enabled.
Fixes: 45dac1d6ea ("vmxnet3: Changes for vmxnet3 adapter version 2 (fwd)")
Reported-by: Rachel Lunnon <rachel_lunnon@stormagic.com>
Signed-off-by: Ronak Doshi <doshir@vmware.com>
Acked-by: Shrikrishna Khare <skhare@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The field txNumDeferred is used by the driver to keep track of the number
of packets it has pushed to the emulation. The driver increments it on
pushing the packet to the emulation and the emulation resets it to 0 at
the end of the transmit.
There is a possibility of a race either when (a) ESX is under heavy load or
(b) workload inside VM is of low packet rate.
This race results in xmit hangs when network coalescing is disabled. This
change creates a local copy of txNumDeferred and uses it to perform ring
arithmetic.
Reported-by: Noriho Tanaka <ntanaka@vmware.com>
Signed-off-by: Ronak Doshi <doshir@vmware.com>
Acked-by: Shrikrishna Khare <skhare@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Davide Caratti says:
====================
net/sched: fix NULL dereference in the error path of .init()
with several TC actions it's possible to see NULL pointer dereference,
when the .init() function calls tcf_idr_alloc(), fails at some point and
then calls tcf_idr_release(): this series fixes all them introducing
non-NULL tests in the .cleanup() function.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver implementation returns support for private flags, while
no private flags are present. When asked for the number of private
flags it returns the number of statistic flag names.
Fix this by returning EOPNOTSUPP for not implemented ethtool flags.
Signed-off-by: Matthias Brugger <mbrugger@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some HP laptops have a mute mute LED controlled by a pin VREF. The
Realtek codec driver updates the VREF via vmaster hook by calling
snd_hda_set_pin_ctl_cache().
This works fine as long as the driver is running in a normal mode.
However, when the VREF change happens during the codec being in
runtime PM suspend, the regmap access will skip and postpone the
actual register change. This ends up with the unchanged LED status
until the next runtime PM resume even if you change the Master mute
switch. (Interestingly, the machine keeps the LED status even after
the codec goes into D3 -- but it's another story.)
For improving this usability, let the driver temporarily powering up /
down only during the pin VREF change. This can be achieved easily by
wrapping the call with snd_hda_power_up_pm() / *_down_pm().
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199073
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
In commit 9ffcc3725f ("mlxsw: spectrum: Allow packets to be trapped
from any PG") I fixed a problem where packets could not be trapped to
the CPU due to exceeded shared buffer quotas. The mentioned commit
explains the problem in detail.
The problem was fixed by assigning a minimum quota for the CPU port and
the traffic class used for scheduling traffic to the CPU.
However, commit 117b0dad2d ("mlxsw: Create a different trap group list
for each device") assigned different traffic classes to different
packet types and rendered the fix useless.
Fix the problem by assigning a minimum quota for the CPU port and all
the traffic classes that are currently in use.
Fixes: 117b0dad2d ("mlxsw: Create a different trap group list for each device")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Eddie Shklaer <eddies@mellanox.com>
Tested-by: Eddie Shklaer <eddies@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
syzbot reported one use-after-free in pfifo_fast_enqueue() [1]
Issue here is that we can not reuse skb after a successful skb_array_produce()
since another cpu might have consumed it already.
I believe a similar problem exists in try_bulk_dequeue_skb_slow()
in case we put an skb into qdisc_enqueue_skb_bad_txq() for lockless qdisc.
[1]
BUG: KASAN: use-after-free in qdisc_pkt_len include/net/sch_generic.h:610 [inline]
BUG: KASAN: use-after-free in qdisc_qstats_cpu_backlog_inc include/net/sch_generic.h:712 [inline]
BUG: KASAN: use-after-free in pfifo_fast_enqueue+0x4bc/0x5e0 net/sched/sch_generic.c:639
Read of size 4 at addr ffff8801cede37e8 by task syzkaller717588/5543
CPU: 1 PID: 5543 Comm: syzkaller717588 Not tainted 4.16.0-rc4+ #265
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x24d lib/dump_stack.c:53
print_address_description+0x73/0x250 mm/kasan/report.c:256
kasan_report_error mm/kasan/report.c:354 [inline]
kasan_report+0x23c/0x360 mm/kasan/report.c:412
__asan_report_load4_noabort+0x14/0x20 mm/kasan/report.c:432
qdisc_pkt_len include/net/sch_generic.h:610 [inline]
qdisc_qstats_cpu_backlog_inc include/net/sch_generic.h:712 [inline]
pfifo_fast_enqueue+0x4bc/0x5e0 net/sched/sch_generic.c:639
__dev_xmit_skb net/core/dev.c:3216 [inline]
Fixes: c5ad119fb6 ("net: sched: pfifo_fast use skb_array")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot+ed43b6903ab968b16f54@syzkaller.appspotmail.com
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Just when I had decided that flush_cache_range() was always called with
a valid context, Helge reported two cases where the
"BUG_ON(!vma->vm_mm->context);" was hit on the phantom buildd:
kernel BUG at /mnt/sdb6/linux/linux-4.15.4/arch/parisc/kernel/cache.c:587!
CPU: 1 PID: 3254 Comm: kworker/1:2 Tainted: G D 4.15.0-1-parisc64-smp #1 Debian 4.15.4-1+b1
Workqueue: events free_ioctx
IAOQ[0]: flush_cache_range+0x164/0x168
IAOQ[1]: flush_cache_page+0x0/0x1c8
RP(r2): unmap_page_range+0xae8/0xb88
Backtrace:
[<00000000404a6980>] unmap_page_range+0xae8/0xb88
[<00000000404a6ae0>] unmap_single_vma+0xc0/0x188
[<00000000404a6cdc>] zap_page_range_single+0x134/0x1f8
[<00000000404a702c>] unmap_mapping_range+0x1cc/0x208
[<0000000040461518>] truncate_pagecache+0x98/0x108
[<0000000040461624>] truncate_setsize+0x9c/0xb8
[<00000000405d7f30>] put_aio_ring_file+0x80/0x100
[<00000000405d803c>] aio_free_ring+0x8c/0x290
[<00000000405d82c0>] free_ioctx+0x80/0x180
[<0000000040284e6c>] process_one_work+0x21c/0x668
[<00000000402854c4>] worker_thread+0x20c/0x778
[<0000000040291d44>] kthread+0x2d4/0x2e0
[<0000000040204020>] end_fault_vector+0x20/0xc0
This indicates that we need to handle the no context case in
flush_cache_range() as we do in flush_cache_mm().
In thinking about this, I realized that we don't need to flush the TLB
when there is no context. So, I added context checks to the large flush
cases in flush_cache_mm() and flush_cache_range(). The large flush case
occurs frequently in flush_cache_mm() and the change should improve fork
performance.
The v2 version of this change removes the BUG_ON from flush_cache_page()
by skipping the TLB flush when there is no context. I also added code
to flush the TLB in flush_cache_mm() and flush_cache_range() when we
have a context that's not current. Now all three routines handle TLB
flushes in a similar manner.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Cc: stable@vger.kernel.org # 4.9+
Signed-off-by: Helge Deller <deller@gmx.de>
Pull btrfs fixes from David Sterba:
"There's an important revert in this pull request that needs to go to
stable as it causes a corruption on big endian machines.
The other fix is for FIEMAP incorrectly reporting shared extents
before a sync and one fix for a crash in raid56.
So far we got only one report about the BE corruption, the stable
kernels were out for like a week, so hopefully the scope of the damage
is low"
* tag 'for-4.16-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
Revert "btrfs: use proper endianness accessors for super_copy"
btrfs: add missing initialization in btrfs_check_shared
btrfs: Fix NULL pointer exception in find_bio_stripe
Pull microblaze fixes from Michal Simek:
- Use NO_BOOTMEM to fix boot issue
- Fix opt lib endian dependencies
* tag 'microblaze-4.16-rc6' of git://git.monstr.eu/linux-2.6-microblaze:
microblaze: switch to NO_BOOTMEM
microblaze: remove unused alloc_maybe_bootmem
microblaze: Setup dependencies for ASM optimized lib functions
Emanuel reported an issue with a hang during microcode update because my
dumb idea to use one atomic synchronization variable for both rendezvous
- before and after update - was simply bollocks:
microcode: microcode_reload_late: late_cpus: 4
microcode: __reload_late: cpu 2 entered
microcode: __reload_late: cpu 1 entered
microcode: __reload_late: cpu 3 entered
microcode: __reload_late: cpu 0 entered
microcode: __reload_late: cpu 1 left
microcode: Timeout while waiting for CPUs rendezvous, remaining: 1
CPU1 above would finish, leave and the others will still spin waiting for
it to join.
So do two synchronization atomics instead, which makes the code a lot more
straightforward.
Also, since the update is serialized and it also takes quite some time per
microcode engine, increase the exit timeout by the number of CPUs on the
system.
That's ok because the moment all CPUs are done, that timeout will be cut
short.
Furthermore, panic when some of the CPUs timeout when returning from a
microcode update: we can't allow a system with not all cores updated.
Also, as an optimization, do not do the exit sync if microcode wasn't
updated.
Reported-by: Emanuel Czirai <xftroxgpx@protonmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Emanuel Czirai <xftroxgpx@protonmail.com>
Tested-by: Ashok Raj <ashok.raj@intel.com>
Tested-by: Tom Lendacky <thomas.lendacky@amd.com>
Link: https://lkml.kernel.org/r/20180314183615.17629-2-bp@alien8.de
Pull drm fixes from Dave Airlie:
"i915, amd and nouveau fixes.
i915:
- backlight fix for some panels
- pm fix
- fencing fix
- some GVT fixes
amdgpu:
- backlight fix across suspend/resume
- object destruction ordering issue fix
- displayport fix
nouveau:
- two backlight fixes
- fix for some lockups
Pretty quiet week, seems like everyone was fixing backlights"
* tag 'drm-fixes-for-v4.16-rc6' of git://people.freedesktop.org/~airlied/linux:
drm/nouveau/bl: fix backlight regression
drm/nouveau/bl: Fix oops on driver unbind
drm/nouveau/mmu: ALIGN_DOWN correct variable
drm/i915/gvt: fix user copy warning by whitelist workload rb_tail field
drm/i915/gvt: Correct the privilege shadow batch buffer address
drm/amdgpu/dce: Don't turn off DP sink when disconnected
drm/amdgpu: save/restore backlight level in legacy dce code
drm/radeon: fix prime teardown order
drm/amdgpu: fix prime teardown order
drm/i915: Kick the rps worker when changing the boost frequency
drm/i915: Only prune fences after wait-for-all
drm/i915: Enable VBT based BL control for DP
drm/i915/gvt: keep oa config in shadow ctx
drm/i915/gvt: Add runtime_pm_get/put into gvt_switch_mmio
Checking for 0 is insufficient: when an SKB without a batadv header, but
with a VLAN header is received, hdr_size will be 4, making the following
code interpret the Ethernet header as a batadv header.
Fixes: be1db4f661 ("batman-adv: make the Distributed ARP Table vlan aware")
Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
batadv_check_unicast_ttvn() calls skb_cow(), so pointers into the SKB data
must be (re)set after calling it. The ethhdr variable is dropped
altogether.
Fixes: 7cdcf6dddc ("batman-adv: add UNICAST_4ADDR packet type")
Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
The MDIO busses are switch properties and so should be inside the
switch node. Fix the examples in the binding document.
Reported-by: 尤晓杰 <yxj790222@163.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Fixes: a3c53be55c ("net: dsa: mv88e6xxx: Support multiple MDIO busses")
Signed-off-by: David S. Miller <davem@davemloft.net>
When errors are enqueued to the error queue via sock_queue_err_skb()
function, it is possible that the waiting application is not notified.
Calling 'sk->sk_data_ready()' would not notify applications that
selected only POLLERR events in poll() (for example).
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: Randy E. Witt <randy.e.witt@intel.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
nlmsg_multicast() consumes always the skb, thus the original skb must be
freed only when this function is called with a clone.
Fixes: cb9f7a9a5c ("netlink: ensure to loop over all netns in genlmsg_multicast_allns()")
Reported-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Link updates were not reported to qedr correctly.
Leading to cases where a link could be down, but qedr
would see it as up.
In addition, once qede was loaded, link state would be up,
regardless of the actual link state.
Signed-off-by: Michal Kalderon <michal.kalderon@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michal Kalderon says:
====================
qed: iWARP related fixes
This series contains two fixes related to iWARP flow.
====================
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
FW workaround. The iWARP LL2 connection did not expect TCP packets
to arrive on it's connection. The fix drops any non-tcp packets
Fixes b5c29ca ("qed: iWARP CM - setup a ll2 connection for handling
SYN packets")
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is a corner case in the MPA unalign flow where a FPDU header is
split over two tcp segments. The length of the first fragment in this
case was not initialized properly and should be '1'
Fixes: c7d1d839 ("qed: Add support for MPA header being split over two tcp packets")
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no need for complex checking between the last consumed index
and current consumed index, a simple subtraction will do.
This also eliminates the possibility of a permanent transmit queue stall
under the following conditions:
- one CPU bursts ring->size worth of traffic (up to 256 buffers), to the
point where we run out of free descriptors, so we stop the transmit
queue at the end of bcm_sysport_xmit()
- because of our locking, we have the transmit process disable
interrupts which means we can be blocking the TX reclamation process
- when TX reclamation finally runs, we will be computing the difference
between ring->c_index (last consumed index by SW) and what the HW
reports through its register
- this register is masked with (ring->size - 1) = 0xff, which will lead
to stripping the upper bits of the index (register is 16-bits wide)
- we will be computing last_tx_cn as 0, which means there is no work to
be done, and we never wake-up the transmit queue, leaving it
permanently disabled
A practical example is e.g: ring->c_index aka last_c_index = 12, we
pushed 256 entries, HW consumer index = 268, we mask it with 0xff = 12,
so last_tx_cn == 0, nothing happens.
Fixes: 80105befdb ("net: systemport: add Broadcom SYSTEMPORT Ethernet MAC driver")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Toshiaki Makita says:
====================
Fix vlan untag and insertion for bridge and vlan with reorder_hdr off
As Brandon Carpenter reported[1], sending non-vlan-offloaded packets from
bridge devices ends up with corrupted packets. He narrowed down this problem
and found that the root cause is in skb_reorder_vlan_header().
While I was working on fixing this problem, I found that the function does
not work properly for double tagged packets with reorder_hdr off as well.
Patch 1 fixes these 2 problems in skb_reorder_vlan_header().
And it turned out that fixing skb_reorder_vlan_header() is not sufficient
to receive double tagged packets with reorder_hdr off while I was testing the
fix. Vlan tags got out of order when vlan devices with reorder_hdr disabled
were stacked. Patch 2 fixes this problem.
[1] https://www.spinics.net/lists/linux-ethernet-bridging/msg07039.html
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
With reorder header off, received packets are untagged in skb_vlan_untag()
called from within __netif_receive_skb_core(), and later the tag will be
inserted back in vlan_do_receive().
This caused out of order vlan headers when we create a vlan device on top
of another vlan device, because vlan_do_receive() inserts a tag as the
outermost vlan tag. E.g. the outer tag is first removed in skb_vlan_untag()
and inserted back in vlan_do_receive(), then the inner tag is next removed
and inserted back as the outermost tag.
This patch fixes the behaviour by inserting the inner tag at the right
position.
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
When we have a bridge with vlan_filtering on and a vlan device on top of
it, packets would be corrupted in skb_vlan_untag() called from
br_dev_xmit().
The problem sits in skb_reorder_vlan_header() used in skb_vlan_untag(),
which makes use of skb->mac_len. In this function mac_len is meant for
handling rx path with vlan devices with reorder_header disabled, but in
tx path mac_len is typically 0 and cannot be used, which is the problem
in this case.
The current code even does not properly handle rx path (skb_vlan_untag()
called from __netif_receive_skb_core()) with reorder_header off actually.
In rx path single tag case, it works as follows:
- Before skb_reorder_vlan_header()
mac_header data
v v
+-------------------+-------------+------+----
| ETH | VLAN | ETH |
| ADDRS | TPID | TCI | TYPE |
+-------------------+-------------+------+----
<-------- mac_len --------->
<------------->
to be removed
- After skb_reorder_vlan_header()
mac_header data
v v
+-------------------+------+----
| ETH | ETH |
| ADDRS | TYPE |
+-------------------+------+----
<-------- mac_len --------->
This is ok, but in rx double tag case, it corrupts packets:
- Before skb_reorder_vlan_header()
mac_header data
v v
+-------------------+-------------+-------------+------+----
| ETH | VLAN | VLAN | ETH |
| ADDRS | TPID | TCI | TPID | TCI | TYPE |
+-------------------+-------------+-------------+------+----
<--------------- mac_len ---------------->
<------------->
should be removed
<--------------------------->
actually will be removed
- After skb_reorder_vlan_header()
mac_header data
v v
+-------------------+------+----
| ETH | ETH |
| ADDRS | TYPE |
+-------------------+------+----
<--------------- mac_len ---------------->
So, two of vlan tags are both removed while only inner one should be
removed and mac_header (and mac_len) is broken.
skb_vlan_untag() is meant for removing the vlan header at (skb->data - 2),
so use skb->data and skb->mac_header to calculate the right offset.
Reported-by: Brandon Carpenter <brandon.carpenter@cypherpath.com>
Fixes: a6e18ff111 ("vlan: Fix untag operations of stacked vlans with REORDER_HEADER off")
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit 3c181c12c4.
The offending patch was merged in 4.16-rc4 and was promptly applied to
stable kernels 4.14.25 and 4.15.8.
The patch causes a corruption in several superblock items on big-endian
machines because of messed up endianity conversions. The damage is
manually repairable. A filesystem cannot be mounted again after it has
been unmounted once.
We do a full revert and not a fixup so stable can pick that patch ASAP.
Fixes: 3c181c12c4 ("btrfs: use proper endianness accessors for super_copy")
Link: https://lkml.kernel.org/r/1521139304@msgid.manchmal.in-ulm.de
CC: stable@vger.kernel.org # 4.14+
Reported-by: Christoph Biedl <linux-kernel.bfrz@manchmal.in-ulm.de>
Signed-off-by: David Sterba <dsterba@suse.com>
When using device passthrough with SME active, the MMIO range that is
mapped for the device should not be mapped encrypted. Add a check in
set_spte() to insure that a page is not mapped encrypted if that page
is a device MMIO page as indicated by kvm_is_mmio_pfn().
Cc: <stable@vger.kernel.org> # 4.14.x-
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Testing brcmfmac with more recent firmwares resulted in AP interfaces
not working in some specific setups. Debugging resulted in discovering
support for IAPP in Broadcom's firmwares.
Older firmwares were only generating 802.11f frames. Newer ones like:
1) 10.10 (TOB) (r663589)
2) 10.10.122.20 (r683106)
for 4366b1 and 4366c0 respectively seem to also /respect/ 802.11f frames
in the Tx path by performing a STA disassociation.
This obsoleted standard and its implementation is something that:
1) Most people don't need / want to use
2) Can allow local DoS attacks
3) Breaks AP interfaces in some specific bridge setups
To solve issues it can cause this commit modifies brcmfmac to drop IAPP
packets. If affects:
1) Rx path: driver won't be sending these unwanted packets up.
2) Tx path: driver will reject packets that would trigger STA
disassociation perfromed by a firmware (possible local DoS attack).
It appears there are some Broadcom's clients/users who care about this
feature despite the drawbacks. They can switch it on using a new module
param.
This change results in only two more comparisons (check for module param
and check for Ethernet packet length) for 99.9% of packets. Its overhead
should be very minimal.
Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Third batch of iwlwifi fixes intended for 4.16:
* Fix an issue with the multicast queue;
* Fix IGTK handling;
* Fix some missing return value checks;
* Add support for a HW workaround for issues on some platforms;
Microblaze doesn't set CONFIG_NO_BOOTMEM and so memblock_virt_alloc()
doesn't work for CONFIG_HAVE_MEMBLOCK && !CONFIG_NO_BOOTMEM.
Similar change was already done by others architectures
"ARM: mm: Remove bootmem code and switch to NO_BOOTMEM"
(sha1: 84f452b1e8)
or
"openrisc: Consolidate setup to use memblock instead of bootmem"
(sha1: 266c7fad15)
or
"parisc: Drop bootmem and switch to memblock"
(sha1: 4fe9e1d957)
or
"powerpc: Remove bootmem allocator"
(sha1: 10239733ee)
or
"s390/mm: Convert bootmem to memblock"
(sha1: 50be634507)
or
"sparc64: Convert over to NO_BOOTMEM."
(sha1: 625d693e97)
or
"xtensa: drop sysmem and switch to memblock"
(sha1: 0e46c1115f)
Issue was introduced by:
"of/fdt: use memblock_virt_alloc for early alloc"
(sha1: 0fa1c57934)
Signed-off-by: Rob Herring <robh@kernel.org>
Tested-by: Alvaro Gamez Machado <alvaro.gamez@hazent.com>
Tested-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
The patch:
"microblaze: Setup proper dependency for optimized lib functions"
(sha1: 7b6ce52be3)
didn't setup all dependencies properly.
Optimized lib functions in C are also present for little endian
and optimized library functions in assembler are implemented only for
big endian version.
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Some devices use a shared clock which is very sensitive to variations
and cause trouble in some situations. We need to set a bit in the phy
configuration to indicate that to the FW. To make this generic, add a
extra_phy_config_flags element to the device configuration and OR it
into the phy_cfg before sending it to the firmware. And also create a
set of configurations for devices that use shared clocks and need this
extra bit to be set.
Fixes: c62446d2b0 ("iwlwifi: add new 9460 series PCI IDs")
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
The earlier patch called the station add functions but didn't
assign their return value to the ret variable, so that the
checks for it were meaningless. Fix that.
Found by smatch:
.../mac80211.c:2560 iwl_mvm_start_ap_ibss() warn: we tested 'ret' before and it was 'false'
.../mac80211.c:2563 iwl_mvm_start_ap_ibss() warn: we tested 'ret' before and it was 'false'
Fixes: 3a89411cd31c ("iwlwifi: mvm: fix assert 0x2B00 on older FWs")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Currently when an IGTK is set for an AP, it is set as a regular key.
Since the cipher is set to CMAC, the STA_KEY_FLG_EXT flag is added to
the host command, which causes assert 0x253D on NICs that do not support
this.
Fixes: 85aeb58cec ("iwlwifi: mvm: Enable security on new TX API")
Signed-off-by: Beni Lev <beni.lev@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
The tid being used for the queue (cab_queue) for the MCAST
station has been changed recently to be 0 (for BE).
The flush path still flushed only the special tid (15)
which means that the firmware wasn't flushing the right
queue and we could get a firmware crash upon remove
station if we had an MCAST packet on the ring.
The current code that flushes queues for a station only
differentiates between internal stations (stations that
aren't instantiated in mac80211, like the MCAST station)
and the non-internal ones.
Internal stations can be either: BCAST (beacons), MCAST
(for cab_queue), GENERAL_PURPOSE (p2p dev, and sniffer
injection). The internal stations can use different tids.
To make the code simpler, just flush all the tids always
and add the special internal tid (15) for internal
stations. The firmware will know how to handle this even
if we hadn't any queue mapped that that tid.
Fixes: e340c1a6ef4b ("iwlwifi: mvm: Correctly set the tid for mcast queue")
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
In the xfrm_local_error, rcu_read_unlock should be called when afinfo
is not NULL. because xfrm_state_get_afinfo calls rcu_read_unlock
if afinfo is NULL.
Fixes: af5d27c4e1 ("xfrm: remove xfrm_state_put_afinfo")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Only GVT fixes:
- Two warnings fix for runtime pm and usr copy (Xiong, Zhenyu)
- OA context fix for vGPU profiling (Min)
- privilege batch buffer reloc fix (Fred)
* tag 'drm-intel-fixes-2018-03-15' of git://anongit.freedesktop.org/drm/drm-intel:
drm/i915/gvt: fix user copy warning by whitelist workload rb_tail field
drm/i915/gvt: Correct the privilege shadow batch buffer address
drm/i915/gvt: keep oa config in shadow ctx
drm/i915/gvt: Add runtime_pm_get/put into gvt_switch_mmio
Commit 99759869fa "acpi: Add acpi_map_pxm_to_online_node()" added
support for mapping a given proximity to its nearest, by SLIT distance,
online node. However, it sometimes returns unexpected results due to the
fact that it switches from comparing the PXM node to the last node that
was closer than the current max.
for_each_online_node(n) {
dist = node_distance(node, n);
if (dist < min_dist) {
min_dist = dist;
node = n; <---- from this point we're using the
wrong node for node_distance()
Fixes: 99759869fa ("acpi: Add acpi_map_pxm_to_online_node()")
Cc: <stable@vger.kernel.org>
Reviewed-by: Toshi Kani <toshi.kani@hp.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Pull vfs fixes from Al Viro:
- backport-friendly part of lock_parent() race fix
- a fix for an assumption in the heurisic used by path_connected() that
is not true on NFS
- livelock fixes for d_alloc_parallel()
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fs: Teach path_connected to handle nfs filesystems with multiple roots.
fs: dcache: Use READ_ONCE when accessing i_dir_seq
fs: dcache: Avoid livelock between d_alloc_parallel and __d_add
lock_parent() needs to recheck if dentry got __dentry_kill'ed under it
Unbinding nouveau on a dual GPU MacBook Pro oopses because we iterate
over the bl_connectors list in nouveau_backlight_exit() but skipped
initializing it in nouveau_backlight_init(). Stacktrace for posterity:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
IP: nouveau_backlight_exit+0x2b/0x70 [nouveau]
nouveau_display_destroy+0x29/0x80 [nouveau]
nouveau_drm_unload+0x65/0xe0 [nouveau]
drm_dev_unregister+0x3c/0xe0 [drm]
drm_put_dev+0x2e/0x60 [drm]
nouveau_drm_device_remove+0x47/0x70 [nouveau]
pci_device_remove+0x36/0xb0
device_release_driver_internal+0x157/0x220
driver_detach+0x39/0x70
bus_remove_driver+0x51/0xd0
pci_unregister_driver+0x2a/0xa0
nouveau_drm_exit+0x15/0xfb0 [nouveau]
SyS_delete_module+0x18c/0x290
system_call_fast_compare_end+0xc/0x6f
Fixes: b53ac1ee12 ("drm/nouveau/bl: Do not register interface if Apple GMUX detected")
Cc: stable@vger.kernel.org # v4.10+
Cc: Pierre Moreau <pierre.morrow@free.fr>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Commit 7110c89bb8852ff8b0f88ce05b332b3fe22bd11e ("mmu: swap out round
for ALIGN") replaced two calls to round/rounddown with ALIGN/ALIGN_DOWN,
but erroneously applied ALIGN_DOWN to a different variable (addr) and left
intended variable (tail) not rounded/ALIGNed.
As a result screen corruption, X lockups are observable. An example of kernel
log of affected system with NV98 card where it was bisected:
nouveau 0000:01:00.0: gr: TRAP_M2MF 00000002 [IN]
nouveau 0000:01:00.0: gr: TRAP_M2MF 00320951 400007c0 00000000 04000000
nouveau 0000:01:00.0: gr: 00200000 [] ch 1 [000fbbe000 DRM] subc 4 class 5039
mthd 0100 data 00000000
nouveau 0000:01:00.0: fb: trapped read at 0040000000 on channel 1
[0fbbe000 DRM]
engine 00 [PGRAPH] client 03 [DISPATCH] subclient 04 [M2M_IN] reason 00000006
[NULL_DMAOBJ]
Fixes bug 105173 ("[MCP79][Regression] Unhandled NULL pointer dereference in
nvkm_object_unmap since kernel 4.15")
https://bugs.freedesktop.org/show_bug.cgi?id=105173
Fixes: 7110c89bb885 ("mmu: swap out round for ALIGN ")
Tested-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
Signed-off-by: Maris Nartiss <maris.nartiss@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Cc: stable@vger.kernel.org # v4.15+
On nfsv2 and nfsv3 the nfs server can export subsets of the same
filesystem and report the same filesystem identifier, so that the nfs
client can know they are the same filesystem. The subsets can be from
disjoint directory trees. The nfsv2 and nfsv3 filesystems provides no
way to find the common root of all directory trees exported form the
server with the same filesystem identifier.
The practical result is that in struct super s_root for nfs s_root is
not necessarily the root of the filesystem. The nfs mount code sets
s_root to the root of the first subset of the nfs filesystem that the
kernel mounts.
This effects the dcache invalidation code in generic_shutdown_super
currently called shrunk_dcache_for_umount and that code for years
has gone through an additional list of dentries that might be dentry
trees that need to be freed to accomodate nfs.
When I wrote path_connected I did not realize nfs was so special, and
it's hueristic for avoiding calling is_subdir can fail.
The practical case where this fails is when there is a move of a
directory from the subtree exposed by one nfs mount to the subtree
exposed by another nfs mount. This move can happen either locally or
remotely. With the remote case requiring that the move directory be cached
before the move and that after the move someone walks the path
to where the move directory now exists and in so doing causes the
already cached directory to be moved in the dcache through the magic
of d_splice_alias.
If someone whose working directory is in the move directory or a
subdirectory and now starts calling .. from the initial mount of nfs
(where s_root == mnt_root), then path_connected as a heuristic will
not bother with the is_subdir check. As s_root really is not the root
of the nfs filesystem this heuristic is wrong, and the path may
actually not be connected and path_connected can fail.
The is_subdir function might be cheap enough that we can call it
unconditionally. Verifying that will take some benchmarking and
the result may not be the same on all kernels this fix needs
to be backported to. So I am avoiding that for now.
Filesystems with snapshots such as nilfs and btrfs do something
similar. But as the directory tree of the snapshots are disjoint
from one another and from the main directory tree rename won't move
things between them and this problem will not occur.
Cc: stable@vger.kernel.org
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Fixes: 397d425dc2 ("vfs: Test for and handle paths that are unreachable from their mnt_root")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
pmdp_invalidate() was changed to update the pmd atomically
(to not lose dirty/access bits) and return the original pmd
value.
However, in doing so, we lost a lot of the essential work that
set_pmd_at() does, namely to update hugepage mapping counts and
queuing up the batched TLB flush entry.
Thus we were not flushing entries out of the TLB when making
such PMD changes.
Fix this by abstracting the accounting work of set_pmd_at() out into a
separate function, and call it from pmdp_establish().
Fixes: a8e654f01c ("sparc64: update pmdp_invalidate() to return old pmd value")
Signed-off-by: David S. Miller <davem@davemloft.net>
kvm/arm fixes for 4.16, take 2
- Peace of mind locking fix in vgic_mmio_read_pending
- Allow hw-mapped interrupts to be reset when the VM resets
- Fix GICv2 multi-source SGI injection
- Fix MMIO synchronization for GICv2 on v3 emulation
- Remove excess verbosity on the console
The IRQ output of the bcm bt-device is really a level IRQ signal, which
signals a logical high as long as the device's buffer contains data. Since
the draining in the buffer is done in the tty driver, we cannot (easily)
wait in a threaded interrupt handler for the draining, after which the
IRQ should go low again.
So instead we treat the IRQ as an edge interrupt. This opens the window
for a theoretical race where we wakeup, read some data and then autosuspend
*before* the IRQ has gone (logical) low, followed by the device just at
that moment receiving more data, causing the IRQ to stay high and we never
see an edge.
Since we call pm_runtime_mark_last_busy() on every received byte, there
should be plenty time for the IRQ to go (logical) low before we ever
suspend, so this should never happen, but after commit 43fff76834
("Bluetooth: hci_bcm: Streamline runtime PM code"), which has been reverted
since, this was actually happening causing the device to get stuck in
runtime suspend.
The bcm bt-device actually has a workaround for this, if we set the
pulsed_host_wake flag in the sleep parameters, then the device monitors
if the host is draining the buffer and if not then after a timeout the
device will pulse the IRQ line, causing us to see an edge, fixing the
stuck in suspend condition.
This commit sets the pulsed_host_wake flag to fix the (mostly theoretical)
race caused by us treating the IRQ as an edge IRQ.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
This reverts commit 43fff76834 ("Bluetooth: hci_bcm: Streamline runtime
PM code"). The commit msg for this commit states "No functional change
intended.", but replacing:
pm_runtime_get();
pm_runtime_mark_last_busy();
pm_runtime_put_autosuspend();
with:
pm_request_resume();
Does result in a functional change, pm_request_resume() only calls
pm_runtime_mark_last_busy() if the device was suspended before the call.
This results in the following happening:
1) Device is runtime suspended
2) Device drives host_wake IRQ logically high as it starts receiving data
3) bcm_host_wake() gets called, causes the device to runtime-resume,
current time gets marked as last_busy time
4) After 5 seconds the autosuspend timer expires and the dev autosuspends
as no one has been calling pm_runtime_mark_last_busy(), the device was
resumed during those 5 seconds, so all the pm_request_resume() calls
while receiving data and/or bcm_host_wake() calls were nops
5) If 4) happens while the device has (just received) data in its buffer to
be read by the host the IRQ line is *already* / still logically high
when we autosuspend and since we use an edge triggered IRQ, the IRQ
will never trigger, causing the device to get stuck in suspend
Therefor this commit has to be reverted, so that we avoid the device
getting stuck in suspend.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
According to the Aspeed specification, the reset and enable sequence
should be done when the clock is stopped. The specification doesn't
define behavior if the reset is done while the clock is enabled.
From testing on the AST2500, the LPC Controller has problems if the
clock is reset while enabled.
Therefore, check whether the clock is enabled or not before performing
the reset and enable sequence in the Aspeed clock driver.
Reported-by: Lei Yu <mine260309@gmail.com>
Signed-off-by: Eddie James <eajames@linux.vnet.ibm.com>
Fixes: 15ed8ce5f8 ("clk: aspeed: Register gated clocks")
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Some of the Aspeed clocks are disabled by setting the relevant bit in
the "clock stop control" register to one, while others are disabled by
setting their bit to zero. The driver already uses a flag per gate to
identify this behavior, but doesn't apply it in the clock is_enabled
function.
Use the existing gate flag to correctly return whether or not a clock
is enabled in the aspeed_clk_is_enabled function.
Signed-off-by: Eddie James <eajames@linux.vnet.ibm.com>
Fixes: 6671507f0f ("clk: aspeed: Handle inverse polarity of USB port 1 clock gate")
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Pull sound fixes from Takashi Iwai:
"A series of small fixes in ASoC, HD-audio and core stuff:
- a UAF fix in ALSA PCM core
- yet more hardening for ALSA sequencer
- a regression fix for the previous HD-audio power_save option change
- various ASoC codec fixes (sgtl5000, rt5651, hdmi-codec, wm_adsp)
- minor ASoC platform fixes (AMD ACP, sun4i)"
* tag 'sound-4.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - Revert power_save option default value
ALSA: pcm: Fix UAF in snd_pcm_oss_get_formats()
ALSA: seq: Clear client entry before deleting else at closing
ALSA: seq: Fix possible UAF in snd_seq_check_queue()
ASoC: amd: 16bit resolution support for i2s sp instance
ASoC: wm_adsp: For TLV controls only register TLV get/set
ASoC: sun4i-i2s: Fix RX slot number of SUN8I
ASoC: hdmi-codec: Fix module unloading caused kernel crash
ASoC: rt5651: Fix regcache sync errors on resume
ASoC: sgtl5000: Fix suspend/resume
MAINTAINERS: Add myself as sgtl5000 maintainer
ASoC: samsung: Add the DT binding files entry to MAINTAINERS
sgtl5000: change digital_mute policy
Pull device mapper fixes from Mike Snitzer:
- a stable DM multipath fix to restore ability to pass integrity data
- two DM multipath fixes for a fix that was merged into 4.16-rc5
* tag 'for-4.16/dm-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm mpath: fix passing integrity data
dm mpath: eliminate need to use scsi_device_from_queue
dm mpath: fix uninitialized 'pg_init_wait' waitqueue_head NULL pointer
Right now the vblank event completion is racing with the atomic update,
which is especially bad when the PRE is in use, as one of the hardware
issue workaround might extend the atomic commit for quite some time.
If the vblank IRQ happens to trigger during that time, we will prematurely
signal the atomic commit completion to userspace, which causes tearing
when userspace re-uses a framebuffer we haven't managed to flip away from
yet.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
ipu_planes_assign_pre() prototype is in "imx-drm.h" header file, so
include it to fix the following sparse warning:
drivers/gpu/drm/imx/ipuv3-plane.c:729:5: warning: symbol 'ipu_planes_assign_pre' was not declared. Should it be static?
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
ipu_plane_state_reset(), ipu_plane_duplicate_state() and
ipu_plane_destroy_state() are only used in this file, so make them static.
This fixes the following sparse warnings:
drivers/gpu/drm/imx/ipuv3-plane.c:275:6: warning: symbol 'ipu_plane_state_reset' was not declared. Should it be static?
drivers/gpu/drm/imx/ipuv3-plane.c:295:24: warning: symbol 'ipu_plane_duplicate_state' was not declared. Should it be static?
drivers/gpu/drm/imx/ipuv3-plane.c:309:6: warning: symbol 'ipu_plane_destroy_state' was not declared. Should it be static?
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
gcc-8 reports that we access an array with a negative index
in an error case:
drivers/gpu/ipu-v3/ipu-prg.c: In function 'ipu_prg_channel_disable':
drivers/gpu/ipu-v3/ipu-prg.c:252:43: error: array subscript -22 is below array bounds of 'struct ipu_prg_channel[3]' [-Werror=array-bounds]
This moves the range check in front of the first time that
variable gets used.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Keep old 'dependent' state of unaffected planes, this way new state takes
into account current state of unaffected planes.
Fixes: ebae8d0743 ("drm/tegra: dc: Implement legacy blending")
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Closing of a listen socket wakes up kernel_accept() of
smc_tcp_listen_worker(), and then has to wait till smc_tcp_listen_worker()
gives up the internal clcsock. The wait logic introduced with
commit 127f497058 ("net/smc: release clcsock from tcp_listen_worker")
might wait longer than necessary. This patch implements the idea to
implement the wait just with flush_work(), and gets rid of the extra
smc_close_wait_listen_clcsock() function.
Fixes: 127f497058 ("net/smc: release clcsock from tcp_listen_worker")
Reported-by: Hans Wippel <hwippel@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The opaque/alpha format conversion code is currently only looking at
XRGB formats because they have an equivalent ARGB format. The opaque
format for RGB565 is RGB565 itself, much like the YUV formats map to
themselves.
Reported-by: Dmitry Osipenko <digetx@gmail.com>
Fixes: ebae8d0743 ("drm/tegra: dc: Implement legacy blending")
Signed-off-by: Thierry Reding <treding@nvidia.com>
Swap the positions of blk_addr and blksz in the tracepoint print arguments
so that they match the print format.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Fixes: d2f82254e4 ("mmc: core: Add members to mmc_request and mmc_data for CQE's")
Cc: <stable@vger.kernel.org> # 4.14+
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Certain Micron eMMC v4.5 cards might get broken when HPI feature is used
and hence this patch disables the HPI feature for such buggy cards.
In U-Boot, these cards are reported as
Manufacturer: Micron (ID: 0xFE)
OEM: 0x4E
Name: MMC32G
Revision: 19 (0x13)
Serial: 959241022 Manufact. date: 8/2015 (0x82) CRC: 0x00
Tran Speed: 52000000
Rd Block Len: 512
MMC version 4.5
High Capacity: Yes
Capacity: 29.1 GiB
Boot Partition Size: 16 MiB
Bus Width: 8-bit
According to JEDEC JEP106 manufacturer 0xFE is Numonyx, which was bought by
Micron.
Signed-off-by: Dirk Behme <dirk.behme@de.bosch.com>
Signed-off-by: Mark Craske <Mark_Craske@mentor.com>
Cc: <stable@vger.kernel.org> # 4.8+
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
PARTITION_CONFIG is cached in mmc_card->ext_csd.part_config and the
currently active partition in mmc_blk_data->part_curr. These caches do
not always reflect changes if the ioctl call modifies the
PARTITION_CONFIG registers, e.g. by changing BOOT_PARTITION_ENABLE.
Write the PARTITION_CONFIG value extracted from the ioctl call to the
cache and update the currently active partition accordingly. This
ensures that the user space cannot change the values behind the
kernel's back. The next call to mmc_blk_part_switch() will operate on
the data set by the ioctl and reflect the changes appropriately.
Signed-off-by: Bastian Stender <bst@pengutronix.de>
Signed-off-by: Jan Luebbe <jlu@pengutronix.de>
Cc: stable@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Once the ring buffer is copied to ring_scan_buffer and scanned,
the shadow batch buffer start address is only updated into
ring_scan_buffer, not the real ring address allocated through
intel_ring_begin in later copy_workload_to_ring_buffer.
This patch is only to set the right shadow batch buffer address
from Ring buffer, not include the shadow_wa_ctx.
v2:
- refine some comments. (Zhenyu)
v3:
- fix typo in title. (Zhenyu)
v4:
- remove the unnecessary comments. (Zhenyu)
- add comments in bb_start_cmd_va update. (Zhenyu)
Fixes: 0a53bc07f0 ("drm/i915/gvt: Separate cmd scan from request allocation")
Cc: stable@vger.kernel.org # v4.15
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Yulei Zhang <yulei.zhang@intel.com>
Signed-off-by: fred gao <fred.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Following an RSCN, ibmvfc will issue an ADISC to determine if the
underlying target has changed, comparing the SCSI ID, WWPN, and WWNN to
determine how to handle the rport in discovery. However, the comparison
of the WWPN and WWNN was performing a memcmp between a big endian field
against a CPU endian field, which resulted in the wrong answer on LE
systems. This was observed as unexpected errors getting logged at boot
time as targets were getting relogins when not needed.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Pull SCSI fixes from James Bottomley:
"This is four patches, consisting of one regression from the merge
window (qla2xxx), one long-standing memory leak (sd_zbc), one event
queue mislabelling which we want to eliminate to discourage the
pattern (mpt3sas), and one behaviour change because re-reading the
partition table shouldn't clear the ro flag"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: sd: Keep disk read-only when re-reading partition
scsi: qla2xxx: Fix crashes in qla2x00_probe_one on probe failure
scsi: sd_zbc: Fix potential memory leak
scsi: mpt3sas: Do not mark fw_event workqueue as WQ_MEM_RECLAIM
geo->keylen cannot be larger than 4. So we might as well make
fixed-size allocations.
Given the one remaining user, geo->keylen cannot even be larger than 1.
Logfs used to have 64bit and 128bit keys, tcm_qla2xxx only has 32bit
keys. But let's not break the code if we don't have to.
Signed-off-by: Joern Engel <joern@purestorage.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull percpu_ref rcu fixes from Tejun Heo:
"Jann Horn found that aio was depending on the internal RCU grace
periods of percpu-ref and that it's broken because aio uses regular
RCU while percpu_ref uses sched-RCU.
Depending on percpu_ref's internal grace periods isn't a good idea
because
- The RCU type might not match.
- percpu_ref's grace periods are used to switch to atomic mode. They
aren't between the last put and the invocation of the last release.
This is easy to get confused about and can lead to subtle bugs.
- percpu_ref might not have grace periods at all depending on its
current operation mode.
This patchset audits and fixes percpu_ref users for their RCU usages"
[ There's a continuation of this series that clarifies percpu_ref
documentation that the internal grace periods must not be depended
upon, and introduces rcu_work to simplify bouncing to a workqueue
after an RCU grace period.
That will go in for 4.17 - this is just the minimal set with the fixes
that are tagged for -stable ]
* 'percpu_ref-rcu-audit-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc:
RDMAVT: Fix synchronization around percpu_ref
fs/aio: Use RCU accessors for kioctx_table->table[]
fs/aio: Add explicit RCU grace period when freeing kioctx
This reverts commit 864b75f9d6.
Commit 864b75f9d6 ("mm/page_alloc: fix memmap_init_zone pageblock
alignment") modified the logic in memmap_init_zone() to initialize
struct pages associated with invalid PFNs, to appease a VM_BUG_ON()
in move_freepages(), which is redundant by its own admission, and
dereferences struct page fields to obtain the zone without checking
whether the struct pages in question are valid to begin with.
Commit 864b75f9d6 only makes it worse, since the rounding it does
may cause pfn assume the same value it had in a prior iteration of
the loop, resulting in an infinite loop and a hang very early in the
boot. Also, since it doesn't perform the same rounding on start_pfn
itself but only on intermediate values following an invalid PFN, we
may still hit the same VM_BUG_ON() as before.
So instead, let's fix this at the core, and ensure that the BUG
check doesn't dereference struct page fields of invalid pages.
Fixes: 864b75f9d6 ("mm/page_alloc: fix memmap_init_zone pageblock alignment")
Tested-by: Jan Glauber <jglauber@cavium.com>
Tested-by: Shanker Donthineni <shankerd@codeaurora.org>
Cc: Daniel Vacek <neelx@redhat.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- 1 display fix for bxt
- 1 gem fix for fences
- 1 gem/pm fix for rps freq
* tag 'drm-intel-fixes-2018-03-14' of git://anongit.freedesktop.org/drm/drm-intel:
drm/i915: Kick the rps worker when changing the boost frequency
drm/i915: Only prune fences after wait-for-all
drm/i915: Enable VBT based BL control for DP
A few fixes for 4.16:
- Fix a backlight S/R regression on amdgpu
- Fix prime teardown on radeon and amdgpu
- DP fix for amdgpu
* 'drm-fixes-4.16' of git://people.freedesktop.org/~agd5f/linux:
drm/amdgpu/dce: Don't turn off DP sink when disconnected
drm/amdgpu: save/restore backlight level in legacy dce code
drm/radeon: fix prime teardown order
drm/amdgpu: fix prime teardown order
On 32-bit targets, we otherwise get a warning about an impossible constant
integer expression:
In file included from include/linux/kernel.h:11,
from include/linux/interrupt.h:6,
from drivers/infiniband/hw/bnxt_re/ib_verbs.c:39:
drivers/infiniband/hw/bnxt_re/ib_verbs.c: In function 'bnxt_re_query_device':
include/linux/bitops.h:7:24: error: left shift count >= width of type [-Werror=shift-count-overflow]
#define BIT(nr) (1UL << (nr))
^~
drivers/infiniband/hw/bnxt_re/bnxt_re.h:61:34: note: in expansion of macro 'BIT'
#define BNXT_RE_MAX_MR_SIZE_HIGH BIT(39)
^~~
drivers/infiniband/hw/bnxt_re/bnxt_re.h:62:30: note: in expansion of macro 'BNXT_RE_MAX_MR_SIZE_HIGH'
#define BNXT_RE_MAX_MR_SIZE BNXT_RE_MAX_MR_SIZE_HIGH
^~~~~~~~~~~~~~~~~~~~~~~~
drivers/infiniband/hw/bnxt_re/ib_verbs.c:149:25: note: in expansion of macro 'BNXT_RE_MAX_MR_SIZE'
ib_attr->max_mr_size = BNXT_RE_MAX_MR_SIZE;
^~~~~~~~~~~~~~~~~~~
Fixes: 872f357824 ("RDMA/bnxt_re: Add support for MRs with Huge pages")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Building for a 32-bit target results in a couple of warnings from casting
between a 32-bit pointer and a 64-bit integer:
drivers/infiniband/hw/bnxt_re/qplib_fp.c: In function 'bnxt_qplib_service_nq':
drivers/infiniband/hw/bnxt_re/qplib_fp.c:333:23: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
bnxt_qplib_arm_srq((struct bnxt_qplib_srq *)q_handle,
^
drivers/infiniband/hw/bnxt_re/qplib_fp.c:336:12: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
(struct bnxt_qplib_srq *)q_handle,
^
In file included from include/linux/byteorder/little_endian.h:5,
from arch/arm/include/uapi/asm/byteorder.h:22,
from include/asm-generic/bitops/le.h:6,
from arch/arm/include/asm/bitops.h:342,
from include/linux/bitops.h:38,
from include/linux/kernel.h:11,
from include/linux/interrupt.h:6,
from drivers/infiniband/hw/bnxt_re/qplib_fp.c:39:
drivers/infiniband/hw/bnxt_re/qplib_fp.c: In function 'bnxt_qplib_create_srq':
include/uapi/linux/byteorder/little_endian.h:31:43: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
#define __cpu_to_le64(x) ((__force __le64)(__u64)(x))
^
include/linux/byteorder/generic.h:86:21: note: in expansion of macro '__cpu_to_le64'
#define cpu_to_le64 __cpu_to_le64
^~~~~~~~~~~~~
drivers/infiniband/hw/bnxt_re/qplib_fp.c:569:19: note: in expansion of macro 'cpu_to_le64'
req.srq_handle = cpu_to_le64(srq);
Using a uintptr_t as an intermediate works on all architectures.
Fixes: 37cb11acf1 ("RDMA/bnxt_re: Add SRQ support for Broadcom adapters")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Commit 24b6d41643 "mm: pass the vmem_altmap to vmemmap_free" converted
the vmemmap_free() path to pass the altmap argument all the way through
the call chain rather than looking it up based on the page.
Unfortunately that ends up over freeing altmap allocated pages in some
cases since free_pagetable() is used to free both memmap space and pte
space, where only the memmap stored in huge pages uses altmap
allocations.
Given that altmap allocations for memmap space are special cased in
vmemmap_populate_hugepages() add a symmetric / special case
free_hugepage_table() to handle altmap freeing, and cleanup the unneeded
passing of altmap to leaf functions that do not require it.
Without this change the sanity check accounting in
devm_memremap_pages_release() will throw a warning with the following
signature.
nd_pmem pfn10.1: devm_memremap_pages_release: failed to free all reserved pages
WARNING: CPU: 44 PID: 3539 at kernel/memremap.c:310 devm_memremap_pages_release+0x1c7/0x220
CPU: 44 PID: 3539 Comm: ndctl Tainted: G L 4.16.0-rc1-linux-stable #7
RIP: 0010:devm_memremap_pages_release+0x1c7/0x220
[..]
Call Trace:
release_nodes+0x225/0x270
device_release_driver_internal+0x15d/0x210
bus_remove_device+0xe2/0x160
device_del+0x130/0x310
? klist_release+0x56/0x100
? nd_region_notify+0xc0/0xc0 [libnvdimm]
device_unregister+0x16/0x60
This was missed in testing since not all configurations will trigger
this warning.
Fixes: 24b6d41643 ("mm: pass the vmem_altmap to vmemmap_free")
Reported-by: Jane Chu <jane.chu@oracle.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
This patch addresses an issue that causes fiemap to falsely
report a shared extent. The test case is as follows:
xfs_io -f -d -c "pwrite -b 16k 0 64k" -c "fiemap -v" /media/scratch/file5
sync
xfs_io -c "fiemap -v" /media/scratch/file5
which gives the resulting output:
wrote 65536/65536 bytes at offset 0
64 KiB, 4 ops; 0.0000 sec (121.359 MiB/sec and 7766.9903 ops/sec)
/media/scratch/file5:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [0..127]: 24576..24703 128 0x2001
/media/scratch/file5:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [0..127]: 24576..24703 128 0x1
This is because btrfs_check_shared calls find_parent_nodes
repeatedly in a loop, passing a share_check struct to report
the count of shared extent. But btrfs_check_shared does not
re-initialize the count value to zero for subsequent calls
from the loop, resulting in a false share count value. This
is a regressive behavior from 4.13.
With proper re-initialization the test result is as follows:
wrote 65536/65536 bytes at offset 0
64 KiB, 4 ops; 0.0000 sec (110.035 MiB/sec and 7042.2535 ops/sec)
/media/scratch/file5:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [0..127]: 24576..24703 128 0x1
/media/scratch/file5:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [0..127]: 24576..24703 128 0x1
which corrects the regression.
Fixes: 3ec4d3238a ("btrfs: allow backref search checks for shared extents")
Signed-off-by: Edmund Nadolski <enadolski@suse.com>
[ add text from cover letter to changelog ]
Signed-off-by: David Sterba <dsterba@suse.com>
On load we create private CQ/QP/PD in order to be used by UMR, we create
those resources after we register ourself as an IB device, and we destroy
them after we unregister as an IB device. This was changed by commit
16c1975f10 ("IB/mlx5: Create profile infrastructure to add and remove
stages") which moved the destruction before we unregistration. This
allowed to trigger an invalid memory access when unloading mlx5_ib while
there are open resources:
BUG: unable to handle kernel paging request at 00000001002c012c
...
Call Trace:
mlx5_ib_post_send_wait+0x75/0x110 [mlx5_ib]
__slab_free+0x9a/0x2d0
delay_time_func+0x10/0x10 [mlx5_ib]
unreg_umr.isra.15+0x4b/0x50 [mlx5_ib]
mlx5_mr_cache_free+0x46/0x150 [mlx5_ib]
clean_mr+0xc9/0x190 [mlx5_ib]
dereg_mr+0xba/0xf0 [mlx5_ib]
ib_dereg_mr+0x13/0x20 [ib_core]
remove_commit_idr_uobject+0x16/0x70 [ib_uverbs]
uverbs_cleanup_ucontext+0xe8/0x1a0 [ib_uverbs]
ib_uverbs_cleanup_ucontext.isra.9+0x19/0x40 [ib_uverbs]
ib_uverbs_remove_one+0x162/0x2e0 [ib_uverbs]
ib_unregister_device+0xd4/0x190 [ib_core]
__mlx5_ib_remove+0x2e/0x40 [mlx5_ib]
mlx5_remove_device+0xf5/0x120 [mlx5_core]
mlx5_unregister_interface+0x37/0x90 [mlx5_core]
mlx5_ib_cleanup+0xc/0x225 [mlx5_ib]
SyS_delete_module+0x153/0x230
do_syscall_64+0x62/0x110
entry_SYSCALL_64_after_hwframe+0x21/0x86
...
We restore the original behavior by breaking the UMR stage into two parts,
pre and post IB registration stages, this way we can restore the original
functionality and maintain clean separation of logic between stages.
Fixes: 16c1975f10 ("IB/mlx5: Create profile infrastructure to add and remove stages")
Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Pull x86 platform drives fixes from Darren Hart:
- DELL_SMBIOS conditionally depends on ACPI_WMI in the same way it
depends on DCDBAS, update the Kconfig accordingly.
- fix the dell driver init order to ensure that the driver dependencies
are met, avoiding race conditions resulting in boot failure on
certain systems when the drivers are built-in.
* tag 'platform-drivers-x86-v4.16-7' of git://git.infradead.org/linux-platform-drivers-x86:
platform/x86: Fix dell driver init order
platform/x86: dell-smbios: Resolve dependency error on ACPI_WMI
cma_port_is_unique() allows local port reuse if the quad (source
address and port, destination address and port) for this connection
is unique. However, if the destination info is zero or unspecified, it
can't make a correct decision but still allows port reuse. For example,
sometimes rdma_bind_addr() is called with unspecified destination and
reusing the port can lead to creating a connection with a duplicate quad,
after the destination is resolved. The issue manifests when MPI scale-up
tests hang after the duplicate quad is used.
Set the destination address family and add checks for zero destination
address and port to prevent source port reuse based on invalid destination.
Fixes: 19b752a19d ("IB/cma: Allow port reuse for rdma_id")
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
After v4.12 commit e2460f2a4b ("dm: mark targets that pass integrity
data"), dm-multipath, e.g. on DIF+DIX SCSI disk paths, does not support
block integrity any more. So add it to the whitelist.
This is also a pre-requisite to use block integrity with other dm layer(s)
on top of multipath, such as kpartx partitions (dm-linear) or LVM.
Also, bump target version to reflect this fix.
Fixes: e2460f2a4b ("dm: mark targets that pass integrity data")
Cc: <stable@vger.kernel.org> #4.12+
Bisected-by: Fedor Loshakov <loshakov@linux.vnet.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
rvt_mregion uses percpu_ref for reference counting and RCU to protect
accesses from lkey_table. When a rvt_mregion needs to be freed, it
first gets unregistered from lkey_table and then rvt_check_refs() is
called to wait for in-flight usages before the rvt_mregion is freed.
rvt_check_refs() seems to have a couple issues.
* It has a fast exit path which tests percpu_ref_is_zero(). However,
a percpu_ref reading zero doesn't mean that the object can be
released. In fact, the ->release() callback might not even have
started executing yet. Proceeding with freeing can lead to
use-after-free.
* lkey_table is RCU protected but there is no RCU grace period in the
free path. percpu_ref uses RCU internally but it's sched-RCU whose
grace periods are different from regular RCU. Also, it generally
isn't a good idea to depend on internal behaviors like this.
To address the above issues, this patch removes the fast exit and adds
an explicit synchronize_rcu().
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Cc: Mike Marciniszyn <mike.marciniszyn@intel.com>
Cc: linux-rdma@vger.kernel.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
While converting ioctx index from a list to a table, db446a08c2
("aio: convert the ioctx list to table lookup v3") missed tagging
kioctx_table->table[] as an array of RCU pointers and using the
appropriate RCU accessors. This introduces a small window in the
lookup path where init and access may race.
Mark kioctx_table->table[] with __rcu and use the approriate RCU
accessors when using the field.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Jann Horn <jannh@google.com>
Fixes: db446a08c2 ("aio: convert the ioctx list to table lookup v3")
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: stable@vger.kernel.org # v3.12+
While fixing refcounting, e34ecee2ae ("aio: Fix a trinity splat")
incorrectly removed explicit RCU grace period before freeing kioctx.
The intention seems to be depending on the internal RCU grace periods
of percpu_ref; however, percpu_ref uses a different flavor of RCU,
sched-RCU. This can lead to kioctx being freed while RCU read
protected dereferences are still in progress.
Fix it by updating free_ioctx() to go through call_rcu() explicitly.
v2: Comment added to explain double bouncing.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Jann Horn <jannh@google.com>
Fixes: e34ecee2ae ("aio: Fix a trinity splat")
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: stable@vger.kernel.org # v3.13+
On guest exit, and when using GICv2 on GICv3, we use a dsb(st) to
force synchronization between the memory-mapped guest view and
the system-register view that the hypervisor uses.
This is incorrect, as the spec calls out the need for "a DSB whose
required access type is both loads and stores with any Shareability
attribute", while we're only synchronizing stores.
We also lack an isb after the dsb to ensure that the latter has
actually been executed before we start reading stuff from the sysregs.
The fix is pretty easy: turn dsb(st) into dsb(sy), and slap an isb()
just after.
Cc: stable@vger.kernel.org
Fixes: f68d2b1b73 ("arm64: KVM: Implement vgic-v3 save/restore")
Acked-by: Christoffer Dall <cdall@kernel.org>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
The vgic code is trying to be clever when injecting GICv2 SGIs,
and will happily populate LRs with the same interrupt number if
they come from multiple vcpus (after all, they are distinct
interrupt sources).
Unfortunately, this is against the letter of the architecture,
and the GICv2 architecture spec says "Each valid interrupt stored
in the List registers must have a unique VirtualID for that
virtual CPU interface.". GICv3 has similar (although slightly
ambiguous) restrictions.
This results in guests locking up when using GICv2-on-GICv3, for
example. The obvious fix is to stop trying so hard, and inject
a single vcpu per SGI per guest entry. After all, pending SGIs
with multiple source vcpus are pretty rare, and are mostly seen
in scenario where the physical CPUs are severely overcomitted.
But as we now only inject a single instance of a multi-source SGI per
vcpu entry, we may delay those interrupts for longer than strictly
necessary, and run the risk of injecting lower priority interrupts
in the meantime.
In order to address this, we adopt a three stage strategy:
- If we encounter a multi-source SGI in the AP list while computing
its depth, we force the list to be sorted
- When populating the LRs, we prevent the injection of any interrupt
of lower priority than that of the first multi-source SGI we've
injected.
- Finally, the injection of a multi-source SGI triggers the request
of a maintenance interrupt when there will be no pending interrupt
in the LRs (HCR_NPIE).
At the point where the last pending interrupt in the LRs switches
from Pending to Active, the maintenance interrupt will be delivered,
allowing us to add the remaining SGIs using the same process.
Cc: stable@vger.kernel.org
Fixes: 0919e84c0f ("KVM: arm/arm64: vgic-new: Add IRQ sync/flush framework")
Acked-by: Christoffer Dall <cdall@kernel.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
On my GICv3 system, the following is printed to the kernel log at boot:
kvm [1]: 8-bit VMID
kvm [1]: IDMAP page: d20e35000
kvm [1]: HYP VA range: 800000000000:ffffffffffff
kvm [1]: vgic-v2@2c020000
kvm [1]: GIC system register CPU interface enabled
kvm [1]: vgic interrupt IRQ1
kvm [1]: virtual timer IRQ4
kvm [1]: Hyp mode initialized successfully
The KVM IDMAP is a mapping of a statically allocated kernel structure,
and so printing its physical address leaks the physical placement of
the kernel when physical KASLR in effect. So change the kvm_info() to
kvm_debug() to remove it from the log output.
While at it, trim the output a bit more: IRQ numbers can be found in
/proc/interrupts, and the HYP VA and vgic-v2 lines are not highly
informational either.
Cc: <stable@vger.kernel.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Christoffer Dall <cdall@kernel.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
We currently don't allow resetting mapped IRQs from userspace, because
their state is controlled by the hardware. But we do need to reset the
state when the VM is reset, so we provide a function for the 'owner' of
the mapped interrupt to reset the interrupt state.
Currently only the timer uses mapped interrupts, so we call this
function from the timer reset logic.
Cc: stable@vger.kernel.org
Fixes: 4c60e360d6 ("KVM: arm/arm64: Provide a get_input_level for the arch timer")
Signed-off-by: Christoffer Dall <cdall@kernel.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Calling vcpu_load() registers preempt notifiers for this vcpu and calls
kvm_arch_vcpu_load(). The latter will soon be doing a lot of heavy
lifting on arm/arm64 and will try to do things such as enabling the
virtual timer and setting us up to handle interrupts from the timer
hardware.
Loading state onto hardware registers and enabling hardware to signal
interrupts can be problematic when we're not actually about to run the
VCPU, because it makes it difficult to establish the right context when
handling interrupts from the timer, and it makes the register access
code difficult to reason about.
Luckily, now when we call vcpu_load in each ioctl implementation, we can
simply remove the call from the non-KVM_RUN vcpu ioctls, and our
kvm_arch_vcpu_load() is only used for loading vcpu content to the
physical CPU when we're actually going to run the vcpu.
Cc: stable@vger.kernel.org
Fixes: 9b062471e5 ("KVM: Move vcpu_load to arch-specific kvm_arch_vcpu_ioctl")
Reviewed-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Our irq_is_pending() helper function accesses multiple members of the
vgic_irq struct, so we need to hold the lock when calling it.
Add that requirement as a comment to the definition and take the lock
around the call in vgic_mmio_read_pending(), where we were missing it
before.
Fixes: 96b298000d ("KVM: arm/arm64: vgic-new: Add PENDING registers handlers")
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Update the initcall ordering to satisfy the following dependency
ordering:
1. DCDBAS, ACPI_WMI
2. DELL_SMBIOS, DELL_RBTN
3. DELL_LAPTOP, DELL_WMI
By assigning them to the following initcall levels:
subsys_initcall: DCDBAS, ACPI_WMI
module_init: DELL_SMBIOS, DELL_RBTN
late_initcall: DELL_LAPTOP, DELL_WMI
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Mario.Limonciello@dell.com
Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
Similarly to DCDBAS for DELL_SMBIOS_SMM, if DELL_SMBIOS_WMI is enabled,
DELL_SMBIOS becomes dependent on ACPI_WMI. Update the depends lines to
prevent a configuration where DELL_SMBIOS=y and either backend
dependency =m. Update the comment accordingly.
Cc: Mario Limonciello <mario.limonciello@dell.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
The NETIF_F_GSO_SOFTWARE implies support for GSO on SCTP, but the
sunvnet driver does not support GSO for sctp. Here we remove the
NETIF_F_GSO_SOFTWARE feature flag and only report NETIF_F_ALL_TSO
instead.
Signed-off-by: Cathy Zhou <Cathy.Zhou@Oracle.COM>
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marc Kleine-Budde says:
====================
pull-request: can 2018-03-14
this is a pull request of two patches for net/master.
Both patches are by Andri Yngvason and fix problems in the cc770 driver,
that show up quite fast on RT systems, but also on non RT setups.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The problem was introduced in commit
506b0a395f ("[netdrv] tg3: APE heartbeat changes"). The bug occurs
because tp->lock spinlock is held which is obtained in tg3_start
by way of tg3_full_lock(), line 11571. The documentation for usleep_range()
specifically states it cannot be used inside a spinlock.
Fixes: 506b0a395f ("[netdrv] tg3: APE heartbeat changes")
Signed-off-by: Jonathan Toppins <jtoppins@redhat.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Prior to the rework of PMTU information storage in commit
2c8cec5c10 ("ipv4: Cache learned PMTU information in inetpeer."),
when a PMTU event advertising a PMTU smaller than
net.ipv4.route.min_pmtu was received, we would disable setting the DF
flag on packets by locking the MTU metric, and set the PMTU to
net.ipv4.route.min_pmtu.
Since then, we don't disable DF, and set PMTU to
net.ipv4.route.min_pmtu, so the intermediate router that has this link
with a small MTU will have to drop the packets.
This patch reestablishes pre-2.6.39 behavior by splitting
rtable->rt_pmtu into a bitfield with rt_mtu_locked and rt_pmtu.
rt_mtu_locked indicates that we shouldn't set the DF bit on that path,
and is checked in ip_dont_fragment().
One possible workaround is to set net.ipv4.route.min_pmtu to a value low
enough to accommodate the lowest MTU encountered.
Fixes: 2c8cec5c10 ("ipv4: Cache learned PMTU information in inetpeer.")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Madalin Bucur says:
====================
DPAA Ethernet fixes
This patch set is addressing several issues in the DPAA Ethernet
driver suite:
- module unload crash caused by wrong reference to device being left
in the cleanup code after the DSA related changes
- scheduling wile atomic bug in QMan code revealed during dpaa_eth
module unload
- a couple of error counter fixes, a duplicated init in dpaa_eth.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The fd_format has already been initialized at this point.
Signed-off-by: Camelia Groza <camelia.groza@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The recent changes that make the driver probing compatible with DSA
were not propagated in the dpa_remove() function, breaking the
module unload function. Using the proper device to address the issue.
Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The wait_for_completion() call in qman_delete_cgr_safe()
was triggering a scheduling while atomic bug, replacing the
kthread with a smp_call_function_single() call to fix it.
Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: Roy Pledge <roy.pledge@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull USB fixes from Greg KH:
"Here are a small clump of USB fixes for 4.16-rc6.
Nothing major, just a number of fixes in lots of different drivers, as
well as a PHY driver fix that snuck into this tree. Full details are
in the shortlog.
All of these have been in linux-next with no reported issues"
* tag 'usb-4.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (22 commits)
usb: musb: Fix external abort in musb_remove on omap2430
phy: qcom-ufs: add MODULE_LICENSE tag
usb: typec: tcpm: fusb302: Do not log an error on -EPROBE_DEFER
USB: OHCI: Fix NULL dereference in HCDs using HCD_LOCAL_MEM
usbip: vudc: fix null pointer dereference on udc->lock
xhci: Fix front USB ports on ASUS PRIME B350M-A
usb: host: xhci-plat: revert "usb: host: xhci-plat: enable clk in resume timing"
usb: usbmon: Read text within supplied buffer size
usb: host: xhci-rcar: add support for r8a77965
USB: storage: Add JMicron bridge 152d:2567 to unusual_devs.h
usb: xhci: dbc: Fix lockdep warning
xhci: fix endpoint context tracer output
Revert "typec: tcpm: Only request matching pdos"
usb: musb: call pm_runtime_{get,put}_sync before reading vbus registers
usb: quirks: add control message delay for 1b1c:1b20
uas: fix comparison for error code
usb: gadget: udc: renesas_usb3: add binging for r8a77965
usb: renesas_usbhs: add binding for r8a77965
usb: dwc2: fix STM32F7 USB OTG HS compatible
dt-bindings: usb: fix the STM32F7 DWC2 OTG HS core binding
...
Pull tty/serial driver fixes from Greg KH:
"Here are some small tty core and serial driver fixes for 4.16-rc6.
They resolve some newly reported bugs, as well as some very old ones,
which is always nice to see. There is also a new device id added in
here for good measure.
All of these have been in linux-next for a while with no reported
issues"
* tag 'tty-4.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
serial: imx: fix bogus dev_err
serial: sh-sci: prevent lockup on full TTY buffers
serial: 8250_pci: Add Brainboxes UC-260 4 port serial device
earlycon: add reg-offset to physical address before mapping
serial: core: mark port as initialized in autoconfig
serial: 8250_pci: Don't fail on multiport card class
tty/serial: atmel: add new version check for usart
tty: make n_tty_read() always abort if hangup is in progress
Pull staging fixes from Greg KH:
"Here are three staging driver fixes for 4.16-rc6
Two of them are lockdep fixes for the ashmem driver that have been
reported by a number of people recently. The last one is a fix for the
comedi driver core.
All of these have been in linux-next for a while with no reported
issues"
* tag 'staging-4.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: android: ashmem: Fix possible deadlock in ashmem_ioctl
staging: comedi: fix comedi_nsamples_left.
staging: android: ashmem: Fix lockdep issue during llseek
Andrei Vagin reported a KASAN: slab-out-of-bounds error in
skb_update_prio()
Since SYNACK might be attached to a request socket, we need to
get back to the listener socket.
Since this listener is manipulated without locks, add const
qualifiers to sock_cgroup_prioidx() so that the const can also
be used in skb_update_prio()
Also add the const qualifier to sock_cgroup_classid() for consistency.
Fixes: ca6fb06518 ("tcp: attach SYNACK messages to request sockets instead of listener")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Andrei Vagin <avagin@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull auxdisplay fixes from Miguel Ojeda:
"Silence a few warnings in auxdisplay.
- a couple of uninitialized warnings reported by the build service
- a doc comment warning under W=1
- three fall-through comments not recognized under W=1"
* tag 'auxdisplay-for-linus-v4.16-rc6' of git://github.com/ojeda/linux:
auxdisplay: img-ascii-lcd: Silence 2 uninitialized warnings
auxdisplay: img-ascii-lcd: Fix doc comment to silence warnings
auxdisplay: panel: Change comments to silence fallthrough warnings
The kbuild test robot reported the following warning on sparc64:
kernel/jump_label.c: In function '__jump_label_update':
kernel/jump_label.c:376:51: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
WARN_ONCE(1, "can't patch jump_label at %pS", (void *)entry->code);
On sparc64, the jump_label entry->code field is of type u32, but
pointers are 64-bit. Silence the warning by casting entry->code to an
unsigned long before casting it to a pointer. This is also what the
sparc jump label code does.
Fixes: dc1dd184c2 ("jump_label: Warn on failed jump_label patching attempt")
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: "David S . Miller" <davem@davemloft.net>
Link: https://lkml.kernel.org/r/c966fed42be6611254a62d46579ec7416548d572.1521041026.git.jpoimboe@redhat.com
Samsung explicitly states that queued TRIM is supported for Linux with
860 PRO and 860 EVO.
Make the previous blacklist to cover only 840 and 850 series.
Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org
While waiting for the TX object to send an RTR, an external message with a
matching id can overwrite the TX data. In this case we must call the rx
routine and then try transmitting the message that was overwritten again.
The queue was being stalled because the RX event did not generate an
interrupt to wake up the queue again and the TX event did not happen
because the TXRQST flag is reset by the chip when new data is received.
According to the CC770 datasheet the id of a message object should not be
changed while the MSGVAL bit is set. This has been fixed by resetting the
MSGVAL bit before modifying the object in the transmit function and setting
it after. It is not enough to set & reset CPUUPD.
It is important to keep the MSGVAL bit reset while the message object is
being modified. Otherwise, during RTR transmission, a frame with matching
id could trigger an rx-interrupt, which would cause a race condition
between the interrupt routine and the transmit function.
Signed-off-by: Andri Yngvason <andri.yngvason@marel.com>
Tested-by: Richard Weinberger <richard@nod.at>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
When running virtualised the powerpc kernel is able to run the system
in "compat mode" - which means the kernel and hardware are pretending
to userspace that the CPU is an older version than it actually is.
AT_BASE_PLATFORM is an AUXV entry that we export to userspace for use
when we're running in that mode, which tells userspace the "platform"
string for the real CPU version, as opposed to the faked version.
Although we don't support compat mode when using DT CPU features, and
arguably don't need to set AT_BASE_PLATFORM, the existing cputable
based code always sets it even when we're running bare metal. That
means the lack of AT_BASE_PLATFORM is a user-visible artifact of the
fact that the kernel is using DT CPU features, which we don't want.
So set it in the DT CPU features code also.
This results in eg:
$ LD_SHOW_AUXV=1 /bin/true | grep "AT_.*PLATFORM"
AT_PLATFORM: power9
AT_BASE_PLATFORM:power9
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
When we removed the inclusion of skeleton.dtsi from the device trees, we
broke booting for systems with bootloaders that aren't device tre aware.
This can be seen, for example, when appending the device tree blob to
the kernel image.
The reason booting broke was that the kernel lacked the device_type
label in the memory node. Add in a default memory node wth the
device_type. It can contain the memory address as the location is fixed
for each SoC generation, but the size needs to be added by the
bootloader or the board specific dts.
Fixes: 73102d6fdc ("ARM: dts: aspeed: Remove skeleton.dtsi")
Cc: <stable@vger.kernel.org>
Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Pointer dev is being assigned a value that is never read, it is being
re-assigned the same value later on, hence the initialization is redundant
and can be removed.
Cleans up clang warning:
drivers/nvdimm/pfn_devs.c:307:17: warning: Value stored to 'dev' during
its initialization is never read
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
This fixes a bug where the trap number that is returned by
__kvmppc_vcore_entry gets corrupted. The effect of the corruption
is that IPIs get ignored on POWER9 systems when the IPI is sent via
a doorbell interrupt to a CPU which is executing in a KVM guest.
The effect of the IPI being ignored is often that another CPU locks
up inside smp_call_function_many() (and if that CPU is holding a
spinlock, other CPUs then lock up inside raw_spin_lock()).
The trap number is currently held in register r12 for most of the
assembly-language part of the guest exit path. In that path, we
call kvmppc_subcore_exit_guest(), which is a C function, without
restoring r12 afterwards. Depending on the kernel config and the
compiler, it may modify r12 or it may not, so some config/compiler
combinations see the bug and others don't.
To fix this, we arrange for the trap number to be stored on the
stack from the 'guest_bypass:' label until the end of the function,
then the trap number is loaded and returned in r12 as before.
Cc: stable@vger.kernel.org # v4.8+
Fixes: fd7bacbca4 ("KVM: PPC: Book3S HV: Fix TB corruption in guest exit path on HMI interrupt")
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Found this by accident.
There are no usages of bare cancel_work() in current kernel source.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
This patch validates user provided input to prevent integer overflow due
to integer manipulation in the mlx5_ib_create_srq function.
Cc: syzkaller <syzkaller@googlegroups.com>
Fixes: e126ba97db ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Add a check for the length of the qpin structure to prevent out-of-bounds reads
BUG: KASAN: slab-out-of-bounds in create_raw_packet_qp+0x114c/0x15e2
Read of size 8192 at addr ffff880066b99290 by task syz-executor3/549
CPU: 3 PID: 549 Comm: syz-executor3 Not tainted 4.15.0-rc2+ #27 Hardware
name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
Call Trace:
dump_stack+0x8d/0xd4
print_address_description+0x73/0x290
kasan_report+0x25c/0x370
? create_raw_packet_qp+0x114c/0x15e2
memcpy+0x1f/0x50
create_raw_packet_qp+0x114c/0x15e2
? create_raw_packet_qp_tis.isra.28+0x13d/0x13d
? lock_acquire+0x370/0x370
create_qp_common+0x2245/0x3b50
? destroy_qp_user.isra.47+0x100/0x100
? kasan_kmalloc+0x13d/0x170
? sched_clock_cpu+0x18/0x180
? fs_reclaim_acquire.part.15+0x5/0x30
? __lock_acquire+0xa11/0x1da0
? sched_clock_cpu+0x18/0x180
? kmem_cache_alloc_trace+0x17e/0x310
? mlx5_ib_create_qp+0x30e/0x17b0
mlx5_ib_create_qp+0x33d/0x17b0
? sched_clock_cpu+0x18/0x180
? create_qp_common+0x3b50/0x3b50
? lock_acquire+0x370/0x370
? __radix_tree_lookup+0x180/0x220
? uverbs_try_lock_object+0x68/0xc0
? rdma_lookup_get_uobject+0x114/0x240
create_qp.isra.5+0xce4/0x1e20
? ib_uverbs_ex_create_cq_cb+0xa0/0xa0
? copy_ah_attr_from_uverbs.isra.2+0xa00/0xa00
? ib_uverbs_cq_event_handler+0x160/0x160
? __might_fault+0x17c/0x1c0
ib_uverbs_create_qp+0x21b/0x2a0
? ib_uverbs_destroy_cq+0x2e0/0x2e0
ib_uverbs_write+0x55a/0xad0
? ib_uverbs_destroy_cq+0x2e0/0x2e0
? ib_uverbs_destroy_cq+0x2e0/0x2e0
? ib_uverbs_open+0x760/0x760
? futex_wake+0x147/0x410
? check_prev_add+0x1680/0x1680
? do_futex+0x3d3/0xa60
? sched_clock_cpu+0x18/0x180
__vfs_write+0xf7/0x5c0
? ib_uverbs_open+0x760/0x760
? kernel_read+0x110/0x110
? lock_acquire+0x370/0x370
? __fget+0x264/0x3b0
vfs_write+0x18a/0x460
SyS_write+0xc7/0x1a0
? SyS_read+0x1a0/0x1a0
? trace_hardirqs_on_thunk+0x1a/0x1c
entry_SYSCALL_64_fastpath+0x18/0x85
RIP: 0033:0x4477b9
RSP: 002b:00007f1822cadc18 EFLAGS: 00000292 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00000000004477b9
RDX: 0000000000000070 RSI: 000000002000a000 RDI: 0000000000000005
RBP: 0000000000708000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000292 R12: 00000000ffffffff
R13: 0000000000005d70 R14: 00000000006e6e30 R15: 0000000020010ff0
Allocated by task 549:
__kmalloc+0x15e/0x340
kvmalloc_node+0xa1/0xd0
create_user_qp.isra.46+0xd42/0x1610
create_qp_common+0x2e63/0x3b50
mlx5_ib_create_qp+0x33d/0x17b0
create_qp.isra.5+0xce4/0x1e20
ib_uverbs_create_qp+0x21b/0x2a0
ib_uverbs_write+0x55a/0xad0
__vfs_write+0xf7/0x5c0
vfs_write+0x18a/0x460
SyS_write+0xc7/0x1a0
entry_SYSCALL_64_fastpath+0x18/0x85
Freed by task 368:
kfree+0xeb/0x2f0
kernfs_fop_release+0x140/0x180
__fput+0x266/0x700
task_work_run+0x104/0x180
exit_to_usermode_loop+0xf7/0x110
syscall_return_slowpath+0x298/0x370
entry_SYSCALL_64_fastpath+0x83/0x85
The buggy address belongs to the object at ffff880066b99180 which
belongs to the cache kmalloc-512 of size 512 The buggy address is
located 272 bytes inside of 512-byte region [ffff880066b99180,
ffff880066b99380) The buggy address belongs to the page:
page:000000006040eedd count:1 mapcount:0 mapping: (null)
index:0x0 compound_mapcount: 0
flags: 0x4000000000008100(slab|head)
raw: 4000000000008100 0000000000000000 0000000000000000 0000000180190019
raw: ffffea00019a7500 0000000b0000000b ffff88006c403080 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff880066b99180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffff880066b99200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>ffff880066b99280: 00 00 fc fc fc fc fc fc fc fc fc fc fc fc fc fc
^
ffff880066b99300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff880066b99380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
Cc: syzkaller <syzkaller@googlegroups.com>
Fixes: 0fb2ed66a1 ("IB/mlx5: Add create and destroy functionality for Raw Packet QP")
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Never directly free @dev after calling device_register(), even
if it returned an error! Always use put_device() to give up the
reference initialized in this function instead.
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Initialize all the scsi_dh related 'struct multipath' members regardless
of whether a scsi_dh is in use or not.
The subtle (and fragile) SCSI-assuming legacy code clearly needs further
decoupling from non-SCSI (and/or developer understanding).
Fixes: 8d47e65948 ("dm mpath: remove unnecessary NVMe branching in favor of scsi_dh checks")
Reported-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
The warnings are:
drivers/auxdisplay/img-ascii-lcd.c: warning: 'err' may be used
uninitialized in this function [-Wuninitialized]
At lines 109 and 207. Reported by Geert using the build service
several times, e.g.:
https://lkml.org/lkml/2018/2/19/303
They are two false positives, since num_chars > 0 in the three present
configurations (boston, malta, sead3). Initialize to 0 in order to
silence the warning.
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Paul Burton <paul.burton@mips.com>
Signed-off-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Compiling with W=1 with gcc 7.2.0 gives 2 warnings:
drivers/auxdisplay/img-ascii-lcd.c:233: warning: Function parameter or
member 't' not described in 'img_ascii_lcd_scroll'
drivers/auxdisplay/img-ascii-lcd.c:233: warning: Excess function
parameter 'arg' description in 'img_ascii_lcd_scroll'
Cc: Paul Burton <paul.burton@mips.com>
Signed-off-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Compiling with W=1 with gcc 7.2.0 gives 3 warnings like:
drivers/auxdisplay/panel.c: In function ‘panel_process_inputs’:
drivers/auxdisplay/panel.c:1374:17: warning: this statement may fall
through [-Wimplicit-fallthrough=]
Cc: Willy Tarreau <w@1wt.eu>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
This fixes an oops on unbind / module unload (on the musb omap2430
platform).
musb_remove function now calls musb_platform_exit before disabling
runtime pm.
Signed-off-by: Merlijn Wajer <merlijn@wizzup.org>
Signed-off-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We're dereferencing "p_hwfn->p_rdma_info" but that is freed on the line
before in qed_rdma_resc_free(p_hwfn).
Fixes: 9de506a547 ("qed: Free RoCE ILT Memory on rmmod qedr")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Steffen Klassert says:
====================
pull request (net): ipsec 2018-03-13
1) Refuse to insert 32 bit userspace socket policies on 64
bit systems like we do it for standard policies. We don't
have a compat layer, so inserting socket policies from
32 bit userspace will lead to a broken configuration.
2) Make the policy hold queue work without the flowcache.
Dummy bundles are not chached anymore, so we need to
generate a new one on each lookup as long as the SAs
are not yet in place.
3) Fix the validation of the esn replay attribute. The
The sanity check in verify_replay() is bypassed if
the XFRM_STATE_ESN flag is not set. Fix this by doing
the sanity check uncoditionally.
From Florian Westphal.
4) After most of the dst_entry garbage collection code
is removed, we may leak xfrm_dst entries as they are
neither cached nor tracked somewhere. Fix this by
reusing the 'uncached_list' to track xfrm_dst entries
too. From Xin Long.
5) Fix a rcu_read_lock/rcu_read_unlock imbalance in
xfrm_get_tos() From Xin Long.
6) Fix an infinite loop in xfrm_get_dst_nexthop. On
transport mode we fetch the child dst_entry after
we continue, so this pointer is never updated.
Fix this by fetching it before we continue.
7) Fix ESN sequence number gap after IPsec GSO packets.
We accidentally increment the sequence number counter
on the xfrm_state by one packet too much in the ESN
case. Fix this by setting the sequence number to the
correct value.
8) Reset the ethernet protocol after decapsulation only if a
mac header was set. Otherwise it breaks configurations
with TUN devices. From Yossi Kuperman.
9) Fix __this_cpu_read() usage in preemptible code. Use
this_cpu_read() instead in ipcomp_alloc_tfms().
From Greg Hackmann.
Please pull or let me know if there are problems.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 7d64c39e64310 fixed regression of FCP discovery when Nport Handle
is in-use and relogin is triggered. However, during FCP and FC-NVMe
discovery this resulted into only discovering NVMe LUNs.
This patch fixes issue where FCP and FC-NVMe protocol is used on same
port where assigning FC_NO_LOOP_ID will result into discovery failure
for FCP LUNs.
Fixes: a084fd68e1 ("scsi: qla2xxx: Fix re-login for Nport Handle in use")
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When ata device doing EH, some commands still attached with tasks are
not passed to libata when abort failed or recover failed, so libata did
not handle these commands. After these commands done, sas task is freed,
but ata qc is not freed. This will cause ata qc leak and trigger a
warning like below:
WARNING: CPU: 0 PID: 28512 at drivers/ata/libata-eh.c:4037
ata_eh_finish+0xb4/0xcc
CPU: 0 PID: 28512 Comm: kworker/u32:2 Tainted: G W OE 4.14.0#1
......
Call trace:
[<ffff0000088b7bd0>] ata_eh_finish+0xb4/0xcc
[<ffff0000088b8420>] ata_do_eh+0xc4/0xd8
[<ffff0000088b8478>] ata_std_error_handler+0x44/0x8c
[<ffff0000088b8068>] ata_scsi_port_error_handler+0x480/0x694
[<ffff000008875fc4>] async_sas_ata_eh+0x4c/0x80
[<ffff0000080f6be8>] async_run_entry_fn+0x4c/0x170
[<ffff0000080ebd70>] process_one_work+0x144/0x390
[<ffff0000080ec100>] worker_thread+0x144/0x418
[<ffff0000080f2c98>] kthread+0x10c/0x138
[<ffff0000080855dc>] ret_from_fork+0x10/0x18
If ata qc leaked too many, ata tag allocation will fail and io blocked
for ever.
As suggested by Dan Williams, defer ata device commands to libata and
merge sas_eh_finish_cmd() with sas_eh_defer_cmd(). libata will handle
ata qcs correctly after this.
Signed-off-by: Jason Yan <yanaijie@huawei.com>
CC: Xiaofei Tan <tanxiaofei@huawei.com>
CC: John Garry <john.garry@huawei.com>
CC: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
During the conversion to dsa_is_user_port(), a condition ended up being
reversed, which would prevent the creation of any user port when using
the legacy binding and/or platform data, fix that.
Fixes: 4a5b85ffe2 ("net: dsa: use dsa_is_user_port everywhere")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In case of error, the function dev_get_regmap() returns NULL pointer
not ERR_PTR(). The IS_ERR() test in the return value check should be
replaced with NULL test.
Fixes: 81ac38847a ("clk: qcom: Add APCS clock controller support")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
platform_get_resource() may return NULL, add proper check to
avoid potential NULL dereferencing.
This is detected by Coccinelle semantic patch.
@@
expression pdev, res, n, t, e, e1, e2;
@@
res = platform_get_resource(pdev, t, n);
+ if (!res)
+ return -EINVAL;
... when != res == NULL
e = devm_ioremap(e1, res->start, e2);
Fixes: 4f16f7ff3b ("clk: hisilicon: Add support for Hi3660 stub clocks")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
If we try to determine the rate of a pass-through clock (a clock which
does not implement .round_rate() nor .determine_rate()),
clk_core_round_rate_nolock() will directly forward the call to the
parent clock. In the particular case where the pass-through actually
does not have a parent, clk_core_round_rate_nolock() will directly
return 0 with the requested rate still set to the initial request
structure. This is interpreted as if the rate could be exactly achieved
while it actually cannot be adjusted.
This become a real problem when this particular pass-through clock is
the parent of a mux with the flag CLK_SET_RATE_PARENT set. The
pass-through clock will always report an exact match, get picked and
finally error when the rate is actually getting set.
This is fixed by setting the rate inside the req to 0 when core is NULL
in clk_core_round_rate_nolock() (same as in __clk_determine_rate() when
hw is NULL)
Fixes: 0f6cc2b8e9 ("clk: rework calls to round and determine rate callbacks")
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: Michael Turquette <mturquette@baylibre.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Pull TI SoC clock fixes for 4.16 from Tero Kristo:
* tag 'ti-clk-fixes-4.16' of https://github.com/t-kristo/linux-pm:
clk: ti: am43xx: add set-rate-parent support for display clkctrl clock
clk: ti: am33xx: add set-rate-parent support for display clkctrl clock
clk: ti: clkctrl: add support for CLK_SET_RATE_PARENT flag
Pull i.MX clock fixes for 4.16 from Shawn Guo:
- Update i.MX5 clock driver to register UART4/5 clock only on i.MX50
and i.MX53. It fixes a kernel warning seen on i.MX53, caused by
commit 59dc3d8c86 ("clk: imx51: uart4, uart5 gates only exist on
imx50, imx53").
* tag 'clk-imx-fixes-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
clk: imx51-imx53: Fix UART4/5 registration on i.MX50 and i.MX53
Pull Allwinner clock fixes for 4.16 from Chen-Yu Tsai:
A critical fix for the A31 sunxi-ng clock driver. The CLK_OUT clocks
had definitions paired with the incorrect type of clk ops. This results
in a serious oops starting with commit 946797aa3f ("clk: sunxi-ng:
Support fixed post-dividers on MP style clocks"), which exposed the
incorrect clk ops when it added a new field to the data structures,
which then nudged the underlying (compatible but incorrect) data
structures out of alignment.
* tag 'sunxi-clk-fixes-for-4.16' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux:
clk: sunxi-ng: a31: Fix CLK_OUT_* clock ops
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2018-03-12
This series contains fixes to e1000e only.
Benjamin Poirier provides two fixes, first reverts commits that changed
what happens to the link status when there is an error. These commits
were to resolve a race condition, but in the process of fixing the race
condition, they changed the behavior when an error occurred. Second fix
resolves a race condition by not setting "get_link_status" to false
after checking the link.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Paolo Abeni says:
====================
l2tp: fix races with ipv4-mapped ipv6 addresses
The syzbot reported an l2tp oops that uncovered some races in the l2tp xmit
path and a partially related issue in the generic ipv6 code.
We need to address them separately.
v1 -> v2:
- add missing fixes tag in patch 1
- fix several issues in patch 2
v2 -> v3:
- dropped some unneeded chunks in patch 2
====================
Reviewed-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
The l2tp_tunnel_create() function checks for v4mapped ipv6
sockets and cache that flag, so that l2tp core code can
reusing it at xmit time.
If the socket is provided by the userspace, the connection
status of the tunnel sockets can change between the tunnel
creation and the xmit call, so that syzbot is able to
trigger the following splat:
BUG: KASAN: use-after-free in ip6_dst_idev include/net/ip6_fib.h:192
[inline]
BUG: KASAN: use-after-free in ip6_xmit+0x1f76/0x2260
net/ipv6/ip6_output.c:264
Read of size 8 at addr ffff8801bd949318 by task syz-executor4/23448
CPU: 0 PID: 23448 Comm: syz-executor4 Not tainted 4.16.0-rc4+ #65
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x24d lib/dump_stack.c:53
print_address_description+0x73/0x250 mm/kasan/report.c:256
kasan_report_error mm/kasan/report.c:354 [inline]
kasan_report+0x23c/0x360 mm/kasan/report.c:412
__asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
ip6_dst_idev include/net/ip6_fib.h:192 [inline]
ip6_xmit+0x1f76/0x2260 net/ipv6/ip6_output.c:264
inet6_csk_xmit+0x2fc/0x580 net/ipv6/inet6_connection_sock.c:139
l2tp_xmit_core net/l2tp/l2tp_core.c:1053 [inline]
l2tp_xmit_skb+0x105f/0x1410 net/l2tp/l2tp_core.c:1148
pppol2tp_sendmsg+0x470/0x670 net/l2tp/l2tp_ppp.c:341
sock_sendmsg_nosec net/socket.c:630 [inline]
sock_sendmsg+0xca/0x110 net/socket.c:640
___sys_sendmsg+0x767/0x8b0 net/socket.c:2046
__sys_sendmsg+0xe5/0x210 net/socket.c:2080
SYSC_sendmsg net/socket.c:2091 [inline]
SyS_sendmsg+0x2d/0x50 net/socket.c:2087
do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x42/0xb7
RIP: 0033:0x453e69
RSP: 002b:00007f819593cc68 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007f819593d6d4 RCX: 0000000000453e69
RDX: 0000000000000081 RSI: 000000002037ffc8 RDI: 0000000000000004
RBP: 000000000072bea0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
R13: 00000000000004c3 R14: 00000000006f72e8 R15: 0000000000000000
This change addresses the issues:
* explicitly checking for TCP_ESTABLISHED for user space provided sockets
* dropping the v4mapped flag usage - it can become outdated - and
explicitly invoking ipv6_addr_v4mapped() instead
The issue is apparently there since ancient times.
v1 -> v2: (many thanks to Guillaume)
- with csum issue introduced in v1
- replace pr_err with pr_debug
- fix build issue with IPV6 disabled
- move l2tp_sk_is_v4mapped in l2tp_core.c
v2 -> v3:
- don't update inet_daddr for v4mapped address, unneeded
- drop rendundant check at creation time
Reported-and-tested-by: syzbot+92fa328176eb07e4ac1a@syzkaller.appspotmail.com
Fixes: 3557baabf2 ("[L2TP]: PPP over L2TP driver core")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On unsuccesful ip6_datagram_connect(), if the failure is caused by
ip6_datagram_dst_update(), the sk peer information are cleared, but
the sk->sk_state is preserved.
If the socket was already in an established status, the overall sk
status is inconsistent and fouls later checks in datagram code.
Fix this saving the old peer information and restoring them in
case of failure. This also aligns ipv6 datagram connect() behavior
with ipv4.
v1 -> v2:
- added missing Fixes tag
Fixes: 85cb73ff9b ("net: ipv6: reset daddr and dport in sk if connect() fails")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alex reported the following race condition:
/* link goes up... interrupt... schedule watchdog */
\ e1000_watchdog_task
\ e1000e_has_link
\ hw->mac.ops.check_for_link() === e1000e_check_for_copper_link
\ e1000e_phy_has_link_generic(..., &link)
link = true
/* link goes down... interrupt */
\ e1000_msix_other
hw->mac.get_link_status = true
/* link is up */
mac->get_link_status = false
link_active = true
/* link_active is true, wrongly, and stays so because
* get_link_status is false */
Avoid this problem by making sure that we don't set get_link_status = false
after having checked the link.
It seems this problem has been present since the introduction of e1000e.
Link: https://lkml.org/lkml/2018/1/29/338
Reported-by: Alexander Duyck <alexander.duyck@gmail.com>
Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This reverts commit 19110cfbb3.
This reverts commit 4110e02eb4.
This reverts commit d3604515c9eda464a92e8e67aae82dfe07fe3c98.
Commit 19110cfbb3 ("e1000e: Separate signaling for link check/link up")
changed what happens to the link status when there is an error which
happens after "get_link_status = false" in the copper check_for_link
callbacks. Previously, such an error would be ignored and the link
considered up. After that commit, any error implies that the link is down.
Revert commit 19110cfbb3 ("e1000e: Separate signaling for link check/link
up") and its followups. After reverting, the race condition described in
the log of commit 19110cfbb3 is reintroduced. It may still be triggered
by LSC events but this should keep the link down in case the link is
electrically unstable, as discussed. The race may no longer be
triggered by RXO events because commit 4aea7a5c5e ("e1000e: Avoid
receiver overrun interrupt bursts") restored reading icr in the Other
handler.
Link: https://lkml.org/lkml/2018/3/1/789
Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently, we only allow ourselves to prune the fences so long as
all the waits completed (i.e. all the fences we checked were signaled),
and that the reservation snapshot did not change across the wait.
However, if we only waited for a subset of the reservation object, i.e.
just waiting for the last writer to complete as opposed to all readers
as well, then we would erroneously conclude we could prune the fences as
indeed although all of our waits were successful, they did not represent
the totality of the reservation object.
v2: We only need to check the shared fences due to construction (i.e.
all of the shared fences will be later than the exclusive fence, if
any).
Fixes: e54ca97747 ("drm/i915: Remove completed fences after a wait")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180307171303.29466-1-chris@chris-wilson.co.uk
(cherry picked from commit fa73055b84)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Currently, BXT_PP is hardcoded with value '0'.
It practically disabled eDP backlight on MRB (BXT) platform.
This patch will tell which BXT_PP registers (there are two set of
PP_CONTROL in the spec) to be used as defined in VBT (Video Bios Timing
table) and this will enabled eDP backlight controller on MRB (BXT)
platform.
v2:
- Remove unnecessary information in commit message.
- Assign vbt.backlight.controller to a backlight_controller variable and
return the variable value.
v3:
- Rebased to latest code base.
- updated commit title.
Signed-off-by: Mustamin B Mustaffa <mustamin.b.mustaffa@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180227030734.37901-1-mustamin.b.mustaffa@intel.com
(cherry picked from commit 73c0fcac97)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
The orphan clocks reparents should migrate any existing count from the
orphan clock to its new acestor clocks, otherwise we may have
inconsistent counts in the tree and end-up with gated critical clocks
Assuming we have two clocks, A and B.
* Clock A has CLK_IS_CRITICAL flag set.
* Clock B is an ancestor of A which can gate. Clock B gate is left
enabled by the bootloader.
Step 1: Clock A is registered. Since it is a critical clock, it is
enabled. The clock being still an orphan, no parent are enabled.
Step 2: Clock B is registered and reparented to clock A (potentially
through several other clocks). We are now in situation where the enable
count of clock A is 1 while the enable count of its ancestors is 0, which
is not good.
Step 3: in lateinit, clk_disable_unused() is called, the enable_count of
clock B being 0, clock B is gated and and critical clock A actually gets
disabled.
This situation was found while adding fdiv_clk gates to the meson8b
platform. These clocks parent clk81 critical clock, which is the mother
of all peripheral clocks in this system. Because of the issue described
here, the system is crashing when clk_disable_unused() is called.
The situation is solved by reverting
commit f8f8f1d044 ("clk: Don't touch hardware when reparenting during registration").
To avoid breaking again the situation described in this commit
description, enabling critical clock should be done before walking the
orphan list. This way, a parent critical clock may not be accidentally
disabled due to the CLK_OPS_PARENT_ENABLE mechanism.
Fixes: f8f8f1d044 ("clk: Don't touch hardware when reparenting during registration")
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Dong Aisheng <aisheng.dong@nxp.com>
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Michael Turquette <mturquette@baylibre.com>
Pull NFS client bugfixes from Trond Myklebust:
"Hightlights include the following stable fixes:
- NFS: Fix an incorrect type in struct nfs_direct_req
- pNFS: Prevent the layout header refcount going to zero in
pnfs_roc()
- NFS: Fix unstable write completion"
* tag 'nfs-for-4.16-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFS: Fix unstable write completion
pNFS: Prevent the layout header refcount going to zero in pnfs_roc()
NFS: Fix an incorrect type in struct nfs_direct_req
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for your net tree, they are:
1) Fixed hashtable representation doesn't support timeout flag, skip it
otherwise rules to add elements from the packet fail bogusly fail with
EOPNOTSUPP.
2) Fix bogus error with 32-bits ebtables userspace and 64-bits kernel,
patch from Florian Westphal.
3) Sanitize proc names in several x_tables extensions, also from Florian.
4) Add sanitization to ebt_among wormhash logic, from Florian.
5) Missing release of hook array in flowtable.
====================
ASoC: Fixes for v4.16
This is a fairly standard collection of fixes, there's no changes to the
core here just a bunch of small device specific changes for single
drivers plus an update to the MAINTAINERS file for the sgl5000.
Marc Kleine-Budde says:
====================
pull-request: can 2018-03-12
this is a pull reqeust of 6 patches for net/master.
The first patch is by Wolfram Sang and fixes a bitshift vs. comparison mistake
in the m_can driver. Two patches of Marek Vasut repair the error handling in
the ifi driver. The two patches by Stephane Grosjean fix a "echo_skb is
occupied!" bug in the peak/pcie_fd driver. Bich HEMON's patch adds pinctrl
select state calls to the m_can's driver to further improve power saving during
suspend.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Since the kprobe which was optimized by jump can not change
the execution path, the kprobe for error-injection must not
be optimized. To prohibit it, set a dummy post-handler as
officially stated in Documentation/kprobes.txt.
Fixes: 4b1a29a7f5 ("error-injection: Support fault injection framework")
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Now when using 'ss' in iproute, kernel would try to load all _diag
modules, which also causes corresponding family and proto modules
to be loaded as well due to module dependencies.
Like after running 'ss', sctp, dccp, af_packet (if it works as a module)
would be loaded.
For example:
$ lsmod|grep sctp
$ ss
$ lsmod|grep sctp
sctp_diag 16384 0
sctp 323584 5 sctp_diag
inet_diag 24576 4 raw_diag,tcp_diag,sctp_diag,udp_diag
libcrc32c 16384 3 nf_conntrack,nf_nat,sctp
As these family and proto modules are loaded unintentionally, it
could cause some problems, like:
- Some debug tools use 'ss' to collect the socket info, which loads all
those diag and family and protocol modules. It's noisy for identifying
issues.
- Users usually expect to drop sctp init packet silently when they
have no sense of sctp protocol instead of sending abort back.
- It wastes resources (especially with multiple netns), and SCTP module
can't be unloaded once it's loaded.
...
In short, it's really inappropriate to have these family and proto
modules loaded unexpectedly when just doing debugging with inet_diag.
This patch is to introduce sock_load_diag_module() where it loads
the _diag module only when it's corresponding family or proto has
been already registered.
Note that we can't just load _diag module without the family or
proto loaded, as some symbols used in _diag module are from the
family or proto module.
v1->v2:
- move inet proto check to inet_diag to avoid a compiling err.
v2->v3:
- define sock_load_diag_module in sock.c and export one symbol
only.
- improve the changelog.
Reported-by: Sabrina Dubroca <sd@queasysnail.net>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Phil Sutter <phil@nwl.cc>
Acked-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan says:
====================
bnxt_en: Bug fixes.
There are 3 bug fixes in this series to fix regressions recently
introduced when adding the new ring reservations scheme. 2 minor
fixes in the TC Flower code to return standard errno values and
to elide some unnecessary warning dmesg. One Fixes the VLAN TCI
value passed to the stack by including the entire 16-bit VLAN TCI,
and the last fix is to check for valid VNIC ID before setting up or
shutting down LRO/GRO.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
During initialization, if we encounter errors, there is a code path that
calls bnxt_hwrm_vnic_set_tpa() with invalid VNIC ID. This may cause a
warning in firmware logs.
Fixes: c0c050c58d ("bnxt_en: New Broadcom ethernet driver.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bnxt_restore_pf_fw_resources routine frees PF resources by calling
close_nic and allocates the resources back, by doing open_nic. However,
this is not needed, if the PF is already in closed state.
This bug causes the driver to call open the device and call request_irq()
when it is not needed. Ultimately, pci_disable_msix() will crash
when bnxt_en is unloaded.
This patch fixes the problem by skipping __bnxt_close_nic and
__bnxt_open_nic inside bnxt_restore_pf_fw_resources routine, if the
interface is not running.
Fixes: 80fcaf46c0 ("bnxt_en: Restore MSIX after disabling SRIOV.")
Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, internal error value is returned by the driver, when
hwrm_cfa_flow_alloc() fails due lack of resources. We should be returning
Linux errno value -ENOSPC instead.
This patch also converts other similar command errors to standard Linux errno
code (-EIO) in bnxt_tc.c
Fixes: db1d36a273 ("bnxt_en: add TC flower offload flow_alloc/free FW cmds")
Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Recent changes added the bnxt_init_int_mode() call in the driver's open
path whenever ring reservations are changed. This call was previously
only called in the probe path. In the open path, if MQPRIO TC has been
setup, the bnxt_init_int_mode() call would reset and mess up the MQPRIO
per TC rings.
Fix it by not re-initilizing bp->tx_nr_rings_per_tc in
bnxt_init_int_mode(). Instead, initialize it in the probe path only
after the bnxt_init_int_mode() call.
Fixes: 674f50a5b0 ("bnxt_en: Implement new method to reserve rings.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When receiving a packet with VLAN tag, pass the entire 16-bit TCI to the
stack when calling __vlan_hwaccel_put_tag(). The current code is only
passing the 12-bit tag and it is missing the priority bits.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In some conditions when the driver fails to add a flow in HW and returns
an error back to the stack, the stack continues to invoke get_flow_stats()
and/or del_flow() on it. The driver fails these APIs with an error message
"no flow_node for cookie". The message gets logged repeatedly as long as
the stack keeps invoking these functions.
Fix this by removing the corresponding netdev_info() calls from these
functions.
Fixes: d7bc730530 ("bnxt_en: add code to query TC flower offload stats")
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The number of vnics to check must be determined ahead of time because
only standard RX rings require vnics to support RFS. The logic is
similar to the ring reservation logic and we can now use the
refactored common functions to do most of the work in setting up
the firmware message.
Fixes: 8f23d638b3 ("bnxt_en: Expand bnxt_check_rings() to check all resources.")
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The bnxt_hwrm_reserve_{pf|vf}_rings() functions are very similar to
the bnxt_hwrm_check_{pf|vf}_rings() functions. Refactor the former
so that the latter can make use of common code in the next patch.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull "Rockchip dts64 fixes for 4.16" from Heiko Stübner:
Pinctrl got a fix in 4.16-rc1, that exposed an issue with wifi-related
pinctrl hogs on rk3399-gru-kevin that broke suspend. This gets fixed
by moving the wifi pinctrl to the correct node.
Also revert the usb3 phy-port enablement, as a missing feature in the
type-c phy breaks usb on all non-gru rk3399 boards.
* tag 'v4.16-rockchip-dts64fixes-2' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip:
Revert "arm64: dts: rockchip: add usb3-phy otg-port support for rk3399"
arm64: dts: rockchip: Fix rk3399-gru-* s2r (pinctrl hogs, wifi reset)
In 664fcf123a (net: phy: Threaded interrupts allow some simplification)
the phy_interrupt system was changed to use a traditional threaded
interrupt scheme instead of a workqueue approach.
With this change, the phy status check moved into phy_change, which
did not report back to the caller whether or not the interrupt was
handled. This means that, in the case of a shared phy interrupt,
only the first phydev's interrupt registers are checked (since
phy_interrupt() would always return IRQ_HANDLED). This leads to
interrupt storms when it is a secondary device that's actually the
interrupt source.
Signed-off-by: Brad Mouring <brad.mouring@ni.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull "Rockchip dts32 fixes for 4.16" from Heiko Stübner:
Fix for a new dtc warning.
* tag 'v4.16-rockchip-dts32fixes-2' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip:
ARM: dts: rockchip: Add missing #sound-dai-cells on rk3288
Pull "mvebu fixes for 4.16 (part 2)" from Gregory CLEMENT:
- Updating my emails address but this time in mailmap (from
free-electrons to bootlin)
* tag 'mvebu-fixes-4.16-2' of git://git.infradead.org/linux-mvebu:
mailmap: Update email address for Gregory CLEMENT
Pull "DaVinci fixes for v4.16" from Sekhar Nori:
A patch fixing GPIO look-up for MMC/SD card detect and write-protect
pins on OMAP-L138 Hawkboard.
* tag 'davinci-fixes-for-v4.16' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/nsekhar/linux-davinci:
ARM: davinci: fix the GPIO lookup for omapl138-hawk
Pull "i.MX fixes for 4.16, 2nd round" from Shawn Guo:
- Fix a copy-paste error in imx7d-sdb device tree, which results in two
usb-otg regulators are both named "regulator-usb-otg1-vbus" and
override each other.
* tag 'imx-fixes-4.16-2' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
ARM: dts: imx7d-sdb: Fix regulator-usb-otg2-vbus node name
With the commit 1ba8f9d308 ("ALSA: hda: Add a power_save
blacklist"), we changed the default value of power_save option to -1
for processing the power-save blacklist.
Unfortunately, this seems breaking user-space applications that
actually read the power_save parameter value via sysfs and judge /
adjust the power-saving status. They see the value -1 as if the
power-save is turned off, although the actual value is taken from
CONFIG_SND_HDA_POWER_SAVE_DEFAULT and it can be a positive.
So, overall, passing -1 there was no good idea. Let's partially
revert it -- at least for power_save option default value is restored
again to CONFIG_SND_HDA_POWER_SAVE_DEFAULT. Meanwhile, in this patch,
we keep the blacklist behavior and make is adjustable via the new
option, pm_blacklist.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199073
Fixes: 1ba8f9d308 ("ALSA: hda: Add a power_save blacklist")
Acked-by: Hans de Goede <hdegoede@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
While the specific UFS PHY drivers (14nm and 20nm) have a module
license, the common base module does not, leading to a Kbuild
failure:
WARNING: modpost: missing MODULE_LICENSE() in drivers/phy/qualcomm/phy-qcom-ufs.o
FATAL: modpost: GPL-incompatible module phy-qcom-ufs.ko uses GPL-only symbol 'clk_enable'
This adds a module description and license tag to fix the build.
I added both Yaniv and Vivek as authors here, as Yaniv sent the initial
submission, while Vivek did most of the work since.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Make sure to apply the correct pin state in suspend/resume callbacks.
Putting pins in sleep state saves power.
Signed-off-by: Bich Hemon <bich.hemon@st.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Currently the exclusivity is enabled when the rate is set by
the mode setting functions. These functions are called by
mode_set_nofb callback of drm_crc_helper. Then exclusivity
is disabled when tcon is disabled by atomic_disable
callback.
What happens is that mode_set_nofb can be called once when
mode changes, and afterwards the system can call atomic_enable
and atomic_disable multiple times without further calls to
mode_set_nofb.
This happens:
mode_set_nofb - clk exclusivity is enabled
atomic_enable
atomic_disable - clk exclusivity is disabled
atomic_enable
atomic_disable - clk exclusivity is already disabled, leading to WARN
in clk_rate_exclusive_put
Solution is to enable exclusivity in sun4i_tcon_channel_set_status.
Signed-off-by: Ondrej Jirman <megous@megous.com>
Cc: Jernej Skrabec <jernej.skrabec@siol.net>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180310110511.14697-1-megous@megous.com
When an interface starts, the echo_skb array is empty and the network
queue should be started only. This patch replaces useless code and locks
when the internal RX_BARRIER message is received from the IP core, telling
the driver that tx may start.
Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
This patch makes atomic the handling of the linux-can echo_skb array and
the network tx queue. This prevents from the "BUG! echo_skb is occupied!"
message to be printed by the linux-can core, in SMP environments.
Reported-by: Diana Burgess <diana@peloton-tech.com>
Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
The new version of the IFI CANFD core has significantly less complex
error state indication logic. In particular, the warning/error state
bits are no longer all over the place, but are all present in the
STATUS register. Moreover, there is a new IRQ register bit indicating
transition between error states (active/warning/passive/busoff).
This patch makes use of this bit to weed out the obscure selective
INTERRUPT register clearing, which was used to carry over the error
state indication into the poll function. While at it, this patch
fixes the handling of the ACTIVE state, since the hardware provides
indication of the core being in ACTIVE state and that in turn fixes
the state transition indication toward userspace. Finally, register
reads in the poll function are moved to the matching subfunctions
since those are also no longer needed in the poll function.
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Heiko Schocher <hs@denx.de>
Cc: Markus Marb <markus@marb.org>
Cc: Marc Kleine-Budde <mkl@pengutronix.de>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Older versions of the core are not compatible with the driver due
to various intrusive fixes of the core. Read out the VER register,
check the core revision bitfield and verify if the core in use is
new enough (rev 2.1 or newer) to work correctly with this driver.
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Heiko Schocher <hs@denx.de>
Cc: Markus Marb <markus@marb.org>
Cc: Marc Kleine-Budde <mkl@pengutronix.de>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Max delat_t should be the full_bucket/rate instead of the full_bucket.
Also report EINVAL if the rate is zero.
Fixes: 96fbc13d7e ("openvswitch: Add meter infrastructure")
Cc: Andy Zhou <azhou@ovn.org>
Signed-off-by: zhangliping <zhangliping02@baidu.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adding a macvlan device on top of a lowerdev that supports
the xfrm offloads fails with a new regression:
# ip link add link ens1f0 mv0 type macvlan
RTNETLINK answers: Operation not permitted
Tracing down the failure shows that the macvlan device inherits
the NETIF_F_HW_ESP and NETIF_F_HW_ESP_TX_CSUM feature flags
from the lowerdev, but with no dev->xfrmdev_ops API filled
in, it doesn't actually support xfrm. When the request is
made to add the new macvlan device, the XFRM listener for
NETDEV_REGISTER calls xfrm_api_check() which fails the new
registration because dev->xfrmdev_ops is NULL.
The macvlan creation succeeds when we filter out the ESP
feature flags in macvlan_fix_features(), so let's filter them
out like we're already filtering out ~NETIF_F_NETNS_LOCAL.
When XFRM support is added in the future, we can add the flags
into MACVLAN_FEATURES.
This same problem could crop up in the future with any other
new feature flags, so let's filter out any flags that aren't
defined as supported in macvlan.
Fixes: d77e38e612 ("xfrm: Add an IPsec hardware offloading API")
Reported-by: Alexey Kodanev <alexey.kodanev@oracle.com>
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It's found that the final phase set by driver doesn't match that of
the output from clk_summary:
dwmmc_rockchip fe310000.dwmmc: Successfully tuned phase to 346
mmc0: new ultra high speed SDR104 SDIO card at address 0001
cat /sys/kernel/debug/clk/clk_summary | grep sdio_sample
sdio_sample 0 1 0 50000000 0 0
It seems the cached core->phase isn't updated after the clk was
registered. So fix this issue by updating the core->phase if setting
phase successfully.
Fixes: 9e4d04adeb ("clk: add clk_core_set_phase_nolock function")
Cc: Stable <stable@vger.kernel.org>
Cc: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Reviewed-by: Jerome Brunet <jbrunet@baylibre.com>
Tested-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: Michael Turquette <mturquette@baylibre.com>
Pull x86/pti updates from Thomas Gleixner:
"Yet another pile of melted spectrum related updates:
- Drop native vsyscall support finally as it causes more trouble than
benefit.
- Make microcode loading more robust. There were a few issues
especially related to late loading which are now surfacing because
late loading of the IB* microcodes addressing spectre issues has
become more widely used.
- Simplify and robustify the syscall handling in the entry code
- Prevent kprobes on the entry trampoline code which lead to kernel
crashes when the probe hits before CR3 is updated
- Don't check microcode versions when running on hypervisors as they
are considered as lying anyway.
- Fix the 32bit objtool build and a coment typo"
* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/kprobes: Fix kernel crash when probing .entry_trampoline code
x86/pti: Fix a comment typo
x86/microcode: Synchronize late microcode loading
x86/microcode: Request microcode on the BSP
x86/microcode/intel: Look into the patch cache first
x86/microcode: Do not upload microcode if CPUs are offline
x86/microcode/intel: Writeback and invalidate caches before updating microcode
x86/microcode/intel: Check microcode revision before updating sibling threads
x86/microcode: Get rid of struct apply_microcode_ctx
x86/spectre_v2: Don't check microcode versions when running under hypervisors
x86/vsyscall/64: Drop "native" vsyscalls
x86/entry/64/compat: Save one instruction in entry_INT80_compat()
x86/entry: Do not special-case clone(2) in compat entry
x86/syscalls: Use COMPAT_SYSCALL_DEFINEx() macros for x86-only compat syscalls
x86/syscalls: Use proper syscall definition for sys_ioperm()
x86/entry: Remove stale syscall prototype
x86/syscalls/32: Simplify $entry == $compat entries
objtool: Fix 32-bit build
Pull timer fix from Thomas Gleixner:
"Just a single fix which adds a missing Kconfig dependency to avoid
unmet dependency warnings"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clocksource/atmel-st: Add 'depends on HAS_IOMEM' to fix unmet dependency
Pull RAS fixes from Thomas Gleixner:
"Two small fixes for RAS/MCE:
- Serialize sysfs changes to avoid concurrent modificaiton of
underlying data
- Add microcode revision to Machine Check records. This should have
been there forever, but now with the broken microcode versions in
the wild it has become important"
* 'ras-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/MCE: Serialize sysfs changes
x86/MCE: Save microcode revision in machine check records
Pull perf updates from Thomas Gleixner:
"Another set of perf updates:
- Fix a Skylake Uncore event format declaration
- Prevent perf pipe mode from crahsing which was caused by a missing
buffer allocation
- Make the perf top popup message which tells the user that it uses
fallback mode on older kernels a debug message.
- Make perf context rescheduling work correcctly
- Robustify the jump error drawing in perf browser mode so it does
not try to create references to NULL initialized offset entries
- Make trigger_on() robust so it does not enable the trigger before
everything is set up correctly to handle it
- Make perf auxtrace respect the --no-itrace option so it does not
try to queue AUX data for decoding.
- Prevent having different number of field separators in CVS output
lines when a counter is not supported.
- Make the perf kallsyms man page usage behave like it does for all
other perf commands.
- Synchronize the kernel headers"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/core: Fix ctx_event_type in ctx_resched()
perf tools: Fix trigger class trigger_on()
perf auxtrace: Prevent decoding when --no-itrace
perf stat: Fix CVS output format for non-supported counters
tools headers: Sync x86's cpufeatures.h
tools headers: Sync copy of kvm UAPI headers
perf record: Fix crash in pipe mode
perf annotate browser: Be more robust when drawing jump arrows
perf top: Fix annoying fallback message on older kernels
perf kallsyms: Fix the usage on the man page
perf/x86/intel/uncore: Fix Skylake UPI event format
Pull locking fix from Thomas Gleixner:
"rt_mutex_futex_unlock() grew a new irq-off call site, but the function
assumes that its always called from irq enabled context.
Use (un)lock_irqsafe() to handle the new call site correctly"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
rtmutex: Make rt_mutex_futex_unlock() safe for irq-off callsites
Pull irqchip updates for 4.16-rc5 from Marc Zyngier
- IMX GPCv2 cleanup
- GICv3 iomem annontation fixes
- GICv3 ITS minimal ITE allocation now matching the LPIs'.
ebt_among is special, it has a dynamic match size and is exempt
from the central size checks.
commit c4585a2823 ("bridge: ebt_among: add missing match size checks")
added validation for pool size, but missed fact that the macros
ebt_among_wh_src/dst can already return out-of-bound result because
they do not check value of wh_src/dst_ofs (an offset) vs. the size
of the match that userspace gave to us.
v2:
check that offset has correct alignment.
Paolo Abeni points out that we should also check that src/dst
wormhash arrays do not overlap, and src + length lines up with
start of dst (or vice versa).
v3: compact wormhash_sizes_valid() part
NB: Fixes tag is intentionally wrong, this bug exists from day
one when match was added for 2.6 kernel. Tag is there so stable
maintainers will notice this one too.
Tested with same rules from the earlier patch.
Fixes: c4585a2823 ("bridge: ebt_among: add missing match size checks")
Reported-by: <syzbot+bdabab6f1983a03fc009@syzkaller.appspotmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The last rule in the blob has next_entry offset that is same as total size.
This made "ebtables32 -A OUTPUT -d de:ad:be:ef:01:02" fail on 64 bit kernel.
Fixes: b718121685 ("netfilter: ebtables: CONFIG_COMPAT: don't trust userland offsets")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pull dmaengine fixes from Vinod Koul:
"Two small fixes are for this cycle:
- fix max_chunk_size for rcar-dmac for R-Car Gen3
- fix clock resource of mv_xor_v2"
* tag 'dmaengine-fix-4.16-rc5' of git://git.infradead.org/users/vkoul/slave-dma:
dmaengine: mv_xor_v2: Fix clock resource by adding a register clock
dmaengine: rcar-dmac: fix max_chunk_size for R-Car Gen3
Pull GPIO fix from Linus Walleij:
"This is a single GPIO fix for the v4.16 series affecting the Renesas
driver, and fixes wakeup from external stuff"
* tag 'gpio-v4.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: rcar: Use wakeup_path i.s.o. explicit clock handling
On the CP110 components which are present on the Armada 7K/8K SoC we need
to explicitly enable the clock for the registers. However it is not
needed for the AP8xx component, that's why this clock is optional.
With this patch both clock have now a name, but in order to be backward
compatible, the name of the first clock is not used. It allows to still
use this clock with a device tree using the old binding.
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
imx_gpcv2_get_wakeup_source() is not used anywhere, so remove it.
This fixes the following sparse warning:
drivers/irqchip/irq-imx-gpcv2.c:34:5: warning: symbol 'imx_gpcv2_get_wakeup_source' was not declared. Should it be static?
Fixes: e324c4dc4a ("irqchip/imx-gpcv2: IMX GPCv2 driver for wakeup sources")
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
When struct its_device instances are created, the nr_ites member
will be set to a power of 2 that equals or exceeds the requested
number of MSIs passed to the msi_prepare() callback. At the same
time, the LPI map is allocated to be some multiple of 32 in size,
where the allocated size may be less than the requested size
depending on whether a contiguous range of sufficient size is
available in the global LPI bitmap.
This may result in the situation where the nr_ites < nr_lpis, and
since nr_ites is what we program into the hardware when we map the
device, the additional LPIs will be non-functional.
For bog standard hardware, this does not really matter. However,
in cases where ITS device IDs are shared between different PCIe
devices, we may end up allocating these additional LPIs without
taking into account that they don't actually work.
So let's make nr_ites at least 32. This ensures that all allocated
LPIs are 'live', and that its_alloc_device_irq() will fail when
attempts are made to allocate MSIs beyond what was allocated in
the first place.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[maz: updated comment]
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
snd_pcm_oss_get_formats() has an obvious use-after-free around
snd_mask_test() calls, as spotted by syzbot. The passed format_mask
argument is a pointer to the hw_params object that is freed before the
loop. What a surprise that it has been present since the original
code of decades ago...
Reported-by: syzbot+4090700a4f13fccaf648@syzkaller.appspotmail.com
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Pull Kbuild fixes from Masahiro Yamada:
- make fixdep parse kconfig.h to fix missing rebuild
- replace hyphens with underscores in builtin DTB label names
- fix typos
* tag 'kbuild-fixes-v4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kbuild: Handle builtin dtb file names containing hyphens
scripts/bloat-o-meter: fix typos in help
fixdep: do not ignore kconfig.h
fixdep: remove some false CONFIG_ matches
fixdep: remove stale references to uml-config.h
Pull block fixes from Jens Axboe:
- a xen-blkfront fix from Bhavesh with a multiqueue fix when
detaching/re-attaching
- a few important NVMe fixes, including a revert for a sysfs fix that
caused some user space confusion
- two bcache fixes by way of Michael Lyle
- a loop regression fix, fixing an issue with lost writes on DAX.
* tag 'for-linus-20180309' of git://git.kernel.dk/linux-block:
loop: Fix lost writes caused by missing flag
nvme_fc: rework sqsize handling
nvme-fabrics: Ignore nr_io_queues option for discovery controllers
xen-blkfront: move negotiate_mq to cover all cases of new VBDs
Revert "nvme: create 'slaves' and 'holders' entries for hidden controllers"
bcache: don't attach backing with duplicate UUID
bcache: fix crashes in duplicate cache device register
nvme: pci: pass max vectors as num_possible_cpus() to pci_alloc_irq_vectors
nvme-pci: Fix EEH failure on ppc
Pull device mapper fixes from Mike Snitzer:
- Fix an uninitialized variable false warning in dm bufio
- Fix DM's passthrough ioctl support to be race free against an
underlying device being removed.
- Fix corner-case of DM raid resync reporting if/when the raid becomes
degraded during resync; otherwise automated raid repair will fail.
- A few DM multipath fixes to make non-SCSI optimizations, that were
introduced during the 4.16 merge, useful for all non-SCSI devices,
rather than narrowly define this non-SCSI mode in terms of "nvme".
This allows the removal of "queue_mode nvme" that really didn't need
to be introduced. Instead DM core will internalize whether
nvme-specific IO submission optimizations are doable and DM multipath
will only do SCSI-specific device handler operations if SCSI is in
use.
* tag 'for-4.16/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm table: allow upgrade from bio-based to specialized bio-based variant
dm mpath: remove unnecessary NVMe branching in favor of scsi_dh checks
dm table: fix "nvme" test
dm raid: fix incorrect sync_ratio when degraded
dm: use blkdev_get rather than bdgrab when issuing pass-through ioctl
dm bufio: avoid false-positive Wmaybe-uninitialized warning
Pull rdma fixes from Doug Ledford:
- Various driver bug fixes in mlx5, mlx4, bnxt_re and qedr, ranging
from bugs under load to bad error case handling
- There in one largish patch fixing the locking in bnxt_re to avoid a
machine hard lock situation
- A few core bugs on error paths
- A patch to reduce stack usage in the new CQ API
- One mlx5 regression introduced in this merge window
- There were new syzkaller scripts written for the RDMA subsystem and
we are fixing issues found by the bot
- One of the commits (aa0de36a40 “RDMA/mlx5: Fix integer overflow
while resizing CQ”) is missing part of the commit log message and one
of the SOB lines. The original patch was from Leon Romanovsky, and a
cut-n-paste separator in the commit message confused patchworks which
then put the end of message separator in the wrong place in the
downloaded patch, and I didn’t notice in time. The patch made it into
the official branch, and the only way to fix it in-place was to
rebase. Given the pain that a rebase causes, and the fact that the
patch has relevant tags for stable and syzkaller, a revert of the
munged patch and a reapplication of the original patch with the log
message intact was done.
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (25 commits)
RDMA/mlx5: Fix integer overflow while resizing CQ
Revert "RDMA/mlx5: Fix integer overflow while resizing CQ"
RDMA/ucma: Check that user doesn't overflow QP state
RDMA/mlx5: Fix integer overflow while resizing CQ
RDMA/ucma: Limit possible option size
IB/core: Fix possible crash to access NULL netdev
RDMA/bnxt_re: Avoid Hard lockup during error CQE processing
RDMA/core: Reduce poll batch for direct cq polling
IB/mlx5: Fix an error code in __mlx5_ib_modify_qp()
IB/mlx5: When not in dual port RoCE mode, use provided port as native
IB/mlx4: Include GID type when deleting GIDs from HW table under RoCE
IB/mlx4: Fix corruption of RoCEv2 IPv4 GIDs
RDMA/qedr: Fix iWARP write and send with immediate
RDMA/qedr: Fix kernel panic when running fio over NFSoRDMA
RDMA/qedr: Fix iWARP connect with port mapper
RDMA/qedr: Fix ipv6 destination address resolution
IB/core : Add null pointer check in addr_resolve
RDMA/bnxt_re: Fix the ib_reg failure cleanup
RDMA/bnxt_re: Fix incorrect DB offset calculation
RDMA/bnxt_re: Unconditionly fence non wire memory operations
...
Pull x86 platform driver fixes from Darren Hart:
"Correct a module loading race condition between the DELL_SMBIOS
backend modules and the first user by converting them to bool features
of the DELL_SMBIOS driver. Fixup the resulting Kconfig dependency
issue with DCDBAS"
* tag 'platform-drivers-x86-v4.16-6' of git://git.infradead.org/linux-platform-drivers-x86:
platform/x86: dell-smbios: Resolve dependency error on DCDBAS
platform/x86: Allow for SMBIOS backend defaults
platform/x86: dell-smbios: Link all dell-smbios-* modules together
platform/x86: dell-smbios: Rename dell-smbios source to dell-smbios-base
platform/x86: dell-smbios: Correct some style warnings
When releasing a client, we need to clear the clienttab[] entry at
first, then call snd_seq_queue_client_leave(). Otherwise, the
in-flight cell in the queue might be picked up by the timer interrupt
via snd_seq_check_queue() before calling snd_seq_queue_client_leave(),
and it's delivered to another queue while the client is clearing
queues. This may eventually result in an uncleared cell remaining in
a queue, and the later snd_seq_pool_delete() may need to wait for a
long time until the event gets really processed.
By moving the clienttab[] clearance at the beginning of release, any
event delivery of a cell belonging to this client will fail at a later
point, since snd_seq_client_ptr() returns NULL. Thus the cell that
was picked up by the timer interrupt will be returned immediately
without further delivery, and the long stall of snd_seq_delete_pool()
can be avoided, too.
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Although we've covered the races between concurrent write() and
ioctl() in the previous patch series, there is still a possible UAF in
the following scenario:
A: user client closed B: timer irq
-> snd_seq_release() -> snd_seq_timer_interrupt()
-> snd_seq_free_client() -> snd_seq_check_queue()
-> cell = snd_seq_prioq_cell_peek()
-> snd_seq_prioq_leave()
.... removing all cells
-> snd_seq_pool_done()
.... vfree()
-> snd_seq_compare_tick_time(cell)
... Oops
So the problem is that a cell is peeked and accessed without any
protection until it's retrieved from the queue again via
snd_seq_prioq_cell_out().
This patch tries to address it, also cleans up the code by a slight
refactoring. snd_seq_prioq_cell_out() now receives an extra pointer
argument. When it's non-NULL, the function checks the event timestamp
with the given pointer. The caller needs to pass the right reference
either to snd_seq_tick or snd_seq_realtime depending on the event
timestamp type.
A good news is that the above change allows us to remove the
snd_seq_prioq_cell_peek(), too, thus the patch actually reduces the
code size.
Reviewed-by: Nicolai Stange <nstange@suse.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Commit 7383d44b added a pointer pdata which get set to the default
platform_data when non was defined in the device. But it did not
pass this pointer to the st_sensors_init_sensor call but still
used the maybe uninitialized platform_data from dev.
This breaks initialization when no platform_data is given and
the optional st,drdy-int-pin devicetree option is not set.
This commit fixes this.
Cc: stable@vger.kernel.org
Fixes: 7383d44b ("iio: st_pressure: st_accel: Initialise sensor platform data properly")
Signed-off-by: Michael Nosthoff <committed@heine.so>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
This reverts commit 585ed27d06.
This removed code which was unused due to a bug in commit 7383d44b.
To fix this bug the code is needed. Thus this revert.
Signed-off-by: Michael Nosthoff <committed@heine.so>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
The meson_sar_adc_lock() function is not supposed to hold the
"indio_dev->mlock" on the error path.
Fixes: 3adbf34273 ("iio: adc: add a driver for the SAR ADC found in Amlogic Meson SoCs")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Pull KVM fixes from Radim Krčmář:
"PPC:
- Fix guest time accounting in the host
- Fix large-page backing for radix guests on POWER9
- Fix HPT guests on POWER9 backed by 2M or 1G pages
- Compile fixes for some configs and gcc versions
s390:
- Fix random memory corruption when running as guest2 (e.g. KVM in
LPAR) and starting guest3 (e.g. nested KVM) with many CPUs
- Export forgotten io interrupt delivery statistics counter"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: s390: fix memory overwrites when not using SCA entries
KVM: PPC: Book3S HV: Fix guest time accounting with VIRT_CPU_ACCOUNTING_GEN
KVM: PPC: Book3S HV: Fix VRMA initialization with 2MB or 1GB memory backing
KVM: PPC: Book3S HV: Fix handling of large pages in radix page fault handler
KVM: s390: provide io interrupt kvm_stat
KVM: PPC: Book3S: Fix compile error that occurs with some gcc versions
KVM: PPC: Fix compile error that occurs when CONFIG_ALTIVEC=n
Pull xen fix from Juergen Gross:
"Just one fix for the correct error handling after a failed
device_register()"
* tag 'for-linus-4.16a-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen: xenbus: use put_device() instead of kfree()
Pull arm64 fixes from Catalin Marinas:
- The SMCCC firmware interface for the spectre variant 2 mitigation has
been updated to allow the discovery of whether the CPU needs the
workaround. This pull request relaxes the kernel check on the return
value from firmware.
- Fix the commit allowing changing from global to non-global page table
entries which inadvertently disallowed other safe attribute changes.
- Fix sleeping in atomic during the arm_perf_teardown_cpu() code.
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: Relax ARM_SMCCC_ARCH_WORKAROUND_1 discovery
arm_pmu: Use disable_irq_nosync when disabling SPI in CPU teardown hook
arm64: mm: fix thinko in non-global page table attribute check
Pull Documentation build fix from Jonathan Corbet:
"The Sphinx 1.7 release broke the build process for reasons that are
mostly our fault.
This is a single fix cherry-picked from docs-next that restores docs
buildability for all supported Sphinx versions"
* tag 'docs-4.16-fix' of git://git.lwn.net/linux:
Documentation/sphinx: Fix Directive import error
Merge misc fixes from Andrew Morton:
"8 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
lib/test_kmod.c: fix limit check on number of test devices created
selftests/vm/run_vmtests: adjust hugetlb size according to nr_cpus
mm/page_alloc: fix memmap_init_zone pageblock alignment
mm/memblock.c: hardcode the end_pfn being -1
mm/gup.c: teach get_user_pages_unlocked to handle FOLL_NOWAIT
lib/bug.c: exclude non-BUG/WARN exceptions from report_bug()
bug: use %pB in BUG and stack protector failure
hugetlb: fix surplus pages accounting
As reported by Dan the parentheses is in the wrong place, and since
unlikely() call returns either 0 or 1 it's never less than zero. The
second issue is that signed integer overflows like "INT_MAX + 1" are
undefined behavior.
Since num_test_devs represents the number of devices, we want to stop
prior to hitting the max, and not rely on the wrap arround at all. So
just cap at num_test_devs + 1, prior to assigning a new device.
Link: http://lkml.kernel.org/r/20180224030046.24238-1-mcgrof@kernel.org
Fixes: d9c6a72d6f ("kmod: add test driver to stress test the module loader")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit b92df1de5d ("mm: page_alloc: skip over regions of invalid pfns
where possible") introduced a bug where move_freepages() triggers a
VM_BUG_ON() on uninitialized page structure due to pageblock alignment.
To fix this, simply align the skipped pfns in memmap_init_zone() the
same way as in move_freepages_block().
Seen in one of the RHEL reports:
crash> log | grep -e BUG -e RIP -e Call.Trace -e move_freepages_block -e rmqueue -e freelist -A1
kernel BUG at mm/page_alloc.c:1389!
invalid opcode: 0000 [#1] SMP
--
RIP: 0010:[<ffffffff8118833e>] [<ffffffff8118833e>] move_freepages+0x15e/0x160
RSP: 0018:ffff88054d727688 EFLAGS: 00010087
--
Call Trace:
[<ffffffff811883b3>] move_freepages_block+0x73/0x80
[<ffffffff81189e63>] __rmqueue+0x263/0x460
[<ffffffff8118c781>] get_page_from_freelist+0x7e1/0x9e0
[<ffffffff8118caf6>] __alloc_pages_nodemask+0x176/0x420
--
RIP [<ffffffff8118833e>] move_freepages+0x15e/0x160
RSP <ffff88054d727688>
crash> page_init_bug -v | grep RAM
<struct resource 0xffff88067fffd2f8> 1000 - 9bfff System RAM (620.00 KiB)
<struct resource 0xffff88067fffd3a0> 100000 - 430bffff System RAM ( 1.05 GiB = 1071.75 MiB = 1097472.00 KiB)
<struct resource 0xffff88067fffd410> 4b0c8000 - 4bf9cfff System RAM ( 14.83 MiB = 15188.00 KiB)
<struct resource 0xffff88067fffd480> 4bfac000 - 646b1fff System RAM (391.02 MiB = 400408.00 KiB)
<struct resource 0xffff88067fffd560> 7b788000 - 7b7fffff System RAM (480.00 KiB)
<struct resource 0xffff88067fffd640> 100000000 - 67fffffff System RAM ( 22.00 GiB)
crash> page_init_bug | head -6
<struct resource 0xffff88067fffd560> 7b788000 - 7b7fffff System RAM (480.00 KiB)
<struct page 0xffffea0001ede200> 1fffff00000000 0 <struct pglist_data 0xffff88047ffd9000> 1 <struct zone 0xffff88047ffd9800> DMA32 4096 1048575
<struct page 0xffffea0001ede200> 505736 505344 <struct page 0xffffea0001ed8000> 505855 <struct page 0xffffea0001edffc0>
<struct page 0xffffea0001ed8000> 0 0 <struct pglist_data 0xffff88047ffd9000> 0 <struct zone 0xffff88047ffd9000> DMA 1 4095
<struct page 0xffffea0001edffc0> 1fffff00000400 0 <struct pglist_data 0xffff88047ffd9000> 1 <struct zone 0xffff88047ffd9800> DMA32 4096 1048575
BUG, zones differ!
Note that this range follows two not populated sections
68000000-77ffffff in this zone. 7b788000-7b7fffff is the first one
after a gap. This makes memmap_init_zone() skip all the pfns up to the
beginning of this range. But this range is not pageblock (2M) aligned.
In fact no range has to be.
crash> kmem -p 77fff000 78000000 7b5ff000 7b600000 7b787000 7b788000
PAGE PHYSICAL MAPPING INDEX CNT FLAGS
ffffea0001e00000 78000000 0 0 0 0
ffffea0001ed7fc0 7b5ff000 0 0 0 0
ffffea0001ed8000 7b600000 0 0 0 0 <<<<
ffffea0001ede1c0 7b787000 0 0 0 0
ffffea0001ede200 7b788000 0 0 1 1fffff00000000
Top part of page flags should contain nodeid and zonenr, which is not
the case for page ffffea0001ed8000 here (<<<<).
crash> log | grep -o fffea0001ed[^\ ]* | sort -u
fffea0001ed8000
fffea0001eded20
fffea0001edffc0
crash> bt -r | grep -o fffea0001ed[^\ ]* | sort -u
fffea0001ed8000
fffea0001eded00
fffea0001eded20
fffea0001edffc0
Initialization of the whole beginning of the section is skipped up to
the start of the range due to the commit b92df1de5d. Now any code
calling move_freepages_block() (like reusing the page from a freelist as
in this example) with a page from the beginning of the range will get
the page rounded down to start_page ffffea0001ed8000 and passed to
move_freepages() which crashes on assertion getting wrong zonenr.
> VM_BUG_ON(page_zone(start_page) != page_zone(end_page));
Note, page_zone() derives the zone from page flags here.
From similar machine before commit b92df1de5d:
crash> kmem -p 77fff000 78000000 7b5ff000 7b600000 7b7fe000 7b7ff000
PAGE PHYSICAL MAPPING INDEX CNT FLAGS
fffff73941e00000 78000000 0 0 1 1fffff00000000
fffff73941ed7fc0 7b5ff000 0 0 1 1fffff00000000
fffff73941ed8000 7b600000 0 0 1 1fffff00000000
fffff73941edff80 7b7fe000 0 0 1 1fffff00000000
fffff73941edffc0 7b7ff000 ffff8e67e04d3ae0 ad84 1 1fffff00020068 uptodate,lru,active,mappedtodisk
All the pages since the beginning of the section are initialized.
move_freepages()' not gonna blow up.
The same machine with this fix applied:
crash> kmem -p 77fff000 78000000 7b5ff000 7b600000 7b7fe000 7b7ff000
PAGE PHYSICAL MAPPING INDEX CNT FLAGS
ffffea0001e00000 78000000 0 0 0 0
ffffea0001e00000 7b5ff000 0 0 0 0
ffffea0001ed8000 7b600000 0 0 1 1fffff00000000
ffffea0001edff80 7b7fe000 0 0 1 1fffff00000000
ffffea0001edffc0 7b7ff000 ffff88017fb13720 8 2 1fffff00020068 uptodate,lru,active,mappedtodisk
At least the bare minimum of pages is initialized preventing the crash
as well.
Customers started to report this as soon as 7.4 (where b92df1de5d was
merged in RHEL) was released. I remember reports from
September/October-ish times. It's not easily reproduced and happens on
a handful of machines only. I guess that's why. But that does not make
it less serious, I think.
Though there actually is a report here:
https://bugzilla.kernel.org/show_bug.cgi?id=196443
And there are reports for Fedora from July:
https://bugzilla.redhat.com/show_bug.cgi?id=1473242
and CentOS:
https://bugs.centos.org/view.php?id=13964
and we internally track several dozens reports for RHEL bug
https://bugzilla.redhat.com/show_bug.cgi?id=1525121
Link: http://lkml.kernel.org/r/0485727b2e82da7efbce5f6ba42524b429d0391a.1520011945.git.neelx@redhat.com
Fixes: b92df1de5d ("mm: page_alloc: skip over regions of invalid pfns where possible")
Signed-off-by: Daniel Vacek <neelx@redhat.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
KVM is hanging during postcopy live migration with userfaultfd because
get_user_pages_unlocked is not capable to handle FOLL_NOWAIT.
Earlier FOLL_NOWAIT was only ever passed to get_user_pages.
Specifically faultin_page (the callee of get_user_pages_unlocked caller)
doesn't know that if FAULT_FLAG_RETRY_NOWAIT was set in the page fault
flags, when VM_FAULT_RETRY is returned, the mmap_sem wasn't actually
released (even if nonblocking is not NULL). So it sets *nonblocking to
zero and the caller won't release the mmap_sem thinking it was already
released, but it wasn't because of FOLL_NOWAIT.
Link: http://lkml.kernel.org/r/20180302174343.5421-2-aarcange@redhat.com
Fixes: ce53053ce3 ("kvm: switch get_user_page_nowait() to get_user_pages_unlocked()")
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Tested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit b8347c2196 ("x86/debug: Handle warnings before the notifier
chain, to fix KGDB crash") changed the ordering of fixups, and did not
take into account the case of x86 processing non-WARN() and non-BUG()
exceptions. This would lead to output of a false BUG line with no other
information.
In the case of a refcount exception, it would be immediately followed by
the refcount WARN(), producing very strange double-"cut here":
lkdtm: attempting bad refcount_inc() overflow
------------[ cut here ]------------
Kernel BUG at 0000000065f29de5 [verbose debug info unavailable]
------------[ cut here ]------------
refcount_t overflow at lkdtm_REFCOUNT_INC_OVERFLOW+0x6b/0x90 in cat[3065], uid/euid: 0/0
WARNING: CPU: 0 PID: 3065 at kernel/panic.c:657 refcount_error_report+0x9a/0xa4
...
In the prior ordering, exceptions were searched first:
do_trap_no_signal(struct task_struct *tsk, int trapnr, char *str,
...
if (fixup_exception(regs, trapnr))
return 0;
- if (fixup_bug(regs, trapnr))
- return 0;
-
As a result, fixup_bugs()'s is_valid_bugaddr() didn't take into account
needing to search the exception list first, since that had already
happened.
So, instead of searching the exception list twice (once in
is_valid_bugaddr() and then again in fixup_exception()), just add a
simple sanity check to report_bug() that will immediately bail out if a
BUG() (or WARN()) entry is not found.
Link: http://lkml.kernel.org/r/20180301225934.GA34350@beast
Fixes: b8347c2196 ("x86/debug: Handle warnings before the notifier chain, to fix KGDB crash")
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Richard Weinberger <richard.weinberger@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dan Rue has noticed that libhugetlbfs test suite fails counter test:
# mount_point="/mnt/hugetlb/"
# echo 200 > /proc/sys/vm/nr_hugepages
# mkdir -p "${mount_point}"
# mount -t hugetlbfs hugetlbfs "${mount_point}"
# export LD_LIBRARY_PATH=/root/libhugetlbfs/libhugetlbfs-2.20/obj64
# /root/libhugetlbfs/libhugetlbfs-2.20/tests/obj64/counters
Starting testcase "/root/libhugetlbfs/libhugetlbfs-2.20/tests/obj64/counters", pid 3319
Base pool size: 0
Clean...
FAIL Line 326: Bad HugePages_Total: expected 0, actual 1
The bug was bisected to 0c397daea1 ("mm, hugetlb: further simplify
hugetlb allocation API").
The reason is that alloc_surplus_huge_page() misaccounts per node
surplus pages. We should increase surplus_huge_pages_node rather than
nr_huge_pages_node which is already handled by alloc_fresh_huge_page.
Link: http://lkml.kernel.org/r/20180221191439.GM2231@dhcp22.suse.cz
Fixes: 0c397daea1 ("mm, hugetlb: further simplify hugetlb allocation API")
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Dan Rue <dan.rue@linaro.org>
Tested-by: Dan Rue <dan.rue@linaro.org>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The original commit of this patch has a munged log message that is
missing several of the tags the original author intended to be on the
patch. This was due to patchworks misinterpreting a cut-n-paste
separator line as an end of message line and munging the mbox that was
used to import the patch:
https://patchwork.kernel.org/patch/10264089/
The original patch will be reapplied with a fixed commit message so the
proper tags are applied.
This reverts commit aa0de36a40.
Signed-off-by: Doug Ledford <dledford@redhat.com>
Pull PCI fixes from Bjorn Helgaas:
- fix sparc build issue when OF_IRQ not enabled (Guenter Roeck)
- fix enumeration of devices below switches on DesignWare-based
controllers (Koen Vandeputte)
* tag 'pci-v4.16-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI: dwc: Fix enumeration end when reaching root subordinate
PCI: Move of_irq_parse_and_map_pci() declaration under OF_IRQ
Pull fbdev fix from Bartlomiej Zolnierkiewicz:
"Just a single fix to close a kernel data leak in FBIOGETCMAP_SPARC
ioctl"
* tag 'fbdev-v4.16-rc5' of git://github.com/bzolnier/linux:
fbdev: Fixing arbitrary kernel leak in case FBIOGETCMAP_SPARC in sbusfb_ioctl_helper().
Pull drm fixes from Dave Airlie:
"There are a small set of sun4i and i915 fixes, and many more amdgpu
fixes:
sun4i:
- divide by zero fix
- clock and LVDS fixes
i915:
- fix for perf
- race fix
amdgpu:
- a bit more than we are normally comfortable with at this point,
however it does fix a lot of display issues with the new DC code
which result in black screens in various configurations along with
some run of the mill gpu configuration fixes.
I'm happy enough that the fixes are limited to the DC code and
should fix a bunch of issues on the new raven ridge APUs that we
are seeing shipped now"
* tag 'drm-fixes-for-v4.16-rc5' of git://people.freedesktop.org/~airlied/linux: (42 commits)
drm/amd/display: validate plane format on primary plane
drm/amdgpu:Always save uvd vcpu_bo in VM Mode
drm/amdgpu:Correct max uvd handles
drm/amd/display: early return if not in vga mode in disable_vga
drm/amd/display: Fix takover from VGA mode
drm/amd/display: Fix memleaks when atomic check fails.
drm/amd/display: Return success when enabling interrupt
drm/amd/display: Use crtc enable/disable_vblank hooks
drm/amd/display: update infoframe after dig fe is turned on
drm/amd/display: fix boot-up on vega10
drm/amd/display: fix cursor related Pstate hang
drm/amd/display: Set irq state only on existing crtcs
drm/amd/display: Fixed non-native modes not lighting up
drm/amd/display: Call update_stream_signal directly from amdgpu_dm
drm/amd/display: Make create_stream_for_sink more consistent
drm/amd/display: Don't block dual-link DVI modes
drm/amd/display: Don't allow dual-link DVI on all ASICs.
drm/amd/display: Pass signal directly to enable_tmds_output
drm/amd/display: Remove unnecessary fail labels in create_stream_for_sink
drm/amd/display: Move MAX_TMDS_CLOCK define to header
...
Do not log an error if tcpm_register_port() fails with -EPROBE_DEFER.
Fixes: cf140a3569 ("typec: fusb302: Use dev_err during probe")
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
William Tu says:
====================
a couple of erspan fixes
The series fixes a couple of erspan issues.
The first patch adds the erspan v2 proto type to the ip6 tunnel lookup.
The second patch improves the error handling when users screws the
version number in metadata. The final patch makes sure the skb has
enough headroom for pushing erspan header when xmit.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The patch adds skb_cow_header() to ensure enough headroom
at ip6erspan_tunnel_xmit before pushing the erspan header
to the skb.
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When users fill in incorrect erspan version number through
the struct erspan_metadata uapi, current code skips pushing
the erspan header but continue pushing the gre header, which
is incorrect. The patch fixes it by returning error.
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The patch adds the erspan v2 proto in ip6gre_tunnel_lookup
so the erspan v2 tunnel can be found correctly.
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel says:
====================
mlxsw: ACL and mirroring fixes
The first patch fixes offload of rules using the 'pass' action. Instead
of continuing to evaluate lower priority rules, the binding is
terminated and the packet proceeds to the bridge and router blocks on
ingress, or goes out of the port on egress.
Second patch prevents the user from mirroring more than once from a
given {Port, Direction} as this is not supported by the device.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The Spectrum ASIC doesn't support mirroring more than once from a single
binding point (which is a port-direction pair). Therefore detect that a
second binding of a given binding point is attempted.
To that end, extend struct mlxsw_sp_span_inspected_port to track whether
a given binding point is bound or not. Extend
mlxsw_sp_span_entry_port_find() to look for ports based on the full
unique key: port number, direction, and boundness.
Besides fixing the overt bug where configured mirrors are not offloaded,
this also fixes a more subtle bug: mlxsw_sp_span_inspected_port_del()
just defers to mlxsw_sp_span_entry_bound_port_find(), and that used to
find the first port with the right number (disregarding the type). Thus
by adding and removing egress and ingress mirrors in the right order,
one could trick the system into believing it has no egress mirrors when
in fact it did have some. That then caused that
mlxsw_sp_span_port_mtu_update() didn't update mirroring buffer when MTU
was changed.
Fixes: 763b4b70af ("mlxsw: spectrum: Add support in matchall mirror TC offloading")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For ok GACT action, TERMINATE binding_cmd should be used in action set
passed down to HW.
Fixes: b2925957ec ("mlxsw: spectrum_flower: Offload "ok" termination action")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reported-by: Alexander Petrovskiy <alexpe@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull sound fixes from Takashi Iwai:
"Two type of fixes:
- The usual stuff, a handful HD-audio quirks for various machines
- Further hardening against ALSA sequencer ioctl/write races that are
triggered by fuzzer"
* tag 'sound-4.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda: add dock and led support for HP ProBook 640 G2
ALSA: hda: add dock and led support for HP EliteBook 820 G3
ALSA: hda/realtek - Make dock sound work on ThinkPad L570
ALSA: seq: Remove superfluous snd_seq_queue_client_leave_cells() call
ALSA: seq: More protection for concurrent write and ioctl races
ALSA: seq: Don't allow resizing pool in use
ALSA: hda/realtek - Fix dock line-out volume on Dell Precision 7520
ALSA: hda/realtek: Limit mic boost on T480
ALSA: hda/realtek - Add headset mode support for Dell laptop
ALSA: hda/realtek - Add support headset mode for DELL WYSE
ALSA: hda - Fix a wrong FIXUP for alc289 on Dell machines
Currently the driver attempts to spin lock on udc->lock before a NULL
pointer check is performed on udc, hence there is a potential null
pointer dereference on udc->lock. Fix this by moving the null check
on udc before the lock occurs.
Fixes: ea6873a45a ("usbip: vudc: Add SysFS infrastructure for VUDC")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Shuah Khan <shuahkh@osg.samsung.com>
Reviewed-by: Krzysztof Opasiak <k.opasiak@samsung.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A recent update to the ARM SMCCC ARCH_WORKAROUND_1 specification
allows firmware to return a non zero, positive value to describe
that although the mitigation is implemented at the higher exception
level, the CPU on which the call is made is not affected.
Let's relax the check on the return value from ARCH_WORKAROUND_1
so that we only error out if the returned value is negative.
Fixes: b092201e00 ("arm64: Add ARM_SMCCC_ARCH_WORKAROUND_1 BP hardening support")
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Pull overlayfs fixes from Miklos Szeredi:
"This fixes a corner case for NFS exporting (introduced in this cycle)
as well as fixing miscellaneous bugs"
* 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
ovl: update Kconfig texts
ovl: redirect_dir=nofollow should not follow redirect for opaque lower
ovl: fix ptr_ret.cocci warnings
ovl: check ERR_PTR() return value from ovl_lookup_real()
ovl: check lower ancestry on encode of lower dir file handle
ovl: hash non-dir by lower inode for fsnotify
Pull xfs fixes from Darrick Wong:
- Fix some iomap locking problems
- Don't allocate cow blocks when we're zeroing file data
* tag 'xfs-4.16-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: don't block on the ilock for RWF_NOWAIT
xfs: don't start out with the exclusive ilock for direct I/O
xfs: don't allocate COW blocks for zeroing holes or unwritten extents
When the DELL_SMBIOS_SMM backend is enabled, the DELL_SMBIOS symbol
depends on DELL_DCDBAS, and we must avoid the situation where
DELL_SMBIOS=y and DCDBAS=m.
Adding the conditional dependency to DELL_SMBIOS such as:
depends !DELL_SMBIOS_SMM || (DCDBAS || DCDBAS=n)
results in the Kconfig tooling complaining about a circular dependency,
although it appears to work in practice.
Avoid the errors by simplifying the dependency and forcing DELL_SMBIOS
to be <= DCDBAS if DCDBAS is enabled (thanks to Greg KH for the
suggestion).
Cc: Mario.Limonciello@dell.com
Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
Avoid accidental configurations by setting default y for DELL_SMBIOS
backends. Avoid this impacting the default build size, by making them
dependent on DELL_SMBIOS, so they only appear when DELL_SMBIOS is
manually selected, or by DELL_LAPTOP or DELL_WMI.
While DELL_SMBIOS does have a prompt, it does not have any dependencies.
Keeping DELL_SMBIOS visible, despite being "select"ed by DELL_LAPTOP and
DELL_WMI, is a deliberate choice to provide context for the WMI and SMM
backends, which would otherwise appear to float without context within
the menu.
Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
Some race conditions were raised due to dell-smbios and its backends
not being ready by the time that a consumer would call one of the
exported methods.
To avoid this problem, guarantee that all initialization has been
done by linking them all together and running init for them all.
As part of this change the Kconfig needs to be adjusted so that
CONFIG_DELL_SMBIOS_SMM and CONFIG_DELL_SMBIOS_WMI are boolean
rather than modules.
CONFIG_DELL_SMBIOS is a visually selectable option again and both
CONFIG_DELL_SMBIOS_WMI and CONFIG_DELL_SMBIOS_SMM are optional.
Signed-off-by: Mario Limonciello <mario.limonciello@dell.com>
[dvhart: Update prompt and help text for DELL_SMBIOS_* backends]
Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
This is being done to faciliate a later change to link all the dell-smbios
drivers together.
Signed-off-by: Mario Limonciello <mario.limonciello@dell.com>
Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
WARNING: function definition argument 'struct calling_interface_buffer *'
should also have an identifier name
+ int (*call_fn)(struct calling_interface_buffer *);
WARNING: Block comments use * on subsequent lines
+ /* 4 bytes of table header, plus 7 bytes of Dell header,
plus at least
+ 6 bytes of entry */
WARNING: Block comments use a trailing */ on a separate line
+ 6 bytes of entry */
Signed-off-by: Mario Limonciello <mario.limonciello@dell.com>
Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
Pull powerpc fixes from Michael Ellerman:
"One notable fix to properly advertise our support for a new firmware
feature, caused by two series conflicting semantically but not
textually.
There's a new ioctl for the new ocxl driver, which is not a fix, but
needed to complete the userspace API and good to have before the
driver is in a released kernel.
Finally three minor selftest fixes, and a fix for intermittent build
failures for some obscure platforms, caused by a missing make
dependency.
Thanks to: Alastair D'Silva, Bharata B Rao, Guenter Roeck"
* tag 'powerpc-4.16-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/pseries: Fix vector5 in ibm architecture vector table
ocxl: Document the OCXL_IOCTL_GET_METADATA IOCTL
ocxl: Add get_metadata IOCTL to share OCXL information to userspace
selftests/powerpc: Skip the subpage_prot tests if the syscall is unavailable
selftests/powerpc: Fix missing clean of pmu/lib.o
powerpc/boot: Fix random libfdt related build errors
selftests/powerpc: Skip tm-trap if transactional memory is not enabled
When a USB device gets plugged on ASUS PRIME B350M-A's front ports, the
xHC stops working:
[ 549.114587] xhci_hcd 0000:02:00.0: WARN: xHC CMD_RUN timeout
[ 549.114608] suspend_common(): xhci_pci_suspend+0x0/0xc0 returns -110
[ 549.114638] xhci_hcd 0000:02:00.0: can't suspend (hcd_pci_runtime_suspend returned -110)
Delay before running xHC command CMD_RUN can workaround the issue.
Use a new quirk to make the delay only targets to the affected xHC.
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch reverts the commit 835e4241e7 ("usb: host: xhci-plat:
enable clk in resume timing") because this driver also has runtime PM
and the commit 560869100b ("clk: renesas: cpg-mssr: Restore module
clocks during resume") will restore the clock on R-Car H3 environment.
If the xhci_plat_suspend() disables the clk, the system cannot enable
the clk in resume like the following behavior:
< In resume >
- genpd_resume_noirq() runs and enable the clk (enable_count = 1)
- cpg_mssr_resume_noirq() restores the clk register.
-- Since the clk was disabled in suspend, cpg_mssr_resume_noirq()
will disable the clk and keep the enable_count.
- Even if xhci_plat_resume() calls clk_prepare_enable(), since
the enable_count is 1, the clk will be not enabled.
After this patch is applied, the cpg-mssr driver will save the clk
as enable, so the clk will be enabled in resume.
Fixes: 835e4241e7 ("usb: host: xhci-plat: enable clk in resume timing")
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jason Wang says:
====================
Several fixes for vhost_net ptr_ring usage
This small series try to fix several bugs of ptr_ring usage in
vhost_net. Please review.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
After commit fc72d1d54d ("tuntap: XDP transmission"), we can
actually queueing XDP pointers in the pointer ring, so we should
examine the pointer type before freeing the pointer.
Fixes: fc72d1d54d ("tuntap: XDP transmission")
Reported-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We get pointer ring from the exported sock, this means we should keep
rx_ring and vq->private synced during both vq stop and backend set,
otherwise we may see stale rx_ring.
Fixes: c67df11f6e ("vhost_net: try batch dequing from skb array")
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This enables AVE_GI_RXDROP interrupt factor. This factor indicates
depletion of Rx descriptors and the handler counts the number
of dropped packets.
Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As well as the basic conversion, I noticed that a lot of the
SCTP code checks gso_type without first checking skb_is_gso()
so I have added that where appropriate.
Also, document the helper.
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the following slab-out-of-bounds kasan report in
ndisc_fill_redirect_hdr_option when the incoming ipv6 packet is not
linear and the accessed data are not in the linear data region of orig_skb.
[ 1503.122508] ==================================================================
[ 1503.122832] BUG: KASAN: slab-out-of-bounds in ndisc_send_redirect+0x94e/0x990
[ 1503.123036] Read of size 1184 at addr ffff8800298ab6b0 by task netperf/1932
[ 1503.123220] CPU: 0 PID: 1932 Comm: netperf Not tainted 4.16.0-rc2+ #124
[ 1503.123347] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-2.fc27 04/01/2014
[ 1503.123527] Call Trace:
[ 1503.123579] <IRQ>
[ 1503.123638] print_address_description+0x6e/0x280
[ 1503.123849] kasan_report+0x233/0x350
[ 1503.123946] memcpy+0x1f/0x50
[ 1503.124037] ndisc_send_redirect+0x94e/0x990
[ 1503.125150] ip6_forward+0x1242/0x13b0
[...]
[ 1503.153890] Allocated by task 1932:
[ 1503.153982] kasan_kmalloc+0x9f/0xd0
[ 1503.154074] __kmalloc_track_caller+0xb5/0x160
[ 1503.154198] __kmalloc_reserve.isra.41+0x24/0x70
[ 1503.154324] __alloc_skb+0x130/0x3e0
[ 1503.154415] sctp_packet_transmit+0x21a/0x1810
[ 1503.154533] sctp_outq_flush+0xc14/0x1db0
[ 1503.154624] sctp_do_sm+0x34e/0x2740
[ 1503.154715] sctp_primitive_SEND+0x57/0x70
[ 1503.154807] sctp_sendmsg+0xaa6/0x1b10
[ 1503.154897] sock_sendmsg+0x68/0x80
[ 1503.154987] ___sys_sendmsg+0x431/0x4b0
[ 1503.155078] __sys_sendmsg+0xa4/0x130
[ 1503.155168] do_syscall_64+0x171/0x3f0
[ 1503.155259] entry_SYSCALL_64_after_hwframe+0x42/0xb7
[ 1503.155436] Freed by task 1932:
[ 1503.155527] __kasan_slab_free+0x134/0x180
[ 1503.155618] kfree+0xbc/0x180
[ 1503.155709] skb_release_data+0x27f/0x2c0
[ 1503.155800] consume_skb+0x94/0xe0
[ 1503.155889] sctp_chunk_put+0x1aa/0x1f0
[ 1503.155979] sctp_inq_pop+0x2f8/0x6e0
[ 1503.156070] sctp_assoc_bh_rcv+0x6a/0x230
[ 1503.156164] sctp_inq_push+0x117/0x150
[ 1503.156255] sctp_backlog_rcv+0xdf/0x4a0
[ 1503.156346] __release_sock+0x142/0x250
[ 1503.156436] release_sock+0x80/0x180
[ 1503.156526] sctp_sendmsg+0xbb0/0x1b10
[ 1503.156617] sock_sendmsg+0x68/0x80
[ 1503.156708] ___sys_sendmsg+0x431/0x4b0
[ 1503.156799] __sys_sendmsg+0xa4/0x130
[ 1503.156889] do_syscall_64+0x171/0x3f0
[ 1503.156980] entry_SYSCALL_64_after_hwframe+0x42/0xb7
[ 1503.157158] The buggy address belongs to the object at ffff8800298ab600
which belongs to the cache kmalloc-1024 of size 1024
[ 1503.157444] The buggy address is located 176 bytes inside of
1024-byte region [ffff8800298ab600, ffff8800298aba00)
[ 1503.157702] The buggy address belongs to the page:
[ 1503.157820] page:ffffea0000a62a00 count:1 mapcount:0 mapping:0000000000000000 index:0x0 compound_mapcount: 0
[ 1503.158053] flags: 0x4000000000008100(slab|head)
[ 1503.158171] raw: 4000000000008100 0000000000000000 0000000000000000 00000001800e000e
[ 1503.158350] raw: dead000000000100 dead000000000200 ffff880036002600 0000000000000000
[ 1503.158523] page dumped because: kasan: bad access detected
[ 1503.158698] Memory state around the buggy address:
[ 1503.158816] ffff8800298ab900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 1503.158988] ffff8800298ab980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 1503.159165] >ffff8800298aba00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 1503.159338] ^
[ 1503.159436] ffff8800298aba80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 1503.159610] ffff8800298abb00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 1503.159785] ==================================================================
[ 1503.159964] Disabling lock debugging due to kernel taint
The test scenario to trigger the issue consists of 4 devices:
- H0: data sender, connected to LAN0
- H1: data receiver, connected to LAN1
- GW0 and GW1: routers between LAN0 and LAN1. Both of them have an
ethernet connection on LAN0 and LAN1
On H{0,1} set GW0 as default gateway while on GW0 set GW1 as next hop for
data from LAN0 to LAN1.
Moreover create an ip6ip6 tunnel between H0 and H1 and send 3 concurrent
data streams (TCP/UDP/SCTP) from H0 to H1 through ip6ip6 tunnel (send
buffer size is set to 16K). While data streams are active flush the route
cache on HA multiple times.
I have not been able to identify a given commit that introduced the issue
since, using the reproducer described above, the kasan report has been
triggered from 4.14 and I have not gone back further.
Reported-by: Jianlin Shi <jishi@redhat.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Moved 16bit resolution condition check for stoney platform
to acp_hw_params.Depending upon substream required register
value need to be programmed rather than enabling 16bit resolution
support all time in acp init.
Signed-off-by: Vijendar Mukunda <Vijendar.Mukunda@amd.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
The following commit:
commit aa4d86163e ("block: loop: switch to VFS ITER_BVEC")
replaced __do_lo_send_write(), which used ITER_KVEC iterators, with
lo_write_bvec() which uses ITER_BVEC iterators. In this change, though,
the WRITE flag was lost:
- iov_iter_kvec(&from, ITER_KVEC | WRITE, &kvec, 1, len);
+ iov_iter_bvec(&i, ITER_BVEC, bvec, 1, bvec->bv_len);
This flag is necessary for the DAX case because we make decisions based on
whether or not the iterator is a READ or a WRITE in dax_iomap_actor() and
in dax_iomap_rw().
We end up going through this path in configurations where we combine a PMEM
device with 4k sectors, a loopback device and DAX. The consequence of this
missed flag is that what we intend as a write actually turns into a read in
the DAX code, so no data is ever written.
The very simplest test case is to create a loopback device and try and
write a small string to it, then hexdump a few bytes of the device to see
if the write took. Without this patch you read back all zeros, with this
you read back the string you wrote.
For XFS this causes us to fail or panic during the following xfstests:
xfs/074 xfs/078 xfs/216 xfs/217 xfs/250
For ext4 we have a similar issue where writes never happen, but we don't
currently have any xfstests that use loopback and show this issue.
Fix this by restoring the WRITE flag argument to iov_iter_bvec(). This
causes the xfstests to all pass.
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: stable@vger.kernel.org
Fixes: commit aa4d86163e ("block: loop: switch to VFS ITER_BVEC")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
When populating shadow ctx from guest, we should handle oa related
registers in hw ctx, so that they will not be overlapped by guest oa
configs. This patch made it possible to capture oa data from host for
both host and guests.
Signed-off-by: Min He <min.he@intel.com>
Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
If user continuously create vgpu, boot guest, shoutdown guest and destroy
vgpu from remote, the following calltrace exists in dmesg sometimes:
[ 6412.954721] RPM wakelock ref not held during HW access
[ 6412.954795] WARNING: CPU: 7 PID: 11941 at
linux/drivers/gpu/drm/i915/intel_drv.h:1800
intel_uncore_forcewake_get.part.7+0x96/0xa0 [i915]
[ 6412.954915] Call Trace:
[ 6412.954951] intel_uncore_forcewake_get+0x18/0x20 [i915]
[ 6412.954989] intel_gvt_switch_mmio+0x8e/0x770 [i915]
[ 6412.954996] ? __slab_free+0x14d/0x2c0
[ 6412.955001] ? __slab_free+0x14d/0x2c0
[ 6412.955006] ? __slab_free+0x14d/0x2c0
[ 6412.955041] intel_vgpu_stop_schedule+0x92/0xd0 [i915]
[ 6412.955073] intel_gvt_deactivate_vgpu+0x48/0x60 [i915]
[ 6412.955078] __intel_vgpu_release+0x55/0x260 [kvmgt]
when this happens, gvt_switch_mmio is called at vgpu destroy, host i915 is
idle and doesn't hold RPM wakelock, igd is in powersave mode, but
gvt_switch_mmio require igd power on to access register, so
intel_runtime_pm_get should be added to make sure igd power on before
gvt_switch_mmio.
v2: Move runtime_pm_get/put into gvt_switch_mmio.(Zhenyu)
Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com>
Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
When running rcutorture with TREE03 config, CONFIG_PROVE_LOCKING=y, and
kernel cmdline argument "rcutorture.gp_exp=1", lockdep reports a
HARDIRQ-safe->HARDIRQ-unsafe deadlock:
================================
WARNING: inconsistent lock state
4.16.0-rc4+ #1 Not tainted
--------------------------------
inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
takes:
__schedule+0xbe/0xaf0
{IN-HARDIRQ-W} state was registered at:
_raw_spin_lock+0x2a/0x40
scheduler_tick+0x47/0xf0
...
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&rq->lock);
<Interrupt>
lock(&rq->lock);
*** DEADLOCK ***
1 lock held by rcu_torture_rea/724:
rcu_torture_read_lock+0x0/0x70
stack backtrace:
CPU: 2 PID: 724 Comm: rcu_torture_rea Not tainted 4.16.0-rc4+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014
Call Trace:
lock_acquire+0x90/0x200
? __schedule+0xbe/0xaf0
_raw_spin_lock+0x2a/0x40
? __schedule+0xbe/0xaf0
__schedule+0xbe/0xaf0
preempt_schedule_irq+0x2f/0x60
retint_kernel+0x1b/0x2d
RIP: 0010:rcu_read_unlock_special+0x0/0x680
? rcu_torture_read_unlock+0x60/0x60
__rcu_read_unlock+0x64/0x70
rcu_torture_read_unlock+0x17/0x60
rcu_torture_reader+0x275/0x450
? rcutorture_booster_init+0x110/0x110
? rcu_torture_stall+0x230/0x230
? kthread+0x10e/0x130
kthread+0x10e/0x130
? kthread_create_worker_on_cpu+0x70/0x70
? call_usermodehelper_exec_async+0x11a/0x150
ret_from_fork+0x3a/0x50
This happens with the following even sequence:
preempt_schedule_irq();
local_irq_enable();
__schedule():
local_irq_disable(); // irq off
...
rcu_note_context_switch():
rcu_note_preempt_context_switch():
rcu_read_unlock_special():
local_irq_save(flags);
...
raw_spin_unlock_irqrestore(...,flags); // irq remains off
rt_mutex_futex_unlock():
raw_spin_lock_irq();
...
raw_spin_unlock_irq(); // accidentally set irq on
<return to __schedule()>
rq_lock():
raw_spin_lock(); // acquiring rq->lock with irq on
which means rq->lock becomes a HARDIRQ-unsafe lock, which can cause
deadlocks in scheduler code.
This problem was introduced by commit 02a7c234e5 ("rcu: Suppress
lockdep false-positive ->boost_mtx complaints"). That brought the user
of rt_mutex_futex_unlock() with irq off.
To fix this, replace the *lock_irq() in rt_mutex_futex_unlock() with
*lock_irq{save,restore}() to make it safe to call rt_mutex_futex_unlock()
with irq off.
Fixes: 02a7c234e5 ("rcu: Suppress lockdep false-positive ->boost_mtx complaints")
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
Link: https://lkml.kernel.org/r/20180309065630.8283-1-boqun.feng@gmail.com
The GPIO chip is called davinci_gpio.0 in legacy mode. Fix it, so that
mmc can correctly lookup the wp and cp gpios.
Note that it is the gpio-davinci driver that sets the gpiochip label to
davinci_gpio.0.
Fixes: c69f43fb4f ("ARM: davinci: hawk: use gpio descriptor for mmc pins")
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
[nsekhar@ti.com: add a note on where the chip label is set]
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Disable the kprobe probing of the entry trampoline:
.entry_trampoline is a code area that is used to ensure page table
isolation between userspace and kernelspace.
At the beginning of the execution of the trampoline, we load the
kernel's CR3 register. This has the effect of enabling the translation
of the kernel virtual addresses to physical addresses. Before this
happens most kernel addresses can not be translated because the running
process' CR3 is still used.
If a kprobe is placed on the trampoline code before that change of the
CR3 register happens the kernel crashes because int3 handling pages are
not accessible.
To fix this, add the .entry_trampoline section to the kprobe blacklist
to prohibit the probing of code before all the kernel pages are
accessible.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: mathieu.desnoyers@efficios.com
Cc: mhiramat@kernel.org
Link: http://lkml.kernel.org/r/1520565492-4637-2-git-send-email-francis.deslauriers@efficios.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The email address for the Tegra IOMMU maintainer, Hiroshi, is no longer
valid. Hiroshi has not been maintaining the Tegra IOMMU driver for some
time now and so remove Hiroshi as the maintainer and add Thierry Reding
instead. Also add the mailing list information for this driver as well.
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
In ctx_resched(), EVENT_FLEXIBLE should be sched_out when EVENT_PINNED is
added. However, ctx_resched() calculates ctx_event_type before checking
this condition. As a result, pinned events will NOT get higher priority
than flexible events.
The following shows this issue on an Intel CPU (where ref-cycles can
only use one hardware counter).
1. First start:
perf stat -C 0 -e ref-cycles -I 1000
2. Then, in the second console, run:
perf stat -C 0 -e ref-cycles:D -I 1000
The second perf uses pinned events, which is expected to have higher
priority. However, because it failed in ctx_resched(). It is never
run.
This patch fixes this by calculating ctx_event_type after re-evaluating
event_type.
Reported-by: Ephraim Park <ephiepark@fb.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <jolsa@redhat.com>
Cc: <kernel-team@fb.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Fixes: 487f05e18a ("perf/core: Optimize event rescheduling on active contexts")
Link: http://lkml.kernel.org/r/20180306055504.3283731-1-songliubraving@fb.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull NVMe fixes for this series from Keith:
"A few late fixes for 4.16:
* Reverting sysfs slave device links for native nvme multipathing.
The hidden disk attributes broke common user tools.
* A fix for a PPC pci error handling regression.
* Update pci interrupt count to consider the actual IRQ spread, fixing
potentially poor initial queue affinity.
* Off-by-one errors in nvme-fc queue sizes
* A fabrics discovery fix to be more tolerant with user tools."
* 'nvme-4.16-rc5' of git://git.infradead.org/nvme:
nvme_fc: rework sqsize handling
nvme-fabrics: Ignore nr_io_queues option for discovery controllers
Revert "nvme: create 'slaves' and 'holders' entries for hidden controllers"
nvme: pci: pass max vectors as num_possible_cpus() to pci_alloc_irq_vectors
nvme-pci: Fix EEH failure on ppc
Fixes for 4.16. A bit bigger than I would have liked, but most of that
is DC fixes which Harry helped me pull together from the past few weeks.
Highlights:
- Fix DL DVI with DC
- Various RV fixes for DC
- Overlay fixes for DC
- Fix HDMI2 handling on boards without HBR tables in the vbios
- Fix crash with pass-through on SI on amdgpu
- Fix RB harvesting on KV
- Fix hibernation failures on UVD with certain cards
* 'drm-fixes-4.16' of git://people.freedesktop.org/~agd5f/linux: (35 commits)
drm/amd/display: validate plane format on primary plane
drm/amdgpu:Always save uvd vcpu_bo in VM Mode
drm/amdgpu:Correct max uvd handles
drm/amd/display: early return if not in vga mode in disable_vga
drm/amd/display: Fix takover from VGA mode
drm/amd/display: Fix memleaks when atomic check fails.
drm/amd/display: Return success when enabling interrupt
drm/amd/display: Use crtc enable/disable_vblank hooks
drm/amd/display: update infoframe after dig fe is turned on
drm/amd/display: fix boot-up on vega10
drm/amd/display: fix cursor related Pstate hang
drm/amd/display: Set irq state only on existing crtcs
drm/amd/display: Fixed non-native modes not lighting up
drm/amd/display: Call update_stream_signal directly from amdgpu_dm
drm/amd/display: Make create_stream_for_sink more consistent
drm/amd/display: Don't block dual-link DVI modes
drm/amd/display: Don't allow dual-link DVI on all ASICs.
drm/amd/display: Pass signal directly to enable_tmds_output
drm/amd/display: Remove unnecessary fail labels in create_stream_for_sink
drm/amd/display: Move MAX_TMDS_CLOCK define to header
...
sun4i fixes on clk, division by zero and LVDS.
* tag 'drm-misc-fixes-2018-03-07' of git://anongit.freedesktop.org/drm/drm-misc:
drm/sun4i: crtc: Call drm_crtc_vblank_on / drm_crtc_vblank_off
drm/sun4i: rgb: Fix potential division by zero
drm/sun4i: tcon: Reduce the scope of the LVDS error a bit
drm/sun4i: Release exclusive clock lock when disabling TCON
drm/sun4i: Fix dclk_set_phase
otherwise kernel can oops later in seq_release() due to dereferencing null
file->private_data which is only set if seq_open() succeeds.
BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
IP: seq_release+0xc/0x30
Call Trace:
close_pdeo+0x37/0xd0
proc_reg_release+0x5d/0x60
__fput+0x9d/0x1d0
____fput+0x9/0x10
task_work_run+0x75/0x90
do_exit+0x252/0xa00
do_group_exit+0x36/0xb0
SyS_exit_group+0xf/0x10
Fixes: 516fb7f2e7 ("/proc/module: use the same logic as /proc/kallsyms for address exposure")
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: stable@vger.kernel.org # 4.15+
Signed-off-by: Leon Yu <chianglungyu@gmail.com>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
Pull MIPS fixes from James Hogan:
"A miscellaneous pile of MIPS fixes for 4.16:
- move put_compat_sigset() to evade hardened usercopy warnings (4.16)
- select ARCH_HAVE_PC_{SERIO,PARPORT} for Loongson64 platforms (4.16)
- fix kzalloc() failure handling in ath25 (3.19) and Octeon (4.0)
- fix disabling of IPIs during BMIPS suspend (3.19)"
* tag 'mips_fixes_4.16_4' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/mips:
MIPS: BMIPS: Do not mask IPIs during suspend
MIPS: Loongson64: Select ARCH_MIGHT_HAVE_PC_SERIO
MIPS: Loongson64: Select ARCH_MIGHT_HAVE_PC_PARPORT
signals: Move put_compat_sigset to compat.h to silence hardened usercopy
MIPS: OCTEON: irq: Check for null return on kzalloc allocation
MIPS: ath25: Check for kzalloc allocation failure
Pull chrome platform fix from Benson Leung:
"Revert a problematic patch that constified something imporperly"
* tag 'chrome-platform-4.16-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bleung/chrome-platform:
Revert "platform/chrome: chromeos_laptop: make chromeos_laptop const"
We do want to respect the FLUSH_SYNC argument to nfs_commit_inode() to
ensure that all outstanding COMMIT requests to the inode in question are
complete. Currently we may exit early from both nfs_commit_inode() and
nfs_write_inode() even if there are COMMIT requests in flight, or unstable
writes on the commit list.
In order to get the right semantics w.r.t. sync_inode(), we don't need
to have nfs_commit_inode() reset the inode dirty flags when called from
nfs_wb_page() and/or nfs_wb_all(). We just need to ensure that
nfs_write_inode() leaves them in the right state if there are outstanding
commits, or stable pages.
Reported-by: Scott Mayhew <smayhew@redhat.com>
Fixes: dc4fd9ab01 ("nfs: don't wait on commit in nfs_commit_inode()...")
Cc: stable@vger.kernel.org # v4.14+
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Stephen Hemminger says:
====================
hv_netvsc: fix multicast flags and sync
This set of patches deals with the handling of multicast flags
and addresses in transparent VF mode. The recent set of patches
(in linux-net) had a couple of bugs.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The dev_uc/mc_sync calls need to have the device address list
locked. This was spotted by running with lockdep enabled.
Fixes: bee9d41b37 ("hv_netvsc: propagate rx filters to VF")
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The rx_mode operation handler is different than other callbacks
in that is not always called with rtnl held. Therefore use
RCU to ensure that references are valid.
Fixes: bee9d41b37 ("hv_netvsc: propagate rx filters to VF")
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The netvsc driver can get repeated calls to netvsc_rx_mode during
network setup; each of these calls ends up scheduling the lower
layers to update tha packet filter. This update requires an
request/response to the host. So avoid doing this if we already
know that the correct packet filter value is set.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The recent change to not always enable all multicast and broadcast
was broken; meant to set filter, not change flags.
Fixes: 009f766ca2 ("hv_netvsc: filter multicast/broadcast")
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Corrected four outstanding issues in the transport around sqsize.
1: Create Connection LS is sending the 1's-based sqsize, should be
sending the 0's-based value.
2: allocation of hw queue is using the 0's-base size. It should be
using the 1's-based value.
3: normalization of ctrl.sqsize by MQES is using MQES+1 (1's-based
value). It should be MQES (0's-based value).
4: Missing clause to ensure queue_count not larger than ctrl->sqsize.
Corrected by:
Clean up routines that pass queue size around. The queue size value is
the actual count (1's-based) value and determined from ctrl->sqsize + 1.
Routines that send 0's-based value adapt from queue size.
Sset ctrl->sqsize properly for MQES.
Added clause to nsure queue_count not larger than ctrl->sqsize + 1.
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Kalle Valo:
====================
wireless-drivers fixes for 4.16
Quote a few fixes as I have not been able to send a pull request
earlier. Most of the fixes for iwlwifi but also few others, nothing
really standing out though.
iwlwifi
* fix a bogus warning when freeing a TFD
* fix severe throughput problem with 9000 series
* fix for a bug that caused queue hangs in certain situations
* fix for an issue with IBSS
* fix an issue with rate-scaling in AP-mode
* fix Channel Switch Announcement (CSA) issues with count 0 and 1
* some firmware debugging fixes
* remov a wrong error message when removing keys
* fix a firmware sysassert most usually triggered in IBSS
* a couple of fixes on multicast queues
* a fix with CCMP 256
rtlwifi
* fix loss of signal for rtl8723be
brcmfmac
* add possibility to obtain firmware error
* fix P2P_DEVICE ethernet address generation
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds missing initialisation for HP 2013 UltraSlim Dock
Line-In/Out PINs and activates keyboard mute/micmute leds
for HP ProBook 640 G2
Signed-off-by: Dennis Wassenberg <dennis.wassenberg@secunet.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
This patch adds missing initialisation for HP 2013 UltraSlim Dock
Line-In/Out PINs and activates keyboard mute/micmute leds
for HP EliteBook 820 G3
Signed-off-by: Dennis Wassenberg <dennis.wassenberg@secunet.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Pretty minor: just SKB_GSO_TCP -> SKB_GSO_TCPV4 and
SKB_GSO_TCP6 -> SKB_GSO_TCPV6.
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull a xen_blkfront fix from Konrad:
"It has one simple fix for the multi-queue support not showing up after
a block device was detached/re-attached."
* 'stable/for-jens-4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen-blkfront: move negotiate_mq to cover all cases of new VBDs
The __send_and_alloc_skb() receives a skb ptr as a parameter but in
case it fails the skb is not valid:
- Send failed and released the skb internally.
- Allocation failed.
The current code tries to release the skb in case of failure which
causes redundant freeing.
Fixes: 9b00cf2d10 ("team: implement multipart netlink messages for options transfers")
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This removes a dependency on the order options are passed when creating
a fabrics controller. With the old code, if "nr_io_queues" appears before
an "nqn" option specifying the discovery controller, then nr_io_queues
is overridden with zero. If "nr_io_queues" appears after specifying the
discovery controller, then the nr_io_queues option is used to set the
number of queues, and the driver attempts to establish IO connections
to the discovery controller (which doesn't work).
It seems better to ignore (and warn about) the "nr_io_queues" option
if userspace has already asked to connect to the discovery controller.
Signed-off-by: Roland Dreier <roland@purestorage.com>
Reviewed-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <keith.busch@intel.com>
cmd_dt_S_dtb constructs the assembly source to incorporate a devicetree
FDT (that is, the .dtb file) as binary data in the kernel image. This
assembly source contains labels before and after the binary data. The
label names incorporate the file name of the corresponding .dtb file.
Hyphens are not legal characters in labels, so .dtb files built into the
kernel with hyphens in the file name result in errors like the
following:
bcm3368-netgear-cvg834g.dtb.S: Assembler messages:
bcm3368-netgear-cvg834g.dtb.S:5: Error: : no such section
bcm3368-netgear-cvg834g.dtb.S:5: Error: junk at end of line, first unrecognized character is `-'
bcm3368-netgear-cvg834g.dtb.S:6: Error: unrecognized opcode `__dtb_bcm3368-netgear-cvg834g_begin:'
bcm3368-netgear-cvg834g.dtb.S:8: Error: unrecognized opcode `__dtb_bcm3368-netgear-cvg834g_end:'
bcm3368-netgear-cvg834g.dtb.S:9: Error: : no such section
bcm3368-netgear-cvg834g.dtb.S:9: Error: junk at end of line, first unrecognized character is `-'
Fix this by updating cmd_dt_S_dtb to transform all hyphens from the file
name to underscores when constructing the labels.
As of v4.16-rc2, 1139 .dts files across ARM64, ARM, MIPS and PowerPC
contain hyphens in their names, but the issue only currently manifests
on Broadcom MIPS platforms, as that is the only place where such files
are built into the kernel. For example when CONFIG_DT_NETGEAR_CVG834G=y,
or on BMIPS kernels when the dtbs target is used (in the latter case it
admittedly shouldn't really build all the dtb.o files, but thats a
separate issue).
Fixes: 695835511f ("MIPS: BMIPS: rename bcm96358nb4ser to bcm6358-neufbox4-sercom")
Signed-off-by: James Hogan <jhogan@kernel.org>
Reviewed-by: Frank Rowand <frowand.list@gmail.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Michal Marek <michal.lkml@markovi.net>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Kevin Cernekee <cernekee@gmail.com>
Cc: <stable@vger.kernel.org> # 4.9+
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
The bloat-o-meter script has two typos in the help, fix both.
Fixes: 192efb7a1f ("bloat-o-meter: provide 3 different arguments for data, function and All")
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Felipe writes:
usb: fixes for v4.16-rc4
Just small fixes now. The two most important are a fix for a a lock up
on USB ID pin change during system suspend/resume on dwc3 and a
use-after-free fix in ffs_fs_kill_sb().
Apart from that, some DT compatible fixes.
The check_interval file in
/sys/devices/system/machinecheck/machinecheck<cpu number>
directory is a global timer value for MCE polling. If it is changed by one
CPU, mce_restart() broadcasts the event to other CPUs to delete and restart
the MCE polling timer and __mcheck_cpu_init_timer() reinitializes the
mce_timer variable.
If more than one CPU writes a specific value to the check_interval file
concurrently, mce_timer is not protected from such concurrent accesses and
all kinds of explosions happen. Since only root can write to those sysfs
variables, the issue is not a big deal security-wise.
However, concurrent writes to these configuration variables is void of
reason so the proper thing to do is to serialize the access with a mutex.
Boris:
- Make store_int_with_restart() use device_store_ulong() to filter out
negative intervals
- Limit min interval to 1 second
- Correct locking
- Massage commit message
Signed-off-by: Seunghun Han <kkamagui@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20180302202706.9434-1-kkamagui@gmail.com
Never directly free @dev after calling device_register(), even
if it returned an error! Always use put_device() to give up the
reference initialized.
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
One version of Lenovo Thinkpad T570 did not use ALC298
(like other Kaby Lake devices). Instead it uses ALC292.
In order to make the Lenovo dock working with that codec
the dock quirk for ALC292 will be used.
Signed-off-by: Dennis Wassenberg <dennis.wassenberg@secunet.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Driver uses alias from Device Tree as an index of pin controller data
array. In case of a wrong DTB or an out-of-tree DTB, the alias could be
outside of this data array leading to out-of-bounds access.
Depending on binary and memory layout, this could be handled properly
(showing error like "samsung-pinctrl 3860000.pinctrl: driver data not
available") or could lead to exceptions.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: <stable@vger.kernel.org>
Fixes: 30574f0db1 ("pinctrl: add samsung pinctrl and gpiolib driver")
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Tomasz Figa <tomasz.figa@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
With the previous two fixes for the write / ioctl races:
ALSA: seq: Don't allow resizing pool in use
ALSA: seq: More protection for concurrent write and ioctl races
the cells aren't any longer in queues at the point calling
snd_seq_pool_done() in snd_seq_ioctl_set_client_pool(). Hence the
function call snd_seq_queue_client_leave_cells() can be dropped safely
from there.
Suggested-by: Nicolai Stange <nstange@suse.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
This patch is an attempt for further hardening against races between
the concurrent write and ioctls. The previous fix d15d662e89
("ALSA: seq: Fix racy pool initializations") covered the race of the
pool initialization at writer and the pool resize ioctl by the
client->ioctl_mutex (CVE-2018-1000004). However, basically this mutex
should be applied more widely to the whole write operation for
avoiding the unexpected pool operations by another thread.
The only change outside snd_seq_write() is the additional mutex
argument to helper functions, so that we can unlock / relock the given
mutex temporarily during schedule() call for blocking write.
Fixes: d15d662e89 ("ALSA: seq: Fix racy pool initializations")
Reported-by: 范龙飞 <long7573@126.com>
Reported-by: Nicolai Stange <nstange@suse.de>
Reviewed-and-tested-by: Nicolai Stange <nstange@suse.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Display driver assumes it can use clk_set_rate for the display clock
via set-rate-parent mechanism, so add the flag for this to id.
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Display driver assumes it can use clk_set_rate for the display clock
via set-rate-parent mechanism, so add the flag for this to it.
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Reported-by: Jyri Sarha <jsarha@ti.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Tested-by: Jyri Sarha <jsarha@ti.com>
Certain clkctrl clocks, notably the display ones, use the
CLK_SET_RATE_PARENT feature extensively. Add support for this flag
to the clkctrl clocks.
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Reported-by: Jyri Sarha <jsarha@ti.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Tested-by: Jyri Sarha <jsarha@ti.com>
Original idea by Ashok, completely rewritten by Borislav.
Before you read any further: the early loading method is still the
preferred one and you should always do that. The following patch is
improving the late loading mechanism for long running jobs and cloud use
cases.
Gather all cores and serialize the microcode update on them by doing it
one-by-one to make the late update process as reliable as possible and
avoid potential issues caused by the microcode update.
[ Borislav: Rewrite completely. ]
Co-developed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Tom Lendacky <thomas.lendacky@amd.com>
Tested-by: Ashok Raj <ashok.raj@intel.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com>
Link: https://lkml.kernel.org/r/20180228102846.13447-8-bp@alien8.de
The two usb-otg regulators for imx7d-sdb are both called
"regulator-usb-otg1-vbus" and they effectively override each other.
This is most likely a copy-paste error.
Fixes: b877039aa1 ("ARM: dts: imx7d-sdb: Adjust the regulator nodes")
Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>
Reviewed-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
This is a fix for a (sort of) fallout in the recent commit
d15d662e89 ("ALSA: seq: Fix racy pool initializations") for
CVE-2018-1000004.
As the pool resize deletes the existing cells, it may lead to a race
when another thread is writing concurrently, eventually resulting a
UAF.
A simple workaround is not to allow the pool resizing when the pool is
in use. It's an invalid behavior in anyway.
Fixes: d15d662e89 ("ALSA: seq: Fix racy pool initializations")
Reported-by: 范龙飞 <long7573@126.com>
Reported-by: Nicolai Stange <nstange@suse.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Since Linux v3.2, vsyscalls have been deprecated and slow. From v3.2
on, Linux had three vsyscall modes: "native", "emulate", and "none".
"emulate" is the default. All known user programs work correctly in
emulate mode, but vsyscalls turn into page faults and are emulated.
This is very slow. In "native" mode, the vsyscall page is easily
usable as an exploit gadget, but vsyscalls are a bit faster -- they
turn into normal syscalls. (This is in contrast to vDSO functions,
which can be much faster than syscalls.) In "none" mode, there are
no vsyscalls.
For all practical purposes, "native" was really just a chicken bit
in case something went wrong with the emulation. It's been over six
years, and nothing has gone wrong. Delete it.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Kernel Hardening <kernel-hardening@lists.openwall.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/519fee5268faea09ae550776ce969fa6e88668b0.1520449896.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This reverts commit a376cd9160 because
chromeos_laptop instances should not be marked as "const" (at this
time), since i2c_peripheral is being modified (we change "state" and
"tries") when we instantiate devices.
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Benson Leung <bleung@chromium.org>
Pull virtio bugfix from Michael Tsirkin:
"A bugfix for error handling in virtio"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
virtio_ring: fix num_free handling in error case
Pull input fixes from Dmitry Torokhov:
- we are reverting patch that was switched touchpad on Lenovo T460P
over to native RMI because on these boxes BIOS messes up with SMBus
controller state. We might re-enable it later once SMBus issue is
resolved
- disabling interrupts in matrix_keypad driver was racy
- mms114 now has SPDX header and matching MODULE_LICENSE
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Revert "Input: synaptics - Lenovo Thinkpad T460p devices should use RMI"
Input: matrix_keypad - fix race when disabling interrupts
Input: mms114 - add SPDX identifier
Input: mms114 - fix license module information
Daniel Borkmann says:
====================
pull-request: bpf 2018-03-08
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) Fix various BPF helpers which adjust the skb and its GSO information
with regards to SCTP GSO. The latter is a special case where gso_size
is of value GSO_BY_FRAGS, so mangling that will end up corrupting
the skb, thus bail out when seeing SCTP GSO packets, from Daniel(s).
2) Fix a compilation error in bpftool where BPF_FS_MAGIC is not defined
due to too old kernel headers in the system, from Jiri.
3) Increase the number of x64 JIT passes in order to allow larger images
to converge instead of punting them to interpreter or having them
rejected when the interpreter is not built into the kernel, from Daniel.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit 4828296982 which
caused the following issues:
1. On T460p with BIOS version 2.22 touchpad and trackpoint stop working
after suspend-resume cycle. Due to strange state of the device another
suspend is impossible.
The following dmesg errors can be observed:
thinkpad_acpi: EC reports that Thermal Table has changed
rmi4_smbus 7-002c: failed to get SMBus version number!
rmi4_physical rmi4-00: rmi_driver_reset_handler: Failed to read current IRQ mask.
rmi4_f01 rmi4-00.fn01: Failed to restore normal operation: -16.
rmi4_f01 rmi4-00.fn01: Resume failed with code -16.
rmi4_physical rmi4-00: Failed to suspend functions: -16
rmi4_smbus 7-002c: Failed to resume device: -16
PM: resume devices took 0.640 seconds
rmi4_f03 rmi4-00.fn03: rmi_f03_pt_write: Failed to write to F03 TX register (-16).
rmi4_physical rmi4-00: rmi_driver_clear_irq_bits: Failed to change enabled interrupts!
rmi4_physical rmi4-00: rmi_driver_set_irq_bits: Failed to change enabled interrupts!
psmouse: probe of serio3 failed with error -1
2. On another T460p with BIOS version 2.15 two finger scrolling gesture
on the touchpad stops working after suspend-resume cycle (about 75%
reproducibility, when it still works, the scrolling gesture becomes
laggy). Nothing suspicious appears in the dmesg.
Analysis form Richard Schütz:
"RMI is unreliable on the ThinkPad T460p because the device is affected
by the firmware behavior addressed in a7ae81952c ("i2c: i801: Allow
ACPI SystemIO OpRegion to conflict with PCI BAR")."
The affected devices often show:
i801_smbus 0000:00:1f.4: BIOS is accessing SMBus registers
i801_smbus 0000:00:1f.4: Driver SMBus register access inhibited
Reported-by: Richard Schütz <rschuetz@uni-koblenz.de>
Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Tested-by: Martin Peres <martin.peres@linux.intel.com>
Tested-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
In Cilium some of the main programs we run today are hitting 9 passes
on x64's JIT compiler, and we've had cases already where we surpassed
the limit where the JIT then punts the program to the interpreter
instead, leading to insertion failures due to CONFIG_BPF_JIT_ALWAYS_ON
or insertion failures due to the prog array owner being JITed but the
program to insert not (both must have the same JITed/non-JITed property).
One concrete case the program image shrunk from 12,767 bytes down to
10,288 bytes where the image converged after 16 steps. I've measured
that this took 340us in the JIT until it converges on my i7-6600U. Thus,
increase the original limit we had from day one where the JIT covered
cBPF only back then before we run into the case (as similar with the
complexity limit) where we trip over this and hit program rejections.
Also add a cond_resched() into the compilation loop, the JIT process
runs without any locks and may sleep anyway.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Prior to 25520d55cd ("block: Inline blk_integrity in struct gendisk")
we needed to temporarily add a zero-capacity disk before registering for
blk-integrity. But adding a zero-capacity disk caused the partition
table scanning to bail early, and this resulted in partitions not coming
up after a probe of the BTT or blk namespaces.
We can now register for integrity before the disk has been added, and
this fixes the rescan problems.
Fixes: 25520d55cd ("block: Inline blk_integrity in struct gendisk")
Reported-by: Dariusz Dokupil <dariusz.dokupil@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
In dce110, the plane configuration is such that plane 0
or the primary plane should be rendered with only RGB data.
This patch adds the validation to ensure that no video data
is rendered on plane 0.
Signed-off-by: Shirish S <shirish.s@amd.com>
Reviewed-by: Tony Cheng <Tony.Cheng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
HW Engineer's Notes:
During switch from vga->extended, if we set the VGA_TEST_ENABLE and then
hit the VGA_TEST_RENDER_START, then the DCHUBP timing gets updated correctly.
Then vBIOS will have it poll for the VGA_TEST_RENDER_DONE and unset
VGA_TEST_ENABLE, to leave it in the same state as before.
Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Reviewed-by: Tony Cheng <Tony.Cheng@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
While checking plane states for updates during atomic check, we create
dc_plane_states in preparation. These dc states should be freed if
something errors.
Although the input transfer function is also freed by
dc_plane_state_release(), we should free it (on error) under the same
scope as where it is created.
Signed-off-by: Leo (Sunpeng) Li <sunpeng.li@amd.com>
Reviewed-by: Harry Wentland <Harry.Wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Before dig fe is enabled, infoframe can't be programmed. So in
suspend resume case our infoframe programmming was not going through.
This change changes the sequence so that infoframe is programmed
after.
Signed-off-by: Eric Yang <Eric.Yang2@amd.com>
Reviewed-by: Tony Cheng <Tony.Cheng@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fixing null-deref on Vega10 due to regression after
'fix cursor related Pstate hang' change.
Added null checks in setting cursor position.
Signed-off-by: Roman Li <Roman.Li@amd.com>
Reviewed-by: Eric Yang <eric.yang2@amd.com>
Reviewed-by: Harry Wentland <Harry.Wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Move cursor programming to inside the OTG_MASTER_UPDATE_LOCK
If graphics plane go from 1 pipe to hsplit, the cursor updates
after mpc programming and unlock. Which means there is a window
of time where cursor is enabled on the wrong pipe if it's on
the right side of the screen (i.e. case where cursor need to
move from pipe 0 to pipe 3 post split). This will cause pstate hang.
Solution is to program the cursor while still locked.
Signed-off-by: Eric Yang <Eric.Yang2@amd.com>
Reviewed-by: Tony Cheng <Tony.Cheng@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Because AMDGPU_CRTC_IRQ_VLINE1 = 6, it expected 6 more crtcs to be
programed with disabled irq state in amdgpu_irq_disable_all. That caused errors and accessed
the wrong memory location.
Signed-off-by: Mikita Lipski <mikita.lipski@amd.com>
Reviewed-by: Tony Cheng <Tony.Cheng@amd.com>
Reviewed-by: Harry Wentland <Harry.Wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
There is no need to call drm_mode_set_crtcinfo() again once
crtc timing is decided. Otherwise non-native/unsupported timing
might get overwritten.
Signed-off-by: Jerry (Fangzhi) Zuo <Jerry.Zuo@amd.com>
Reviewed-by: Harry Wentland <Harry.Wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
There's no good place in DC to cover all place where stream signal should
be updated. update_stream_signal depends on timing which comes from DM.
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Tony Cheng <Tony.Cheng@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We've got a helper function to call dc_create_stream_for_sink and one
other place that calls it directly. Make sure we call the helper
functions always since we need to update a bunch of things in stream and
don't want to miss that.
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Tony Cheng <Tony.Cheng@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
When topology changed and rehook up MST display to the same DP
connector, need to take care of drm_dp_mst_port object.
Due to the topology is changed, drm_dp_mst_port and corresponding
i2c_algorithm object could be NULL in such situation.
Signed-off-by: Jerry (Fangzhi) Zuo <Jerry.Zuo@amd.com>
Reviewed-by: Roman Li <Roman.Li@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The below commit
"drm/atomic: Try to preserve the crtc enabled state in drm_atomic_remove_fb, v2"
introduces a slight behavioral change to rmfb. Instead of disabling a crtc
when the primary plane is disabled, it now preserves it.
This change leads to BUG hit while performing atomic commit on amd driver.
As a fix this patch ensures that we disable the CRTC's with NULL FB by returning
-EINVAL and hence triggering fall back to the old behavior and turning off the
crtc in atomic_remove_fb().
V2: Added error check for plane_state and removed sanity check for crtc.
Signed-off-by: Shirish S <shirish.s@amd.com>
Signed-off-by: Pratik Vishwakarma <Pratik.Vishwakarma@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch updates the dc's plane state with the parameters set by the
user side.
This is needed to validate the plane capabilities with the parameters
user space wants to set.
Signed-off-by: Shirish S <shirish.s@amd.com>
Reviewed-by: Harry Wentland <Harry.Wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
CZ & ST support uptil a limit 2:1 downscaling, this patch
adds validate_plane hook, that shall be used to validate
the plane attributes sent by the user space based
on dce110 capabilities.
Signed-off-by: Shirish S <shirish.s@amd.com>
Reviewed-by: Harry Wentland <Harry.Wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
amdgpu_dm_atomic_check() is used to validate the entire configuration of
planes and crtc's that the user space wants to commit.
However amdgpu_dm_atomic_check() depends upon DRM_MODE_ATOMIC_ALLOW_MODESET
flag else its mostly dummy.
Its not mandatory for the user space to set DRM_MODE_ATOMIC_ALLOW_MODESET,
and in general its not set either along with DRM_MODE_ATOMIC_TEST_ONLY.
Considering its importantance, this patch defers the allow_modeset check
in dm_update_planes_state(), so that there shall be scope to validate
the configuration sent from user space, without impacting the population
of dc/dm related data structures.
Signed-off-by: Shirish S <shirish.s@amd.com>
Reviewed-by: Harry Wentland <Harry.Wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
it is required if a platform supports PCIe root complex
core voltage reduction. After receiving this notification,
SBIOS can apply default PCIe root complex power policy.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
negotiate_mq should happen in all cases of a new VBD being discovered by
xen-blkfront, whether called through _probe() or a hot-attached new VBD
from dom-0 via xenstore. Otherwise, hot-attached new VBDs are left
configured without multi-queue.
Signed-off-by: Bhavesh Davda <bhavesh.davda@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Do not set 'needs_free_netdev' as we do call free_netdev
for mgmt net devices, doing both hits BUG_ON.
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
instantiation of VF's on different adapters fails, copy
adapter index and chip type to PF0-3 adapter instances
to fix the issue.
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The user can provide very large cqe_size which will cause to integer
overflow as it can be seen in the following UBSAN warning:
Signed-off-by: Doug Ledford <dledford@redhat.com>
Users of ucma are supposed to provide size of option level,
in most paths it is supposed to be equal to u8 or u16, but
it is not the case for the IB path record, where it can be
multiple of struct ib_path_rec_data.
This patch takes simplest possible approach and prevents providing
values more than possible to allocate.
Reported-by: syzbot+a38b0e9f694c379ca7ce@syzkaller.appspotmail.com
Fixes: 7ce86409ad ("RDMA/ucma: Allow user space to set service type")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
resolved_dev returned might be NULL as ifindex is transient number.
Ignoring NULL check of resolved_dev might crash the kernel.
Therefore perform NULL check before accessing resolved_dev.
Additionally rdma_resolve_ip_route() invokes addr_resolve() which
performs check and address translation for loopback ifindex.
Therefore, checking it again in rdma_resolve_ip_route() is not helpful.
Therefore, the code is simplified to avoid IFF_LOOPBACK check.
Fixes: 200298326b ("IB/core: Validate route when we init ah")
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Starting with v4.16-rc1 we've been seeing a higher than usual number
of requests for the kernel to load networking modules, even on events
which shouldn't trigger a module load (e.g. ioctl(TCGETS)). Stephen
Smalley suggested the problem may lie in commit 44c02a2c3d
("dev_ioctl(): move copyin/copyout to callers") which moves changes
the network dev_ioctl() function to always call dev_load(),
regardless of the requested ioctl.
This patch moves the dev_load() calls back into the individual ioctls
while preserving the rest of the original patch.
Reported-by: Dominick Grift <dac.override@gmail.com>
Suggested-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull gfs2 fix from Bob Peterson:
"An additional patch from Andreas Gruenbacher that fixes another
unfortunate GFS2 regression"
* tag 'gfs2-4.16.rc4.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
gfs2: Fixes to "Implement iomap for block_map" (2)
When the connection is aborted, there is no point in
keeping the packets on the write queue until the connection
is closed.
Similar to a27fd7a8ed ('tcp: purge write queue upon RST'),
this is essential for a correct MSG_ZEROCOPY implementation,
because userspace cannot call close(fd) before receiving
zerocopy signals even when the connection is aborted.
Fixes: f214f915e7 ("tcp: enable MSG_ZEROCOPY")
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull s390 fixes from Martin Schwidefsky:
"Nine bug fixes for s390:
- Three fixes for the expoline code, one of them is strictly speaking
a cleanup but as it relates to code added with 4.16 I would like to
include the patch.
- Three timer related fixes in the common I/O layer
- A fix for the handling of internal DASD request which could cause
panics.
- One correction in regard to the accounting of pud page tables vs.
compat tasks.
- The register scrubbing in entry.S caused spurious crashes, this is
fixed now as well"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/entry.S: fix spurious zeroing of r0
s390: Fix runtime warning about negative pgtables_bytes
s390: do not bypass BPENTER for interrupt system calls
s390/cio: clear timer when terminating driver I/O
s390/cio: fix return code after missing interrupt
s390/cio: fix ccw_device_start_timeout API
s390/clean-up: use CFI_* macros in entry.S
s390: Replace IS_ENABLED(EXPOLINE_*) with IS_ENABLED(CONFIG_EXPOLINE_*)
s390/dasd: fix handling of internal requests
Pull regulator fixes from Mark Brown:
"A couple of fixes here:
- another half of the supend to idle fix from Geert that went in
earlier, both he and I are confused as to why he didn't notice that
this was missing when his earlier fix was merged.
- a simple fix for a test done the wrong way round in the
stm32-vrefbuf driver"
* tag 'regulator-fix-v4.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: Fix resume from suspend to idle
regulator: stm32-vrefbuf: fix check on ready flag
Pull SCSI fixes from James Bottomley:
"This is mostly fixes for driver specific issues (nine of them) and the
storvsc performance improvement with interrupt handling which was
dropped from the previous fixes pull request.
We also have two regressions: one is a double call_rcu() in ATA error
handling and the other is a missed conversion to BLK_STS_OK in
__scsi_error_from_host_byte()"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: qedi: Fix kernel crash during port toggle
scsi: qla2xxx: Fix FC-NVMe LUN discovery
scsi: core: return BLK_STS_OK for DID_OK in __scsi_error_from_host_byte()
scsi: core: Avoid that ATA error handling can trigger a kernel hang or oops
scsi: qla2xxx: ensure async flags are reset correctly
scsi: qla2xxx: do not check login_state if no loop id is assigned
scsi: qla2xxx: Fixup locking for session deletion
scsi: qla2xxx: Fix NULL pointer crash due to active timer for ABTS
scsi: mpt3sas: wait for and flush running commands on shutdown/unload
scsi: mpt3sas: fix oops in error handlers after shutdown/unload
scsi: storvsc: Spread interrupts when picking a channel for I/O requests
scsi: megaraid_sas: Do not use 32-bit atomic request descriptor for Ventura controllers
It turns out that commit 3229c18c0d6b2 'Fixes to "Implement iomap for
block_map"' introduced another bug in gfs2_iomap_begin that can cause
gfs2_block_map to set bh->b_size of an actual buffer to 0. This can
lead to arbitrary incorrect behavior including crashes or disk
corruption. Revert the incorrect part of that commit.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
dccp_disconnect() sets 'dp->dccps_hc_tx_ccid' tx handler to NULL,
therefore if DCCP socket is disconnected and dccp_sendmsg() is
called after it, it will cause a NULL pointer dereference in
dccp_write_xmit().
This crash and the reproducer was reported by syzbot. Looks like
it is reproduced if commit 69c64866ce ("dccp: CVE-2017-8824:
use-after-free in DCCP code") is applied.
Reported-by: syzbot+f99ab3887ab65d70f816@syzkaller.appspotmail.com
Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
syzkaller found an issue caused by lack of sufficient checks
in l2tp_tunnel_create()
RAW sockets can not be considered as UDP ones for instance.
In another patch, we shall replace all pr_err() by less intrusive
pr_debug() so that syzkaller can find other bugs faster.
Acked-by: Guillaume Nault <g.nault@alphalink.fr>
Acked-by: James Chapman <jchapman@katalix.com>
==================================================================
BUG: KASAN: slab-out-of-bounds in setup_udp_tunnel_sock+0x3ee/0x5f0 net/ipv4/udp_tunnel.c:69
dst_release: dst:00000000d53d0d0f refcnt:-1
Write of size 1 at addr ffff8801d013b798 by task syz-executor3/6242
CPU: 1 PID: 6242 Comm: syz-executor3 Not tainted 4.16.0-rc2+ #253
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x24d lib/dump_stack.c:53
print_address_description+0x73/0x250 mm/kasan/report.c:256
kasan_report_error mm/kasan/report.c:354 [inline]
kasan_report+0x23b/0x360 mm/kasan/report.c:412
__asan_report_store1_noabort+0x17/0x20 mm/kasan/report.c:435
setup_udp_tunnel_sock+0x3ee/0x5f0 net/ipv4/udp_tunnel.c:69
l2tp_tunnel_create+0x1354/0x17f0 net/l2tp/l2tp_core.c:1596
pppol2tp_connect+0x14b1/0x1dd0 net/l2tp/l2tp_ppp.c:707
SYSC_connect+0x213/0x4a0 net/socket.c:1640
SyS_connect+0x24/0x30 net/socket.c:1621
do_syscall_64+0x280/0x940 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x42/0xb7
Fixes: fd558d186d ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
inet_evict_bucket() iterates global list, and
several tasks may call it in parallel. All of
them hash the same fq->list_evictor to different
lists, which leads to list corruption.
This patch makes fq be hashed to expired list
only if this has not been made yet by another
task. Since inet_frag_alloc() allocates fq
using kmem_cache_zalloc(), we may rely on
list_evictor is initially unhashed.
The problem seems to exist before async
pernet_operations, as there was possible to have
exit method to be executed in parallel with
inet_frags::frags_work, so I add two Fixes tags.
This also may go to stable.
Fixes: d1fe19444d "inet: frag: don't re-use chainlist for evictor"
Fixes: f84c6821aa "net: Convert pernet_subsys, registered from inet_init()"
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The smsc911x driver will crash if it is rmmod'ed while the netdev
is up like:
Call trace:
phy_detach+0x94/0x150
phy_disconnect+0x40/0x50
smsc911x_stop+0x104/0x128 [smsc911x]
__dev_close_many+0xb4/0x138
dev_close_many+0xbc/0x190
rollback_registered_many+0x140/0x460
rollback_registered+0x68/0xb0
unregister_netdevice_queue+0x100/0x118
unregister_netdev+0x28/0x38
smsc911x_drv_remove+0x58/0x130 [smsc911x]
platform_drv_remove+0x30/0x50
device_release_driver_internal+0x15c/0x1f8
driver_detach+0x54/0x98
bus_remove_driver+0x64/0xe8
driver_unregister+0x34/0x60
platform_driver_unregister+0x20/0x30
smsc911x_cleanup_module+0x14/0xbca8 [smsc911x]
SyS_delete_module+0x1e8/0x238
__sys_trace_return+0x0/0x4
This is caused by the mdiobus being unregistered/free'd
and the code in phy_detach() attempting to manipulate mdio
related structures from unregister_netdev() calling close()
To fix this, we delay the mdiobus teardown until after
the netdev is deregistered.
Reported-by: Matt Sealey <matt.sealey@arm.com>
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, administrative MTU changes on a given netdevice are
not reflected on route exceptions for MTU-less routes, with a
set PMTU value, for that device:
# ip -6 route get 2001:db8::b
2001:db8::b from :: dev vti_a proto kernel src 2001:db8::a metric 256 pref medium
# ping6 -c 1 -q -s10000 2001:db8::b > /dev/null
# ip netns exec a ip -6 route get 2001:db8::b
2001:db8::b from :: dev vti_a src 2001:db8::a metric 0
cache expires 571sec mtu 4926 pref medium
# ip link set dev vti_a mtu 3000
# ip -6 route get 2001:db8::b
2001:db8::b from :: dev vti_a src 2001:db8::a metric 0
cache expires 571sec mtu 4926 pref medium
# ip link set dev vti_a mtu 9000
# ip -6 route get 2001:db8::b
2001:db8::b from :: dev vti_a src 2001:db8::a metric 0
cache expires 571sec mtu 4926 pref medium
The first issue is that since commit fb56be83e4 ("net-ipv6: on
device mtu change do not add mtu to mtu-less routes") we don't
call rt6_exceptions_update_pmtu() from rt6_mtu_change_route(),
which handles administrative MTU changes, if the regular route
is MTU-less.
However, PMTU exceptions should be always updated, as long as
RTAX_MTU is not locked. Keep the check for MTU-less main route,
as introduced by that commit, but, for exceptions,
call rt6_exceptions_update_pmtu() regardless of that check.
Once that is fixed, one problem remains: MTU changes are not
reflected if the new MTU is higher than the previous one,
because rt6_exceptions_update_pmtu() doesn't allow that. We
should instead allow PMTU increase if the old PMTU matches the
local MTU, as that implies that the old MTU was the lowest in the
path, and PMTU discovery might lead to different results.
The existing check in rt6_mtu_change_route() correctly took that
case into account (for regular routes only), so factor it out
and re-use it also in rt6_exceptions_update_pmtu().
While at it, fix comments style and grammar, and try to be a bit
more descriptive.
Reported-by: Xiumei Mu <xmu@redhat.com>
Fixes: fb56be83e4 ("net-ipv6: on device mtu change do not add mtu to mtu-less routes")
Fixes: f5bbe7ee79 ("ipv6: prepare rt6_mtu_change() for exception table")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes the warning messages/call traces seen if DMA debug is
enabled, In case of fragmented skb's memory was allocated using
dma_map_page but freed using dma_unmap_single. This patch modifies buffer
allocations in TX path to use dma_map_page in all the places and
dma_unmap_page while freeing the buffers.
Signed-off-by: Hemanth Puranik <hpuranik@codeaurora.org>
Acked-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rdma requires ILT Memory to be allocated for it's QPs.
Each ILT entry points to a page used by several Rdma QPs.
To avoid allocating all the memory in advance, the rdma
implementation dynamically allocates memory as more QPs are
added, however it does not dynamically free the memory.
The memory should have been freed on rmmod qedr, but isn't.
This patch adds the memory freeing on rmmod qedr (currently
it will be freed with qed is removed).
An outcome of this bug, is that if qedr is unloaded and loaded
without unloaded qed, there will be no more RoCE traffic.
The reason these are related, is that the logic of detecting the
first QP ever opened is by asking whether ILT memory for RoCE has
been allocated.
In addition, this patch modifies freeing of the Task context to
always use the PROTOCOLID_ROCE and not the protocol passed,
this is because task context for iWARP and ROCE both use the
ROCE protocol id, as opposed to the connection context.
Fixes: dbb799c397 ("qed: Initialize hardware for new protocols")
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2018-03-05
This series contains fixes to e1000e only.
Benjamin Poirier provides all but one fix in this series, starting with
workaround for a VMWare e1000e emulation issue where ICR reads 0x0 on
the emulated device. Partially reverted a previous commit dealing with
the "Other" interrupt throttling to avoid unforeseen fallout from these
changes that are not strictly necessary. Restored the ICS write for
receive and transmit queue interrupts in the case that txq or rxq bits
were set in ICR and the Other interrupt handler read and cleared ICR
before the queue interrupt was raised. Fixed an bug where interrupts
may be missed if ICR is read while INT_ASSERTED is not set, so avoid the
problem by setting all bits related to events that can trigger the Other
interrupt in IMS. Fixed the return value for check_for_link() when
auto-negotiation is off.
Pierre-Yves Kerbrat fixes e1000e to use dma_zalloc_coherent() to make
sure the ring is memset to 0 to prevent the area from containing
garbage.
v2: added an additional e1000e fix to the series
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The subordinate value indicates the highest bus number which can be
reached downstream though a certain device.
Commit a20c7f36bd ("PCI: Do not allocate more buses than available in
parent") ensures that downstream devices cannot assign busnumbers higher
than the upstream device subordinate number, which was indeed illogical.
By default, dw_pcie_setup_rc() inits the Root Complex subordinate to a
value of 0x01.
Due to this combined with above commit, enumeration stops digging deeper
downstream as soon as bus num 0x01 has been assigned, which is always the
case for a bridge device.
This results in all devices behind a bridge bus remaining undetected, as
these would be connected to bus 0x02 or higher.
Fix this by initializing the RC to a subordinate value of 0xff, which is
not altering hardware behaviour in any way, but informs probing function
pci_scan_bridge() later on which reads this value back from register.
The following nasty errors during boot are also fixed by this:
pci_bus 0000:02: busn_res: can not insert [bus 02-ff] under [bus 01] (conflicts with (null) [bus 01])
...
pci_bus 0000:03: [bus 03] partially hidden behind bridge 0000:01 [bus 01]
...
pci_bus 0000:04: [bus 04] partially hidden behind bridge 0000:01 [bus 01]
...
pci_bus 0000:05: [bus 05] partially hidden behind bridge 0000:01 [bus 01]
pci_bus 0000:02: busn_res: [bus 02-ff] end is updated to 05
pci_bus 0000:02: busn_res: can not insert [bus 02-05] under [bus 01] (conflicts with (null) [bus 01])
pci_bus 0000:02: [bus 02-05] partially hidden behind bridge 0000:01 [bus 01]
Fixes: a20c7f36bd ("PCI: Do not allocate more buses than available in
parent")
Tested-by: Niklas Cassel <niklas.cassel@axis.com>
Tested-by: Fabio Estevam <fabio.estevam@nxp.com>
Tested-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk>
Signed-off-by: Koen Vandeputte <koen.vandeputte@ncentric.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Lucas Stach <l.stach@pengutronix.de>
Cc: stable@vger.kernel.org # v4.15+
Cc: Binghui Wang <wangbinghui@hisilicon.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Jianguo Sun <sunjianguo1@huawei.com>
Cc: Jingoo Han <jingoohan1@gmail.com>
Cc: Kishon Vijay Abraham I <kishon@ti.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Minghuan Lian <minghuan.Lian@freescale.com>
Cc: Mingkai Hu <mingkai.hu@freescale.com>
Cc: Murali Karicheri <m-karicheri2@ti.com>
Cc: Pratyush Anand <pratyush.anand@gmail.com>
Cc: Richard Zhu <hongxing.zhu@nxp.com>
Cc: Roy Zang <tie-fei.zang@freescale.com>
Cc: Shawn Guo <shawn.guo@linaro.org>
Cc: Stanimir Varbanov <svarbanov@mm-sol.com>
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Xiaowei Song <songxiaowei@hisilicon.com>
Cc: Zhou Wang <wangzhou1@hisilicon.com>
When we exceed current packets limit and we have more than one
segment in the list returned by skb_gso_segment(), netem drops
only the first one, skipping the rest, hence kmemleak reports:
unreferenced object 0xffff880b5d23b600 (size 1024):
comm "softirq", pid 0, jiffies 4384527763 (age 2770.629s)
hex dump (first 32 bytes):
00 80 23 5d 0b 88 ff ff 00 00 00 00 00 00 00 00 ..#]............
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<00000000d8a19b9d>] __alloc_skb+0xc9/0x520
[<000000001709b32f>] skb_segment+0x8c8/0x3710
[<00000000c7b9bb88>] tcp_gso_segment+0x331/0x1830
[<00000000c921cba1>] inet_gso_segment+0x476/0x1370
[<000000008b762dd4>] skb_mac_gso_segment+0x1f9/0x510
[<000000002182660a>] __skb_gso_segment+0x1dd/0x620
[<00000000412651b9>] netem_enqueue+0x1536/0x2590 [sch_netem]
[<0000000005d3b2a9>] __dev_queue_xmit+0x1167/0x2120
[<00000000fc5f7327>] ip_finish_output2+0x998/0xf00
[<00000000d309e9d3>] ip_output+0x1aa/0x2c0
[<000000007ecbd3a4>] tcp_transmit_skb+0x18db/0x3670
[<0000000042d2a45f>] tcp_write_xmit+0x4d4/0x58c0
[<0000000056a44199>] tcp_tasklet_func+0x3d9/0x540
[<0000000013d06d02>] tasklet_action+0x1ca/0x250
[<00000000fcde0b8b>] __do_softirq+0x1b4/0x5a3
[<00000000e7ed027c>] irq_exit+0x1e2/0x210
Fix it by adding the rest of the segments, if any, to skb 'to_free'
list. Add new __qdisc_drop_all() and qdisc_drop_all() functions
because they can be useful in the future if we need to drop segmented
GSO packets in other places.
Fixes: 6071bd1aa1 ("netem: Segment GSO packets on enqueue")
Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Johan Hedberg says:
====================
pull request: bluetooth 2018-03-05
Here are a few more Bluetooth fixes for the 4.16 kernel:
- btusb: reset/resume fixes for Yoga 920 and Dell OptiPlex 3060
- Fix for missing encryption refresh with the Security Manager protocol
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Blakey says:
====================
rhashtable: Fix rhltable duplicates insertion
On our mlx5 driver fs_core.c, we use the rhltable interface to store
flow groups. We noticed that sometimes we get a warning that flow group isn't
found at removal. This rare case was caused when a specific scenario happened,
insertion of a flow group with a similar match criteria (a duplicate),
but only where the flow group rhash_head was second (or not first)
on the relevant rhashtable bucket list.
The first patch fixes it, and the second one adds a test that show
it is now working.
Paul.
v3 --> v2 changes:
* added missing fix in rhashtable_lookup_one code path as well.
v1 --> v2 changes:
* Changed commit messages to better reflect the change
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Tries to insert duplicates in the middle of bucket's chain:
bucket 1: [[val 21 (tid=1)]] -> [[ val 1 (tid=2), val 1 (tid=0) ]]
Reuses tid to distinguish the elements insertion order.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
When inserting duplicate objects (those with the same key),
current rhlist implementation messes up the chain pointers by
updating the bucket pointer instead of prev next pointer to the
newly inserted node. This causes missing elements on removal and
travesal.
Fix that by properly updating pprev pointer to point to
the correct rhash_head next pointer.
Issue: 1241076
Change-Id: I86b2c140bcb4aeb10b70a72a267ff590bb2b17e7
Fixes: ca26893f05 ('rhashtable: Add rhlist interface')
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 2b05f6ae1e ("ARM: ux500: remove PMU IRQ bouncer")
deleted some code to bounce and work around the weird PMU
IRQs in the DB8500 ASIC, but did a semantic mistake:
since the auxdata was now unused, the call to
of_platform_populate() was removed, but this does not
work: the default platform population will only kick in
if .init_machine() is assigned NULL, and since the U8540
was still using the callback that was not the case.
Fix this by reinstating the call to of_platform_populate(),
but pass NULL as auxdata.
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Fixes: 2b05f6ae1e ("ARM: ux500: remove PMU IRQ bouncer")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
The Stream Buffer for EtherAVB-IF (STBE) is an optional component, and
is not present on all SoCs.
Document this in the DT bindings, including a list of SoCs that do have
it.
Fixes: 785ec87483 ("ravb: document R8A77970 bindings")
Fixes: f231c4178a ("dt-bindings: net: renesas-ravb: Add support for R8A77995 RAVB")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The firmware has a requirement that the P2P_DEVICE address should
be different from the address of the primary interface. When not
specified by user-space, the driver generates the MAC address for
the P2P_DEVICE interface using the MAC address of the primary
interface and setting the locally administered bit. However, the MAC
address of the primary interface may already have that bit set causing
the creation of the P2P_DEVICE interface to fail with -EBUSY. Fix this
by using a random address instead to determine the P2P_DEVICE address.
Cc: stable@vger.kernel.org # 3.10.y
Reported-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com>
Reviewed-by: Franky Lin <franky.lin@broadcom.com>
Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
The feature module needs to evaluate the actual firmware error return
upon a control command. This adds a flag to struct brcmf_if that the
caller can set. This flag is checked to determine the error code that
needs to be returned.
Fixes: b69c1df472 ("brcmfmac: separate firmware errors from i/o errors")
Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com>
Reviewed-by: Franky Lin <franky.lin@broadcom.com>
Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Fixing arbitrary kernel leak in case FBIOGETCMAP_SPARC in
sbusfb_ioctl_helper().
'index' is defined as an int in sbusfb_ioctl_helper().
We retrieve this from the user:
if (get_user(index, &c->index) ||
__get_user(count, &c->count) ||
__get_user(ured, &c->red) ||
__get_user(ugreen, &c->green) ||
__get_user(ublue, &c->blue))
return -EFAULT;
and then we use 'index' in the following way:
red = cmap->red[index + i] >> 8;
green = cmap->green[index + i] >> 8;
blue = cmap->blue[index + i] >> 8;
This is a classic information leak vulnerability. 'index' should be
an unsigned int, given its usage above.
This patch is straight-forward; it changes 'index' to unsigned int
in two switch-cases: FBIOGETCMAP_SPARC && FBIOPUTCMAP_SPARC.
This patch fixes CVE-2018-6412.
Signed-off-by: Peter Malone <peter.malone@gmail.com>
Acked-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Add some hints about overlayfs kernel config options.
Enabling NFS export by default is especially recommended against, as it
incurs a performance penalty even if the filesystem is not actually
exported.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
This reverts commit e9a48034d7.
The slaves and holders link for the hidden gendisks confuse lsblk so that
it errors out on, or doesn't report the nvme multipath devices. Given
that we don't need holder relationships for something that can't even be
directly accessed we should just stop creating those links.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Potnuri Bharat Teja <bharat@chelsio.com>
Cc: stable@vger.kernel.org
Signed-off-by: Keith Busch <keith.busch@intel.com>
Artem Savkov reported that commit 5efec5c655 leads to a packet loss under
IPSec configuration. It appears that his setup consists of a TUN device,
which does not have a MAC header.
Make sure MAC header exists.
Note: TUN device sets a MAC header pointer, although it does not have one.
Fixes: 5efec5c655 ("xfrm: Fix eth_hdr(skb)->h_proto to reflect inner IP version")
Reported-by: Artem Savkov <artem.savkov@gmail.com>
Tested-by: Artem Savkov <artem.savkov@gmail.com>
Signed-off-by: Yossi Kuperman <yossiku@mellanox.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
- Be more robust when drawing arrows in the annotation TUI, avoiding a
segfault when jump instructions have as a target addresses in functions
other that the one currently being annotated. The full fix will come in
the following days, when jumping to other functions will work as call
instructions (Arnaldo Carvalho de Melo)
- Prevent auxtrace_queues__process_index() from queuing AUX area data for
decoding when the --no-itrace option has been used (Adrian Hunter)
- Sync copy of kvm UAPI headers and x86's cpufeatures.h (Arnaldo Carvalho de Melo)
- Fix 'perf stat' CSV output format for non-supported counters (Ilya Pronin)
- Fix crash in 'perf record|perf report' pipe mode (Jiri Olsa)
- Fix annoying 'perf top' overwrite fallback message on older kernels (Kan Liang)
- Fix the usage on the 'perf kallsyms' man page (Sangwon Hong)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Hitting the following hardlockup due to a race condition in
error CQE processing.
[26146.879798] bnxt_en 0000:04:00.0: QPLIB: FP: CQ Processed Req
[26146.886346] bnxt_en 0000:04:00.0: QPLIB: wr_id[1251] = 0x0 with status 0xa
[26156.350935] NMI watchdog: Watchdog detected hard LOCKUP on cpu 4
[26156.357470] Modules linked in: nfsd auth_rpcgss nfs_acl lockd grace
[26156.447957] CPU: 4 PID: 3413 Comm: kworker/4:1H Kdump: loaded
[26156.457994] Hardware name: Dell Inc. PowerEdge R430/0CN7X8,
[26156.466390] Workqueue: ib-comp-wq ib_cq_poll_work [ib_core]
[26156.472639] Call Trace:
[26156.475379] <NMI> [<ffffffff98d0d722>] dump_stack+0x19/0x1b
[26156.481833] [<ffffffff9873f775>] watchdog_overflow_callback+0x135/0x140
[26156.489341] [<ffffffff9877f237>] __perf_event_overflow+0x57/0x100
[26156.496256] [<ffffffff98787c24>] perf_event_overflow+0x14/0x20
[26156.502887] [<ffffffff9860a580>] intel_pmu_handle_irq+0x220/0x510
[26156.509813] [<ffffffff98d16031>] perf_event_nmi_handler+0x31/0x50
[26156.516738] [<ffffffff98d1790c>] nmi_handle.isra.0+0x8c/0x150
[26156.523273] [<ffffffff98d17be8>] do_nmi+0x218/0x460
[26156.528834] [<ffffffff98d16d79>] end_repeat_nmi+0x1e/0x7e
[26156.534980] [<ffffffff987089c0>] ? native_queued_spin_lock_slowpath+0x1d0/0x200
[26156.543268] [<ffffffff987089c0>] ? native_queued_spin_lock_slowpath+0x1d0/0x200
[26156.551556] [<ffffffff987089c0>] ? native_queued_spin_lock_slowpath+0x1d0/0x200
[26156.559842] <EOE> [<ffffffff98d083e4>] queued_spin_lock_slowpath+0xb/0xf
[26156.567555] [<ffffffff98d15690>] _raw_spin_lock+0x20/0x30
[26156.573696] [<ffffffffc08381a1>] bnxt_qplib_lock_buddy_cq+0x31/0x40 [bnxt_re]
[26156.581789] [<ffffffffc083bbaa>] bnxt_qplib_poll_cq+0x43a/0xf10 [bnxt_re]
[26156.589493] [<ffffffffc083239b>] bnxt_re_poll_cq+0x9b/0x760 [bnxt_re]
The issue happens if RQ poll_cq or SQ poll_cq or Async error event tries to
put the error QP in flush list. Since SQ and RQ of each error qp are added
to two different flush list, we need to protect it using locks of
corresponding CQs. Difference in order of acquiring the lock in
SQ poll_cq and RQ poll_cq can cause a hard lockup.
Revisits the locking strategy and removes the usage of qplib_cq.hwq.lock.
Instead of this lock, introduces qplib_cq.flush_lock to handle
addition/deletion of QPs in flush list. Also, always invoke the flush_lock
in order (SQ CQ lock first and then RQ CQ lock) to avoid any potential
deadlock.
Other than the poll_cq context, the movement of QP to/from flush list can
be done in modify_qp context or from an async error event from HW.
Synchronize these operations using the bnxt_re verbs layer CQ locks.
To achieve this, adds a call back to the HW abstraction layer(qplib) to
bnxt_re ib_verbs layer in case of async error event. Also, removes the
buddy cq functions as it is no longer required.
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Fix warning limit for kernel stack consumption:
drivers/infiniband/core/cq.c: In function 'ib_process_cq_direct':
drivers/infiniband/core/cq.c:78:1: error: the frame size of 1032 bytes
is larger than 1024 bytes [-Werror=frame-larger-than=]
Using smaller ib_wc array on the stack brings us comfortably below that
limit again.
Fixes: 246d8b184c ("IB/cq: Don't force IB_POLL_DIRECT poll context for ib_process_cq_direct")
Reported-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Sergey Gorenko <sergeygo@mellanox.com>
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
"err" is either zero or possibly uninitialized here. It should be
-EINVAL.
Fixes: 427c1e7bcd ("{IB, net}/mlx5: Move the modify QP operation table to mlx5_ib")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The series that introduced dual port RoCE mode assumed that we don't have
a dual port HCA that use the mlx5 driver, this is not the case for
Connect-IB HCAs. This reasoning led to assigning 1 as the native port
index which causes issue when the second port is used.
For example query_pkey() when called on the second port will return values
of the first port. Make sure that we assign the right port index as the
native port index.
Fixes: 32f69e4be2 ("{net, IB}/mlx5: Manage port association for multiport RoCE")
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The commit cited below added a gid_type field (RoCEv1 or RoCEv2)
to GID properties.
When adding GIDs, this gid_type field was copied over to the
hardware gid table. However, when deleting GIDs, the gid_type field
was not copied over to the hardware gid table.
As a result, when running RoCEv2, all RoCEv2 gids in the
hardware gid table were set to type RoCEv1 when any gid was deleted.
This problem would persist until the next gid was added (which would again
restore the gid_type field for all the gids in the hardware gid table).
Fix this by copying over the gid_type field to the hardware gid table
when deleting gids, so that the gid_type of all remaining gids is
preserved when a gid is deleted.
Fixes: b699a859d1 ("IB/mlx4: Add gid_type to GID properties")
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
When using IPv4 addresses in RoCEv2, the GID format for the mapped
IPv4 address should be: ::ffff:<4-byte IPv4 address>.
In the cited commit, IPv4 mapped IPV6 addresses had the 3 upper dwords
zeroed out by memset, which resulted in deleting the ffff field.
However, since procedure ipv6_addr_v4mapped() already verifies that the
gid has format ::ffff:<ipv4 address>, no change is needed for the gid,
and the memset can simply be removed.
Fixes: 7e57b85c44 ("IB/mlx4: Add support for setting RoCEv2 gids in hardware")
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
If the read-only flag is true on a SCSI disk, re-reading the partition
table sets the flag back to false.
To observe this bug, you can run:
1. blockdev --setro /dev/sda
2. blockdev --rereadpt /dev/sda
3. blockdev --getro /dev/sda
This commit reads the disk's old state and combines it with the device
disk-reported state rather than unconditionally marking it as RW.
Reported-by: Li Ning <lining916740672@icloud.com>
Signed-off-by: Jeremy Cline <jeremy@jcline.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Because of the shifting around of code in qla2x00_probe_one recently,
failures during adapter initialization can lead to problems, i.e. NULL
pointer crashes and doubly freed data structures which cause eventual
panics.
This V2 version makes the relevant memory free routines idempotent, so
repeat calls won't cause any harm. I also removed the problematic
probe_init_failed exit point as it is not needed.
Fixes: d64d6c5671 ("scsi: qla2xxx: Fix NULL pointer crash due to probe failure")
Signed-off-by: Bill Kuzeja <william.kuzeja@stratus.com>
Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
iWARP does not support RDMA WRITE or SEND with immediate data.
Driver should check this before submitting to FW and return an
immediate error
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Race in qedr_poll_cq, lastest_cqe wasn't protected by lock,
leading to a case where two context's accessing poll_cq at
the same time lead to one of them having a pointer to an old
latest_cqe and reading an invalid cqe element
Signed-off-by: Amit Radzi <Amit.Radzi@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Fix iWARP connect and listen to use the mapped port for
ipv4 and ipv6. Without this fixed, running on a server
that has iwpmd enabled will not use the correct port
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
In practice this is really only meaningful in the context of the DM
multipath target (which uses dm_table_set_type() to set the type of
device DM should create via its "queue_mode" option).
So this change allows a DM multipath device with "queue_mode bio" to be
upgraded from DM_TYPE_BIO_BASED to DM_TYPE_NVME_BIO_BASED -- iff the
underlying device(s) are NVMe.
DM_TYPE_NVME_BIO_BASED is just a DM core implementation detail that
allows for NVMe-specific optimizations (e.g. use direct_make_request
instead of generic_make_request). If in the future there is no benefit
or need to distinguish NVMe vs not: then it will be removed.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
This eliminates the "queue_mode" configuration's "nvme" mode. There
wasn't anything NVMe-specific about that mode. It was named "nvme"
because it was a short name for the mode. But the entire point of the
mode was to optimize the multipath target for underlying devices that
are _not_ SCSI-based. Devices that aren't SCSI have no need for the
various SCSI device handler (scsi_dh) specific code in DM multipath.
But rather than narrowly define this scsi_dh vs not branching in terms
of "nvme": invert the logic so that we're just checking whether a
multipath device is layered on SCSI devices with scsi_dh attached.
This allows any future storage technology to avoid scsi_dh specific code
in the multipath target too.
Fixes: 848b8aefd4 ("dm mpath: optimize NVMe bio-based support")
Suggested-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
The strncmp function should compare 4 bytes.
Fixes: 22c11858e8 ("dm: introduce DM_TYPE_NVME_BIO_BASED")
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Upstream commit 4102d9de6d ("dm raid: fix rs_get_progress()
synchronization state/ratio") in combination with commit 7c29744ecc
("dm raid: simplify rs_get_progress()") introduced a regression by
incorrectly reporting a sync_ratio of 0 for degraded raid sets. This
caused lvm2 to fail to repair raid legs automatically.
Fix by identifying the degraded state by checking the MD_RECOVERY_INTR
flag and returning mddev->recovery_cp in case it is set.
MD sets recovery = [ MD_RECOVERY_RECOVER MD_RECOVERY_INTR
MD_RECOVERY_NEEDED ] when a RAID member fails. It then shuts down any
sync thread that is running and leaves us with all MD_RECOVERY_* flags
cleared. The bug occurs if a status is requested in the short time it
takes to shut down any sync thread and clear the flags, because we were
keying in on the MD_RECOVERY_NEEDED - understanding it to be the initial
phase of a “recover” sync thread. However, this is an incorrect
interpretation if MD_RECOVERY_INTR is also set.
This also explains why the bug only happened when automatic repair was
enabled and not a normal ‘manual’ method. It is impossible to react
quick enough to hit the problematic window without it being automated.
Fix passes automatic repair tests.
Fixes: 7c29744ecc ("dm raid: simplify rs_get_progress()")
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Otherwise an underlying device's teardown (e.g. SCSI) may race with the
DM ioctl or persistent reservation and result in dereferencing driver
memory that gets freed when the underlying device's final blkdev_put()
occurs.
bdgrab() only increases the refcount for the block_device's inode to
ensure the block_device struct itself will not be freed, but does not
guarantee the block_device will remain associated with the gendisk or
its storage.
Cc: stable@vger.kernel.org # 4.8+
Reported-by: David Jeffery <djeffery@redhat.com>
Suggested-by: David Jeffery <djeffery@redhat.com>
Reviewed-by: Ben Marzinski <bmarzins@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
gcc-6.3 and earlier show a new warning after a seemingly unrelated
change to the arm64 PAGE_KERNEL definition:
In file included from drivers/md/dm-bufio.c:14:0:
drivers/md/dm-bufio.c: In function 'alloc_buffer':
include/linux/sched/mm.h:182:56: warning: 'noio_flag' may be used uninitialized in this function [-Wmaybe-uninitialized]
current->flags = (current->flags & ~PF_MEMALLOC_NOIO) | flags;
^
The same warning happened earlier on linux-3.18 for MIPS and I did a
workaround for that, but now it's come back.
gcc-7 and newer are apparently smart enough to figure this out, and
other architectures don't show it, so the best I could come up with is
to rework the caller slightly in a way that makes it obvious enough to
all arm64 compilers what is happening here.
Fixes: 41acec6240 ("arm64: kpti: Make use of nG dependent on arm64_kernel_unmapped_at_el0()")
Link: https://patchwork.kernel.org/patch/9692829/
Cc: stable@vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
[snitzer: moved declarations inside conditional, altered vmalloc return]
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Compilation of bpftool on a distro that lacks eBPF support in the installed
kernel headers fails with:
common.c: In function ‘is_bpffs’:
common.c:96:40: error: ‘BPF_FS_MAGIC’ undeclared (first use in this function)
return (unsigned long)st_fs.f_type == BPF_FS_MAGIC;
^
Fix this the same way it is already in tools/lib/bpf/libbpf.c and
tools/lib/api/fs/fs.c.
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Only allow ifindex from IP_PKTINFO to override SO_BINDTODEVICE settings
if the index is actually set in the message.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull sigingo fix from Eric Biederman:
"The kbuild test robot found that I accidentally moved si_pkey when I
was cleaning up siginfo_t. A short followed by an int with the int
having 8 byte alignment. Sheesh siginfo_t is a weird structure.
I have now corrected it and added build time checks that with a little
luck will catch any similar future mistakes. The build time checks
were sufficient for me to verify the bug and to verify my fix. So they
are at least useful this once."
* 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
signal/x86: Include the field offsets in the build time checks
signal: Correct the offset of si_pkey in struct siginfo
devm_memremap_pages() was re-worked in e8d5134833 "memremap: change
devm_memremap_pages interface to use struct dev_pagemap" to take a
caller allocated struct dev_pagemap as a function parameter. A call to
devres_free() was left in the error cleanup path which results in a
kernel panic if the remap fails for some reason. Remove it to fix the
panic and let devm_memremap_pages() fail gracefully.
Fixes: e8d5134833 ("memremap: change devm_memremap_pages interface...")
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Fix bugs in signaling the Hyper-V host when freeing space in the
host->guest ring buffer:
1. The interrupt_mask must not be used to determine whether to signal
on the host->guest ring buffer
2. The ring buffer write_index must be read (via hv_get_bytes_to_write)
*after* pending_send_sz is read in order to avoid a race condition
3. Comparisons with pending_send_sz must treat the "equals" case as
not-enough-space
4. Don't signal if the pending_send_sz feature is not present. Older
versions of Hyper-V that don't implement this feature will poll.
Fixes: 03bad714a1 ("vmbus: more host signalling avoidance")
Cc: Stable <stable@vger.kernel.org> # 4.14 and above
Signed-off-by: Michael Kelley <mhkelley@outlook.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 57e6f0d7b8 ("typec: tcpm: Only request matching pdos") is causing
a regression, before this commit e.g. the GPD win and GPD pocket devices
were charging at 9V 3A with a PD charger, now they are instead slowly
discharging at 5V 0.4A, as this commit causes the ports max_snk_mv/ma/mw
settings to be completely ignored.
Arguably the way to fix this would be to add a PDO_VAR() describing the
voltage range to the snk_caps of boards which can handle any voltage in
their range, but the "typec: tcpm: Only request matching pdos" commit
looks at the type of PDO advertised by the source/charger and if that
is fixed (as it typically is) only compairs against PDO_FIXED entries
in the snk_caps so supporting a range of voltage would require adding a
PDO_FIXED entry for *every possible* voltage to snk_caps.
AFAICT there is no reason why a fixed source_cap cannot be matched against
a variable snk_cap, so at a minimum the commit should be rewritten to
support that.
For now lets revert the "typec: tcpm: Only request matching pdos" commit,
fixing the regression.
Fixes: 57e6f0d7b8 ("typec: tcpm: Only request matching pdos")
Cc: Badhri Jagan Sridharan <badhri@google.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Acked-by: Adam Thomson <Adam.Thomson.Opensource@diasemi.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Without pm_runtime_{get,put}_sync calls in place, reading
vbus status via /sys causes the following error:
Unhandled fault: external abort on non-linefetch (0x1028) at 0xfa0ab060
pgd = b333e822
[fa0ab060] *pgd=48011452(bad)
[<c05261b0>] (musb_default_readb) from [<c0525bd0>] (musb_vbus_show+0x58/0xe4)
[<c0525bd0>] (musb_vbus_show) from [<c04c0148>] (dev_attr_show+0x20/0x44)
[<c04c0148>] (dev_attr_show) from [<c0259f74>] (sysfs_kf_seq_show+0x80/0xdc)
[<c0259f74>] (sysfs_kf_seq_show) from [<c0210bac>] (seq_read+0x250/0x448)
[<c0210bac>] (seq_read) from [<c01edb40>] (__vfs_read+0x1c/0x118)
[<c01edb40>] (__vfs_read) from [<c01edccc>] (vfs_read+0x90/0x144)
[<c01edccc>] (vfs_read) from [<c01ee1d0>] (SyS_read+0x3c/0x74)
[<c01ee1d0>] (SyS_read) from [<c0106fe0>] (ret_fast_syscall+0x0/0x54)
Solution was suggested by Tony Lindgren <tony@atomide.com>.
Signed-off-by: Merlijn Wajer <merlijn@wizzup.org>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Corsair Strafe RGB keyboard does not respond to usb control messages
sometimes and hence generates timeouts.
Commit de3af5bf25 ("usb: quirks: add delay init quirk for Corsair
Strafe RGB keyboard") tried to fix those timeouts by adding
USB_QUIRK_DELAY_INIT.
Unfortunately, even with this quirk timeouts of usb_control_msg()
can still be seen, but with a lower frequency (approx. 1 out of 15):
[ 29.103520] usb 1-8: string descriptor 0 read error: -110
[ 34.363097] usb 1-8: can't set config #1, error -110
Adding further delays to different locations where usb control
messages are issued just moves the timeouts to other locations,
e.g.:
[ 35.400533] usbhid 1-8:1.0: can't add hid device: -110
[ 35.401014] usbhid: probe of 1-8:1.0 failed with error -110
The only way to reliably avoid those issues is having a pause after
each usb control message. In approx. 200 boot cycles no more timeouts
were seen.
Addionaly, keep USB_QUIRK_DELAY_INIT as it turned out to be necessary
to have the delay in hub_port_connect() after hub_port_init().
The overall boot time seems not to be influenced by these additional
delays, even on fast machines and lightweight distributions.
Fixes: de3af5bf25 ("usb: quirks: add delay init quirk for Corsair Strafe RGB keyboard")
Cc: stable@vger.kernel.org
Signed-off-by: Danilo Krummrich <danilokrummrich@dk-develop.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
KVM: s390: Fixes
- Fix random memory corruption when running as guest2 (e.g. KVM in
LPAR) and starting guest3 (nested KVM) with many CPUs (e.g. a nested
guest with 200 vcpu)
- io interrupt delivery counter was not exported
Fixes for PPC KVM:
- Fix guest time accounting in the host
- Fix large-page backing for radix guests on POWER9
- Fix HPT guests on POWER9 backed by 2M or 1G pages
- Compile fixes for some configs and gcc versions
Correct spelling of National Semi-conductor (no hyphen)
in drivers/net/ethernet/.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli says:
====================
net: Use strlcpy() for ethtool::get_strings
After turning on KASAN on one of my systems, I started getting lots of out of
bounds errors while fetching a given port's statistics, and indeed using
memcpy() is unsafe for copying strings which have not been declared as an array
of ETH_GSTRING_LEN bytes, so let's use strlcpy() instead. This allows the best
of both worlds: we still keep the efficient memory usage of variably sized
strings, but we don't copy more than we need to.
Changes in v2:
- dropped the 3 other patches that were not necessary
- use strlcpy() instead of strncpy()
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Our statistics strings are allocated at initialization without being
bound to a specific size, yet, we would copy ETH_GSTRING_LEN bytes using
memcpy() which would create out of bounds accesses, this was flagged by
KASAN. Replace this with strlcpy() to make sure we are bound the source
buffer size and we also always NUL-terminate strings.
Fixes: 820ee17b8d ("net: phy: broadcom: Add support code for reading PHY counters")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Our statistics strings are allocated at initialization without being
bound to a specific size, yet, we would copy ETH_GSTRING_LEN bytes using
memcpy() which would create out of bounds accesses, this was flagged by
KASAN. Replace this with strlcpy() to make sure we are bound the source
buffer size and we also always NUL-terminate strings.
Fixes: 2b2427d064 ("phy: micrel: Add ethtool statistics counters")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Our statistics strings are allocated at initialization without being
bound to a specific size, yet, we would copy ETH_GSTRING_LEN bytes using
memcpy() which would create out of bounds accesses, this was flagged by
KASAN. Replace this with strlcpy() to make sure we are bound the source
buffer size and we also always NUL-terminate strings.
Fixes: d2fa47d9dd ("phy: marvell: Add ethtool statistics counters")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Our statistics strings are allocated at initialization without being
bound to a specific size, yet, we would copy ETH_GSTRING_LEN bytes using
memcpy() which would create out of bounds accesses, this was flagged by
KASAN. Replace this with strlcpy() to make sure we are bound the source
buffer size and we also always NUL-terminate strings.
Fixes: 967dd82ffc ("net: dsa: b53: Add support for Broadcom RoboSwitch")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ashmem_mutex may create a chain of dependencies like:
CPU0 CPU1
mmap syscall ioctl syscall
-> mmap_sem (acquired) -> ashmem_ioctl
-> ashmem_mmap -> ashmem_mutex (acquired)
-> ashmem_mutex (try to acquire) -> copy_from_user
-> mmap_sem (try to acquire)
There is a lock odering problem between mmap_sem and ashmem_mutex causing
a lockdep splat[1] during a syzcaller test. This patch fixes the problem
by move copy_from_user out of ashmem_mutex.
[1] https://www.spinics.net/lists/kernel/msg2733200.html
Fixes: ce8a3a9e76 (staging: android: ashmem: Fix a race condition in pin ioctls)
Reported-by: syzbot+d7a918a7a8e1c952bc36@syzkaller.appspotmail.com
Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Second batch of iwlwifi fixes intended for 4.16:
* Fix CSA issues with count 0 and 1;
* Some firmware debugging fixes;
* Removed a wrong error message when removing keys;
* Fix a firmware sysassert most usually triggered in IBSS;
* A couple of fixes on multicast queues;
* A fix with CCMP 256;
trigger_on() means that the trigger is available but not ready, however
trigger_on() was making it ready. That can segfault if the signal comes
before trigger_ready(). e.g. (USR2 signal delivery not shown)
$ perf record -e intel_pt//u -S sleep 1
perf: Segmentation fault
Obtained 16 stack frames.
/home/ahunter/bin/perf(sighandler_dump_stack+0x40) [0x4ec550]
/lib/x86_64-linux-gnu/libc.so.6(+0x36caf) [0x7fa76411acaf]
/home/ahunter/bin/perf(perf_evsel__disable+0x26) [0x4b9dd6]
/home/ahunter/bin/perf() [0x43a45b]
/lib/x86_64-linux-gnu/libc.so.6(+0x36caf) [0x7fa76411acaf]
/lib/x86_64-linux-gnu/libc.so.6(__xstat64+0x15) [0x7fa7641d2cc5]
/home/ahunter/bin/perf() [0x4ec6c9]
/home/ahunter/bin/perf() [0x4ec73b]
/home/ahunter/bin/perf() [0x4ec73b]
/home/ahunter/bin/perf() [0x4ec73b]
/home/ahunter/bin/perf() [0x4eca15]
/home/ahunter/bin/perf(machine__create_kernel_maps+0x257) [0x4f0b77]
/home/ahunter/bin/perf(perf_session__new+0xc0) [0x4f86f0]
/home/ahunter/bin/perf(cmd_record+0x722) [0x43c132]
/home/ahunter/bin/perf() [0x4a11ae]
/home/ahunter/bin/perf(main+0x5d4) [0x427fb4]
Note, for testing purposes, this is hard to hit unless you add some sleep()
in builtin-record.c before record__open().
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: stable@vger.kernel.org
Fixes: 3dcc4436fa ("perf tools: Introduce trigger class")
Link: http://lkml.kernel.org/r/1519807144-30694-1-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When printing stats in CSV mode, 'perf stat' appends extra separators
when a counter is not supported:
<not supported>,,L1-dcache-store-misses,mesos/bd442f34-2b4a-47df-b966-9b281f9f56fc,0,100.00,,,,
Which causes a failure when parsing fields. The numbers of separators
should be the same for each line, no matter if the counter is or not
supported.
Signed-off-by: Ilya Pronin <ipronin@twitter.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/20180306064353.31930-1-xiyou.wangcong@gmail.com
Fixes: 92a61f6412 ("perf stat: Implement CSV metrics output")
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The dock line-out pin (NID 0x17 of ALC3254 codec) on Dell Precision
7520 may route to three different DACs, 0x02, 0x03 and 0x06. The
first two DACS have the volume amp controls while the last one
doesn't. And unfortunately, the auto-parser assigns this pin to DAC3,
resulting in the non-working volume control for the line out.
Fix it by disabling the routing to DAC3 on the corresponding pin.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199029
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Even if we don't have extended SCA support, we can have more than 64 CPUs
if we don't enable any HW features that might use the SCA entries.
Now, this works just fine, but we missed a return, which is why we
would actually store the SCA entries. If we have more than 64 CPUs, this
means writing outside of the basic SCA - bad.
Let's fix this. This allows > 64 CPUs when running nested (under vSIE)
without random crashes.
Fixes: a6940674c3 ("KVM: s390: allow 255 VCPUs when sca entries aren't used")
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180306132758.21034-1-david@redhat.com>
Cc: stable@vger.kernel.org
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
With ibm,dynamic-memory-v2 and ibm,drc-info coming around the same
time, byte22 in vector5 of ibm architecture vector table got set twice
separately. The end result is that guest kernel isn't advertising
support for ibm,dynamic-memory-v2.
Fix this by removing the duplicate assignment of byte22.
Fixes: 02ef6dd810 ("powerpc: Enable support for ibm,drc-info devtree property")
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The internal mic boost on the T480 is too high. Fix this by applying the
ALC269_FIXUP_LIMIT_INT_MIC_BOOST fixup to the machine to limit the gain.
Signed-off-by: Benjamin Berg <bberg@redhat.com>
Tested-by: Benjamin Berg <bberg@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
This platform was only one phone Jack.
Add dummy lineout verb to fix automute mode disable.
This just the workaround.
[ More background information:
since the platform has only a headphone jack without speaker, the
driver doesn't create the auto-mute control. Meanwhile we do update
the headset mode via the automute hook in the driver, thus with this
setup, the headset won't be updated any longer.
By adding a dummy line-out pin here, the auto-mute is added by the
driver, and the headset update is triggered properly.
Note that this is different from the other
ALC274_FIXUP_DELL_AIO_LINEOUT_VERB, which has the real line-out pin,
while this quirk adds a dummy line-out pin. -- tiwai ]
Signed-off-by: Kailang Yang <kailang@realtek.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
when a system call is interrupted we might call the critical section
cleanup handler that re-does some of the operations. When we are between
.Lsysc_vtime and .Lsysc_do_svc we might also redo the saving of the
problem state registers r0-r7:
.Lcleanup_system_call:
[...]
0: # update accounting time stamp
mvc __LC_LAST_UPDATE_TIMER(8),__LC_SYNC_ENTER_TIMER
# set up saved register r11
lg %r15,__LC_KERNEL_STACK
la %r9,STACK_FRAME_OVERHEAD(%r15)
stg %r9,24(%r11) # r11 pt_regs pointer
# fill pt_regs
mvc __PT_R8(64,%r9),__LC_SAVE_AREA_SYNC
---> stmg %r0,%r7,__PT_R0(%r9)
The problem is now, that we might have already zeroed out r0.
The fix is to move the zeroing of r0 after sysc_do_svc.
Reported-by: Farhan Ali <alifm@linux.vnet.ibm.com>
Fixes: 7041d28115 ("s390: scrub registers on kernel entry and KVM exit")
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Due to an oversight when refactoring siginfo_t si_pkey has been in the
wrong position since 4.16-rc1. Add an explicit check of the offset of
every user space field in siginfo_t and compat_siginfo_t to make a
mistake like this hard to make in the future.
I have run this code on 4.15 and 4.16-rc1 with the position of si_pkey
fixed and all of the fields show up in the same location.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
The change moving addr_lsb into the _sigfault union failed to take
into account that _sigfault._addr_bnd._lower being a pointer forced
the entire union to have pointer alignment. In practice this only
mattered for the offset of si_pkey which is why this has taken so long
to discover.
To correct this change _dummy_pkey and _dummy_bnd to have pointer type.
Reported-by: kernel test robot <shun.hao@intel.com>
Fixes: b68a68d3dc ("signal: Move addr_lsb into the _sigfault union for clarity")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Pull ia64 cleanups from Tony Luck:
- More atomic cleanup from willy
- Fix a python script to work with version 3
- Some other small cleanups
* tag 'please-pull-ia64_misc' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
ia64/err-inject: fix spelling mistake: "capapbilities" -> "capabilities"
ia64/err-inject: Use get_user_pages_fast()
ia64: doc: tweak whitespace for 'console=' parameter
ia64: Convert remaining atomic operations
ia64: convert unwcheck.py to python3
Since commit 4670d610d5 ("PCI: Move OF-related PCI functions into
PCI core"), sparc:allmodconfig fails to build with the following error.
pcie-cadence-host.c:(.text+0x4c4): undefined reference to `of_irq_parse_and_map_pci'
pcie-cadence-host.c:(.text+0x4c8): undefined reference to `of_irq_parse_and_map_pci'
of_irq_parse_and_map_pci() is now only available if OF_IRQ is enabled.
Make its declaration and its dummy function dependent on OF_IRQ to solve
the problem.
Fixes: 4670d610d5 ("PCI: Move OF-related PCI functions into PCI core")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rob Herring <robh@kernel.org>
Commit a3e6c1eff5 ("MIPS: IRQ: Fix disable_irq on CPU IRQs") fixes an
issue where disable_irq did not actually disable the irq. The bug caused
our IPIs to not be disabled, which actually is the correct behavior.
With the addition of commit a3e6c1eff5 ("MIPS: IRQ: Fix disable_irq on
CPU IRQs"), the IPIs were getting disabled going into suspend, thus
schedule_ipi() was not being called. This caused deadlocks where
schedulable task were not being scheduled and other cpus were waiting
for them to do something.
Add the IRQF_NO_SUSPEND flag so an irq_disable will not be called on the
IPIs during suspend.
Signed-off-by: Justin Chen <justinpopo6@gmail.com>
Fixes: a3e6c1eff5 ("MIPS: IRQ: Fix disabled_irq on CPU IRQs")
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: linux-mips@linux-mips.org
Cc: stable@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/17385/
[jhogan@kernel.org: checkpatch: wrap long lines and fix commit refs]
Signed-off-by: James Hogan <jhogan@kernel.org>
At the point of sysfs callback, the call to gup is
done without mmap_sem (or any lock for that matter).
This is racy. As such, use the get_user_pages_fast()
alternative and safely avoid taking the lock, if possible.
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Tony Luck <tony.luck@intel.com>
While we've only seen inlining problems with atomic_sub_return(),
the other atomic operations could have the same problem. Convert all
remaining operations to use the same solution as atomic_sub_return().
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Since my system use python3 as default, arch/ia64/scripts/unwcheck.py no
longer run.
This patch convert it to the python3 syntax.
I have ran it with python2/python3 while printing values of
start/end/rlen_sum which could be impacted by this change and I see no difference.
Fixes: 94a4708352 ("scripts: change scripts to use system python instead of env")
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This can happen e.g. during disk cloning.
This is an incomplete fix: it does not catch duplicate UUIDs earlier
when things are still unattached. It does not unregister the device.
Further changes to cope better with this are planned but conflict with
Coly's ongoing improvements to handling device errors. In the meantime,
one can manually stop the device after this has happened.
Attempts to attach a duplicate device result in:
[ 136.372404] loop: module loaded
[ 136.424461] bcache: register_bdev() registered backing device loop0
[ 136.424464] bcache: bch_cached_dev_attach() Tried to attach loop0 but duplicate UUID already attached
My test procedure is:
dd if=/dev/sdb1 of=imgfile bs=1024 count=262144
losetup -f imgfile
Signed-off-by: Michael Lyle <mlyle@lyle.org>
Reviewed-by: Tang Junhui <tang.junhui@zte.com.cn>
Cc: <stable@vger.kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Descriptor rings were not initialized at zero when allocated
When area contained garbage data, it caused skb_over_panic in
e1000_clean_rx_irq (if data had E1000_RXD_STAT_DD bit set)
This patch makes use of dma_zalloc_coherent to make sure the
ring is memset at 0 to prevent the area from containing garbage.
Following is the signature of the panic:
IODDR0@0.0: skbuff: skb_over_panic: text:80407b20 len:64010 put:64010 head:ab46d800 data:ab46d842 tail:0xab47d24c end:0xab46df40 dev:eth0
IODDR0@0.0: BUG: failure at net/core/skbuff.c:105/skb_panic()!
IODDR0@0.0: Kernel panic - not syncing: BUG!
IODDR0@0.0:
IODDR0@0.0: Process swapper/0 (pid: 0, threadinfo=81728000, task=8173cc00 ,cpu: 0)
IODDR0@0.0: SP = <815a1c0c>
IODDR0@0.0: Stack: 00000001
IODDR0@0.0: b2d89800 815e33ac
IODDR0@0.0: ea73c040 00000001
IODDR0@0.0: 60040003 0000fa0a
IODDR0@0.0: 00000002
IODDR0@0.0:
IODDR0@0.0: 804540c0 815a1c70
IODDR0@0.0: b2744000 602ac070
IODDR0@0.0: 815a1c44 b2d89800
IODDR0@0.0: 8173cc00 815a1c08
IODDR0@0.0:
IODDR0@0.0: 00000006
IODDR0@0.0: 815a1b50 00000000
IODDR0@0.0: 80079434 00000001
IODDR0@0.0: ab46df40 b2744000
IODDR0@0.0: b2d89800
IODDR0@0.0:
IODDR0@0.0: 0000fa0a 8045745c
IODDR0@0.0: 815a1c88 0000fa0a
IODDR0@0.0: 80407b20 b2789f80
IODDR0@0.0: 00000005 80407b20
IODDR0@0.0:
IODDR0@0.0:
IODDR0@0.0: Call Trace:
IODDR0@0.0: [<804540bc>] skb_panic+0xa4/0xa8
IODDR0@0.0: [<80079430>] console_unlock+0x2f8/0x6d0
IODDR0@0.0: [<80457458>] skb_put+0xa0/0xc0
IODDR0@0.0: [<80407b1c>] e1000_clean_rx_irq+0x2dc/0x3e8
IODDR0@0.0: [<80407b1c>] e1000_clean_rx_irq+0x2dc/0x3e8
IODDR0@0.0: [<804079c8>] e1000_clean_rx_irq+0x188/0x3e8
IODDR0@0.0: [<80407b1c>] e1000_clean_rx_irq+0x2dc/0x3e8
IODDR0@0.0: [<80468b48>] __dev_kfree_skb_any+0x88/0xa8
IODDR0@0.0: [<804101ac>] e1000e_poll+0x94/0x288
IODDR0@0.0: [<8046e9d4>] net_rx_action+0x19c/0x4e8
IODDR0@0.0: ...
IODDR0@0.0: Maximum depth to print reached. Use kstack=<maximum_depth_to_print> To specify a custom value (where 0 means to display the full backtrace)
IODDR0@0.0: ---[ end Kernel panic - not syncing: BUG!
Signed-off-by: Pierre-Yves Kerbrat <pkerbrat@kalray.eu>
Signed-off-by: Marius Gligor <mgligor@kalray.eu>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Reviewed-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Clean fake sink flag after detecting link on downstream port.
Fixing display light-up after "hot-unplug&plug again" downstream
of an active dongle.
Signed-off-by: Roman Li <Roman.Li@amd.com>
Reviewed-by: Harry Wentland <Harry.Wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Pull kselftest fixes from Shuah Khan:
"A fix for regression in memory-hotplug install script that prevents
the test from running on the target"
* tag 'linux-kselftest-4.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests: memory-hotplug: fix emit_tests regression
Pull networking fixes from David Miller:
1) Use an appropriate TSQ pacing shift in mac80211, from Toke
Høiland-Jørgensen.
2) Just like ipv4's ip_route_me_harder(), we have to use skb_to_full_sk
in ip6_route_me_harder, from Eric Dumazet.
3) Fix several shutdown races and similar other problems in l2tp, from
James Chapman.
4) Handle missing XDP flush properly in tuntap, for real this time.
From Jason Wang.
5) Out-of-bounds access in powerpc ebpf tailcalls, from Daniel
Borkmann.
6) Fix phy_resume() locking, from Andrew Lunn.
7) IFLA_MTU values are ignored on newlink for some tunnel types, fix
from Xin Long.
8) Revert F-RTO middle box workarounds, they only handle one dimension
of the problem. From Yuchung Cheng.
9) Fix socket refcounting in RDS, from Ka-Cheong Poon.
10) Don't allow ppp unit registration to an unregistered channel, from
Guillaume Nault.
11) Various hv_netvsc fixes from Stephen Hemminger.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (98 commits)
hv_netvsc: propagate rx filters to VF
hv_netvsc: filter multicast/broadcast
hv_netvsc: defer queue selection to VF
hv_netvsc: use napi_schedule_irqoff
hv_netvsc: fix race in napi poll when rescheduling
hv_netvsc: cancel subchannel setup before halting device
hv_netvsc: fix error unwind handling if vmbus_open fails
hv_netvsc: only wake transmit queue if link is up
hv_netvsc: avoid retry on send during shutdown
virtio-net: re enable XDP_REDIRECT for mergeable buffer
ppp: prevent unregistered channels from connecting to PPP units
tc-testing: skbmod: fix match value of ethertype
mlxsw: spectrum_switchdev: Check success of FDB add operation
net: make skb_gso_*_seglen functions private
net: xfrm: use skb_gso_validate_network_len() to check gso sizes
net: sched: tbf: handle GSO_BY_FRAGS case in enqueue
net: rename skb_gso_validate_mtu -> skb_gso_validate_network_len
rds: Incorrect reference counting in TCP socket creation
net: ethtool: don't ignore return from driver get_fecparam method
vrf: check forwarding on the original netdevice when generating ICMP dest unreachable
...
When autoneg is off, the .check_for_link callback functions clear the
get_link_status flag and systematically return a "pseudo-error". This means
that the link is not detected as up until the next execution of the
e1000_watchdog_task() 2 seconds later.
Fixes: 19110cfbb3 ("e1000e: Separate signaling for link check/link up")
Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
Acked-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The 82574 specification update errata 12 states that interrupts may be
missed if ICR is read while INT_ASSERTED is not set. Avoid that problem by
setting all bits related to events that can trigger the Other interrupt in
IMS.
The Other interrupt is raised for such events regardless of whether or not
they are set in IMS. However, only when they are set is the INT_ASSERTED
bit also set in ICR.
By doing this, we ensure that INT_ASSERTED is always set when we read ICR
in e1000_msix_other() and steer clear of the errata. This also ensures that
ICR will automatically be cleared on read, therefore we no longer need to
clear bits explicitly.
Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Restores the ICS write for Rx/Tx queue interrupts which was present before
commit 16ecba59bc ("e1000e: Do not read ICR in Other interrupt", v4.5-rc1)
but was not restored in commit 4aea7a5c5e
("e1000e: Avoid receiver overrun interrupt bursts", v4.15-rc1).
This re-raises the queue interrupts in case the txq or rxq bits were set in
ICR and the Other interrupt handler read and cleared ICR before the queue
interrupt was raised.
Fixes: 4aea7a5c5e ("e1000e: Avoid receiver overrun interrupt bursts")
Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This partially reverts commit 4aea7a5c5e.
We keep the fix for the first part of the problem (1) described in the log
of that commit, that is to read ICR in the other interrupt handler. We
remove the fix for the second part of the problem (2), Other interrupt
throttling.
Bursts of "Other" interrupts may once again occur during rxo (receive
overflow) traffic conditions. This is deemed acceptable in the interest of
avoiding unforeseen fallout from changes that are not strictly necessary.
As discussed, the e1000e driver should be in "maintenance mode".
Link: https://www.spinics.net/lists/netdev/msg480675.html
Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
It was reported that emulated e1000e devices in vmware esxi 6.5 Build
7526125 do not link up after commit 4aea7a5c5e ("e1000e: Avoid receiver
overrun interrupt bursts", v4.15-rc1). Some tracing shows that after
e1000e_trigger_lsc() is called, ICR reads out as 0x0 in e1000_msix_other()
on emulated e1000e devices. In comparison, on real e1000e 82574 hardware,
icr=0x80000004 (_INT_ASSERTED | _LSC) in the same situation.
Some experimentation showed that this flaw in vmware e1000e emulation can
be worked around by not setting Other in EIAC. This is how it was before
16ecba59bc ("e1000e: Do not read ICR in Other interrupt", v4.5-rc1).
Fixes: 4aea7a5c5e ("e1000e: Avoid receiver overrun interrupt bursts")
Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently we can crash perf record when running in pipe mode, like:
$ perf record ls | perf report
# To display the perf.data header info, please use --header/--header-only options.
#
perf: Segmentation fault
Error:
The - file has no samples!
The callstack of the crash is:
0x0000000000515242 in perf_event__synthesize_event_update_name
3513 ev = event_update_event__new(len + 1, PERF_EVENT_UPDATE__NAME, evsel->id[0]);
(gdb) bt
#0 0x0000000000515242 in perf_event__synthesize_event_update_name
#1 0x00000000005158a4 in perf_event__synthesize_extra_attr
#2 0x0000000000443347 in record__synthesize
#3 0x00000000004438e3 in __cmd_record
#4 0x000000000044514e in cmd_record
#5 0x00000000004cbc95 in run_builtin
#6 0x00000000004cbf02 in handle_internal_command
#7 0x00000000004cc054 in run_argv
#8 0x00000000004cc422 in main
The reason of the crash is that the evsel does not have ids array
allocated and the pipe's synthesize code tries to access it.
We don't force evsel ids allocation when we have single event, because
it's not needed. However we need it when we are in pipe mode even for
single event as a key for evsel update event.
Fixing this by forcing evsel ids allocation event for single event, when
we are in pipe mode.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180302161354.30192-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This first happened with a gcc function, _cpp_lex_token, that has the
usual jumps:
│1159e6c: ↓ jne 115aa32 <_cpp_lex_token@@Base+0xf92>
I.e. jumps to a label inside that function (_cpp_lex_token), and those
works, but also this kind:
│1159e8b: ↓ jne c469be <cpp_named_operator2name@@Base+0xa72>
I.e. jumps to another function, outside _cpp_lex_token, which are not
being correctly handled generating as a side effect references to
ab->offset[] entries that are set to NULL, so to make this code more
robust, check that here.
A proper fix for will be put in place, looking at the function name
right after the '<' token and probably treating this like a 'call'
instruction.
For now just don't draw the arrow.
Reported-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Ingo Molnar <mingo@kernel.org>
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Link: https://lkml.kernel.org/n/tip-5tzvb875ep2sel03aeefgmud@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
On older (e.g. v4.4) kernels, an annoying fallback message can be
observed in 'perf top':
┌─Warning:──────────────────────┐
│fall back to non-overwrite mode│
│ │
│ │
│Press any key... │
└───────────────────────────────┘
The 'perf top' utility has been changed to overwrite mode since commit
ebebbf0823 ("perf top: Switch default mode to overwrite mode").
For older kernels which don't have overwrite mode support, 'perf top'
will fall back to non-overwrite mode and print out the fallback message
using ui__warning(), which needs user's input to close.
The fallback message is not critical for end users. Turning it to debug
message which is printed when running with -vv.
Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Kan Liang <kan.liang@intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Fixes: ebebbf0823 ("perf top: Switch default mode to overwrite mode")
Link: http://lkml.kernel.org/r/1519669030-176549-1-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
kconfig.h was excluded from consideration by fixdep by
6a5be57f0f (fixdep: fix extraneous dependencies) to avoid some false
positive hits
(1) include/config/.h
(2) include/config/h.h
(3) include/config/foo.h
(1) occurred because kconfig.h contains the string CONFIG_ in a
comment. However, since dee81e9886 (fixdep: faster CONFIG_ search), we
have a check that the part after CONFIG_ is non-empty, so this does not
happen anymore (and CONFIG_ appears by itself elsewhere, so that check
is worthwhile).
(2) comes from the include guard, __LINUX_KCONFIG_H. But with the
previous patch, we no longer match that either.
That leaves (3), which amounts to one [1] false dependency (aka stat() call
done by make), which I think we can live with:
We've already had one case [2] where the lack of include/linux/kconfig.h in
the .o.cmd file caused a missing rebuild, and while I originally thought
we should just put kconfig.h in the dependency list without parsing it
for the CONFIG_ pattern, we actually do have some real CONFIG_ symbols
mentioned in it, and one can imagine some translation unit that just
does '#ifdef __BIG_ENDIAN' but doesn't through some other header
actually depend on CONFIG_CPU_BIG_ENDIAN - so changing the target
endianness could end up rebuilding the world, minus that small
TU. Quoting Linus,
... when missing dependencies cause a missed re-compile, the resulting
bugs can be _really_ subtle.
[1] well, two, we now also have CONFIG_BOOGER/booger.h - we could change
that to FOO if we care
[2] https://lkml.org/lkml/2018/2/22/838
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
The string CONFIG_ quite often appears after other alphanumerics,
meaning that that instance cannot be referencing a Kconfig
symbol. Omitting these means make has fewer files to stat() when
deciding what needs to be rebuilt - for a defconfig build, this seems to
remove about 2% of the (wildcard ...) lines from the .o.cmd files.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
uml-config.h hasn't existed in this decade (87e299e5c7 - x86, um: get
rid of uml-config.h). The few remaining UML_CONFIG instances are defined
directly in terms of their real CONFIG symbol in common-offsets.h, so
unlike when the symbols got defined via a sed script, anything that uses
UML_CONFIG_FOO now should also automatically pick up a dependency on
CONFIG_FOO via the normal fixdep mechanism (since common-offsets.h
should at least recursively be a dependency). Hence I believe we should
actually be able to ignore the HELLO_CONFIG_BOOM cases.
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Richard Weinberger <richard@nod.at>
Cc: user-mode-linux-devel@lists.sourceforge.net
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
dtc now gives the following warning:
arch/arm/boot/dts/rk3288-tinker.dtb: Warning (sound_dai_property): /sound/simple-audio-card,codec: Missing property '#sound-dai-cells' in node /hdmi@ff980000 or bad phandle (referred from sound-dai[0])
Add the missing #sound-dai-cells property.
Cc: Heiko Stuebner <heiko@sntech.de>
Cc: linux-rockchip@lists.infradead.org
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
If a start bit is detected, then reset the receive buffer counter to 0.
This ensures that no stale data is in the buffer if a message is
broken off midstream due to e.g. a Low Drive condition and then
retransmitted.
The only Rx interrupts we need to listen to are RX_REGISTER_FULL (i.e.
a valid byte was received) and RX_START_BIT_DETECTED (i.e. a new
message starts and we need to reset the counter).
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Cc: <stable@vger.kernel.org> # for v4.15 and up
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Since commit 02d742f4b2 ("media: lirc: lirc daemon fails to detect raw
IR device"), the feature LIRC_CAN_SEND_SCANCODE is no longer used as it
tripped up lircd. The ability to send scancodes for IR Tx is implied by
LIRC_CAN_SEND_PULSE (i.e. any device that can send can use IR Tx encoders).
So, remove LIRC_CAN_SEND_SCANCODE since it never used. This fixes:
Documentation/output/lirc.h.rst:6: WARNING: undefined label:
lirc-can-send-scancode (if the link has no caption the label must precede
a section header
As this flag was added for kernel 4.16, let's remove it, while not too
late.
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
When I debug a kernel crash issue in funcitonfs, found ffs_data.ref
overflowed, While functionfs is unmounting, ffs_data is put twice.
Commit 43938613c6 ("drivers, usb: convert ffs_data.ref from atomic_t to
refcount_t") can avoid refcount overflow, but that is risk some situations.
So no need put ffs data in ffs_fs_kill_sb, already put in ffs_data_closed.
The issue can be reproduced in Mediatek mt6763 SoC, ffs for ADB device.
KASAN enabled configuration reports use-after-free errro.
BUG: KASAN: use-after-free in refcount_dec_and_test+0x14/0xe0 at addr ffffffc0579386a0
Read of size 4 by task umount/4650
====================================================
BUG kmalloc-512 (Tainted: P W O ): kasan: bad access detected
-----------------------------------------------------------------------------
INFO: Allocated in ffs_fs_mount+0x194/0x844 age=22856 cpu=2 pid=566
alloc_debug_processing+0x1ac/0x1e8
___slab_alloc.constprop.63+0x640/0x648
__slab_alloc.isra.57.constprop.62+0x24/0x34
kmem_cache_alloc_trace+0x1a8/0x2bc
ffs_fs_mount+0x194/0x844
mount_fs+0x6c/0x1d0
vfs_kern_mount+0x50/0x1b4
do_mount+0x258/0x1034
INFO: Freed in ffs_data_put+0x25c/0x320 age=0 cpu=3 pid=4650
free_debug_processing+0x22c/0x434
__slab_free+0x2d8/0x3a0
kfree+0x254/0x264
ffs_data_put+0x25c/0x320
ffs_data_closed+0x124/0x15c
ffs_fs_kill_sb+0xb8/0x110
deactivate_locked_super+0x6c/0x98
deactivate_super+0xb0/0xbc
INFO: Object 0xffffffc057938600 @offset=1536 fp=0x (null)
......
Call trace:
[<ffffff900808cf5c>] dump_backtrace+0x0/0x250
[<ffffff900808d3a0>] show_stack+0x14/0x1c
[<ffffff90084a8c04>] dump_stack+0xa0/0xc8
[<ffffff900826c2b4>] print_trailer+0x158/0x260
[<ffffff900826d9d8>] object_err+0x3c/0x40
[<ffffff90082745f0>] kasan_report_error+0x2a8/0x754
[<ffffff9008274f84>] kasan_report+0x5c/0x60
[<ffffff9008273208>] __asan_load4+0x70/0x88
[<ffffff90084cd81c>] refcount_dec_and_test+0x14/0xe0
[<ffffff9008d98f9c>] ffs_data_put+0x80/0x320
[<ffffff9008d9d904>] ffs_fs_kill_sb+0xc8/0x110
[<ffffff90082852a0>] deactivate_locked_super+0x6c/0x98
[<ffffff900828537c>] deactivate_super+0xb0/0xbc
[<ffffff90082af0c0>] cleanup_mnt+0x64/0xec
[<ffffff90082af1b0>] __cleanup_mnt+0x10/0x18
[<ffffff90080d9e68>] task_work_run+0xcc/0x124
[<ffffff900808c8c0>] do_notify_resume+0x60/0x70
[<ffffff90080866e4>] work_pending+0x10/0x14
Cc: stable@vger.kernel.org
Signed-off-by: Xinyong <xinyong.fang@linux.alibaba.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
The commit 9d9491a7da ("mmc: dw_mmc: Fix the DTO timeout calculation")
and commit 4c2357f57d ("mmc: dw_mmc: Fix the CTO timeout calculation")
made changes, which cause multiply overflow for 32-bit systems. The broken
timeout calculations leads to unexpected ETIMEDOUT errors and causes
stacktrace splat (such as below) during normal data exchange with SD-card.
| Running : 4M-check-reassembly-tcp-cmykw2-rotatew2.out -v0 -w1
| - Info: Finished target initialization.
| mmcblk0: error -110 transferring data, sector 320544, nr 2048, cmd
| response 0x900, card status 0x0
DIV_ROUND_UP_ULL helps to escape usage of __udivdi3() from libgcc and so
code gets compiled on all 32-bit platforms as opposed to usage of
DIV_ROUND_UP when we may only compile stuff on a very few arches.
Lets cast this multiply to u64 type to prevent the overflow.
Fixes: 9d9491a7da ("mmc: dw_mmc: Fix the DTO timeout calculation")
Fixes: 4c2357f57d ("mmc: dw_mmc: Fix the CTO timeout calculation")
Tested-by: Vineet Gupta <Vineet.Gupta1@synopsys.com>
Reported-by: Vineet Gupta <Vineet.Gupta1@synopsys.com> # ARC STAR 9001306872 HSDK, sdio: board crashes when copying big files
Signed-off-by: Evgeniy Didin <Evgeniy.Didin@synopsys.com>
Cc: <stable@vger.kernel.org> # 4.14
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Shawn Lin <shawn.lin@rock-chips.com>
Reviewed-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Acked-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Since commit ab82fa7da4 ("gpio: rcar: Prevent module clock disable
when wake-up is enabled"), when a GPIO is used for wakeup, the GPIO
block's module clock (if exists) is manually kept running during system
suspend, to make sure the device stays active.
However, this explicit clock handling is merely a workaround for a
failure to properly communicate wakeup information to the device core.
Instead, set the device's power.wakeup_path field, to indicate this
device is part of the wakeup path. Depending on the PM Domain's
active_wakeup configuration, the genpd core code will keep the device
enabled (and the clock running) during system suspend when needed.
This allows for the removal of all explicit clock handling code from the
driver.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Stephen Hemminger says:
====================
hv_netvsc: minor fixes
These are improvements to netvsc driver. They aren't functionality
changes so not targeting net-next; and they are not show stopper
bugs that need to go to stable either.
v2
- drop the irq flags patch, defer it to net-next
- split the multicast filter flag patch out
- change propogate rx mode patch to handle startup of vf
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The netvsc device should propagate filters to the SR-IOV VF
device (if present). The flags also need to be propagated to the
VF device as well. This only really matters on local Hyper-V
since Azure does not support multiple addresses.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The netvsc driver was always enabling all multicast and broadcast
even if netdevice flag had not enabled it.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When VF is used for accelerated networking it will likely have
more queues (and different policy) than the synthetic NIC.
This patch defers the queue policy to the VF so that all the
queues can be used. This impacts workloads like local generate UDP.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since the netvsc_channel_cb is already called in interrupt
context from vmbus, there is no need to do irqsave/restore.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is a race between napi_reschedule and re-enabling interrupts
which could lead to missed host interrrupts. This occurs when
interrupts are re-enabled (hv_end_read) and vmbus irq callback
(netvsc_channel_cb) has already scheduled NAPI.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Block setup of multiple channels earlier in the teardown
process. This avoids possible races between halt and subchannel
initialization.
Suggested-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change the initialization order so that the device is ready to transmit
(ie connect vsp is completed) before setting the internal reference
to the device with RCU.
This avoids any races on initialization and prevents retry issues
on shutdown.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
XDP_REDIRECT support for mergeable buffer was removed since commit
7324f5399b ("virtio_net: disable XDP_REDIRECT in receive_mergeable()
case"). This is because we don't reserve enough tailroom for struct
skb_shared_info which breaks XDP assumption. So this patch fixes this
by reserving enough tailroom and using fixed size of rx buffer.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
PPP units don't hold any reference on the channels connected to it.
It is the channel's responsibility to ensure that it disconnects from
its unit before being destroyed.
In practice, this is ensured by ppp_unregister_channel() disconnecting
the channel from the unit before dropping a reference on the channel.
However, it is possible for an unregistered channel to connect to a PPP
unit: register a channel with ppp_register_net_channel(), attach a
/dev/ppp file to it with ioctl(PPPIOCATTCHAN), unregister the channel
with ppp_unregister_channel() and finally connect the /dev/ppp file to
a PPP unit with ioctl(PPPIOCCONNECT).
Once in this situation, the channel is only held by the /dev/ppp file,
which can be released at anytime and free the channel without letting
the parent PPP unit know. Then the ppp structure ends up with dangling
pointers in its ->channels list.
Prevent this scenario by forbidding unregistered channels from
connecting to PPP units. This maintains the code logic by keeping
ppp_unregister_channel() responsible from disconnecting the channel if
necessary and avoids modification on the reference counting mechanism.
This issue seems to predate git history (successfully reproduced on
Linux 2.6.26 and earlier PPP commits are unrelated).
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
iproute2 print_skbmod() prints the configured ethertype using format 0x%X:
therefore, test 9aa8 systematically fails, because it configures action #4
using ethertype 0x0031, and expects 0x0031 when it reads it back. Changing
the expected value to 0x31 lets the test result 'not ok' become 'ok'.
tested with:
# ./tdc.py -e 9aa8
Test 9aa8: Get a single skbmod action from a list
All test results:
1..1
ok 1 9aa8 Get a single skbmod action from a list
Fixes: cf797ac49b ("tc-testing: Add test cases for police and skbmod")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Until now, we assumed that in case of error when adding FDB entries, the
write operation will fail, but this is not the case. Instead, we need to
check that the number of entries reported in the response is equal to
the number of entries specified in the request.
Fixes: 56ade8fe3f ("mlxsw: spectrum: Add initial support for Spectrum ASIC")
Reported-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Shalom Toledo <shalomt@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Axtens says:
====================
GSO_BY_FRAGS correctness improvements
As requested [1], I went through and had a look at users of gso_size to
see if there were things that need to be fixed to consider
GSO_BY_FRAGS, and I have tried to improve our helper functions to deal
with this case.
I found a few. This fixes bugs relating to the use of
skb_gso_*_seglen() where GSO_BY_FRAGS is not considered.
Patch 1 renames skb_gso_validate_mtu to skb_gso_validate_network_len.
This is follow-up to my earlier patch 2b16f04872 ("net: create
skb_gso_validate_mac_len()"), and just makes everything a bit clearer.
Patches 2 and 3 replace the final users of skb_gso_network_seglen() -
which doesn't consider GSO_BY_FRAGS - with
skb_gso_validate_network_len(), which does. This allows me to make the
skb_gso_*_seglen functions private in patch 4 - now future users won't
accidentally do the wrong comparison.
Two things remain. One is qdisc_pkt_len_init, which is discussed at
[2] - it's caught up in the GSO_DODGY mess. I don't have any expertise
in GSO_DODGY, and it looks like a good clean fix will involve
unpicking the whole validation mess, so I have left it for now.
Secondly, there are 3 eBPF opcodes that change the gso_size of an SKB
and don't consider GSO_BY_FRAGS. This is going through the bpf tree.
Regards,
Daniel
[1] https://patchwork.ozlabs.org/comment/1852414/
[2] https://www.spinics.net/lists/netdev/msg482397.html
PS: This is all in the core networking stack. For a driver to be
affected by this it would need to support NETIF_F_GSO_SCTP /
NETIF_F_GSO_SOFTWARE and then either use gso_size or not be a purely
virtual device. (Many drivers look at gso_size, but do not support
SCTP segmentation, so the core network will segment an SCTP gso before
it hits them.) Based on that, the only driver that may be affected is
sunvnet, but I have no way of testing it, so I haven't looked at it.
v2: split out bpf stuff
fix review comments from Dave Miller
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
They're very hard to use properly as they do not consider the
GSO_BY_FRAGS case. Code should use skb_gso_validate_network_len
and skb_gso_validate_mac_len as they do consider this case.
Make the seglen functions static, which stops people using them
outside of skbuff.c
Signed-off-by: Daniel Axtens <dja@axtens.net>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace skb_gso_network_seglen() with
skb_gso_validate_network_len(), as it considers the GSO_BY_FRAGS
case.
Signed-off-by: Daniel Axtens <dja@axtens.net>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tbf_enqueue() checks the size of a packet before enqueuing it.
However, the GSO size check does not consider the GSO_BY_FRAGS
case, and so will drop GSO SCTP packets, causing a massive drop
in throughput.
Use skb_gso_validate_mac_len() instead, as it does consider that
case.
Signed-off-by: Daniel Axtens <dja@axtens.net>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If you take a GSO skb, and split it into packets, will the network
length (L3 headers + L4 headers + payload) of those packets be small
enough to fit within a given MTU?
skb_gso_validate_mtu gives you the answer to that question. However,
we recently added to add a way to validate the MAC length of a split GSO
skb (L2+L3+L4+payload), and the names get confusing, so rename
skb_gso_validate_mtu to skb_gso_validate_network_len
Signed-off-by: Daniel Axtens <dja@axtens.net>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Like the Highpoint Rocketraid 642L and cards using a Marvel 88SE9235
controller in general, this RAID card also supports AHCI mode and short
of a custom driver, this is the only way to make it work under Linux.
Note that even though the card is called to 644L, it has a product-id
of 0x0645.
Cc: stable@vger.kernel.org
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1534106
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Pull x86 fixes from Thomas Gleixner:
"A small set of fixes for x86:
- Add missing instruction suffixes to assembly code so it can be
compiled by newer GAS versions without warnings.
- Switch refcount WARN exceptions to UD2 as we did in general
- Make the reboot on Intel Edison platforms work
- A small documentation update so text and sample command match"
* 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
Documentation, x86, resctrl: Make text and sample command match
x86/platform/intel-mid: Handle Intel Edison reboot correctly
x86/asm: Add instruction suffixes to bitops
x86/entry/64: Add instruction suffix
x86/refcounts: Switch to UD2 for exceptions
Pull x86/pti fixes from Thomas Gleixner:
"Three fixes related to melted spectrum:
- Sync the cpu_entry_area page table to initial_page_table on 32 bit.
Otherwise suspend/resume fails because resume uses
initial_page_table and triggers a triple fault when accessing the
cpu entry area.
- Zero the SPEC_CTL MRS on XEN before suspend to address a
shortcoming in the hypervisor.
- Fix another switch table detection issue in objtool"
* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/cpu_entry_area: Sync cpu_entry_area to initial_page_table
objtool: Fix another switch table detection issue
x86/xen: Zero MSR_IA32_SPEC_CTRL before suspend
Pull timer fixes from Thomas Gleixner:
"A small set of fixes from the timer departement:
- Add a missing timer wheel clock forward when migrating timers off a
unplugged CPU to prevent operating on a stale clock base and
missing timer deadlines.
- Use the proper shift count to extract data from a register value to
prevent evaluating unrelated bits
- Make the error return check in the FSL timer driver work correctly.
Checking an unsigned variable for less than zero does not really
work well.
- Clarify the confusing comments in the ARC timer code"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
timers: Forward timer base before migrating timers
clocksource/drivers/arc_timer: Update some comments
clocksource/drivers/mips-gic-timer: Use correct shift count to extract data
clocksource/drivers/fsl_ftm_timer: Fix error return checking
Pull irq fixlet from Thomas Gleixner:
"Just a documentation update for the missing device tree property of
the R-Car M3N interrupt controller"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
dt-bindings/irqchip/renesas-irqc: Document R-Car M3-N support
Pull btrfs fixes from David Sterba:
- when NR_CPUS is large, a SRCU structure can significantly inflate
size of the main filesystem structure that would not be possible to
allocate by kmalloc, so the kvalloc fallback is used
- improved error handling
- fix endiannes when printing some filesystem attributes via sysfs,
this is could happen when a filesystem is moved between different
endianity hosts
- send fixes: the NO_HOLE mode should not send a write operation for a
file hole
- fix log replay for for special files followed by file hardlinks
- fix log replay failure after unlink and link combination
- fix max chunk size calculation for DUP allocation
* tag 'for-4.16-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
Btrfs: fix log replay failure after unlink and link combination
Btrfs: fix log replay failure after linking special file and fsync
Btrfs: send, fix issuing write op when processing hole in no data mode
btrfs: use proper endianness accessors for super_copy
btrfs: alloc_chunk: fix DUP stripe size handling
btrfs: Handle btrfs_set_extent_delalloc failure in relocate_file_extent_cluster
btrfs: handle failure of add_pending_csums
btrfs: use kvzalloc to allocate btrfs_fs_info
As the kernel doc describes too the code is supposed to skip adding
multicast TT entries if both the WANT_ALL_IPV4 and WANT_ALL_IPV6 flags
are present.
Unfortunately, the current code even skips adding multicast TT entries
if only either the WANT_ALL_IPV4 or WANT_ALL_IPV6 is present.
This could lead to IPv6 multicast packet loss if only an IGMP but not an
MLD querier is present for instance or vice versa.
Fixes: 687937ab34 ("batman-adv: Add multicast optimization support for bridged setups")
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Simon Wunderlich says:
====================
Here are some batman-adv bugfixes:
- fix skb checksum issues, by Matthias Schiffer (2 patches)
- fix exception handling when dumping data objects through netlink,
by Sven Eckelmann (4 patches)
- fix handling of interface indices, by Sven Eckelmann
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull i2c fixes from Wolfram Sang:
"A driver fix and a documentation fix (which makes dependency handling
for the next cycle easier)"
* 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: octeon: Prevent error message on bus error
dt-bindings: at24: sort manufacturers alphabetically
Pull libnvdimm fixes from Dan Williams:
"A 4.16 regression fix, three fixes for -stable, and a cleanup fix:
- During the merge window support for the new ACPI NVDIMM Platform
Capabilities structure disabled support for "deep flush", a
force-unit- access like mechanism for persistent memory. Restore
that mechanism.
- VFIO like RDMA is yet one more memory registration / pinning
interface that is incompatible with Filesystem-DAX. Disable long
term pins of Filesystem-DAX mappings via VFIO.
- The Filesystem-DAX detection to prevent long terms pins mistakenly
also disabled Device-DAX pins which are not subject to the same
block- map collision concerns.
- Similar to the setup path, softlockup warnings can trigger in the
shutdown path for large persistent memory namespaces. Teach
for_each_device_pfn() to perform cond_resched() in all cases.
- Boaz noticed that the might_sleep() in dax_direct_access() is stale
as of the v4.15 kernel.
These have received a build success notification from the 0day robot,
and the longterm pin fixes have appeared in -next. However, I recently
rebased the tree to remove some other fixes that need to be reworked
after review feedback.
* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
memremap: fix softlockup reports at teardown
libnvdimm: re-enable deep flush for pmem devices via fsync()
vfio: disable filesystem-dax page pinning
dax: fix vma_is_fsdax() helper
dax: ->direct_access does not sleep anymore
SCTP GSO skbs have a gso_size of GSO_BY_FRAGS, so any sort of
unconditionally mangling of that will result in nonsense value
and would corrupt the skb later on.
Therefore, i) add two helpers skb_increase_gso_size() and
skb_decrease_gso_size() that would throw a one time warning and
bail out for such skbs and ii) refuse and return early with an
error in those BPF helpers that are affected. We do need to bail
out as early as possible from there before any changes on the
skb have been performed.
Fixes: 6578171a7f ("bpf: add bpf_skb_change_proto helper")
Co-authored-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Axtens <dja@axtens.net>
Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Pull Kbuild fixes from Masahiro Yamada:
- suppress sparse warnings about unknown attributes
- fix typos and stale comments
- fix build error of arch/sh
- fix wrong use of ld-option vs cc-ldoption
- remove redundant GCC_PLUGINS_CFLAGS assignment
- fix another memory leak of Kconfig
- fix line number in error messages of Kconfig
- do not write confusing CONFIG_DEFCONFIG_LIST out to .config
- add xstrdup() to Kconfig to handle memory shortage errors
- show also a Debian package name if ncurses is missing
* tag 'kbuild-fixes-v4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
MAINTAINERS: take over Kconfig maintainership
kconfig: fix line number in recursive inclusion error message
Coccinelle: memdup: Fix typo in warning messages
kconfig: Update ncurses package names for menuconfig
kbuild/kallsyms: trivial typo fix
kbuild: test --build-id linker flag by ld-option instead of cc-ldoption
kbuild: drop superfluous GCC_PLUGINS_CFLAGS assignment
kconfig: Don't leak choice names during parsing
sh: fix build error for empty CONFIG_BUILTIN_DTB_SOURCE
kconfig: set SYMBOL_AUTO to the symbol marked with defconfig_list
kconfig: add xstrdup() helper
kbuild: disable sparse warnings about unknown attributes
Makefile: Fix lying comment re. silentoldconfig
Pull media fixes from Mauro Carvalho Chehab:
- some build fixes with randconfigs
- an m88ds3103 fix to prevent an OOPS if the chip doesn't provide the
right version during probe (with can happen if the hardware hangs)
- a potential out of array bounds reference in tvp5150
- some fixes and improvements in the DVB memory mapped API (added for
kernel 4.16)
* tag 'media/v4.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
media: vb2: Makefile: place vb2-trace together with vb2-core
media: Don't let tvp5150_get_vbi() go out of vbi_ram_default array
media: dvb: update buffer mmaped flags and frame counter
media: dvb: add continuity error indicators for memory mapped buffers
media: dmxdev: Fix the logic that enables DMA mmap support
media: dmxdev: fix error code for invalid ioctls
media: m88ds3103: don't call a non-initalized function
media: au0828: add VIDEO_V4L2 dependency
media: dvb: fix DVB_MMAP dependency
media: dvb: fix DVB_MMAP symbol name
media: videobuf2: fix build issues with vb2-trace
media: videobuf2: Add VIDEOBUF2_V4L2 Kconfig option for VB2 V4L2 part
Gen8 and prior Proliant systems supported the "CRU" interface
to firmware. This interfaces allows linux to "call back" into firmware
to source the cause of an NMI. This feature isn't fully utilized
as the actual source of the NMI isn't printed, the driver only
indicates that the source couldn't be determined when the call
fails.
With the advent of Gen9, iCRU replaces the CRU. The call back
feature is no longer available in firmware. To be compatible and
not attempt to call back into firmware on system not supporting CRU,
the SMBIOS table is consulted to determine if it is safe to
make the call back or not.
This results in about half of the driver code being devoted
to either making CRU calls or determing if it is safe to make
CRU calls. As noted, the driver isn't really using the results of
the CRU calls.
Furthermore, as a consequence of the Spectre security issue, the
BIOS/EFI calls are being wrapped into Spectre-disabling section.
Removing the call back in hpwdt_pretimeout assists in this effort.
As the CRU sourcing of the NMI isn't required for handling the
NMI and there are security concerns with making the call back, remove
the legacy (pre Gen9) NMI sourcing and the DMI code to determine if
the system had the CRU interface.
Signed-off-by: Jerry Hoemann <jerry.hoemann@hpe.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
According to SBSA spec v3.1 section 5.3:
All registers are 32 bits in size and should be accessed using
32-bit reads and writes. If an access size other than 32 bits
is used then the results are IMPLEMENTATION DEFINED.
[...]
The Generic Watchdog is little-endian
The current code uses readq to read the watchdog compare register
which does a 64-bit access. This fails on ThunderX2 which does not
implement 64-bit access to this register.
Fix this by using lo_hi_readq() that does two 32-bit reads.
Signed-off-by: Jayachandran C <jnair@caviumnetworks.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Watchdog close is "expected" when any byte is 'V' not just the last one.
Writing "V" to the device fails because the last byte is the end of string.
$ echo V > /dev/watchdog
f71808e_wdt: Unexpected close, not stopping watchdog!
Signed-off-by: Igor Pylypiv <igor.pylypiv@gmail.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Since commit 8b24e69fc4 ("KVM: PPC: Book3S HV: Close race with testing
for signals on guest entry"), if CONFIG_VIRT_CPU_ACCOUNTING_GEN is set, the
guest time is not accounted to guest time and user time, but instead to
system time.
This is because guest_enter()/guest_exit() are called while interrupts
are disabled and the tick counter cannot be updated between them.
To fix that, move guest_exit() after local_irq_enable(), and as
guest_enter() is called with IRQ disabled, call guest_enter_irqoff()
instead.
Fixes: 8b24e69fc4 ("KVM: PPC: Book3S HV: Close race with testing for signals on guest entry")
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Pull KVM fixes from Radim Krčmář:
"x86:
- fix NULL dereference when using userspace lapic
- optimize spectre v1 mitigations by allowing guests to use LFENCE
- make microcode revision configurable to prevent guests from
unnecessarily blacklisting spectre v2 mitigation feature"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: fix vcpu initialization with userspace lapic
KVM: X86: Allow userspace to define the microcode version
KVM: X86: Introduce kvm_get_msr_feature()
KVM: SVM: Add MSR-based feature support for serializing LFENCE
KVM: x86: Add a framework for supporting MSR-based features
Re-enable deep flush so that users always have a way to be sure that a
write makes it all the way out to media. Writes from the PMEM driver
always arrive at the NVDIMM since movnt is used to bypass the cache, and
the driver relies on the ADR (Asynchronous DRAM Refresh) mechanism to
flush write buffers on power failure. The Deep Flush mechanism is there
to explicitly write buffers to protect against (rare) ADR failure. This
change prevents a regression in deep flush behavior so that applications
can continue to depend on fsync() as a mechanism to trigger deep flush
in the filesystem-DAX case.
Fixes: 06e8ccdab1 ("acpi: nfit: Add support for detect platform CPU cache...")
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Filesystem-DAX is incompatible with 'longterm' page pinning. Without
page cache indirection a DAX mapping maps filesystem blocks directly.
This means that the filesystem must not modify a file's block map while
any page in a mapping is pinned. In order to prevent the situation of
userspace holding of filesystem operations indefinitely, disallow
'longterm' Filesystem-DAX mappings.
RDMA has the same conflict and the plan there is to add a 'with lease'
mechanism to allow the kernel to notify userspace that the mapping is
being torn down for block-map maintenance. Perhaps something similar can
be put in place for vfio.
Note that xfs and ext4 still report:
"DAX enabled. Warning: EXPERIMENTAL, use at your own risk"
...at mount time, and resolving the dax-dma-vs-truncate problem is one
of the last hurdles to remove that designation.
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: kvm@vger.kernel.org
Cc: <stable@vger.kernel.org>
Reported-by: Haozhong Zhang <haozhong.zhang@intel.com>
Tested-by: Haozhong Zhang <haozhong.zhang@intel.com>
Fixes: d475c6346a ("dax,ext2: replace XIP read and write with DAX I/O")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Pull PCI fixes from Bjorn Helgaas:
- Update pci.ids location (documentation only) (Randy Dunlap)
- Fix a crash when BIOS didn't assign a BAR and we try to enlarge it
(Christian König)
* tag 'pci-v4.16-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI: Allow release of resources that were never assigned
PCI: Update location of pci.ids file
Pablo Neira Ayuso says:
====================
Netfilter/IPVS fixes for net
The following patchset contains Netfilter fixes for your net tree,
they are:
1) Put back reference on CLUSTERIP configuration structure from the
error path, patch from Florian Westphal.
2) Put reference on CLUSTERIP configuration instead of freeing it,
another cpu may still be walking over it, also from Florian.
3) Refetch pointer to IPv6 header from nf_nat_ipv6_manip_pkt() given
packet manipulation may reallocation the skbuff header, from Florian.
4) Missing match size sanity checks in ebt_among, from Florian.
5) Convert BUG_ON to WARN_ON in ebtables, from Florian.
6) Sanity check userspace offsets from ebtables kernel, from Florian.
7) Missing checksum replace call in flowtable IPv4 DNAT, from Felix
Fietkau.
8) Bump the right stats on checksum error from bridge netfilter,
from Taehee Yoo.
9) Unset interface flag in IPv6 fib lookups otherwise we get
misleading routing lookup results, from Florian.
10) Missing sk_to_full_sk() in ip6_route_me_harder() from Eric Dumazet.
11) Don't allow devices to be part of multiple flowtables at the same
time, this may break setups.
12) Missing netlink attribute validation in flowtable deletion.
13) Wrong array index in nf_unregister_net_hook() call from error path
in flowtable addition path.
14) Fix FTP IPVS helper when NAT mangling is in place, patch from
Julian Anastasov.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Since commit afcc90f862 ("usercopy: WARN() on slab cache usercopy
region violations"), MIPS systems booting with a compat root filesystem
emit a warning when copying compat siginfo to userspace:
WARNING: CPU: 0 PID: 953 at mm/usercopy.c:81 usercopy_warn+0x98/0xe8
Bad or missing usercopy whitelist? Kernel memory exposure attempt
detected from SLAB object 'task_struct' (offset 1432, size 16)!
Modules linked in:
CPU: 0 PID: 953 Comm: S01logging Not tainted 4.16.0-rc2 #10
Stack : ffffffff808c0000 0000000000000000 0000000000000001 65ac85163f3bdc4a
65ac85163f3bdc4a 0000000000000000 90000000ff667ab8 ffffffff808c0000
00000000000003f8 ffffffff808d0000 00000000000000d1 0000000000000000
000000000000003c 0000000000000000 ffffffff808c8ca8 ffffffff808d0000
ffffffff808d0000 ffffffff80810000 fffffc0000000000 ffffffff80785c30
0000000000000009 0000000000000051 90000000ff667eb0 90000000ff667db0
000000007fe0d938 0000000000000018 ffffffff80449958 0000000020052798
ffffffff808c0000 90000000ff664000 90000000ff667ab0 00000000100c0000
ffffffff80698810 0000000000000000 0000000000000000 0000000000000000
0000000000000000 0000000000000000 ffffffff8010d02c 65ac85163f3bdc4a
...
Call Trace:
[<ffffffff8010d02c>] show_stack+0x9c/0x130
[<ffffffff80698810>] dump_stack+0x90/0xd0
[<ffffffff80137b78>] __warn+0x100/0x118
[<ffffffff80137bdc>] warn_slowpath_fmt+0x4c/0x70
[<ffffffff8021e4a8>] usercopy_warn+0x98/0xe8
[<ffffffff8021e68c>] __check_object_size+0xfc/0x250
[<ffffffff801bbfb8>] put_compat_sigset+0x30/0x88
[<ffffffff8011af24>] setup_rt_frame_n32+0xc4/0x160
[<ffffffff8010b8b4>] do_signal+0x19c/0x230
[<ffffffff8010c408>] do_notify_resume+0x60/0x78
[<ffffffff80106f50>] work_notifysig+0x10/0x18
---[ end trace 88fffbf69147f48a ]---
Commit 5905429ad8 ("fork: Provide usercopy whitelisting for
task_struct") noted that:
"While the blocked and saved_sigmask fields of task_struct are copied to
userspace (via sigmask_to_save() and setup_rt_frame()), it is always
copied with a static length (i.e. sizeof(sigset_t))."
However, this is not true in the case of compat signals, whose sigset
is copied by put_compat_sigset and receives size as an argument.
At most call sites, put_compat_sigset is copying a sigset from the
current task_struct. This triggers a warning when
CONFIG_HARDENED_USERCOPY is active. However, by marking this function as
static inline, the warning can be avoided because in all of these cases
the size is constant at compile time, which is allowed. The only site
where this is not the case is handling the rt_sigpending syscall, but
there the copy is being made from a stack local variable so does not
trigger the warning.
Move put_compat_sigset to compat.h, and mark it static inline. This
fixes the WARN on MIPS.
Fixes: afcc90f862 ("usercopy: WARN() on slab cache usercopy region violations")
Signed-off-by: Matt Redfearn <matt.redfearn@mips.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: "Dmitry V . Levin" <ldv@altlinux.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: kernel-hardening@lists.openwall.com
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/18639/
Signed-off-by: James Hogan <jhogan@kernel.org>
Pull parisc fixes from Helge Deller:
- a patch to change the ordering of cache and TLB flushes to hopefully
fix the random segfaults we very rarely face (by Dave Anglin).
- a patch to hide the virtual kernel memory layout due to security
reasons.
- two small patches to make the kernel run more smoothly under qemu.
* 'parisc-4.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Reduce irq overhead when run in qemu
parisc: Use cr16 interval timers unconditionally on qemu
parisc: Check if secondary CPUs want own PDC calls
parisc: Hide virtual kernel memory layout
parisc: Fix ordering of cache and TLB flushes
Pull xen fixes from Juergen Gross:
"Five minor fixes for Xen-specific drivers"
* tag 'for-linus-4.16a-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
pvcalls-front: 64-bit align flags
x86/xen: add tty0 and hvc0 as preferred consoles for dom0
xen-netfront: Fix hang on device removal
xen/pirq: fix error path cleanup when binding MSIs
xen/pvcalls: fix null pointer dereference on map->sock
Pull ceph fixes from Ilya Dryomov:
"A cap handling fix from Zhi that ensures that metadata writeback isn't
delayed and three error path memory leak fixups from Chengguang"
* tag 'ceph-for-4.16-rc4' of git://github.com/ceph/ceph-client:
ceph: fix potential memory leak in init_caches()
ceph: fix dentry leak when failing to init debugfs
libceph, ceph: avoid memory leak when specifying same option several times
ceph: flush dirty caps of unlinked inode ASAP
Pull block fixes from Jens Axboe:
"A collection of fixes for this series. This is a little larger than
usual at this time, but that's mainly because I was out on vacation
last week. Nothing in here is major in any way, it's just two weeks of
fixes. This contains:
- NVMe pull from Keith, with a set of fixes from the usual suspects.
- mq-deadline zone unlock fix from Damien, fixing an issue with the
SMR zone locking added for 4.16.
- two bcache fixes sent in by Michael, with changes from Coly and
Tang.
- comment typo fix from Eric for blktrace.
- return-value error handling fix for nbd, from Gustavo.
- fix a direct-io case where we don't defer to a completion handler,
making us sleep from IRQ device completion. From Jan.
- a small series from Jan fixing up holes around handling of bdev
references.
- small set of regression fixes from Jiufei, mostly fixing problems
around the gendisk pointer -> partition index change.
- regression fix from Ming, fixing a boundary issue with the discard
page cache invalidation.
- two-patch series from Ming, fixing both a core blk-mq-sched and
kyber issue around token freeing on a requeue condition"
* tag 'for-linus-20180302' of git://git.kernel.dk/linux-block: (24 commits)
block: fix a typo
block: display the correct diskname for bio
block: fix the count of PGPGOUT for WRITE_SAME
mq-deadline: Make sure to always unlock zones
nvmet: fix PSDT field check in command format
nvme-multipath: fix sysfs dangerously created links
nbd: fix return value in error handling path
bcache: fix kcrashes with fio in RAID5 backend dev
bcache: correct flash only vols (check all uuids)
blktrace_api.h: fix comment for struct blk_user_trace_setup
blockdev: Avoid two active bdev inodes for one device
genhd: Fix BUG in blkdev_open()
genhd: Fix use after free in __blkdev_get()
genhd: Add helper put_disk_and_module()
genhd: Rename get_disk() to get_disk_and_module()
genhd: Fix leaked module reference for NVME devices
direct-io: Fix sleep in atomic due to sync AIO
nvme-pci: Fix nvme queue cleanup if IRQ setup fails
block: kyber: fix domain token leak during requeue
blk-mq: don't call io sched's .requeue_request when requeueing rq to ->dispatch
...
Commit 16c513b134
("selftests: memory-hotplug: silence test command echo")
introduced regression in emit_tests and results in the following
failure when selftests are installed and run. Fix it.
Running tests in memory-hotplug
========================================
./run_kselftest.sh: line 121: @./mem-on-off-test.sh: No such file or
directory
selftests: memory-hotplug [FAIL]
Fixes: 16c513b134 (selftests: memory-hotplug: silence test command echo")
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Tested-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Pull MMC fixes from Ulf Hansson:
"MMC core:
- mmc: core: Avoid hang when claiming host
MMC host:
- dw_mmc: Avoid hang when accessing registers
- dw_mmc: Fix out-of-bounds access for slot's caps
- dw_mmc-k3: Fix out-of-bounds access through DT alias
- sdhci-pci: Fix S0i3 for Intel BYT-based controllers"
* tag 'mmc-v4.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: core: Avoid hanging to claim host for mmc via some nested calls
mmc: dw_mmc: Avoid accessing registers in runtime suspended state
mmc: dw_mmc: Fix out-of-bounds access for slot's caps
mmc: dw_mmc: Factor out dw_mci_init_slot_caps
mmc: dw_mmc-k3: Fix out-of-bounds access through DT alias
mmc: sdhci-pci: Fix S0i3 for Intel BYT-based controllers
Pull power management fixes from Rafael Wysocki:
"These fix three issues in cpufreq drivers: one recent regression, one
leftover Kconfig dependency and one old but "stable" material.
Specifics:
- Make the task scheduler load and utilization signals be
frequency-invariant again after recent changes in the SCPI cpufreq
driver (Dietmar Eggemann).
- Drop an unnecessary leftover Kconfig dependency from the SCPI
cpufreq driver (Sudeep Holla).
- Fix the initialization of the s3c24xx cpufreq driver (Viresh
Kumar)"
* tag 'pm-4.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: s3c24xx: Fix broken s3c_cpufreq_init()
cpufreq: scpi: Fix incorrect arm_big_little config dependency
cpufreq: scpi: invoke frequency-invariance setter function
When recursive inclusion is detected, the line number of the last
'included from:' is wrong.
[Test Case]
Kconfig:
-------->8--------
source "Kconfig2"
-------->8--------
Kconfig2:
-------->8--------
source "Kconfig3"
-------->8--------
Kconfig3:
-------->8--------
source "Kconfig"
-------->8--------
[Result]
$ make allyesconfig
scripts/kconfig/conf --allyesconfig Kconfig
Kconfig:1: recursive inclusion detected. Inclusion path:
current file : 'Kconfig'
included from: 'Kconfig3:1'
included from: 'Kconfig2:1'
included from: 'Kconfig:3'
scripts/kconfig/Makefile:89: recipe for target 'allyesconfig' failed
make[1]: *** [allyesconfig] Error 1
Makefile:512: recipe for target 'allyesconfig' failed
make: *** [allyesconfig] Error 2
where we expect
current file : 'Kconfig'
included from: 'Kconfig3:1'
included from: 'Kconfig2:1'
included from: 'Kconfig:1'
The 'iter->lineno+1' in the second fpinrtf() should be 'iter->lineno-1'.
I refactored the code to merge the two fprintf() calls.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Ulf Magnusson <ulfalizer@gmail.com>
Johannes Berg says:
====================
Three more patches:
* fix for a regression in 4-addr mode with fast-RX
* fix for a Kconfig problem with the new regdb
* fix for the long-standing TCP performance issue in
wifi using the new sk_pacing_shift_update()
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 0933a578cd ("rds: tcp: use sock_create_lite() to create the
accept socket") has a reference counting issue in TCP socket creation
when accepting a new connection. The code uses sock_create_lite() to
create a kernel socket. But it does not do __module_get() on the
socket owner. When the connection is shutdown and sock_release() is
called to free the socket, the owner's reference count is decremented
and becomes incorrect. Note that this bug only shows up when the socket
owner is configured as a kernel module.
v2: Update comments
Fixes: 0933a578cd ("rds: tcp: use sock_create_lite() to create the accept socket")
Signed-off-by: Ka-Cheong Poon <ka-cheong.poon@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When running s390 images with 'compat' processes, the following
BUG is seen repeatedly.
BUG: non-zero pgtables_bytes on freeing mm: -16384
Bisect points to commit b4e98d9ac7 ("mm: account pud page tables").
Analysis shows that init_new_context() is called with
mm->context.asce_limit set to _REGION3_SIZE. In this situation,
pgtables_bytes remains set to 0 and is not increased. The message is
displayed when the affected process dies and mm_dec_nr_puds() is called.
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Fixes: b4e98d9ac7 ("mm: account pud page tables")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The error message:
[Fri Feb 16 13:42:13 2018] i2c-thunderx 0000:01:09.4: unhandled state: 0
is mis-leading as state 0 (bus error) is not an unknown state.
Return -EIO as before but avoid printing the message. Also rename
STAT_ERROR to STATE_BUS_ERROR.
Signed-off-by: Jan Glauber <jglauber@cavium.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
When run under QEMU, calling mfctl(16) creates some overhead because the
qemu timer has to be scaled and moved into the register. This patch
reduces the number of calls to mfctl(16) by moving the calls out of the
loops.
Additionally, increase the minimal time interval to 8000 cycles instead
of 500 to compensate possible QEMU delays when delivering interrupts.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # 4.14+
When running on qemu we know that the (emulated) cr16 cpu-internal
clocks are syncronized. So let's use them unconditionally on qemu.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # 4.14+
The architecture specification says (for 64-bit systems): PDC is a per
processor resource, and operating system software must be prepared to
manage separate pointers to PDCE_PROC for each processor. The address
of PDCE_PROC for the monarch processor is stored in the Page Zero
location MEM_PDC. The address of PDCE_PROC for each non-monarch
processor is passed in gr26 when PDCE_RESET invokes OS_RENDEZ.
Currently we still use one PDC for all CPUs, but in case we face a
machine which is following the specification let's warn about it.
Signed-off-by: Helge Deller <deller@gmx.de>
The change to flush_kernel_vmap_range() wasn't sufficient to avoid the
SMP stalls. The problem is some drivers call these routines with
interrupts disabled. Interrupts need to be enabled for flush_tlb_all()
and flush_cache_all() to work. This version adds checks to ensure
interrupts are not disabled before calling routines that need IPI
interrupts. When interrupts are disabled, we now drop into slower code.
The attached change fixes the ordering of cache and TLB flushes in
several cases. When we flush the cache using the existing PTE/TLB
entries, we need to flush the TLB after doing the cache flush. We don't
need to do this when we flush the entire instruction and data caches as
these flushes don't use the existing TLB entries. The same is true for
tmpalias region flushes.
The flush_kernel_vmap_range() and invalidate_kernel_vmap_range()
routines have been updated.
Secondly, we added a new purge_kernel_dcache_range_asm() routine to
pacache.S and use it in invalidate_kernel_vmap_range(). Nominally,
purges are faster than flushes as the cache lines don't have to be
written back to memory.
Hopefully, this is sufficient to resolve the remaining problems due to
cache speculation. So far, testing indicates that this is the case. I
did work up a patch using tmpalias flushes, but there is a performance
hit because we need the physical address for each page, and we also need
to sequence access to the tmpalias flush code. This increases the
probability of stalls.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Cc: stable@vger.kernel.org # 4.9+
Signed-off-by: Helge Deller <deller@gmx.de>
With the alc289, the Pin 0x1b is Headphone-Mic, so we should assign
ALC269_FIXUP_DELL4_MIC_NO_PRESENCE rather than
ALC225_FIXUP_DELL1_MIC_NO_PRESENCE to it. And this change is suggested
by Kailang of Realtek and is verified on the machine.
Fixes: 3f2f7c553d ("ALSA: hda - Fix headset mic detection problem for two Dell machines")
Cc: Kailang Yang <kailang@realtek.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
In the scheduler config command, the meaning of tid == 0xf was intended
to indicate the configuration is for management frames. However,
tid == 0xf was also used for the multicast queue that was meant only
for multicast data frames, which resulted with the FW not encrypting
multicast data frames.
As multicast frames do not have a QoS header, fix this by setting
tid == 0, to indicate that this is a data queue and not management
one.
Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Multicast frames for NL80211_IFTYPE_AP and NL80211_IFTYPE_ADHOC were
directed to the broadcast station, however, as the broadcast station
did not have keys configured, these frames were sent unencrypted.
Fix this by using the multicast station which is the station for which
encryption keys are configured.
Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
When the GTK is installed, we install it to HW with the
station ID of the AP.
Mac80211 will try to remove it only after the AP sta is
removed, which will result in a failure to remove key
since we do not have any station for it.
This is a valid situation, but a previous commit removed
the early return and added a return with error value, which
resulted in an error message that is confusing to users.
Remove the error return value.
Fixes: 85aeb58cec ("iwlwifi: mvm: Enable security on new TX API")
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Trying to collect firmware debug data while firmware
is not loaded causes various errors (e.g. failing NIC access).
This causes even a bigger issue if at that time the
HW radio is off.
In that case, when later turning the radio on, the Driver
fails to read the HW (registers contain garbage values).
(It may be that the CSR_GP_CNTRL_REG_FLAG_RFKILL_WAKE_L1A_EN
bit is cleared on faulty NIC access - since the same behavior
was seen in HW RFKILL toggling before setting that bit.)
Signed-off-by: Shaul Triebitz <shaul.triebitz@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
We should add the multicast station before adding the
broadcast station.
However, in older FW, the firmware will start beaconing
when we add the multicast station, and since the broadcast
station is not added at this point so the transmission
of the beacon will fail on assert 0x2b00.
This is fixed in later firmware, so make the order
of addition depend on the TLV.
Fixes: 26d6c16bed ("iwlwifi: mvm: add multicast station")
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
It was assumed that apply_time==0 implies immediate scheduling, which is
wrong. Instead, the fw expects the START_IMMEDIATELY flag to be set.
Otherwise, this resulted in 0x3063 assert.
Fix that.
While at it rename the T2_V2_START_IMMEDIATELY to
TE_V2_START_IMMEDIATELY.
Fixes: f5d8f50f27 ("iwlwifi: mvm: Fix channel switch in case of count <= 1")
Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
We don't have enough room in the TX command for a CCMP 256
key, and need to use key from table.
Fixes: 3264bf032bd9 ("[BUGFIX] iwlwifi: mvm: Fix CCMP IV setting")
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
While entering to D3 mode there is a gap between the time the
driver handles the D3_CONFIG_CMD response to the time the host is going
to sleep.
In between there might be cases which MARKER_CMD can tailgate.
Also during resume flow the MARKER_CMD might get sent while D0I3_CMD
is being handled in the FW.
Cancel MARKER_CMD timer and set it again properly during suspend
resume flows to prevent this command from being sent accidentlly.
Signed-off-by: Haim Dreyfuss <haim.dreyfuss@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
This reverts commit c301b327ae.
While this works splendidly on rk3399-gru devices using the cros-ec
extcon, other rk3399-based devices using the fusb302 or no power-delivery
controller at all don't probe at all anymore, as the typec-phy currently
always expects the extcon to be available and therefore defers probing
indefinitly on these.
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
The current code for initializing the VRMA (virtual real memory area)
for HPT guests requires the page size of the backing memory to be one
of 4kB, 64kB or 16MB. With a radix host we have the possibility that
the backing memory page size can be 2MB or 1GB. In these cases, if the
guest switches to HPT mode, KVM will not initialize the VRMA and the
guest will fail to run.
In fact it is not necessary that the VRMA page size is the same as the
backing memory page size; any VRMA page size less than or equal to the
backing memory page size is acceptable. Therefore we now choose the
largest page size out of the set {4k, 64k, 16M} which is not larger
than the backing memory page size.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
This fixes several bugs in the radix page fault handler relating to
the way large pages in the memory backing the guest were handled.
First, the check for large pages only checked for explicit huge pages
and missed transparent huge pages. Then the check that the addresses
(host virtual vs. guest physical) had appropriate alignment was
wrong, meaning that the code never put a large page in the partition
scoped radix tree; it was always demoted to a small page.
Fixing this exposed bugs in kvmppc_create_pte(). We were never
invalidating a 2MB PTE, which meant that if a page was initially
faulted in without write permission and the guest then attempted
to store to it, we would never update the PTE to have write permission.
If we find a valid 2MB PTE in the PMD, we need to clear it and
do a TLB invalidation before installing either the new 2MB PTE or
a pointer to a page table page.
This also corrects an assumption that get_user_pages_fast would set
the _PAGE_DIRTY bit if we are writing, which is not true. Instead we
mark the page dirty explicitly with set_page_dirty_lock(). This
also means we don't need the dirty bit set on the host PTE when
providing write access on a read fault.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Daniel Borkmann says:
====================
pull-request: bpf 2018-02-28
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) Add schedule points and reduce the number of loop iterations
the test_bpf kernel module is performing in order to not hog
the CPU for too long, from Eric.
2) Fix an out of bounds access in tail calls in the ppc64 BPF
JIT compiler, from Daniel.
3) Fix a crash on arm64 on unaligned BPF xadd operations that
could be triggered via interpreter and JIT, from Daniel.
Please not that once you merge net into net-next at some point, there
is a minor merge conflict in test_verifier.c since test cases had
been added at the end in both trees. Resolution is trivial: keep all
the test cases from both trees.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
If ethtool_ops->get_fecparam returns an error, pass that error on to the
user, rather than ignoring it.
Fixes: 1a5f3da20b ("net: ethtool: add support for forward error correction modes")
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When ip_error() is called the device is the l3mdev master instead of the
original device. So the forwarding check should be on the original one.
Changes from v2:
- Handle the original device disappearing (per David Ahern)
- Minimize the change in code order
Changes from v1:
- Only need to reset the device on which __in_dev_get_rcu() is done (per
David Ahern).
Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Setting an interface into a VRF fails with 'RTNETLINK answers: File
exists' if one of its VLAN interfaces is already in the same VRF.
As the VRF is an upper device of the VLAN interface, it is also showing
up as an upper device of the interface itself. The solution is to
restrict this check to devices other than master. As only one master
device can be linked to a device, the check in this case is that the
upper device (VRF) being linked to is not the same as the master device
instead of it not being any one of the upper devices.
The following example shows an interface ens12 (with a VLAN interface
ens12.10) being set into VRF green, which behaves as expected:
# ip link add link ens12 ens12.10 type vlan id 10
# ip link set dev ens12 master vrfgreen
# ip link show dev ens12
3: ens12: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel
master vrfgreen state UP mode DEFAULT group default qlen 1000
link/ether 52:54:00:4c:a0:45 brd ff:ff:ff:ff:ff:ff
But if the VLAN interface has previously been set into the same VRF,
then setting the interface into the VRF fails:
# ip link set dev ens12 nomaster
# ip link set dev ens12.10 master vrfgreen
# ip link show dev ens12.10
39: ens12.10@ens12: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500
qdisc noqueue master vrfgreen state UP mode DEFAULT group default
qlen 1000 link/ether 52:54:00:4c:a0:45 brd ff:ff:ff:ff:ff:ff
# ip link set dev ens12 master vrfgreen
RTNETLINK answers: File exists
The workaround is to move the VLAN interface back into the default VRF
beforehand, but it has to be shut first so as to avoid the risk of
traffic leaking from the VRF. This fix avoids needing this workaround.
Signed-off-by: Mike Manning <mmanning@att.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some required information is not exposed to userspace currently (eg. the
PASID), pass this information back, along with other information which
is currently communicated via sysfs, which saves some parsing effort in
userspace.
Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
commit a4239945b8 ("scsi: qla2xxx: Add switch command to simplify
fabric discovery") introduced regression when it did not consider
FC-NVMe code path which broke NVMe LUN discovery.
Fixes: a4239945b8 ("scsi: qla2xxx: Add switch command to simplify fabric discovery")
Signed-off-by: Darren Trapp <darren.trapp@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When converting __scsi_error_from_host_byte() to BLK_STS error codes the
case DID_OK was forgotten, resulting in it always returning an error.
Fixes: 2a842acab1 ("block: introduce new block status code type")
Cc: Doug Gilbert <dgilbert@interlog.com>
Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The fcport flags FCF_ASYNC_ACTIVE and FCF_ASYNC_SENT are used to
throttle the state machine, so we need to ensure to always set and unset
them correctly. Not doing so will lead to the state machine getting
confused and no login attempt into remote ports.
Cc: Quinn Tran <quinn.tran@cavium.com>
Cc: Himanshu Madhani <himanshu.madhani@cavium.com>
Fixes: 3dbec59bdf ("scsi: qla2xxx: Prevent multiple active discovery commands per session")
Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Commit d8630bb95f ('Serialize session deletion by using work_lock')
tries to fixup a deadlock when deleting sessions, but fails to take into
account the locking rules. This patch resolves the situation by
introducing a separate lock for processing the GNLIST response, and
ensures that sess_lock is released before calling
qlt_schedule_sess_delete().
Cc: Himanshu Madhani <himanshu.madhani@cavium.com>
Cc: Quinn Tran <quinn.tran@cavium.com>
Fixes: d8630bb95f ("scsi: qla2xxx: Serialize session deletion by using work_lock")
Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The subpage_prot syscall is only functional when the system is using
the Hash MMU. Since commit 5b2b807147 ("powerpc/mm: Invalidate
subpage_prot() system call on radix platforms") it returns ENOENT when
the Radix MMU is active. Currently this just makes the test fail.
Additionally the syscall is not available if the kernel is built with
4K pages, or if CONFIG_PPC_SUBPAGE_PROT=n, in which case it returns
ENOSYS because the syscall is missing entirely.
So check explicitly for ENOENT and ENOSYS and skip if we see either of
those.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
'--build-id' is passed to $(LD), so it should be tested by 'ld-option'.
This seems a kind of misconversion when ld-option was renamed to
cc-ldoption.
Commit f86fd30660 ("kbuild: rename ld-option to cc-ldoption") renamed
all instances of 'ld-option' to 'cc-ldoption'.
Then, commit 691ef3e7fd ("kbuild: introduce ld-option") re-added
'ld-option' as a new implementation.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
GCC_PLUGINS_CFLAGS is already in the environment, so it is superfluous
to add it in commandline of final build of init/.
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
The named choice is not used in the kernel tree, but if it were used,
it would not be freed.
The intention of the named choice can be seen in the log of
commit 5a1aa8a1af ("kconfig: add named choice group").
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Ulf Magnusson <ulfalizer@gmail.com>
If CONFIG_USE_BUILTIN_DTB is enabled, but CONFIG_BUILTIN_DTB_SOURCE
is empty (for example, allmodconfig), it fails to build, like this:
make[2]: *** No rule to make target 'arch/sh/boot/dts/.dtb.o',
needed by 'arch/sh/boot/dts/built-in.o'. Stop.
Surround obj-y with ifneq ... endif.
I replaced $(CONFIG_USE_BUILTIN_DTB) with 'y' since this is always
the case from the following code from arch/sh/Makefile:
core-$(CONFIG_USE_BUILTIN_DTB) += arch/sh/boot/dts/
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
The 'defconfig_list' is a weird attribute. If the '.config' is
missing, conf_read_simple() iterates over all visible defaults,
then it uses the first one for which fopen() succeeds.
config DEFCONFIG_LIST
string
depends on !UML
option defconfig_list
default "/lib/modules/$UNAME_RELEASE/.config"
default "/etc/kernel-config"
default "/boot/config-$UNAME_RELEASE"
default "$ARCH_DEFCONFIG"
default "arch/$ARCH/defconfig"
However, like other symbols, the first visible default is always
written out to the .config file. This might be different from what
has been actually used.
For example, on my machine, the third one "/boot/config-$UNAME_RELEASE"
is opened, like follows:
$ rm .config
$ make oldconfig 2>/dev/null
scripts/kconfig/conf --oldconfig Kconfig
#
# using defaults found in /boot/config-4.4.0-112-generic
#
*
* Restart config...
*
*
* IRQ subsystem
*
Expose irq internals in debugfs (GENERIC_IRQ_DEBUGFS) [N/y/?] (NEW)
However, the resulted .config file contains the first one since it is
visible:
$ grep CONFIG_DEFCONFIG_LIST .config
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"
In order to stop confusing people, prevent this CONFIG option from
being written to the .config file.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Ulf Magnusson <ulfalizer@gmail.com>
Pull drm fixes from Dave Airlie:
"Pretty much run of the mill drm fixes.
amdgpu:
- power management fixes
- some display fixes
- one ppc 32-bit dma fix
i915:
- two display fixes
- three gem fixes
sun4i:
- display regression fixes
nouveau:
- display regression fix
virtio-gpu:
- dumb airlied ioctl fix"
* tag 'drm-fixes-for-v4.16-rc4' of git://people.freedesktop.org/~airlied/linux: (25 commits)
drm/amdgpu: skip ECC for SRIOV in gmc late_init
drm/amd/amdgpu: Correct VRAM width for APUs with GMC9
drm/amdgpu: fix&cleanups for wb_clear
drm/amdgpu: Correct sdma_v4 get_wptr(v2)
drm/amd/powerplay: fix power over limit on Fiji
drm/amdgpu:Fixed wrong emit frame size for enc
drm/amdgpu: move WB_FREE to correct place
drm/amdgpu: only flush hotplug work without DC
drm/amd/display: check for ipp before calling cursor operations
drm/i915: Make global seqno known in i915_gem_request_execute tracepoint
drm/i915: Clear the in-use marker on execbuf failure
drm/i915/cnl: Fix PORT_TX_DW5/7 register address
drm/i915/audio: fix check for av_enc_map overflow
drm/i915: Fix rsvd2 mask when out-fence is returned
virtio-gpu: fix ioctl and expose the fixed status to userspace.
drm/sun4i: Protect the TCON pixel clocks
drm/sun4i: Enable the output on the pins (tcon0)
drm/nouveau: prefer XBGR2101010 for addfb ioctl
drm/radeon: insist on 32-bit DMA for Cedar on PPC64/PPC64LE
drm/amd/display: VGA black screen from s3 when attached to hook
...
Pull ARC fixes from Vineet Gupta:
- MCIP aka ARconnect fixes for SMP builds [Euginey]
- preventive fix for SLC (L2 cache) flushing [Euginey]
- Kconfig default fix [Ulf Magnusson]
- trailing semicolon fixes [Luis de Bethencourt]
- other assorted minor fixes
* tag 'arc-4.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: setup cpu possible mask according to possible-cpus dts property
ARC: mcip: update MCIP debug mask when the new cpu came online
ARC: mcip: halt GFRC counter when ARC cores halt
ARCv2: boot log: fix HS48 release number
arc: dts: use 'atmel' as manufacturer for at24 in axs10x_mb
ARC: Fix malformed ARC_EMUL_UNALIGNED default
ARC: boot log: Fix trailing semicolon
ARC: dw2 unwind: Fix trailing semicolon
ARC: Enable fatal signals on boot for dev platforms
ARCv2: Don't pretend we may set L-bit in STATUS32 with kflag instruction
ARCv2: cache: fix slc_entire_op: flush only instead of flush-n-inv
Fix xfs_file_iomap_begin to trylock the ilock if IOMAP_NOWAIT is passed,
so that we don't block io_submit callers.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
There is no reason to take the ilock exclusively at the start of
xfs_file_iomap_begin for direct I/O, given that it will be demoted
just before calling xfs_iomap_write_direct anyway.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
The iomap zeroing interface is smart enough to skip zeroing holes or
unwritten extents. Don't subvert this logic for reflink files.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
We've got a kernel panic when using sata disk with sas controller:
[115946.152283] Unable to handle kernel NULL pointer dereference at virtual address 000007d8
[115946.223963] CPU: 0 PID: 22175 Comm: kworker/0:1 Tainted: G W OEL 4.14.0 #1
[115946.232925] Workqueue: events ata_scsi_hotplug
[115946.237938] task: ffff8021ee50b180 task.stack: ffff00000d5d0000
[115946.244717] PC is at sas_find_dev_by_rphy+0x44/0x114
[115946.250224] LR is at sas_find_dev_by_rphy+0x3c/0x114
......
[115946.355701] Process kworker/0:1 (pid: 22175, stack limit = 0xffff00000d5d0000)
[115946.363369] Call trace:
[115946.456356] [<ffff000008878a9c>] sas_find_dev_by_rphy+0x44/0x114
[115946.462908] [<ffff000008878b8c>] sas_target_alloc+0x20/0x5c
[115946.469408] [<ffff00000885a31c>] scsi_alloc_target+0x250/0x308
[115946.475781] [<ffff00000885ba30>] __scsi_add_device+0xb0/0x154
[115946.481991] [<ffff0000088b520c>] ata_scsi_scan_host+0x180/0x218
[115946.488367] [<ffff0000088b53d8>] ata_scsi_hotplug+0xb0/0xcc
[115946.494801] [<ffff0000080ebd70>] process_one_work+0x144/0x390
[115946.501115] [<ffff0000080ec100>] worker_thread+0x144/0x418
[115946.507093] [<ffff0000080f2c98>] kthread+0x10c/0x138
[115946.512792] [<ffff0000080855dc>] ret_from_fork+0x10/0x18
We found that Ding Xiang has reported a similar bug before:
https://patchwork.kernel.org/patch/9179817/
And this bug still exists in mainline. Since libsas handles hotplug and
device adding/removing itself, do not need to schedule ata hot plug task
here if it is a sas host.
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Cc: Ding Xiang <dingxiang@huawei.com>
Cc: stable@vger.kernel.org
Signed-off-by: Tejun Heo <tj@kernel.org>
Linux (among the others) has checks to make sure that certain features
aren't enabled on a certain family/model/stepping if the microcode version
isn't greater than or equal to a known good version.
By exposing the real microcode version, we're preventing buggy guests that
don't check that they are running virtualized (i.e., they should trust the
hypervisor) from disabling features that are effectively not buggy.
Suggested-by: Filippo Sironi <sironi@amazon.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Liran Alon <liran.alon@oracle.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Commit 1fdb926974 ("Bluetooth: btusb: Use DMI matching for QCA
reset_resume quirking"), added the Lenovo Yoga 920 to the
btusb_needs_reset_resume_table.
Testing has shown that this is a false positive and the problems where
caused by issues with the initial fix: commit fd865802c6 ("Bluetooth:
btusb: fix QCA Rome suspend/resume"), which has already been reverted.
So the QCA Rome BT in the Yoga 920 does not need a reset-resume quirk at
all and this commit removes it from the btusb_needs_reset_resume_table.
Note that after this commit the btusb_needs_reset_resume_table is now
empty. It is kept around on purpose, since this whole series of commits
started for a reason and there are actually broken platforms around,
which need to be added to it.
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1514836
Fixes: 1fdb926974 ("Bluetooth: btusb: Use DMI matching for QCA ...")
Cc: stable@vger.kernel.org
Cc: Brian Norris <briannorris@chromium.org>
Cc: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Kevin Fenzi <kevin@scrye.com>
Suggested-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Pull x86 platform drivers fixes from Andy Shevchenko:
- fix a regression on laptops like Dell XPS 9360 where keyboard stopped
working.
- correct sysfs wakeup attribute after removal of some drivers to
reflect that they are not able to wake system up anymore.
* tag 'platform-drivers-x86-v4.16-5' of git://git.infradead.org/linux-platform-drivers-x86:
platform/x86: wmi: Fix misuse of vsprintf extension %pULL
platform/x86: intel-hid: Reset wakeup capable flag on removal
platform/x86: intel-vbtn: Reset wakeup capable flag on removal
platform/x86: intel-vbtn: Only activate tablet mode switch on 2-in-1's
The newly introudced ip_min_valid_pmtu variable is only used when
CONFIG_SYSCTL is set:
net/ipv4/route.c:135:12: error: 'ip_min_valid_pmtu' defined but not used [-Werror=unused-variable]
This moves it to the other variables like it, to avoid the harmless
warning.
Fixes: c7272c2f12 ("net: ipv4: don't allow setting net.ipv4.route.min_pmtu below 68")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull MD bugfixes from Shaohua Li:
- fix raid5-ppl flush request handling hang from Artur
- fix a potential deadlock in raid5/10 reshape from BingJing
- fix a deadlock for dm-raid from Heinz
- fix two md-cluster of raid10 from Lidong and Guoqing
- fix a NULL deference problem in device removal from Neil
- fix a NULL deference problem in raid1/raid10 in specific condition
from Yufen
- other cleanup and fixes
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md:
md/raid1: fix NULL pointer dereference
md: fix a potential deadlock of raid5/raid10 reshape
md-cluster: choose correct label when clustered layout is not supported
md: raid5: avoid string overflow warning
raid5-ppl: fix handling flush requests
md raid10: fix NULL deference in handle_write_completed()
md: only allow remove_and_add_spares when no sync_thread running.
md: document lifetime of internal rdev pointer.
md: fix md_write_start() deadlock w/o metadata devices
MD: Free bioset when md_run fails
raid10: change the size of resync window for clustered raid
md-multipath: Use seq_putc() in multipath_status()
md/raid1: Fix trailing semicolon
md/raid5: simplify uninitialization of shrinker
Pull printk fix from Petr Mladek:
"Make sure that we wake up userspace loggers. This fixes a race
introduced by the console waiter logic during this merge window"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk:
printk: Wake klogd when passing console_lock owner
%pULL doesn't officially exist but %pUL does.
Miscellanea:
o Add missing newlines to a couple logging messages
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
In order to determine if LFENCE is a serializing instruction on AMD
processors, MSR 0xc0011029 (MSR_F10H_DECFG) must be read and the state
of bit 1 checked. This patch will add support to allow a guest to
properly make this determination.
Add the MSR feature callback operation to svm.c and add MSR 0xc0011029
to the list of MSR-based features. If LFENCE is serializing, then the
feature is supported, allowing the hypervisor to set the value of the
MSR that guest will see. Support is also added to write (hypervisor only)
and read the MSR value for the guest. A write by the guest will result in
a #GP. A read by the guest will return the value as set by the host. In
this way, the support to expose the feature to the guest is controlled by
the hypervisor.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Provide a new KVM capability that allows bits within MSRs to be recognized
as features. Two new ioctls are added to the /dev/kvm ioctl routine to
retrieve the list of these MSRs and then retrieve their values. A kvm_x86_ops
callback is used to determine support for the listed MSR-based features.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[Tweaked documentation. - Radim]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
84676c1f21 ("genirq/affinity: assign vectors to all possible CPUs")
has switched to do irq vectors spread among all possible CPUs, so
pass num_possible_cpus() as max vecotrs to be assigned.
For example, in a 8 cores system, 0~3 online, 4~8 offline/not present,
see 'lscpu':
[ming@box]$lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 1
Core(s) per socket: 2
Socket(s): 2
NUMA node(s): 2
...
NUMA node0 CPU(s): 0-3
NUMA node1 CPU(s):
...
1) before this patch, follows the allocated vectors and their affinity:
irq 47, cpu list 0,4
irq 48, cpu list 1,6
irq 49, cpu list 2,5
irq 50, cpu list 3,7
2) after this patch, follows the allocated vectors and their affinity:
irq 43, cpu list 0
irq 44, cpu list 1
irq 45, cpu list 2
irq 46, cpu list 3
irq 47, cpu list 4
irq 48, cpu list 6
irq 49, cpu list 5
irq 50, cpu list 7
Cc: Keith Busch <keith.busch@intel.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <keith.busch@intel.com>
ashmem_mutex create a chain of dependencies like so:
(1)
mmap syscall ->
mmap_sem -> (acquired)
ashmem_mmap
ashmem_mutex (try to acquire)
(block)
(2)
llseek syscall ->
ashmem_llseek ->
ashmem_mutex -> (acquired)
inode_lock ->
inode->i_rwsem (try to acquire)
(block)
(3)
getdents ->
iterate_dir ->
inode_lock ->
inode->i_rwsem (acquired)
copy_to_user ->
mmap_sem (try to acquire)
There is a lock ordering created between mmap_sem and inode->i_rwsem
causing a lockdep splat [2] during a syzcaller test, this patch fixes
the issue by unlocking the mutex earlier. Functionally that's Ok since
we don't need to protect vfs_llseek.
[1] https://patchwork.kernel.org/patch/10185031/
[2] https://lkml.org/lkml/2018/1/10/48
Acked-by: Todd Kjos <tkjos@google.com>
Cc: Arve Hjonnevag <arve@android.com>
Cc: stable@vger.kernel.org
Reported-by: syzbot+8ec30bb7bf1a981a2012@syzkaller.appspotmail.com
Signed-off-by: Joel Fernandes <joelaf@google.com>
Acked-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Triggering PPC EEH detection and handling requires a memory mapped read
failure. The NVMe driver removed the periodic health check MMIO, so
there's no early detection mechanism to trigger the recovery. Instead,
the detection now happens when the nvme driver handles an IO timeout
event. This takes the pci channel offline, so we do not want the driver
to proceed with escalating its own recovery efforts that may conflict
with the EEH handler.
This patch ensures the driver will observe the channel was set to offline
after a failed MMIO read and resets the IO timer so the EEH handler has
a chance to recover the device.
Signed-off-by: Wen Xiong <wenxiong@linux.vnet.ibm.com>
[updated change log]
Signed-off-by: Keith Busch <keith.busch@intel.com>
Pull sound fixes from Takashi Iwai:
"The only core change is the fix for possible memory corruption by ALSA
ctl API since 4.14 kernel due to a thinko.
The rest are all device-specific: in addition to the usual suspects
(HD-audio and USB-audio fixups), a few LPE HDMI audio fixes came in at
this time"
* tag 'sound-4.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: x86: Fix potential crash at error path
ALSA: x86: Fix missing spinlock and mutex initializations
ALSA: control: Fix memory corruption risk in snd_ctl_elem_read
ALSA: hda - Fix pincfg at resume on Lenovo T470 dock
ALSA: usb-audio: Add a quirck for B&W PX headphones
ALSA: hda: Add a power_save blacklist
ALSA: x86: hdmi: Add single_port option for compatible behavior
Pull pin control fixes from Linus Walleij:
"Two smallish pin control fixes: one actual code fix for the Meson and
a MAINTAINERS update.
Summary:
- fix a pin group on the Meson
- assign maintainers for Freescale/NXP pin controllers"
* tag 'pinctrl-v4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
MAINTAINERS: add Freescale pin controllers
pinctrl: meson-axg: adjust uart_ao_b pin group naming
Pull GPIO fixes from Linus Walleij:
"Fix up device tree properties readout caused by my own refactorings"
* tag 'gpio-v4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: Handle deferred probing in of_find_gpio() properly
gpiolib: Keep returning EPROBE_DEFER when we should
Fix a typo in pkt_start_recovery.
Fixes: 74d46992e0 ("block: replace bi_bdev with a gendisk pointer and partitions index")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jiufei Xue <jiufei.xue@linux.alibaba.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
bio_devname use __bdevname to display the device name, and can
only show the major and minor of the part0,
Fix this by using disk_name to display the correct name.
Fixes: 74d46992e0 ("block: replace bi_bdev with a gendisk pointer and partitions index")
Reviewed-by: Omar Sandoval <osandov@fb.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jiufei Xue <jiufei.xue@linux.alibaba.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
There is lack of cache destroy operation for ceph_file_cachep
when failing from fscache register.
Signed-off-by: Chengguang Xu <cgxu519@icloud.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
In case of a failed write request (all retries failed) and when using
libata, the SCSI error handler calls scsi_finish_command(). In the
case of blk-mq this means that scsi_mq_done() does not get called,
that blk_mq_complete_request() does not get called and also that the
mq-deadline .completed_request() method is not called. This results in
the target zone of the failed write request being left in a locked
state, preventing that any new write requests are issued to the same
zone.
Fix this by replacing the .completed_request() method with the
.finish_request() method as this method is always called whether or
not a request completes successfully. Since the .finish_request()
method is only called by the blk-mq core if a .prepare_request()
method exists, add a dummy .prepare_request() method.
Fixes: 5700f69178 ("mq-deadline: Introduce zone locking support")
Cc: Hannes Reinecke <hare@suse.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
[ bvanassche: edited patch description ]
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We already have xmalloc(), xcalloc(), and xrealloc((). Add xstrdup()
as well to save tedious error handling.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Currently, sparse issues warnings on code using an attribute
it doesn't know about.
One of the problem with this is that these warnings have no
value for the developer, it's just noise for him. At best these
warnings tell something about some deficiencies of sparse itself
but not about a potential problem with code analyzed.
A second problem with this is that sparse release are, alas,
less frequent than new attributes are added to GCC.
So, avoid the noise by asking sparse to not warn about
attributes it doesn't know about.
Reference: https://marc.info/?l=linux-sparse&m=151871600016790
Reference: https://marc.info/?l=linux-sparse&m=151871725417322
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
The comment above the silentoldconfig invocation is outdated.
'make oldconfig' updates just .config and doesn't touch the
include/config/ tree.
This came up in https://lkml.org/lkml/2018/2/12/415.
While fixing the comment, make it more informative by explaining the
purpose of the unfortunately named silentoldconfig.
I can't make sense of the comment re. auto.conf.cmd and a cleaned tree.
include/config/auto.conf and include/config/auto.conf.cmd are both
created simultaneously by silentoldconfig (in
scripts/kconfig/confdata.c, by conf_write_autoconf()), and nothing seems
to remove auto.conf.cmd that wouldn't remove auto.conf. Remove that part
of the comment rather than blindly copying it. It might be a leftover
from an older way of doing things.
The include/config/auto.conf.cmd prerequisite might be there to ensure
that silentoldconfig gets rerun if conf_write_autoconf() fails between
writing out auto.conf.cmd and auto.conf (a comment in the function
indicates that auto.conf is deliberately written out last to mark
completion of the operation). It seems the Makefile dependency between
include/config/auto.conf and .config would already take care of that
though, since include/config/auto.conf would still be out of date re.
.config if the operation fails.
Cop out and leave the prerequisite in for now.
Signed-off-by: Ulf Magnusson <ulfalizer@gmail.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
If we have a file with 2 (or more) hard links in the same directory,
remove one of the hard links, create a new file (or link an existing file)
in the same directory with the name of the removed hard link, and then
finally fsync the new file, we end up with a log that fails to replay,
causing a mount failure.
Example:
$ mkfs.btrfs -f /dev/sdb
$ mount /dev/sdb /mnt
$ mkdir /mnt/testdir
$ touch /mnt/testdir/foo
$ ln /mnt/testdir/foo /mnt/testdir/bar
$ sync
$ unlink /mnt/testdir/bar
$ touch /mnt/testdir/bar
$ xfs_io -c "fsync" /mnt/testdir/bar
<power failure>
$ mount /dev/sdb /mnt
mount: mount(2) failed: /mnt: No such file or directory
When replaying the log, for that example, we also see the following in
dmesg/syslog:
[71813.671307] BTRFS info (device dm-0): failed to delete reference to bar, inode 258 parent 257
[71813.674204] ------------[ cut here ]------------
[71813.675694] BTRFS: Transaction aborted (error -2)
[71813.677236] WARNING: CPU: 1 PID: 13231 at fs/btrfs/inode.c:4128 __btrfs_unlink_inode+0x17b/0x355 [btrfs]
[71813.679669] Modules linked in: btrfs xfs f2fs dm_flakey dm_mod dax ghash_clmulni_intel ppdev pcbc aesni_intel aes_x86_64 crypto_simd cryptd glue_helper evdev psmouse i2c_piix4 parport_pc i2c_core pcspkr sg serio_raw parport button sunrpc loop autofs4 ext4 crc16 mbcache jbd2 zstd_decompress zstd_compress xxhash raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid1 raid0 multipath linear md_mod ata_generic sd_mod virtio_scsi ata_piix libata virtio_pci virtio_ring crc32c_intel floppy virtio e1000 scsi_mod [last unloaded: btrfs]
[71813.679669] CPU: 1 PID: 13231 Comm: mount Tainted: G W 4.15.0-rc9-btrfs-next-56+ #1
[71813.679669] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
[71813.679669] RIP: 0010:__btrfs_unlink_inode+0x17b/0x355 [btrfs]
[71813.679669] RSP: 0018:ffffc90001cef738 EFLAGS: 00010286
[71813.679669] RAX: 0000000000000025 RBX: ffff880217ce4708 RCX: 0000000000000001
[71813.679669] RDX: 0000000000000000 RSI: ffffffff81c14bae RDI: 00000000ffffffff
[71813.679669] RBP: ffffc90001cef7c0 R08: 0000000000000001 R09: 0000000000000001
[71813.679669] R10: ffffc90001cef5e0 R11: ffffffff8343f007 R12: ffff880217d474c8
[71813.679669] R13: 00000000fffffffe R14: ffff88021ccf1548 R15: 0000000000000101
[71813.679669] FS: 00007f7cee84c480(0000) GS:ffff88023fc80000(0000) knlGS:0000000000000000
[71813.679669] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[71813.679669] CR2: 00007f7cedc1abf9 CR3: 00000002354b4003 CR4: 00000000001606e0
[71813.679669] Call Trace:
[71813.679669] btrfs_unlink_inode+0x17/0x41 [btrfs]
[71813.679669] drop_one_dir_item+0xfa/0x131 [btrfs]
[71813.679669] add_inode_ref+0x71e/0x851 [btrfs]
[71813.679669] ? __lock_is_held+0x39/0x71
[71813.679669] ? replay_one_buffer+0x53/0x53a [btrfs]
[71813.679669] replay_one_buffer+0x4a4/0x53a [btrfs]
[71813.679669] ? rcu_read_unlock+0x3a/0x57
[71813.679669] ? __lock_is_held+0x39/0x71
[71813.679669] walk_up_log_tree+0x101/0x1d2 [btrfs]
[71813.679669] walk_log_tree+0xad/0x188 [btrfs]
[71813.679669] btrfs_recover_log_trees+0x1fa/0x31e [btrfs]
[71813.679669] ? replay_one_extent+0x544/0x544 [btrfs]
[71813.679669] open_ctree+0x1cf6/0x2209 [btrfs]
[71813.679669] btrfs_mount_root+0x368/0x482 [btrfs]
[71813.679669] ? trace_hardirqs_on_caller+0x14c/0x1a6
[71813.679669] ? __lockdep_init_map+0x176/0x1c2
[71813.679669] ? mount_fs+0x64/0x10b
[71813.679669] mount_fs+0x64/0x10b
[71813.679669] vfs_kern_mount+0x68/0xce
[71813.679669] btrfs_mount+0x13e/0x772 [btrfs]
[71813.679669] ? trace_hardirqs_on_caller+0x14c/0x1a6
[71813.679669] ? __lockdep_init_map+0x176/0x1c2
[71813.679669] ? mount_fs+0x64/0x10b
[71813.679669] mount_fs+0x64/0x10b
[71813.679669] vfs_kern_mount+0x68/0xce
[71813.679669] do_mount+0x6e5/0x973
[71813.679669] ? memdup_user+0x3e/0x5c
[71813.679669] SyS_mount+0x72/0x98
[71813.679669] entry_SYSCALL_64_fastpath+0x1e/0x8b
[71813.679669] RIP: 0033:0x7f7cedf150ba
[71813.679669] RSP: 002b:00007ffca71da688 EFLAGS: 00000206
[71813.679669] Code: 7f a0 e8 51 0c fd ff 48 8b 43 50 f0 0f ba a8 30 2c 00 00 02 72 17 41 83 fd fb 74 11 44 89 ee 48 c7 c7 7d 11 7f a0 e8 38 f5 8d e0 <0f> ff 44 89 e9 ba 20 10 00 00 eb 4d 48 8b 4d b0 48 8b 75 88 4c
[71813.679669] ---[ end trace 83bd473fc5b4663b ]---
[71813.854764] BTRFS: error (device dm-0) in __btrfs_unlink_inode:4128: errno=-2 No such entry
[71813.886994] BTRFS: error (device dm-0) in btrfs_replay_log:2307: errno=-2 No such entry (Failed to recover log tree)
[71813.903357] BTRFS error (device dm-0): cleaner transaction attach returned -30
[71814.128078] BTRFS error (device dm-0): open_ctree failed
This happens because the log has inode reference items for both inode 258
(the first file we created) and inode 259 (the second file created), and
when processing the reference item for inode 258, we replace the
corresponding item in the subvolume tree (which has two names, "foo" and
"bar") witht he one in the log (which only has one name, "foo") without
removing the corresponding dir index keys from the parent directory.
Later, when processing the inode reference item for inode 259, which has
a name of "bar" associated to it, we notice that dir index entries exist
for that name and for a different inode, so we attempt to unlink that
name, which fails because the inode reference item for inode 258 no longer
has the name "bar" associated to it, making a call to btrfs_unlink_inode()
fail with a -ENOENT error.
Fix this by unlinking all the names in an inode reference item from a
subvolume tree that are not present in the inode reference item found in
the log tree, before overwriting it with the item from the log tree.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
If in the same transaction we rename a special file (fifo, character/block
device or symbolic link), create a hard link for it having its old name
then sync the log, we will end up with a log that can not be replayed and
at when attempting to replay it, an EEXIST error is returned and mounting
the filesystem fails. Example scenario:
$ mkfs.btrfs -f /dev/sdc
$ mount /dev/sdc /mnt
$ mkdir /mnt/testdir
$ mkfifo /mnt/testdir/foo
# Make sure everything done so far is durably persisted.
$ sync
# Create some unrelated file and fsync it, this is just to create a log
# tree. The file must be in the same directory as our special file.
$ touch /mnt/testdir/f1
$ xfs_io -c "fsync" /mnt/testdir/f1
# Rename our special file and then create a hard link with its old name.
$ mv /mnt/testdir/foo /mnt/testdir/bar
$ ln /mnt/testdir/bar /mnt/testdir/foo
# Create some other unrelated file and fsync it, this is just to persist
# the log tree which was modified by the previous rename and link
# operations. Alternatively we could have modified file f1 and fsync it.
$ touch /mnt/f2
$ xfs_io -c "fsync" /mnt/f2
<power failure>
$ mount /dev/sdc /mnt
mount: mount /dev/sdc on /mnt failed: File exists
This happens because when both the log tree and the subvolume's tree have
an entry in the directory "testdir" with the same name, that is, there
is one key (258 INODE_REF 257) in the subvolume tree and another one in
the log tree (where 258 is the inode number of our special file and 257
is the inode for directory "testdir"). Only the data of those two keys
differs, in the subvolume tree the index field for inode reference has
a value of 3 while the log tree it has a value of 5. Because the same key
exists in both trees, but have different index, the log replay fails with
an -EEXIST error when attempting to replay the inode reference from the
log tree.
Fix this by setting the last_unlink_trans field of the inode (our special
file) to the current transaction id when a hard link is created, as this
forces logging the parent directory inode, solving the conflict at log
replay time.
A new generic test case for fstests was also submitted.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When doing an incremental send of a filesystem with the no-holes feature
enabled, we end up issuing a write operation when using the no data mode
send flag, instead of issuing an update extent operation. Fix this by
issuing the update extent operation instead.
Trivial reproducer:
$ mkfs.btrfs -f -O no-holes /dev/sdc
$ mkfs.btrfs -f /dev/sdd
$ mount /dev/sdc /mnt/sdc
$ mount /dev/sdd /mnt/sdd
$ xfs_io -f -c "pwrite -S 0xab 0 32K" /mnt/sdc/foobar
$ btrfs subvolume snapshot -r /mnt/sdc /mnt/sdc/snap1
$ xfs_io -c "fpunch 8K 8K" /mnt/sdc/foobar
$ btrfs subvolume snapshot -r /mnt/sdc /mnt/sdc/snap2
$ btrfs send /mnt/sdc/snap1 | btrfs receive /mnt/sdd
$ btrfs send --no-data -p /mnt/sdc/snap1 /mnt/sdc/snap2 \
| btrfs receive -vv /mnt/sdd
Before this change the output of the second receive command is:
receiving snapshot snap2 uuid=f6922049-8c22-e544-9ff9-fc6755918447...
utimes
write foobar, offset 8192, len 8192
utimes foobar
BTRFS_IOC_SET_RECEIVED_SUBVOL uuid=f6922049-8c22-e544-9ff9-...
After this change it is:
receiving snapshot snap2 uuid=564d36a3-ebc8-7343-aec9-bf6fda278e64...
utimes
update_extent foobar: offset=8192, len=8192
utimes foobar
BTRFS_IOC_SET_RECEIVED_SUBVOL uuid=564d36a3-ebc8-7343-aec9-bf6fda278e64...
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The fs_info::super_copy is a byte copy of the on-disk structure and all
members must use the accessor macros/functions to obtain the right
value. This was missing in update_super_roots and in sysfs readers.
Moving between opposite endianness hosts will report bogus numbers in
sysfs, and mount may fail as the root will not be restored correctly. If
the filesystem is always used on a same endian host, this will not be a
problem.
Fix this by using the btrfs_set_super...() functions to set
fs_info::super_copy values, and for the sysfs, use the cached
fs_info::nodesize/sectorsize values.
CC: stable@vger.kernel.org
Fixes: df93589a17 ("btrfs: export more from FS_INFO to sysfs")
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ update changelog ]
Signed-off-by: David Sterba <dsterba@suse.com>
In case of using DUP, we search for enough unallocated disk space on a
device to hold two stripes.
The devices_info[ndevs-1].max_avail that holds the amount of unallocated
space found is directly assigned to stripe_size, while it's actually
twice the stripe size.
Later on in the code, an unconditional division of stripe_size by
dev_stripes corrects the value, but in the meantime there's a check to
see if the stripe_size does not exceed max_chunk_size. Since during this
check stripe_size is twice the amount as intended, the check will reduce
the stripe_size to max_chunk_size if the actual correct to be used
stripe_size is more than half the amount of max_chunk_size.
The unconditional division later tries to correct stripe_size, but will
actually make sure we can't allocate more than half the max_chunk_size.
Fix this by moving the division by dev_stripes before the max chunk size
check, so it always contains the right value, instead of putting a duct
tape division in further on to get it fixed again.
Since in all other cases than DUP, dev_stripes is 1, this change only
affects DUP.
Other attempts in the past were made to fix this:
* 37db63a400 "Btrfs: fix max chunk size check in chunk allocator" tried
to fix the same problem, but still resulted in part of the code acting
on a wrongly doubled stripe_size value.
* 86db25785a "Btrfs: fix max chunk size on raid5/6" unintentionally
broke this fix again.
The real problem was already introduced with the rest of the code in
73c5de0051.
The user visible result however will be that the max chunk size for DUP
will suddenly double, while it's actually acting according to the limits
in the code again like it was 5 years ago.
Reported-by: Naohiro Aota <naohiro.aota@wdc.com>
Link: https://www.spinics.net/lists/linux-btrfs/msg69752.html
Fixes: 73c5de0051 ("btrfs: quasi-round-robin for chunk allocation")
Fixes: 86db25785a ("Btrfs: fix max chunk size on raid5/6")
Signed-off-by: Hans van Kranenburg <hans.van.kranenburg@mendix.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ update comment ]
Signed-off-by: David Sterba <dsterba@suse.com>
Essentially duplicate the error handling from the above block which
handles the !PageUptodate(page) case and additionally clear
EXTENT_BOUNDARY.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
add_pending_csums was added as part of the new data=ordered
implementation in e6dcd2dc9c ("Btrfs: New data=ordered
implementation"). Even back then it called the btrfs_csum_file_blocks
which can fail but it never bothered handling the failure. In ENOMEM
situation this could lead to the filesystem failing to write the
checksums for a particular extent and not detect this. On read this
could lead to the filesystem erroring out due to crc mismatch. Fix it by
propagating failure from add_pending_csums and handling them.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The srcu_struct in btrfs_fs_info scales in size with NR_CPUS. On
kernels built with NR_CPUS=8192, this can result in kmalloc failures
that prevent mounting.
There is work in progress to try to resolve this for every user of
srcu_struct but using kvzalloc will work around the failures until
that is complete.
As an example with NR_CPUS=512 on x86_64: the overall size of
subvol_srcu is 3460 bytes, fs_info is 6496.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The intel-hid device will not be able to wake up the system any more
after removing the notify handler provided by its driver, so make
its sysfs attributes reflect that.
Fixes: ef884112e5 (platform: x86: intel-hid: Wake up the system from suspend-to-idle)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
The intel-vbtn device will not be able to wake up the system any more
after removing the notify handler provided by its driver, so make
its sysfs attributes reflect that.
Fixes: 91f9e850d4 (platform: x86: intel-vbtn: Wake up the system from suspend-to-idle)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
The separation of the cpu_entry_area from the fixmap missed the fact that
on 32bit non-PAE kernels the cpu_entry_area mapping might not be covered in
initial_page_table by the previous synchronizations.
This results in suspend/resume failures because 32bit utilizes initial page
table for resume. The absence of the cpu_entry_area mapping results in a
triple fault, aka. insta reboot.
With PAE enabled this works by chance because the PGD entry which covers
the fixmap and other parts incindentally provides the cpu_entry_area
mapping as well.
Synchronize the initial page table after setting up the cpu entry
area. Instead of adding yet another copy of the same code, move it to a
function and invoke it from the various places.
It needs to be investigated if the existing calls in setup_arch() and
setup_per_cpu_areas() can be replaced by the later invocation from
setup_cpu_entry_areas(), but that's beyond the scope of this fix.
Fixes: 92a0f81d89 ("x86/cpu_entry_area: Move it out of the fixmap")
Reported-by: Woody Suwalski <terraluna977@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Woody Suwalski <terraluna977@gmail.com>
Cc: William Grant <william.grant@canonical.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1802282137290.1392@nanos.tec.linutronix.de
Back in the early days when gru devices were still under development
we found an issue where the WiFi reset line needed to be configured as
early as possible during the boot process to avoid the WiFi module
being in a bad state.
We found that the way to get the kernel to do this in the earliest
possible place was to configure this line in the pinctrl hogs, so
that's what we did. For some history here you can see
<http://crosreview.com/368770>. After the time that change landed in
the kernel, we landed a firmware change to configure this line even
earlier. See <http://crosreview.com/399919>. However, even after the
firmware change landed we kept the kernel change to deal with the fact
that some people working on devices might take a little while to
update their firmware.
At this there are definitely zero devices out in the wild that have
firmware without the fix in it. Specifically looking in the firmware
branch several critically important fixes for memory stability landed
after the patch in coreboot and I know we didn't ship without those.
Thus, by now, everyone should have the new firmware and it's safe to
not have the kernel set this up in a pinctrl hog.
Historically, even though it wasn't needed to have this in a pinctrl
hog, we still kept it since it didn't hurt. Pinctrl would apply the
default hog at bootup and then would never touch things again. That
all changed with commit 981ed1bfbc ("pinctrl: Really force states
during suspend/resume"). After that commit then we'll re-apply the
default hog at resume time and that can screw up the reset state of
WiFi. ...and on rk3399 if you touch a device on PCIe in the wrong way
then the whole system can go haywire. That's what was happening.
Specifically you'd resume a rk3399-gru-* device and it would mostly
resume, then would crash with some crazy weird crash.
One could say, perhaps, that the recent pinctrl change was at fault
(and should be fixed) since it changed behavior. ...but that's not
really true. The device tree for rk3399-gru is really to blame.
Specifically since the pinctrl is defined in the hog and not in the
"wlan-pd-n" node then the actual user of this pin doesn't have a
pinctrl entry for it. That's bad.
Let's fix our problems by just moving the control of
"wlan_module_reset_l pinctrl" out of the hog and put them in the
proper place.
NOTE: in theory, I think it should actually be possible to have a pin
controlled _both_ by the hog and by an actual device. Once the device
claims the pin I think the hog is supposed to let go. I'm not 100%
sure that this works and in any case this solution would be more
complex than is necessary.
Reported-by: Marc Zyngier <marc.zyngier@arm.com>
Fixes: 48f4d9796d ("arm64: dts: rockchip: add Gru/Kevin DTS")
Fixes: 981ed1bfbc ("pinctrl: Really force states during suspend/resume")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Tested-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
When IPsec offloading was introduced, we accidentally incremented
the sequence number counter on the xfrm_state by one packet
too much in the ESN case. This leads to a sequence number gap of
one packet after each GSO packet. Fix this by setting the sequence
number to the correct value.
Fixes: d7dbefc45c ("xfrm: Add xfrm_replay_overflow functions for offloading")
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
We are using test_and_* operations on the status and flag fields of
struct sock_mapping. However, these functions require the operand to be
64-bit aligned on arm64. Currently, only status is 64-bit aligned.
Make status and flags explicitly 64-bit aligned.
Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
A few misc fixes for 4.16.
* 'drm-fixes-4.16' of git://people.freedesktop.org/~agd5f/linux:
drm/amdgpu: skip ECC for SRIOV in gmc late_init
drm/amd/amdgpu: Correct VRAM width for APUs with GMC9
drm/amdgpu: fix&cleanups for wb_clear
drm/amdgpu: Correct sdma_v4 get_wptr(v2)
drm/amd/powerplay: fix power over limit on Fiji
drm/amdgpu:Fixed wrong emit frame size for enc
drm/amdgpu: move WB_FREE to correct place
drm/amdgpu: only flush hotplug work without DC
drm/amd/display: check for ipp before calling cursor operations
Two regression fixes here: a fb format regression on nouveau and a 4.16-rc1
regression with on LVDS with one sun4i device. Plus a sun4i and a virtio-gpu
fixes.
* tag 'drm-misc-fixes-2018-02-28' of git://people.freedesktop.org/drm-misc:
virtio-gpu: fix ioctl and expose the fixed status to userspace.
drm/sun4i: Protect the TCON pixel clocks
drm/sun4i: Enable the output on the pins (tcon0)
drm/nouveau: prefer XBGR2101010 for addfb ioctl
- 2 display fixes: audio av_enc_map overflow check, and Cannonlake PLL related register offset.
- 3 gem fixes: Clear for in-fence out-fence, fix for clearing exec_flags on execbuf failure, and add back global seqno to tracepoints that had been removed recently by other fence related patch.
* tag 'drm-intel-fixes-2018-02-28' of git://anongit.freedesktop.org/drm/drm-intel:
drm/i915: Make global seqno known in i915_gem_request_execute tracepoint
drm/i915: Clear the in-use marker on execbuf failure
drm/i915/cnl: Fix PORT_TX_DW5/7 register address
drm/i915/audio: fix check for av_enc_map overflow
drm/i915: Fix rsvd2 mask when out-fence is returned
Pull ARM SoC fixes from Arnd Bergmann:
"This is the first set of bugfixes for ARM SoCs, fixing a couple of
stability problems, mostly on TI OMAP and Rockchips platforms:
- OMAP2 hwmod clocks must be enabled in the correct order
- OMAP3 Wakeup from resume through PRM IRQ was unreliable
- one regression on OMAP5 caused by a kexec fix
- Rockchip ethernet needs some settings for stable operation on
Rock64
- Rockchip based Chrombook Plus needs another clock setting for
stable display suspend/resume
- Rockchip based phyCORE-RK3288 was able to run at an invalid CPU
clock frequency
- Rockchip MMC link was sometimes unreliable
- multiple fixes to avoid crashes in the Broadcom STB DPFE driver
Other minor changes include:
- Devicetree fixes for incorrect hardware description (rockchip,
omap, Gemini, amlogic)
- some MAINTAINER file updates to correct email and git addresses
- some fixes addressing 'make W=1' dtc warnings (broadcom, amlogic,
cavium, qualcomm, hisilicon, zx)
- fixes for LTO-compilation (orion, davinci, clps711x)
- one fix for an incorrect Kconfig errata selection
- a memory leak in the OMAP timer driver
- a kernel data leak in OMAP1 debugfs files"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (38 commits)
MAINTAINERS: update entries for ARM/STM32
ARM: dts: bcm283x: Move arm-pmu out of soc node
ARM: dts: bcm283x: Fix unit address of local_intc
ARM: dts: NSP: Fix amount of RAM on BCM958625HR
ARM: dts: Set D-Link DNS-313 SATA to muxmode 0
ARM: omap2: set CONFIG_LIRC=y in defconfig
ARM: dts: imx6dl: Include correct dtsi file for Engicam i.CoreM6 DualLite/Solo RQS
memory: brcmstb: dpfe: support new way of passing data from the DCPU
memory: brcmstb: dpfe: fix type declaration of variable "ret"
memory: brcmstb: dpfe: properly mask vendor error bits
ARM: BCM: dts: Remove leading 0x and 0s from bindings notation
ARM: orion: fix orion_ge00_switch_board_info initialization
ARM: davinci: mark spi_board_info arrays as const
ARM: clps711x: mark clps711x_compat as const
arm: zx: dts: Remove leading 0x and 0s from bindings notation
arm64: dts: Remove leading 0x and 0s from bindings notation
arm64: dts: cavium: fix PCI bus dtc warnings
MAINTAINERS: ARM: at91: update my email address
soc: imx: gpc: de-register power domains only if initialized
ARM: dts: rockchip: Fix DWMMC clocks
...
Pull RISC-V fix from Palmer Dabbelt:
"This week we have a single fix: replacing smp_mb() with __smp_mb().
We were the only architecture with smp_mb() and it appears to just be
clearly wrong, so I think this is a pretty safe patch for an RC"
* tag 'riscv-for-linus-4.16-rc4_smp_mb' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux:
riscv/barrier: Define __smp_{mb,rmb,wmb}
On CPU hotunplug the enqueued timers of the unplugged CPU are migrated to a
live CPU. This happens from the control thread which initiated the unplug.
If the CPU on which the control thread runs came out from a longer idle
period then the base clock of that CPU might be stale because the control
thread runs prior to any event which forwards the clock.
In such a case the timers from the unplugged CPU are queued on the live CPU
based on the stale clock which can cause large delays due to increased
granularity of the outer timer wheels which are far away from base:;clock.
But there is a worse problem than that. The following sequence of events
illustrates it:
- CPU0 timer1 is queued expires = 59969 and base->clk = 59131.
The timer is queued at wheel level 2, with resulting expiry time = 60032
(due to level granularity).
- CPU1 enters idle @60007, with next timer expiry @60020.
- CPU0 is hotplugged at @60009
- CPU1 exits idle and runs the control thread which migrates the
timers from CPU0
timer1 is now queued in level 0 for immediate handling in the next
softirq because the requested expiry time 59969 is before CPU1 base->clk
60007
- CPU1 runs code which forwards the base clock which succeeds because the
next expiring timer. which was collected at idle entry time is still set
to 60020.
So it forwards beyond 60007 and therefore misses to expire the migrated
timer1. That timer gets expired when the wheel wraps around again, which
takes between 63 and 630ms depending on the HZ setting.
Address both problems by invoking forward_timer_base() for the control CPUs
timer base. All other places, which might run into a similar problem
(mod_timer()/add_timer_on()) already invoke forward_timer_base() to avoid
that.
[ tglx: Massaged comment and changelog ]
Fixes: a683f390b9 ("timers: Forward the wheel clock whenever possible")
Co-developed-by: Neeraj Upadhyay <neeraju@codeaurora.org>
Signed-off-by: Neeraj Upadhyay <neeraju@codeaurora.org>
Signed-off-by: Lingutla Chandrasekhar <clingutla@codeaurora.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: linux-arm-msm@vger.kernel.org
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180118115022.6368-1-clingutla@codeaurora.org
Pull "Broadcom drivers fixes for 4.16" from Florian Fainelli:
This pull request contains Broadcom SoCs drivers fixes for 4.16, please
pull the following:
- Markus provides two minor fixes to the Broadcom STB DPFE driver, one
to properly mask bits, and a second one to use the correct type. The
third commit is a consequence of a newer DFPE firmware which would
unfortunately crash without appropriate kernel changes.
* tag 'arm-soc/for-4.16/drivers-fixes' of https://github.com/Broadcom/stblinux:
memory: brcmstb: dpfe: support new way of passing data from the DCPU
memory: brcmstb: dpfe: fix type declaration of variable "ret"
memory: brcmstb: dpfe: properly mask vendor error bits
Pull "Broadcom devicetree fixes for 4.16" from Florian Fainelli:
This pull request contains Broadcom ARM-based SoCs Device Tree fixes for
4.16, please pull the following:
- Mathieu fixes leading 0x and 0's from bindings and Device Tree source
files, he has done this treewide and most of his changes are already in
4.16
- Stefan provides two changes to the BCM283x DTS files in order to fix
DTC warnings
- Florian fixes the amount of RAM on the BCM958625HR reference board to
properly limit to what is initialized by the bootloader
* tag 'arm-soc/for-4.16/devicetree-fixes' of https://github.com/Broadcom/stblinux:
ARM: dts: bcm283x: Move arm-pmu out of soc node
ARM: dts: bcm283x: Fix unit address of local_intc
ARM: dts: NSP: Fix amount of RAM on BCM958625HR
ARM: BCM: dts: Remove leading 0x and 0s from bindings notation
Pull "i.MX fixes for 4.16" from Shawn Guo:
- Fix i.MX GPC driver to remove power domains only when they are
initialized in imx_gpc_probe().
- Fix the broken Engicam i.CoreM6 DualLite/Solo RQS board DT to include
imx6dl.dtsi instead of imx6q.dtsi.
* tag 'imx-fixes-4.16' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
ARM: dts: imx6dl: Include correct dtsi file for Engicam i.CoreM6 DualLite/Solo RQS
soc: imx: gpc: de-register power domains only if initialized
Pull kselftest fixes from Shuah Khan:
"Fixes for various problems in test output, compile errors, and missing
configs"
* tag 'linux-kselftest-4.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests: vm: update .gitignore with new test
selftests: memory-hotplug: silence test command echo
selftests/futex: Fix line continuation in Makefile
selftests: memfd: add config fragment for fuse
selftests: pstore: Adding config fragment CONFIG_PSTORE_RAM=m
selftests/android: Fix line continuation in Makefile
selftest/vDSO: fix O=
selftests: sync: missing CFLAGS while compiling
fix:
should do right shift on wb before clearing
cleanups:
1,should memset all wb buffer
2,set max wb number to 128 (total 4KB) is big enough
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
power containment disabled only on Fiji and compute
power profile. It violates PCIe spec and may cause power
supply failed. Enabling it will fix the issue, even the
fix will drop performance of some compute tests.
Signed-off-by: Eric Huang <JinHuiEric.Huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Emit frame size should match with corresponding function,
uvd_v6_0_enc_ring_emit_vm_flush has 5 amdgpu_ring_write
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
WB_FREE should be put after all engines's hw_fini
done, otherwise the invalid wptr/rptr_addr would still
be used by engines which trigger abnormal bugs.
This fixes couple DMAR reading error in host side for SRIOV
after guest kmd is unloaded.
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Currently all cursor related functions are made to all
pipes that are attached to a particular stream.
This is not applicable to pipes that do not have cursor plane
initialised like underlay.
Hence this patch allows cursor related operations on a pipe
only if ipp in available on that particular pipe.
The check is added to set_cursor_position & set_cursor_attribute.
Signed-off-by: Shirish S <shirish.s@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Pull xfs fixes from Darrick Wong:
- fix some compiler warnings
- fix block reservations for transactions created during log recovery
- fix resource leaks when respecifying mount options
* tag 'xfs-4.16-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: fix potential memory leak in mount option parsing
xfs: reserve blocks for refcount / rmap log item recovery
xfs: use memset to initialize xfs_scrub_agfl_info
Today the tty0 and hvc0 consoles are added as a preferred consoles for
pv domUs only. As this requires a boot parameter for getting dom0
messages per default, add them for dom0, too.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
A toolstack may delete the vif frontend and backend xenstore entries
while xen-netfront is in the removal code path. In that case, the
checks for xenbus_read_driver_state would return XenbusStateUnknown, and
xennet_remove would hang indefinitely. This hang prevents system
shutdown.
xennet_remove must be able to handle XenbusStateUnknown, and
netback_changed must also wake up the wake_queue for that state as well.
Fixes: 5b5971df3b ("xen-netfront: remove warning when unloading module")
Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
Cc: Eduardo Otubo <otubo@redhat.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Current cleanup in the error path of xen_bind_pirq_msi_to_irq is
wrong. First of all there's an off-by-one in the cleanup loop, which
can lead to unbinding wrong IRQs.
Secondly IRQs not bound won't be freed, thus leaking IRQ numbers.
Note that there's no need to differentiate between bound and unbound
IRQs when freeing them, __unbind_from_irq will deal with both of them
correctly.
Fixes: 4892c9b4ad ("xen: add support for MSI message groups")
Reported-by: Hooman Mirhadi <mirhadih@amazon.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Amit Shah <aams@amazon.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Pull NVMe fixes from Keith for 4.16-rc.
* 'for-jens' of git://git.infradead.org/nvme:
nvmet: fix PSDT field check in command format
nvme-multipath: fix sysfs dangerously created links
nvme-pci: Fix nvme queue cleanup if IRQ setup fails
nvmet-loop: use blk_rq_payload_bytes for sgl selection
nvme-rdma: use blk_rq_payload_bytes instead of blk_rq_bytes
nvme-fabrics: don't check for non-NULL module in nvmf_register_transport
Pull dma-mapping fix from Christoph Hellwig:
"A single fix for a memory leak regression in the dma-debug code"
* tag 'dma-mapping-4.16-3' of git://git.infradead.org/users/hch/dma-mapping:
dma-debug: fix memory leak in debug_dma_alloc_coherent
Release the netdev references in the cleanup path. Invokes the cleanup
routines if bnxt_re_ib_reg fails.
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
To support host systems with non 4K page size, l2_db_size shall be
calculated with 4096 instead of PAGE_SIZE. Also, supply the host page size
to FW during initialization.
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
HW requires an unconditonal fence for all non-wire memory operations
through SQ. This guarantees the completions of these memory operations.
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
During IB device registration process, if query_device() fails or if
ib_core fails to registers sysfs entries, rdma cgroup cleanup is
skipped.
Cc: <stable@vger.kernel.org> # v4.2+
Fixes: 4be3a4fa51 ("IB/core: Fix kernel crash during fail to initialize device")
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
IB spec says that a lid should be ignored when link layer is Ethernet,
for example when building or parsing a CM request message (CA17-34).
However, since ib_lid_be16() and ib_lid_cpu16() validates the slid,
not only when link layer is IB, we set the slid to zero to prevent
false warnings in the kernel log.
Fixes: 62ede77799 ("Add OPA extended LID support")
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
All other mlx5_events report the port number as 1 based, which is how FW
reports it in the port event EQE. Reporting 0 for this event causes
mlx5_ib to not raise a fatal event notification to registered clients
due to a seemingly invalid port.
All switch cases in mlx5_ib_event that go through the port check are
supposed to set the port now, so just do it once at variable
declaration.
Fixes: 89d44f0a6c73("net/mlx5_core: Add pci error handlers to mlx5_core driver")
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
During QP creation, the mlx5 driver translates the QP type to an
internal value which is passed on to FW. There was no check to make
sure that the translated value is valid, and -EINVAL was coerced into
the mailbox command.
Current firmware refuses this as an invalid QP type, but future/past
firmware may do something else.
Fixes: 09a7d9eca1 ('{net,IB}/mlx5: QP/XRCD commands via mlx5 ifc')
Reviewed-by: Ilya Lesokhin <ilyal@mellanox.com>
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The IPS_NAT_MASK check in 4.12 replaced previous check for nfct_nat()
which was needed to fix a crash in 2.6.36-rc, see
commit 7bcbf81a22 ("ipvs: avoid oops for passive FTP").
But as IPVS does not set the IPS_SRC_NAT and IPS_DST_NAT bits,
checking for IPS_NAT_MASK prevents PASV response to be properly
mangled and blocks the transfer. Remove the check as it is not
needed after 3.12 commit 41d73ec053 ("netfilter: nf_conntrack:
make sequence number adjustments usuable without NAT") which
changes nfct_nat() with nfct_seqadj() and especially after 3.13
commit b25adce160 ("ipvs: correct usage/allocation of seqadj
ext in ipvs").
Thanks to Li Shuang and Florian Westphal for reporting the problem!
Reported-by: Li Shuang <shuali@redhat.com>
Fixes: be7be6e161 ("netfilter: ipvs: fix incorrect conflict resolution")
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
As we have option in u-boot to set CPU mask for running linux,
we want to pass information to kernel about CPU cores should
be brought up. So we patch kernel dtb in u-boot to set
possible-cpus property.
This also allows us to have correctly setuped MCIP debug mask.
Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
As of today we use hardcoded MCIP debug mask, so if we launch
kernel via debugger and kick fever cores than HW has all cpus
hang at the momemt of setup MCIP debug mask.
So update MCIP debug mask when the new cpu came online, instead of
use hardcoded MCIP debug mask.
Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
In SMP systems, GFRC is used for clocksource. However by default the
counter keeps running even when core is halted (say when debugging via a
JTAG debugger). This confuses Linux timekeeping and triggers flase RCU stall
splat such as below:
| [ARCLinux]# while true; do ./shm_open_23-1.run-test ; done
| Running with 1000 processes for 1000 objects
| hrtimer: interrupt took 485060 ns
|
| create_cnt: 1000
| Running with 1000 processes for 1000 objects
| [ARCLinux]# INFO: rcu_preempt self-detected stall on CPU
| 2-...: (1 GPs behind) idle=a01/1/0 softirq=135770/135773 fqs=0
| INFO: rcu_preempt detected stalls on CPUs/tasks:
| 0-...: (1 GPs behind) idle=71e/0/0 softirq=135264/135264 fqs=0
| 2-...: (1 GPs behind) idle=a01/1/0 softirq=135770/135773 fqs=0
| 3-...: (1 GPs behind) idle=4e0/0/0 softirq=134304/134304 fqs=0
| (detected by 1, t=13648 jiffies, g=31493, c=31492, q=1)
Starting from ARC HS v3.0 it's possible to tie GFRC to state of up-to 4
ARC cores with help of GFRC's CORE register where we set a mask for
cores which state we need to rely on.
We update cpu mask every time new cpu came online instead of using
hardcoded one or using mask generated from "possible_cpus" as we
want it set correctly even if we run kernel on HW which has fewer cores
than expected (or we launch kernel via debugger and kick fever cores
than HW has)
Note that GFRC halts when all cores have halted and thus relies on
programming of Inter-Core-dEbug register to halt all cores when one
halts.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
[vgupta: rewrote changelog]
Pull DeviceTree fixes from Rob Herring:
- update i.MX thermal binding example to use current binding, not the
deprecated one
- move arm-charlcd to auxdisplay/
- fix misspelling of "debounce-interval"
* tag 'devicetree-fixes-for-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
dt-bindings: power: Fix "debounce-interval" property misspelling
auxdisplay: Move arm-charlcd binding to correct folder
dt-bindings: thermal: imx: update the binding to new method
Jiri Pirko says:
====================
mlxsw: couple of fixes
Couple of unrelated fixes for mlxsw.
---
v1->v2:
-patch 2:
- rebase on top of current -net tree
- removed forgotten empty line
-patch 3:
- new patch
-patch 4:
- new patch
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
One of the basic construct in the device is a port-VLAN pair, which can
be bound to a FID or a RIF in order to direct packets to the bridge or
the router, respectively.
Since not all the netdevs are configured with a VLAN (e.g., sw1p1 vs.
sw1p1.10), VID 1 is used to represent these and thus this VID can be
used by both upper devices of mlxsw ports and by the driver itself.
However, this VID is not reference counted and therefore might be freed
prematurely, which can result in various WARNINGs. For example:
$ ip link add name br0 type bridge vlan_filtering 1
$ teamd -t team0 -d -c '{"runner": {"name": "lacp"}}'
$ ip link set dev team0 master br0
$ ip link set dev enp1s0np1 master team0
$ ip address add 192.0.2.1/24 dev enp1s0np1
The enslavement to team0 will fail because team0 already has an upper
and thus vlan_vids_del_by_dev() will be executed as part of team's error
path which will delete VID 1 from enp1s0np1 (added by br0 as PVID). The
WARNING will be generated when the driver will realize it can't find VID
1 on the port and bind it to a RIF.
Fix this by adding a reference count to the VLAN entries on the port, in
a similar fashion to the reference counting used by the corresponding
'vlan_vid_info' structure in the 8021q driver.
Fixes: c57529e1d5 ("mlxsw: spectrum: Replace vPorts with Port-VLAN")
Reported-by: Tal Bar <talb@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Tested-by: Tal Bar <talb@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When multicast snooping is enabled, the Linux bridge resorts to flooding
unregistered multicast packets to all ports only in case it did not
detect a querier in the network.
The above condition is not reflected to underlying drivers, which is
especially problematic in IPv6 environments, as multicast snooping is
enabled by default and since neighbour solicitation packets might be
treated as unregistered multicast packets in case there is no
corresponding MDB entry.
Until the Linux bridge reflects its querier state to underlying drivers,
simply treat unregistered multicast packets as broadcast and allow them
to reach their destination.
Fixes: 9df552ef3e ("mlxsw: spectrum: Improve IPv6 unregistered multicast flooding")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current code uses global variables, adjusts them and passes pointer down
to devlink. With every other mlxsw_core instance, the previously passed
pointer values are rewritten. Fix this by de-globalize the variables and
also memcpy size_params during devlink resource registration.
Also, introduce a convenient size_param_init helper.
Fixes: ef3116e540 ("mlxsw: spectrum: Register KVD resources with devlink")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
IP_TTL, IP_ECN and IP_DSCP are using the same offset within the
scratchpad as L4 ports. Fix this by shifting all up.
Fixes: 5f57e09091 ("mlxsw: acl: Add ip ttl acl element")
Fixes: i80d0fe4710c ("mlxsw: acl: Add ip tos acl element")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ursula Braun says:
====================
net/smc: fixes 2018-02-28
here are 3 smc bug fixes for the net-tree. Karsten's first patch is
the reworked version of last week's
"[PATCH net-next 2/5] net/smc: fix structure size"
patch, now solved without using __packed, and now targetted for net
instead of net-next.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The CONFIRM LINK reply message must contain the link_id sent
by the server. And set the link_id explicitly when
initializing the link.
Signed-off-by: Karsten Graul <kgraul@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The sizeof(struct smc_cdc_msg) evaluates to 48 bytes instead of the
required 44 bytes. We need to use the constant value of
SMC_WR_TX_SIZE to set and check the control message length.
Signed-off-by: Karsten Graul <kgraul@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We try to disable NAPI to prevent a single XDP TX queue being used by
multiple cpus. But we don't check if device is up (NAPI is enabled),
this could result stall because of infinite wait in
napi_disable(). Fixing this by checking device state through
netif_running() before.
Fixes: 4941d472bf ("virtio-net: do not reset during XDP set")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the Intel Edison module is powered with 3.3V, the reboot command makes
the module stuck. If the module is powered at a greater voltage, like 4.4V
(as the Edison Mini Breakout board does), reboot works OK.
The official Intel Edison BSP sends the IPCMSG_COLD_RESET message to the
SCU by default. The IPCMSG_COLD_BOOT which is used by the upstream kernel
is only sent when explicitely selected on the kernel command line.
Use IPCMSG_COLD_RESET unconditionally which makes reboot work independent
of the power supply voltage.
[ tglx: Massaged changelog ]
Fixes: bda7b072de ("x86/platform/intel-mid: Implement power off sequence")
Signed-off-by: Sebastian Panceac <sebastian@resin.io>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1519810849-15131-1-git-send-email-sebastian@resin.io
PSDT field section according to NVM_Express-1.3:
"This field specifies whether PRPs or SGLs are used for any data
transfer associated with the command. PRPs shall be used for all
Admin commands for NVMe over PCIe. SGLs shall be used for all Admin
and I/O commands for NVMe over Fabrics. This field shall be set to
01b for NVMe over Fabrics 1.0 implementations.
Suggested-by: Idan Burstein <idanb@mellanox.com>
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <keith.busch@intel.com>
For tests that are using the maximal number of BPF instruction, each
run takes 20 usec. Looping 10,000 times on them totals 200 ms, which
is bad when the loop is not preemptible.
test_bpf: #264 BPF_MAXINSNS: Call heavy transformations jited:1 19248
18548 PASS
test_bpf: #269 BPF_MAXINSNS: ld_abs+get_processor_id jited:1 20896 PASS
Lets divide by ten the number of iterations, so that max latency is
20ms. We could use need_resched() to break the loop earlier if we
believe 20 ms is too much.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
When the connection is reset, there is no point in
keeping the packets on the write queue until the connection
is closed.
RFC 793 (page 70) and RFC 793-bis (page 64) both suggest
purging the write queue upon RST:
https://tools.ietf.org/html/draft-ietf-tcpm-rfc793bis-07
Moreover, this is essential for a correct MSG_ZEROCOPY
implementation, because userspace cannot call close(fd)
before receiving zerocopy signals even when the connection
is reset.
Fixes: f214f915e7 ("tcp: enable MSG_ZEROCOPY")
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuchung Cheng says:
====================
tcp: revert a F-RTO extension due to broken middle-boxes
This patch series reverts a (non-standard) TCP F-RTO extension that aimed
to detect more spurious timeouts. Unfortunately it could result in poor
performance due to broken middle-boxes that modify TCP packets. E.g.
https://www.spinics.net/lists/netdev/msg484154.html
We believe the best and simplest solution is to just revert the change.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit 89fe18e44f.
While the patch could detect more spurious timeouts, it could cause
poor TCP performance on broken middle-boxes that modifies TCP packets
(e.g. receive window, SACK options). Since the performance gain is
much smaller compared to the potential loss. The best solution is
to fully revert the change.
Fixes: 89fe18e44f ("tcp: extend F-RTO to catch more spurious timeouts")
Reported-by: Teodor Milkov <tm@del.bg>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit cc663f4d4c. While fixing
some broken middle-boxes that modifies receive window fields, it does not
address middle-boxes that strip off SACK options. The best solution is
to fully revert this patch and the root F-RTO enhancement.
Fixes: cc663f4d4c ("tcp: restrict F-RTO to work-around broken middle-boxes")
Reported-by: Teodor Milkov <tm@del.bg>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann says:
====================
s390/qeth: fixes 2018-02-27
please apply some more qeth patches for -net and stable.
One patch fixes a performance bug in the TSO path. Then there's several
more fixes for IP management on L3 devices - including a revert, so that
the subsequent fix cleanly applies to earlier kernels.
The final patch takes care of a race in the control IO code that causes
qeth to miss the cmd response, and subsequently trigger device recovery.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
If multiple IPA commands are build & sent out concurrently,
fill_ipacmd_header() may assign a seqno value to a command that's
different from what send_control_data() later assigns to this command's
reply.
This is due to other commands passing through send_control_data(),
and incrementing card->seqno.ipa along the way.
So one IPA command has no reply that's waiting for its seqno, while some
other IPA command has multiple reply objects waiting for it.
Only one of those waiting replies wins, and the other(s) times out and
triggers a recovery via send_ipa_cmd().
Fix this by making sure that the same seqno value is assigned to
a command and its reply object.
Do so immediately before submitting the command & while holding the
irq_pending "lock", to produce nicely ascending seqnos.
As a side effect, *all* IPA commands now use a reply object that's
waiting for its actual seqno. Previously, early IPA commands that were
submitted while the card was still DOWN used the "catch-all" IDX seqno.
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current code ("qeth_l3_ip_from_hash()") matches a queried address object
against objects in the IP table by IP address, Mask/Prefix Length and
MAC address ("qeth_l3_ipaddrs_is_equal()"). But what callers actually
require is either
a) "is this IP address registered" (ie. match by IP address only),
before adding a new address.
b) or "is this address object registered" (ie. match all relevant
attributes), before deleting an address.
Right now
1. the ADD path is too strict in its lookup, and eg. doesn't detect
conflicts between an existing NORMAL address and a new VIPA address
(because the NORMAL address will have mask != 0, while VIPA has
a mask == 0),
2. the DELETE path is not strict enough, and eg. allows del_rxip() to
delete a VIPA address as long as the IP address matches.
Fix all this by adding helpers (_addr_match_ip() and _addr_match_all())
that do the appropriate checking.
Note that the ADD path for NORMAL addresses is special, as qeth keeps
track of how many times such an address is in use (and there is no
immediate way of returning errors to the caller). So when a requested
NORMAL address _fully_ matches an existing one, it's not considered a
conflict and we merely increment the refcount.
Fixes: 5f78e29cee ("qeth: optimize IP handling in rx_mode callback")
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit cb816192d9.
The issue this attempted to fix never actually occurs.
l3_add_rxip() checks (via l3_ip_from_hash()) if the requested address
was previously added to the card. If so, it returns -EEXIST and doesn't
call l3_add_ip().
As a result, the "address exists" path in l3_add_ip() is never taken
for rxip addresses, and this patch had no effect.
Fixes: cb816192d9 ("s390/qeth: fix using of ref counter for rxip addresses")
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Registering an IPv4 address with the HW takes quite a while, so we
temporarily drop the ip_htable lock. Any concurrent add/remove of the
same IP adjusts the IP's use count, and (on remove) is then blocked by
addr->in_progress.
After the register call has completed, we check the use count for
concurrently attempted add/remove calls - and possibly straight-away
deregister the IP again. This happens via l3_delete_ip(), which
1) looks up the queried IP in the htable (getting a reference to the
*same* queried object),
2) deregisters the IP from the HW, and
3) frees the IP object.
The caller in l3_add_ip() then does a second free on the same object.
For this case, skip all the extra checks and lookups in l3_delete_ip()
and just deregister & free the IP object ourselves.
Fixes: 5f78e29cee ("qeth: optimize IP handling in rx_mode callback")
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the HW is not reachable, then none of the IPs in qeth's internal
table has been registered with the HW yet. So when deleting such an IP,
there's no need to stage it for deregistration - just drop it from
the table.
This fixes the "add-delete-add" scenario on an offline card, where the
the second "add" merely increments the IP's use count. But as the IP is
still set to DISP_ADDR_DELETE from the previous "delete" step,
l3_recover_ip() won't register it with the HW when the card goes online.
Fixes: 5f78e29cee ("qeth: optimize IP handling in rx_mode callback")
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
qeth_get_elements_for_range() doesn't know how to handle a 0-length
range (ie. start == end), and returns 1 when it should return 0.
Such ranges occur on TSO skbs, where the L2/L3/L4 headers (and thus all
of the skb's linear data) are skipped when mapping the skb into regular
buffer elements.
This overestimation may cause several performance-related issues:
1. sub-optimal IO buffer selection, where the next buffer gets selected
even though the skb would actually still fit into the current buffer.
2. forced linearization, if the element count for a non-linear skb
exceeds QETH_MAX_BUFFER_ELEMENTS.
Rather than modifying qeth_get_elements_for_range() and adding overhead
to every caller, fix up those callers that are in risk of passing a
0-length range.
Fixes: 2863c61334 ("qeth: refactor calculation of SBALE count")
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Don't include in the Rx bytecount of the packet sent up the stack:
the FCB (frame control block), and the padding bytes inserted by
the controller into the frame payload, nor the FCS. All these are
being pulled out of the skb by gfar_process_frame().
This issue is old, likely from the driver's beginnings, however
it was amplified by recent:
commit d903ec7711 ("gianfar: simplify FCS handling and fix memory leak")
which basically added the FCS to the Rx bytecount, and so brought
this to my attention.
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 6de3f79112 ("arm_pmu: explicitly enable/disable SPIs at hotplug")
moved all of the arm_pmu IRQ enable/disable calls to the CPU hotplug hooks,
regardless of whether they are implemented as PPIs or SPIs. This can
lead to us sleeping from atomic context due to disable_irq blocking:
| BUG: sleeping function called from invalid context at kernel/irq/manage.c:112
| in_atomic(): 1, irqs_disabled(): 128, pid: 15, name: migration/1
| no locks held by migration/1/15.
| irq event stamp: 192
| hardirqs last enabled at (191): [<00000000803c2507>]
| _raw_spin_unlock_irq+0x2c/0x4c
| hardirqs last disabled at (192): [<000000007f57ad28>] multi_cpu_stop+0x9c/0x140
| softirqs last enabled at (0): [<0000000004ee1b58>]
| copy_process.isra.77.part.78+0x43c/0x1504
| softirqs last disabled at (0): [< (null)>] (null)
| CPU: 1 PID: 15 Comm: migration/1 Not tainted 4.16.0-rc3-salvator-x #1651
| Hardware name: Renesas Salvator-X board based on r8a7796 (DT)
| Call trace:
| dump_backtrace+0x0/0x140
| show_stack+0x14/0x1c
| dump_stack+0xb4/0xf0
| ___might_sleep+0x1fc/0x218
| __might_sleep+0x70/0x80
| synchronize_irq+0x40/0xa8
| disable_irq+0x20/0x2c
| arm_perf_teardown_cpu+0x80/0xac
Since the interrupt is always CPU-affine and this code is running with
interrupts disabled, we can just use disable_irq_nosync as we know there
isn't a concurrent invocation of the handler to worry about.
Fixes: 6de3f79112 ("arm_pmu: explicitly enable/disable SPIs at hotplug")
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Omitting suffixes from instructions in AT&T mode is bad practice when
operand size cannot be determined by the assembler from register
operands, and is likely going to be warned about by upstream gas in the
future (mine does already). Add the missing suffixes here. Note that for
64-bit this means some operations change from being 32-bit to 64-bit.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/5A93F98702000078001ABACC@prv-mh.provo.novell.com
Omitting suffixes from instructions in AT&T mode is bad practice when
operand size cannot be determined by the assembler from register
operands, and is likely going to be warned about by upstream gas in the
future (mine does already). Add the single missing suffix here.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/5A93F96902000078001ABAC8@prv-mh.provo.novell.com
__gic_clocksource_init() extracts the GIC_CONFIG_COUNTBITS field from
read_gic_config() by right shifting the register value. The shift count is
determined by the most significant bit (__fls) of the bitmask which is
wrong as it shifts out the complete bitfield.
Use the least significant bit (__ffs) instead to shift the bitfield down to
bit 0.
Fixes: e07127a077 ("clocksource: mips-gic-timer: Use new GIC accessor functions")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: daniel.lezcano@linaro.org
Cc: paul.burton@imgtec.com
Link: https://lkml.kernel.org/r/20180228095610.50341-1-nbd@nbd.name
Only one of the two is really required, not both:
* have_rtscts or
* have_rtsgpio
In imx_rs485_config() this is done correctly, so RS485 is working,
just the error message is false.
Signed-off-by: Phil Eichinger <phil@zankapfel.net>
Reviewed-by: Fabio Estevam <fabio.estevam@nxp.com>
Fixes: b8f3bff057 ("serial: imx: Support common rs485 binding for RTS polarity"
Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When the TTY buffers fill up to the configured maximum, a system lockup
occurs:
[ 598.820128] INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 598.825796] 0-...!: (1 GPs behind) idle=5a6/2/0 softirq=1974/1974 fqs=1
[ 598.832577] (detected by 3, t=62517 jiffies, g=296, c=295, q=126)
[ 598.838755] Task dump for CPU 0:
[ 598.841977] swapper/0 R running task 0 0 0 0x00000022
[ 598.849023] Call trace:
[ 598.851476] __switch_to+0x98/0xb0
[ 598.854870] (null)
This can be prevented by doing a dummy read of the RX data register.
This issue affects both HSCIF and SCIF ports. Reported for R-Car H3 ES2.0;
reproduced and fixed on H3 ES1.1. Probably affects other R-Car platforms
as well.
Reported-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Ulrich Hecht <ulrich.hecht+renesas@gmail.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: stable <stable@vger.kernel.org>
Tested-by: Nguyen Viet Dung <dung.nguyen.aj@renesas.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Do not fail on multiport cards in serial_pci_is_class_communication().
It restores behaviour for SUNIX multiport cards, that enumerated by
class and have a custom board data.
Moreover it allows users to reenumerate port-by-port from user space.
Fixes: 7d8905d064 ("serial: 8250_pci: Enable device after we check black list")
Reported-by: Nikola Ciprich <nikola.ciprich@linuxbox.cz>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Nikola Ciprich <nikola.ciprich@linuxbox.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A tty is hung up by __tty_hangup() setting file->f_op to
hung_up_tty_fops, which is skipped on ttys whose write operation isn't
tty_write(). This means that, for example, /dev/console whose write
op is redirected_tty_write() is never actually marked hung up.
Because n_tty_read() uses the hung up status to decide whether to
abort the waiting readers, the lack of hung-up marking can lead to the
following scenario.
1. A session contains two processes. The leader and its child. The
child ignores SIGHUP.
2. The leader exits and starts disassociating from the controlling
terminal (/dev/console).
3. __tty_hangup() skips setting f_op to hung_up_tty_fops.
4. SIGHUP is delivered and ignored.
5. tty_ldisc_hangup() is invoked. It wakes up the waits which should
clear the read lockers of tty->ldisc_sem.
6. The reader wakes up but because tty_hung_up_p() is false, it
doesn't abort and goes back to sleep while read-holding
tty->ldisc_sem.
7. The leader progresses to tty_ldisc_lock() in tty_ldisc_hangup()
and is now stuck in D sleep indefinitely waiting for
tty->ldisc_sem.
The following is Alan's explanation on why some ttys aren't hung up.
http://lkml.kernel.org/r/20171101170908.6ad08580@alans-desktop
1. It broke the serial consoles because they would hang up and close
down the hardware. With tty_port that *should* be fixable properly
for any cases remaining.
2. The console layer was (and still is) completely broken and doens't
refcount properly. So if you turn on console hangups it breaks (as
indeed does freeing consoles and half a dozen other things).
As neither can be fixed quickly, this patch works around the problem
by introducing a new flag, TTY_HUPPING, which is used solely to tell
n_tty_read() that hang-up is in progress for the console and the
readers should be aborted regardless of the hung-up status of the
device.
The following is a sample hung task warning caused by this issue.
INFO: task agetty:2662 blocked for more than 120 seconds.
Not tainted 4.11.3-dbg-tty-lockup-02478-gfd6c7ee-dirty #28
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
0 2662 1 0x00000086
Call Trace:
__schedule+0x267/0x890
schedule+0x36/0x80
schedule_timeout+0x23c/0x2e0
ldsem_down_write+0xce/0x1f6
tty_ldisc_lock+0x16/0x30
tty_ldisc_hangup+0xb3/0x1b0
__tty_hangup+0x300/0x410
disassociate_ctty+0x6c/0x290
do_exit+0x7ef/0xb00
do_group_exit+0x3f/0xa0
get_signal+0x1b3/0x5d0
do_signal+0x28/0x660
exit_to_usermode_loop+0x46/0x86
do_syscall_64+0x9c/0xb0
entry_SYSCALL64_slow_path+0x25/0x25
The following is the repro. Run "$PROG /dev/console". The parent
process hangs in D state.
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/wait.h>
#include <sys/ioctl.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <signal.h>
#include <time.h>
#include <termios.h>
int main(int argc, char **argv)
{
struct sigaction sact = { .sa_handler = SIG_IGN };
struct timespec ts1s = { .tv_sec = 1 };
pid_t pid;
int fd;
if (argc < 2) {
fprintf(stderr, "test-hung-tty /dev/$TTY\n");
return 1;
}
/* fork a child to ensure that it isn't already the session leader */
pid = fork();
if (pid < 0) {
perror("fork");
return 1;
}
if (pid > 0) {
/* top parent, wait for everyone */
while (waitpid(-1, NULL, 0) >= 0)
;
if (errno != ECHILD)
perror("waitpid");
return 0;
}
/* new session, start a new session and set the controlling tty */
if (setsid() < 0) {
perror("setsid");
return 1;
}
fd = open(argv[1], O_RDWR);
if (fd < 0) {
perror("open");
return 1;
}
if (ioctl(fd, TIOCSCTTY, 1) < 0) {
perror("ioctl");
return 1;
}
/* fork a child, sleep a bit and exit */
pid = fork();
if (pid < 0) {
perror("fork");
return 1;
}
if (pid > 0) {
nanosleep(&ts1s, NULL);
printf("Session leader exiting\n");
exit(0);
}
/*
* The child ignores SIGHUP and keeps reading from the controlling
* tty. Because SIGHUP is ignored, the child doesn't get killed on
* parent exit and the bug in n_tty makes the read(2) block the
* parent's control terminal hangup attempt. The parent ends up in
* D sleep until the child is explicitly killed.
*/
sigaction(SIGHUP, &sact, NULL);
printf("Child reading tty\n");
while (1) {
char buf[1024];
if (read(fd, buf, sizeof(buf)) < 0) {
perror("read");
return 1;
}
}
return 0;
}
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Alan Cox <alan@llwyncelyn.cymru>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The tm-resched-dscr test links against pmu/lib.o, but we don't have a
rule to clean pmu/lib.o. This can lead to a build break if you build
for big endian and then little, or vice versa.
Fix it by making tm-resched-dscr depend on pmu/lib.c, causing the code
to be built directly in, meaning no .o is generated.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Once in a while I see build errors similar to the following
when building images from a clean tree.
Building powerpc:virtex-ml507:44x/virtex5_defconfig ... failed
------------
Error log:
arch/powerpc/boot/treeboot-akebono.c:37:20: fatal error:
libfdt.h: No such file or directory
Building powerpc:bamboo:smpdev:44x/bamboo_defconfig ... failed
------------
Error log:
arch/powerpc/boot/treeboot-akebono.c:37:20: fatal error:
libfdt.h: No such file or directory
arch/powerpc/boot/treeboot-currituck.c:35:20: fatal error:
libfdt.h: No such file or directory
Rebuilds will succeed.
Turns out that several source files in arch/powerpc/boot/ include
libfdt.h, but Makefile dependencies are incomplete. Let's fix that.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Normal 512-byte get/set of a TLV isn't supported but we were
registering the normal get/set anyway and relying on omitting
the SNDRV_CTL_ELEM_ACCESS_[READ|WRITE] flags to prevent them
being called.
Trouble is if this gets broken in the core ALSA code - as it has
been since at least 4.14 - the standard get/set can be called
unexpectedly and corrupt memory.
There's no point providing functions that won't be called and
it's a trivial change. The benefit is that if the ALSA core gets
broken again we get a big fat immediate NULL dereference instead
of a memory corruption timebomb.
Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
If multipathing is enabled, each NVMe subsystem creates a head
namespace (e.g., nvme0n1) and multiple private namespaces
(e.g., nvme0c0n1 and nvme0c1n1) in sysfs. When creating links for
private namespaces, links of head namespace are used, so the
namespace creation order must be followed (e.g., nvme0n1 ->
nvme0c1n1). If the order is not followed, links of sysfs will be
incomplete or kernel panic will occur.
The kernel panic was:
kernel BUG at fs/sysfs/symlink.c:27!
Call Trace:
nvme_mpath_add_disk_links+0x5d/0x80 [nvme_core]
nvme_validate_ns+0x5c2/0x850 [nvme_core]
nvme_scan_work+0x1af/0x2d0 [nvme_core]
Correct order
Context A Context B
nvme0n1
nvme0c0n1 nvme0c1n1
Incorrect order
Context A Context B
nvme0c1n1
nvme0n1
nvme0c0n1
The nvme_mpath_add_disk (for creating head namespace) is called
just before the nvme_mpath_add_disk_links (for creating private
namespaces). In nvme_mpath_add_disk, the first context acquires
the lock of subsystem and creates a head namespace, and other
contexts do nothing by checking GENHD_FL_UP of a head namespace
after waiting to acquire the lock. We verified the code with or
without multipathing using three vendors of dual-port NVMe SSDs.
Signed-off-by: Baegjae Sung <baegjae@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <keith.busch@intel.com>
To reproduce the lock up do the following
- connect otg host adapter and a USB device to the dual-role port
so that it is in host mode.
- suspend to mem.
- disconnect otg adapter.
- resume the system.
If we call dwc3_host_exit() before tasks are thawed
xhci_plat_remove() seems to lock up at the second usb_remove_hcd() call.
To work around this we queue the _dwc3_set_mode() work on
the system_freezable_wq.
Fixes: 41ce1456e1 ("usb: dwc3: core: make dwc3_set_mode() work properly")
Cc: <stable@vger.kernel.org> # v4.12+
Suggested-by: Manu Gautam <mgautam@codeaurora.org>
Signed-off-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
When LPE audio driver gets some error at probing, it may lead to a
crash because of canceling the pending work in hdmi_lpe_audio_free(),
since some of ports might be still not initialized.
For assuring the proper free of each port, initialize all ports at the
beginning of the probe.
Fixes: b4eb0d522f ("ALSA: x86: Split snd_intelhad into card and PCM specific structures")
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The commit change for supporting the multiple ports moved involved
some code shuffling, and there the initializations of spinlock and
mutex in snd_intelhad object were dropped mistakenly.
This patch adds the missing initializations again for each port.
Fixes: b4eb0d522f ("ALSA: x86: Split snd_intelhad into card and PCM specific structures")
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The patch "ALSA: control: code refactoring for ELEM_READ/ELEM_WRITE
operations" introduced a potential for kernel memory corruption due
to an incorrect if statement allowing non-readable controls to fall
through and call the get function. For TLV controls a driver can omit
SNDRV_CTL_ELEM_ACCESS_READ to ensure that only the TLV get function
can be called. Instead the normal get() can be invoked unexpectedly
and as the driver expects that this will only be called for controls
<= 512 bytes, potentially try to copy >512 bytes into the 512 byte
return array, so corrupting kernel memory.
The problem is an attempt to refactor the snd_ctl_elem_read function
to invert the logic so that it conditionally aborted if the control
is unreadable instead of conditionally executing. But the if statement
wasn't inverted correctly.
The correct inversion of
if (a && !b)
is
if (!a || b)
Fixes: becf9e5d55 ("ALSA: control: code refactoring for ELEM_READ/ELEM_WRITE operations")
Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
- Powerplay fixes for cards with no displays attached
- Couple of DC fixes
- radeon workaround for PPC64
* 'drm-fixes-4.16' of git://people.freedesktop.org/~agd5f/linux:
drm/radeon: insist on 32-bit DMA for Cedar on PPC64/PPC64LE
drm/amd/display: VGA black screen from s3 when attached to hook
drm/amdgpu: Unify the dm resume calls into one
drm/amdgpu: Add a missing lock for drm_mm_takedown
Revert "drm/radeon/pm: autoswitch power state when in balanced mode"
drm/amd/powerplay/smu7: allow mclk switching with no displays
drm/amd/powerplay/vega10: allow mclk switching with no displays
The ARM PMU doesn't have a reg address, so fix the following DTC warning
(requires W=1):
Node /soc/arm-pmu missing or empty reg/ranges property
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
This patch fixes the following DTC warning (requires W=1):
Node /soc/local_intc simple-bus unit address format error, expected "40000000"
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Jon attempted to fix the amount of RAM on the BCM958625HR in commit
c53beb47f6 ("ARM: dts: NSP: Correct RAM amount for BCM958625HR board")
but it seems like we tripped over some poorly documented schematics.
The top-level page of the schematics says the board has 2GB, but when
you end-up scrolling to page 6, you see two chips of 4GBit (512MB) but
what the bootloader really initializes only 512MB, any attempt to use
more than that results in data aborts. Fix this again back to 512MB.
Fixes: c53beb47f6 ("ARM: dts: NSP: Correct RAM amount for BCM958625HR board")
Acked-by: Jon Mason <jon.mason@broadcom.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
It seems that the proper value to return in this particular case is the
one contained into variable new_index instead of ret.
Addresses-Coverity-ID: 1465148 ("Copy-paste error")
Fixes: e46c7287b1 ("nbd: add a basic netlink interface")
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pull seccomp fix from James Morris:
"This disables the seccomp samples when cross compiling.
We've seen too many build issues here, so it's best to just disable
it, especially since they're just the samples"
* 'fixes-v4.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
samples/seccomp: do not compile when cross compiled
The Cinterion PL8 is an LTE modem with 2 possible WWAN interfaces.
The modem is controlled via AT commands through the exposed TTYs.
AT^SWWAN write command can be used to activate or deactivate a WWAN
connection for a PDP context defined with AT+CGDCONT. UE supports
two WWAN adapter. Both WWAN adapters can be activated a the same time
Signed-off-by: Bassem Boubaker <bassem.boubaker@actia.fr>
Acked-by: Oliver Neukum <oneukum@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The tls ulp overrides sk->prot with a new tls specific proto structs.
The tls specific structs were previously based on the ipv4 specific
tcp_prot sturct.
As a result, attaching the tls ulp to an ipv6 tcp socket replaced
some ipv6 callback with the ipv4 equivalents.
This patch adds ipv6 tls proto structs and uses them when
attached to ipv6 sockets.
Fixes: 3c4d755915 ('tls: kernel TLS support')
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We have uninlined the sh_eth_{read|write}() functions introduced in the
commit 4a55530f38 ("net: sh_eth: modify the definitions of register").
Now remove *inline* from sh_eth_tsu_{read|write}() as well and move
these functions from the header to the driver itself. This saves 684
more bytes of object code (ARM gcc 4.8.5)...
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long says:
====================
net: fix IFLA_MTU ignored on NEWLINK for some ip and ipv6 tunnels
The fix for ip_gre follows the way other ip tunnels do: not to
set mtu in ndo_init, as ip_tunnel_newlink will take care of it
properly.
The fix for ip6_tunnel and sit follows the way ipv6 tunenls do:
to set mtu again according to IFLA_MTU after, as all bind_dev
are called in ndo_init where it can't get the tb[IFLA_MTU].
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 128bb975dc ("ip6_gre: init dev->mtu and dev->hard_header_len
correctly") fixed IFLA_MTU ignored on NEWLINK for ip6_gre. The same
mtu fix is also needed for sit.
Note that dev->hard_header_len setting for sit works fine, no need to
fix it. sit is actually ipv4 tunnel, it can't call ip6_tnl_change_mtu
to set mtu.
Reported-by: Jianlin Shi <jishi@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 128bb975dc ("ip6_gre: init dev->mtu and dev->hard_header_len
correctly") fixed IFLA_MTU ignored on NEWLINK for ip6_gre. The same
mtu fix is also needed for ip6_tunnel.
Note that dev->hard_header_len setting for ip6_tunnel works fine,
no need to fix it.
Reported-by: Jianlin Shi <jishi@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It's safe to remove the setting of dev's needed_headroom and mtu in
__gre_tunnel_init, as discussed in [1], ip_tunnel_newlink can do it
properly.
Now Eric noticed that it could cover the mtu value set in do_setlink
when creating a ip_gre dev. It makes IFLA_MTU param not take effect.
So this patch is to remove them to make IFLA_MTU work, as in other
ipv4 tunnels.
[1]: https://patchwork.ozlabs.org/patch/823504/
Fixes: c544193214 ("GRE: Refactor GRE tunneling code.")
Reported-by: Eric Garver <e@erig.me>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit f5e64032a7 ("net: phy: fix resume handling") changes the
locking semantics for phy_resume() such that the caller now needs to
hold the phy mutex. Not all call sites were adopted to this new
semantic, resulting in warnings from the added
WARN_ON(!mutex_is_locked(&phydev->lock)). Rather than change the
semantics, add a __phy_resume() and restore the old behavior of
phy_resume().
Reported-by: Heiner Kallweit <hkallweit1@gmail.com>
Fixes: f5e64032a7 ("net: phy: fix resume handling")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- do not build samples when cross compiling (Michal Hocko)
From Kees: "This disables the seccomp samples when cross compiling. We're seen too many build issues here, so
it's best to just disable it, especially since they're just the samples."
Use the right loop index, not the number of devices in the array that we
need to remove, the following message uncovered the problem:
[ 5437.044119] hook not found, pf 5 num 0
[ 5437.044140] WARNING: CPU: 2 PID: 24983 at net/netfilter/core.c:376 __nf_unregister_net_hook+0x250/0x280
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Commit 2831231d4c ("bcache: reduce cache_set devices iteration by
devices_max_used") adds c->devices_max_used to reduce iteration of
c->uuids elements, this value is updated in bcache_device_attach().
But for flash only volume, when calling flash_devs_run(), the function
bcache_device_attach() is not called yet and c->devices_max_used is not
updated. The unexpected result is, the flash only volume won't be run
by flash_devs_run().
This patch fixes the issue by iterate all c->uuids elements in
flash_devs_run(). c->devices_max_used will be updated properly when
bcache_device_attach() gets called.
[mlyle: commit subject edited for character limit]
Fixes: 2831231d4c ("bcache: reduce cache_set devices iteration by devices_max_used")
Reported-by: Tang Junhui <tang.junhui@zte.com.cn>
Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: Michael Lyle <mlyle@lyle.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pull tpm fixes from James Morris:
"Bugfixes for TPM, from Jeremy Boone, via Jarkko Sakkinen"
* 'fixes-v4.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
tpm: fix potential buffer overruns caused by bit glitches on the bus
tpm: st33zp24: fix potential buffer overruns caused by bit glitches on the bus
tpm_i2c_infineon: fix potential buffer overruns caused by bit glitches on the bus
tpm_i2c_nuvoton: fix potential buffer overruns caused by bit glitches on the bus
tpm_tis: fix potential buffer overruns caused by bit glitches on the bus
In commit 60c2530696 ("tipc: fix race between poll() and
setsockopt()") we introduced a pointer from struct tipc_group to the
'group_is_connected' flag in struct tipc_sock, so that this field can
be checked without dereferencing the group pointer of the latter struct.
The initial value for this flag is correctly set to 'false' when a
group is created, but we miss the case when no group is created at
all, in which case the initial value should be 'true'. This has the
effect that SOCK_RDM/DGRAM sockets sending datagrams never receive
POLLOUT if they request so.
This commit corrects this bug.
Fixes: 60c2530696 ("tipc: fix race between poll() and setsockopt()")
Reported-by: Hoang Le <hoang.h.le@dektek.com.au>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit a307a1e6bc "cpufreq: s3c: use cpufreq_generic_init()"
accidentally broke cpufreq on s3c2410 and s3c2412.
These two platforms don't have a CPU frequency table and used to skip
calling cpufreq_table_validate_and_show() for them. But with the
above commit, we started calling it unconditionally and that will
eventually fail as the frequency table pointer is NULL.
Fix this by calling cpufreq_table_validate_and_show() conditionally
again.
Fixes: a307a1e6bc "cpufreq: s3c: use cpufreq_generic_init()"
Cc: 3.13+ <stable@vger.kernel.org> # v3.13+
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
According to RFC 1191 sections 3 and 4, ICMP frag-needed messages
indicating an MTU below 68 should be rejected:
A host MUST never reduce its estimate of the Path MTU below 68
octets.
and (talking about ICMP frag-needed's Next-Hop MTU field):
This field will never contain a value less than 68, since every
router "must be able to forward a datagram of 68 octets without
fragmentation".
Furthermore, by letting net.ipv4.route.min_pmtu be set to negative
values, we can end up with a very large PMTU when (-1) is cast into u32.
Let's also make ip_rt_min_pmtu a u32, since it's only ever compared to
unsigned ints.
Reported-by: Jianlin Shi <jishi@redhat.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Johan Hedberg says:
====================
pull request: bluetooth 2018-02-26
Here are a two Bluetooth driver fixes for the 4.16 kernel.
Please let me know if there are any issues pulling. Thanks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The current implementation checks the combined size of the children with
the 'size' of the parent. The correct behavior is to check the combined
size vs the pending change and to compare vs the 'size_new'.
Fixes: d9f9b9a4d0 ("devlink: Add support for resource abstraction")
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Tested-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
According to R-Car Gen3 Rev.0.80 manual, the DMATCR can be set to
16,777,215 as maximum. So, this patch fixes the max_chunk_size for
safety on all of SoCs. Otherwise, a system may hang if the DMATCR
is set to 0 on R-Car Gen3.
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
As the block layer, since the conversion to blkmq, claims the host using a
context, a following nested call to mmc_claim_host(), which isn't using a
context, may hang.
Calling mmc_interrupt_hpi() and mmc_read_bkops_status() via the mmc block
layer, may suffer from this problem, as these functions are calling
mmc_claim|release_host().
Let's fix the problem by removing the calls to mmc_claim|release_host()
from the above mentioned functions and instead make the callers responsible
of claiming/releasing the host. As a matter of fact, the existing callers
already deals with it.
Fixes: 81196976ed ("mmc: block: Add blk-mq support")
Reported-by: Dmitry Osipenko <digetx@gmail.com>
Suggested-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Shawn Lin <shawn.lin@rock-chips.com>
cat /sys/kernel/debug/mmc0/regs will hang up the system since
it's in runtime suspended state, so the genpd and biu_clk is
off. This patch fixes this problem by calling pm_runtime_get_sync
to wake it up before reading the registers.
Fixes: e9ed8835e9 ("mmc: dw_mmc: add runtime PM callback")
Cc: <stable@vger.kernel.org>
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Reviewed-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Factor out dw_mci_init_slot_caps to consolidate parsing
all differents types of capabilities from host contrllers.
No functional change intended.
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Fixes: 800d78bfcc ("mmc: dw_mmc: add support for implementation specific callbacks")
Cc: <stable@vger.kernel.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
The hs_timing_cfg[] array is indexed using a value derived from the
"mshcN" alias in DT, which may lead to an out-of-bounds access.
Fix this by adding a range check.
Fixes: 361c7fe9b0 ("mmc: dw_mmc-k3: add sd support for hi3660")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Shawn Lin <shawn.lin@rock-chips.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
New options introduced by the patch this fixes are still
enabled even if CFG80211 is disabled.
.config:
# CONFIG_CFG80211 is not set
CONFIG_CFG80211_REQUIRE_SIGNED_REGDB=y
CONFIG_CFG80211_USE_KERNEL_REGDB_KEYS=y
# CONFIG_LIB80211 is not set
When CFG80211_REQUIRE_SIGNED_REGDB is enabled, it selects
SYSTEM_DATA_VERIFICATION which selects SYSTEM_TRUSTED_KEYRING
that need extract-cert tool. extract-cert needs some openssl
headers to be installed on the build machine.
Instead of adding missing "depends on CFG80211", it's
easier to use a 'if' block around all options related
to CFG80211, so do that.
Fixes: 90a53e4432 ("cfg80211: implement regdb signature checking")
Signed-off-by: Romain Naour <romain.naour@gmail.com>
[touch up commit message a bit]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
wake_klogd is a local variable in console_unlock(). The information
is lost when the console_lock owner using the busy wait added by
the commit dbdda842fe ("printk: Add console owner and waiter
logic to load balance console writes"). The following race is
possible:
CPU0 CPU1
console_unlock()
for (;;)
/* calling console for last message */
printk()
log_store()
log_next_seq++;
/* see new message */
if (seen_seq != log_next_seq) {
wake_klogd = true;
seen_seq = log_next_seq;
}
console_lock_spinning_enable();
if (console_trylock_spinning())
/* spinning */
if (console_lock_spinning_disable_and_check()) {
printk_safe_exit_irqrestore(flags);
return;
console_unlock()
if (seen_seq != log_next_seq) {
/* already seen */
/* nothing to do */
Result: Nobody would wakeup klogd.
One solution would be to make a global variable from wake_klogd.
But then we would need to manipulate it under a lock or so.
This patch wakes klogd also when console_lock is passed to the
spinning waiter. It looks like the right way to go. Also userspace
should have a chance to see and store any "flood" of messages.
Note that the very late klogd wake up was a historic solution.
It made sense on single CPU systems or when sys_syslog() operations
were synchronized using the big kernel lock like in v2.1.113.
But it is questionable these days.
Fixes: dbdda842fe ("printk: Add console owner and waiter logic to load balance console writes")
Link: http://lkml.kernel.org/r/20180226155734.dzwg3aovqnwtvkoy@pathway.suse.cz
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: linux-kernel@vger.kernel.org
Cc: Tejun Heo <tj@kernel.org>
Suggested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Tuning can leave the IP in an active state (Buffer Read Enable bit set)
which prevents the entry to low power states (i.e. S0i3). Data reset will
clear it.
Generally tuning is followed by a data transfer which will anyway sort out
the state, so it is rare that S0i3 is actually prevented.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
of_get_named_gpiod_flags() used directly in of_find_gpio() or indirectly
through of_find_spi_gpio() or of_find_regulator_gpio() can return
-EPROBE_DEFER. This gets overwritten by the subsequent of_find_*_gpio()
calls.
This patch fixes this by trying of_find_spi_gpio() or
of_find_regulator_gpio() only if deferred probing was not requested by
the previous of_get_named_gpiod_flags() call.
Fixes: 6a537d4846 ("gpio: of: Support regulator nonstandard GPIO properties")
Fixes: c858233902 ("gpio: of: Support SPI nonstandard GPIO properties")
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
[Augmented to fit with Maxime's patch]
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Commits c858233902 ("gpio: of: Support SPI nonstandard GPIO properties")
and 6a537d4846 ("gpio: of: Support regulator nonstandard GPIO
properties") have introduced a regression in the way error codes from
of_get_named_gpiod_flags are handled.
Previously, those errors codes were returned immediately, but the two
commits mentioned above are now overwriting the error pointer, meaning that
whatever value has been returned will be dropped in favor of whatever the
two new functions will return.
This might not be a big deal except for EPROBE_DEFER, on which GPIOlib
customers will depend on, and that will now be returned as an hard error
which means that they will not probe anymore, instead of gently deferring
their probe.
Since EPROBE_DEFER basically means that we have found a valid property but
there was no GPIO controller registered to handle it, fix this issues by
returning it as soon as we encounter it.
Fixes: c858233902 ("gpio: of: Support SPI nonstandard GPIO properties")
Fixes: 6a537d4846 ("gpio: of: Support regulator nonstandard GPIO properties")
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
[Fold in fix to the fix]
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
If the netdevice is already part of a flowtable, return EBUSY. I cannot
find a valid usecase for having two flowtables bound to the same
netdevice. We can still have two flowtable where the device set is
disjoint.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
While working on 16338a9b3a ("bpf, arm64: fix out of bounds access in
tail call") I noticed that ppc64 JIT is partially affected as well. While
the bound checking is correctly performed as unsigned comparison, the
register with the index value however, is never truncated into 32 bit
space, so e.g. a index value of 0x100000000ULL with a map of 1 element
would pass with PPC_CMPLW() whereas we later on continue with the full
64 bit register value. Therefore, as we do in interpreter and other JITs
truncate the value to 32 bit initially in order to fix access.
Fixes: ce0761419f ("powerpc/bpf: Implement support for tail calls")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
r8152 driver handles TSO packets (limited to ~16KB) quite well,
but pretends each TSO logical packet is a single packet on the wire.
There is also some error since headers are accounted once, but
error rate is small enough that we do not care.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Discrete TPMs are often connected over slow serial buses which, on
some platforms, can have glitches causing bit flips. In all the
driver _recv() functions, we need to use a u32 to unmarshal the
response size, otherwise a bit flip of the 31st bit would cause the
expected variable to go negative, which would then try to read a huge
amount of data. Also sanity check that the expected amount of data is
large enough for the TPM header.
Signed-off-by: Jeremy Boone <jeremy.boone@nccgroup.trust>
Cc: stable@vger.kernel.org
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: James Morris <james.morris@microsoft.com>
Discrete TPMs are often connected over slow serial buses which, on
some platforms, can have glitches causing bit flips. In all the
driver _recv() functions, we need to use a u32 to unmarshal the
response size, otherwise a bit flip of the 31st bit would cause the
expected variable to go negative, which would then try to read a huge
amount of data. Also sanity check that the expected amount of data is
large enough for the TPM header.
Signed-off-by: Jeremy Boone <jeremy.boone@nccgroup.trust>
Cc: stable@vger.kernel.org
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: James Morris <james.morris@microsoft.com>
Discrete TPMs are often connected over slow serial buses which, on
some platforms, can have glitches causing bit flips. In all the
driver _recv() functions, we need to use a u32 to unmarshal the
response size, otherwise a bit flip of the 31st bit would cause the
expected variable to go negative, which would then try to read a huge
amount of data. Also sanity check that the expected amount of data is
large enough for the TPM header.
Signed-off-by: Jeremy Boone <jeremy.boone@nccgroup.trust>
Cc: stable@vger.kernel.org
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: James Morris <james.morris@microsoft.com>
Discrete TPMs are often connected over slow serial buses which, on
some platforms, can have glitches causing bit flips. In all the
driver _recv() functions, we need to use a u32 to unmarshal the
response size, otherwise a bit flip of the 31st bit would cause the
expected variable to go negative, which would then try to read a huge
amount of data. Also sanity check that the expected amount of data is
large enough for the TPM header.
Signed-off-by: Jeremy Boone <jeremy.boone@nccgroup.trust>
Cc: stable@vger.kernel.org
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: James Morris <james.morris@microsoft.com>
The Makefile lacks a couple of line continuation backslashes
in an `if' clause, which produces an error when make versions
prior to 4.x are used for building the tests.
$ make
make[1]: Entering directory `/[...]/linux/tools/testing/selftests/futex'
/bin/sh: -c: line 5: syntax error: unexpected end of file
make[1]: *** [all] Error 1
make[1]: Leaving directory `/[...]/linux/tools/testing/selftests/futex'
make: *** [all] Error 2
Signed-off-by: Daniel Díaz <daniel.diaz@linaro.org>
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Commit 343a8d17fa (cpufreq: scpi: remove arm_big_little dependency)
changed the cpufreq driver on juno from arm_big_little to scpi.
The scpi set_target function does not call the frequency-invariance
setter function arch_set_freq_scale() like the arm_big_little set_target
function does. As a result the task scheduler load and utilization
signals are not frequency-invariant on this platform anymore.
Fix this by adding a call to arch_set_freq_scale() into
scpi_cpufreq_set_target().
Fixes: 343a8d17fa (cpufreq: scpi: remove arm_big_little dependency)
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Pull idr fixes from Matthew Wilcox:
"One test-suite build fix for you and one run-time regression fix.
The regression fix includes new tests to make sure they don't pop back
up."
* 'idr-2018-02-06' of git://git.infradead.org/users/willy/linux-dax:
idr: Fix handling of IDs above INT_MAX
radix tree test suite: Fix build
It is entirely possible that the BIOS wasn't able to assign resources to a
device. In this case don't crash in pci_release_resource() when we try to
resize the resource.
Fixes: 8bb705e3e7 ("PCI: Add pci_resize_resource() for resizing BARs")
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
CC: stable@vger.kernel.org # v4.15+
This stops the driver from trying to probe the ATA slave
interface. The vendor code enables the slave interface
but the driver in the vendor tree does not make use of
it.
Setting it to muxmode 0 disables the slave interface:
the hardware only has the master interface connected
to the one harddrive slot anyways.
Without this change booting takes excessive time, so it
is very annoying to end users.
Fixes: dd5c0561db ("ARM: dts: Add basic devicetree for D-Link DNS-313")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
The CONFIG_LIRC symbol has changed from 'tristate' to 'bool, so we now
get a warning for omap2plus_defconfig:
arch/arm/configs/omap2plus_defconfig:322:warning: symbol value 'm' invalid for LIRC
This changes the file to mark the symbol as built-in to get rid of the
warning.
Fixes: a60d64b15c ("media: lirc: lirc interface should not be a raw decoder")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Gerd reports that ->i_mode may contain other bits besides S_IFCHR. Use
S_ISCHR() instead. Otherwise, get_user_pages_longterm() may fail on
device-dax instances when those are meant to be explicitly allowed.
Fixes: 2bb6d28370 ("mm: introduce get_user_pages_longterm")
Cc: <stable@vger.kernel.org>
Reported-by: Gerd Rausch <gerd.rausch@oracle.com>
Acked-by: Jane Chu <jane.chu@oracle.com>
Reported-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
In Patch:
[7a862fb] brd: remove dax support
Dan Williams has removed the only might_sleep
implementation of ->direct_access.
So we no longer need to check for it.
CC: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Boaz Harrosh <boazh@netapp.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
This reverts commit 5c38bd1b82.
skb->mark contains the mark the encapsulated traffic which
can result in incorrect routing decisions being made such
as routing loops if the route chosen is via tunnel itself.
The correct method should be to use tunnel->fwmark.
Signed-off-by: Thomas Winter <thomas.winter@alliedtelesis.co.nz>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
When a VLAN is added on a port, a reference is taken on the
corresponding master VLAN entry. If it does not already exist, then it
is created and a reference taken.
However, in the second case a reference is not really taken when
CONFIG_REFCOUNT_FULL is enabled as refcount_inc() is replaced by
refcount_inc_not_zero().
Fix this by using refcount_set() on a newly created master VLAN entry.
Fixes: 2512775985 ("net, bridge: convert net_bridge_vlan.refcnt from atomic_t to refcount_t")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Added MODULE_ALIAS("rpmsg:IPCRTR") to ensure qrtr-smd and qrtr will load
when IPCRTR channel is detected.
Signed-off-by: Ramon Fried <rfried@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
test_bpf() is taking 1.6 seconds nowadays, it is time
to add a schedule point in it.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Khalid reported that the kernel selftests are currently failing:
selftests: test_bpf.sh
========================================
test_bpf: [FAIL]
not ok 1..8 selftests: test_bpf.sh [FAIL]
He bisected it to 6ce711f275 ("idr: Make
1-based IDRs more efficient").
The root cause is doing a signed comparison in idr_alloc_u32() instead
of an unsigned comparison. I went looking for any similar problems and
found a couple (which would each result in the failure to warn in two
situations that aren't supposed to happen).
I knocked up a few test-cases to prove that I was right and added them
to the test-suite.
Reported-by: Khalid Aziz <khalid.aziz@oracle.com>
Tested-by: Khalid Aziz <khalid.aziz@oracle.com>
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Sometimes when physical lines have a just good noise to make the protocol
handshaking fail, but the carrier detect still good. Then after remove of
the noise, nobody will trigger this protocol to be start again to cause
the link to never come back. The fix is when the carrier is still on, not
terminate the protocol handshaking.
Signed-off-by: Denis Du <dudenis2000@yahoo.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
'struct blk_user_trace_setup' is passed to BLKTRACESETUP, not
BLKTRACESTART.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We don't flush batched XDP packets through xdp_do_flush_map(), this
will cause packets stall at TX queue. Consider we don't do XDP on NAPI
poll(), the only possible fix is to call xdp_do_flush_map()
immediately after xdp_do_redirect().
Note, this in fact won't try to batch packets through devmap, we could
address in the future.
Reported-by: Christoffer Dall <christoffer.dall@linaro.org>
Fixes: 761876c857 ("tap: XDP support")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Except for tuntap, all other drivers' XDP was implemented at NAPI
poll() routine in a bh. This guarantees all XDP operation were done at
the same CPU which is required by e.g BFP_MAP_TYPE_PERCPU_ARRAY. But
for tuntap, we do it in process context and we try to protect XDP
processing by RCU reader lock. This is insufficient since
CONFIG_PREEMPT_RCU can preempt the RCU reader critical section which
breaks the assumption that all XDP were processed in the same CPU.
Fixing this by simply disabling preemption during XDP processing.
Fixes: 761876c857 ("tap: XDP support")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit 762c330d67. The
reason is we try to batch packets for devmap which causes calling
xdp_do_flush() in the process context. Simply disabling preemption
may not work since process may move among processors which lead
xdp_do_flush() to miss some flushes on some processors.
So simply revert the patch, a follow-up patch will add the xdp flush
correctly.
Reported-by: Christoffer Dall <christoffer.dall@linaro.org>
Fixes: 762c330d67 ("tuntap: add missing xdp flush")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It is not valid for orion5x to use mac_pton().
First of all, the orion5x buffer is not NULL terminated. mac_pton()
has no business operating on non-NULL terminated buffers because
only the caller can know that this is valid and in what manner it
is ok to parse this NULL'less buffer.
Second of all, orion5x operates on an __iomem pointer, which cannot
be dereferenced using normal C pointer operations. Accesses to
such areas much be performed with the proper iomem accessors.
Fixes: 4904dbda41 ("ARM: orion5x: use mac_pton() helper")
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull EDAC fix from Borislav Petkov:
"sb_edac: Prevent memory corruption on KNL (from Anna Karbownik)"
* tag 'edac_fixes_for_4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
EDAC, sb_edac: Fix out of bound writes during DIMM configuration on KNL
When specifying string type mount option (e.g., logdev)
several times in a mount, current option parsing may
cause memory leak. Hence, call kfree for previous one
in this case.
Signed-off-by: Chengguang Xu <cgxu519@icloud.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Pull x86 fixes from Thomas Gleixner:
"Yet another pile of melted spectrum related changes:
- sanitize the array_index_nospec protection mechanism: Remove the
overengineered array_index_nospec_mask_check() magic and allow
const-qualified types as index to avoid temporary storage in a
non-const local variable.
- make the microcode loader more robust by properly propagating error
codes. Provide information about new feature bits after micro code
was updated so administrators can act upon.
- optimizations of the entry ASM code which reduce code footprint and
make the code simpler and faster.
- fix the {pmd,pud}_{set,clear}_flags() implementations to work
properly on paravirt kernels by removing the address translation
operations.
- revert the harmful vmexit_fill_RSB() optimization
- use IBRS around firmware calls
- teach objtool about retpolines and add annotations for indirect
jumps and calls.
- explicitly disable jumplabel patching in __init code and handle
patching failures properly instead of silently ignoring them.
- remove indirect paravirt calls for writing the speculation control
MSR as these calls are obviously proving the same attack vector
which is tried to be mitigated.
- a few small fixes which address build issues with recent compiler
and assembler versions"
* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (38 commits)
KVM/VMX: Optimize vmx_vcpu_run() and svm_vcpu_run() by marking the RDMSR path as unlikely()
KVM/x86: Remove indirect MSR op calls from SPEC_CTRL
objtool, retpolines: Integrate objtool with retpoline support more closely
x86/entry/64: Simplify ENCODE_FRAME_POINTER
extable: Make init_kernel_text() global
jump_label: Warn on failed jump_label patching attempt
jump_label: Explicitly disable jump labels in __init code
x86/entry/64: Open-code switch_to_thread_stack()
x86/entry/64: Move ASM_CLAC to interrupt_entry()
x86/entry/64: Remove 'interrupt' macro
x86/entry/64: Move the switch_to_thread_stack() call to interrupt_entry()
x86/entry/64: Move ENTER_IRQ_STACK from interrupt macro to interrupt_entry
x86/entry/64: Move PUSH_AND_CLEAR_REGS from interrupt macro to helper function
x86/speculation: Move firmware_restrict_branch_speculation_*() from C to CPP
objtool: Add module specific retpoline rules
objtool: Add retpoline validation
objtool: Use existing global variables for options
x86/mm/sme, objtool: Annotate indirect call in sme_encrypt_execute()
x86/boot, objtool: Annotate indirect jump in secondary_startup_64()
x86/paravirt, objtool: Annotate indirect calls
...
Pull KVM fixes from Paolo Bonzini:
"s390:
- optimization for the exitless interrupt support that was merged in 4.16-rc1
- improve the branch prediction blocking for nested KVM
- replace some jump tables with switch statements to improve expoline performance
- fixes for multiple epoch facility
ARM:
- fix the interaction of userspace irqchip VMs with in-kernel irqchip VMs
- make sure we can build 32-bit KVM/ARM with gcc-8.
x86:
- fixes for AMD SEV
- fixes for Intel nested VMX, emulated UMIP and a dump_stack() on VM startup
- fixes for async page fault migration
- small optimization to PV TLB flush (new in 4.16-rc1)
- syzkaller fixes
Generic:
- compiler warning fixes
- syzkaller fixes
- more improvements to the kvm_stat tool
Two more small Spectre fixes are going to reach you via Ingo"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (40 commits)
KVM: SVM: Fix SEV LAUNCH_SECRET command
KVM: SVM: install RSM intercept
KVM: SVM: no need to call access_ok() in LAUNCH_MEASURE command
include: psp-sev: Capitalize invalid length enum
crypto: ccp: Fix sparse, use plain integer as NULL pointer
KVM: X86: Avoid traversing all the cpus for pv tlb flush when steal time is disabled
x86/kvm: Make parse_no_xxx __init for kvm
KVM: x86: fix backward migration with async_PF
kvm: fix warning for non-x86 builds
kvm: fix warning for CONFIG_HAVE_KVM_EVENTFD builds
tools/kvm_stat: print 'Total' line for multiple events only
tools/kvm_stat: group child events indented after parent
tools/kvm_stat: separate drilldown and fields filtering
tools/kvm_stat: eliminate extra guest/pid selection dialog
tools/kvm_stat: mark private methods as such
tools/kvm_stat: fix debugfs handling
tools/kvm_stat: print error on invalid regex
tools/kvm_stat: fix crash when filtering out all non-child trace events
tools/kvm_stat: avoid 'is' for equality checks
tools/kvm_stat: use a more pythonic way to iterate over dictionaries
...
James Chapman says:
====================
l2tp: fix API races discovered by syzbot
This patch series addresses several races with L2TP APIs discovered by
syzbot. There are no functional changes.
The set of patches 1-5 in combination fix the following syzbot reports.
19c09769f WARNING in debug_print_object
347bd5acd KASAN: use-after-free Read in inet_shutdown
6e6a5ec8d general protection fault in pppol2tp_connect
9df43faf0 KASAN: use-after-free Read in pppol2tp_connect
My first attempts to fix these issues were as net-next patches but
the series included other refactoring and cleanup work. I was asked to
separate out the bugfixes and redo for the net tree, which is what
these patches are.
The changes are:
1. Fix inet_shutdown races when L2TP tunnels and sessions close. (patches 1-2)
2. Fix races with tunnel and its socket. (patch 3)
3. Fix race in pppol2tp_release with session and its socket. (patch 4)
4. Fix tunnel lookup use-after-free. (patch 5)
All of the syzbot reproducers hit races in the tunnel and pppol2tp
session create and destroy paths. These tests create and destroy
pppol2tp tunnels and sessions rapidly using multiple threads,
provoking races in several tunnel/session create/destroy paths. The
key problem was that each tunnel/session socket could be destroyed
while its associated tunnel/session object still existed (patches 3,
4). Patch 5 addresses a problem with the way tunnels are removed from
the tunnel list. Patch 5 is tagged that it addresses all four syzbot
issues, though all 5 patches are needed.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
pppol2tp_release uses call_rcu to put the final ref on its socket. But
the session object doesn't hold a ref on the session socket so may be
freed while the pppol2tp_put_sk RCU callback is scheduled. Fix this by
having the session hold a ref on its socket until the session is
destroyed. It is this ref that is dropped via call_rcu.
Sessions are also deleted via l2tp_tunnel_closeall. This must now also put
the final ref via call_rcu. So move the call_rcu call site into
pppol2tp_session_close so that this happens in both destroy paths. A
common destroy path should really be implemented, perhaps with
l2tp_tunnel_closeall calling l2tp_session_delete like pppol2tp_release
does, but this will be looked at later.
ODEBUG: activate active (active state 1) object type: rcu_head hint: (null)
WARNING: CPU: 3 PID: 13407 at lib/debugobjects.c:291 debug_print_object+0x166/0x220
Modules linked in:
CPU: 3 PID: 13407 Comm: syzbot_19c09769 Not tainted 4.16.0-rc2+ #38
Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
RIP: 0010:debug_print_object+0x166/0x220
RSP: 0018:ffff880013647a00 EFLAGS: 00010082
RAX: dffffc0000000008 RBX: 0000000000000003 RCX: ffffffff814d3333
RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff88001a59f6d0
RBP: ffff880013647a40 R08: 0000000000000000 R09: 0000000000000001
R10: ffff8800136479a8 R11: 0000000000000000 R12: 0000000000000001
R13: ffffffff86161420 R14: ffffffff85648b60 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff88001a580000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020e77000 CR3: 0000000006022000 CR4: 00000000000006e0
Call Trace:
debug_object_activate+0x38b/0x530
? debug_object_assert_init+0x3b0/0x3b0
? __mutex_unlock_slowpath+0x85/0x8b0
? pppol2tp_session_destruct+0x110/0x110
__call_rcu.constprop.66+0x39/0x890
? __call_rcu.constprop.66+0x39/0x890
call_rcu_sched+0x17/0x20
pppol2tp_release+0x2c7/0x440
? fcntl_setlk+0xca0/0xca0
? sock_alloc_file+0x340/0x340
sock_release+0x92/0x1e0
sock_close+0x1b/0x20
__fput+0x296/0x6e0
____fput+0x1a/0x20
task_work_run+0x127/0x1a0
do_exit+0x7f9/0x2ce0
? SYSC_connect+0x212/0x310
? mm_update_next_owner+0x690/0x690
? up_read+0x1f/0x40
? __do_page_fault+0x3c8/0xca0
do_group_exit+0x10d/0x330
? do_group_exit+0x330/0x330
SyS_exit_group+0x22/0x30
do_syscall_64+0x1e0/0x730
? trace_hardirqs_off_thunk+0x1a/0x1c
entry_SYSCALL_64_after_hwframe+0x42/0xb7
RIP: 0033:0x7f362e471259
RSP: 002b:00007ffe389abe08 EFLAGS: 00000202 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f362e471259
RDX: 00007f362e471259 RSI: 000000000000002e RDI: 0000000000000000
RBP: 00007ffe389abe30 R08: 0000000000000000 R09: 00007f362e944270
R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000400b60
R13: 00007ffe389abf50 R14: 0000000000000000 R15: 0000000000000000
Code: 8d 3c dd a0 8f 64 85 48 89 fa 48 c1 ea 03 80 3c 02 00 75 7b 48 8b 14 dd a0 8f 64 85 4c 89 f6 48 c7 c7 20 85 64 85 e
8 2a 55 14 ff <0f> 0b 83 05 ad 2a 68 04 01 48 83 c4 18 5b 41 5c 41 5d 41 5e 41
Fixes: ee40fb2e1e ("l2tp: protect sock pointer of struct pppol2tp_session with RCU")
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The tunnel socket tunnel->sock (struct sock) is accessed when
preparing a new ppp session on a tunnel at pppol2tp_session_init. If
the socket is closed by a thread while another is creating a new
session, the threads race. In pppol2tp_connect, the tunnel object may
be created if the pppol2tp socket is associated with the special
session_id 0 and the tunnel socket is looked up using the provided
fd. When handling this, pppol2tp_connect cannot sock_hold the tunnel
socket to prevent it being destroyed during pppol2tp_connect since
this may itself may race with the socket being destroyed. Doing
sockfd_lookup in pppol2tp_connect isn't sufficient to prevent
tunnel->sock going away either because a given tunnel socket fd may be
reused between calls to pppol2tp_connect. Instead, have
l2tp_tunnel_create sock_hold the tunnel socket before it does
sockfd_put. This ensures that the tunnel's socket is always extant
while the tunnel object exists. Hold a ref on the socket until the
tunnel is destroyed and ensure that all tunnel destroy paths go
through a common function (l2tp_tunnel_delete) since this will do the
final sock_put to release the tunnel socket.
Since the tunnel's socket is now guaranteed to exist if the tunnel
exists, we no longer need to use sockfd_lookup via l2tp_sock_to_tunnel
to derive the tunnel from the socket since this is always
sk_user_data.
Also, sessions no longer sock_hold the tunnel socket since sessions
already hold a tunnel ref and the tunnel sock will not be freed until
the tunnel is freed. Removing these sock_holds in
l2tp_session_register avoids a possible sock leak in the
pppol2tp_connect error path if l2tp_session_register succeeds but
attaching a ppp channel fails. The pppol2tp_connect error path could
have been fixed instead and have the sock ref dropped when the session
is freed, but doing a sock_put of the tunnel socket when the session
is freed would require a new session_free callback. It is simpler to
just remove the sock_hold of the tunnel socket in
l2tp_session_register, now that the tunnel socket lifetime is
guaranteed.
Finally, some init code in l2tp_tunnel_create is reordered to ensure
that the new tunnel object's refcount is set and the tunnel socket ref
is taken before the tunnel socket destructor callbacks are set.
kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] SMP KASAN
Modules linked in:
CPU: 0 PID: 4360 Comm: syzbot_19c09769 Not tainted 4.16.0-rc2+ #34
Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
RIP: 0010:pppol2tp_session_init+0x1d6/0x500
RSP: 0018:ffff88001377fb40 EFLAGS: 00010212
RAX: dffffc0000000000 RBX: ffff88001636a940 RCX: ffffffff84836c1d
RDX: 0000000000000045 RSI: 0000000055976744 RDI: 0000000000000228
RBP: ffff88001377fb60 R08: ffffffff84836bc8 R09: 0000000000000002
R10: ffff88001377fab8 R11: 0000000000000001 R12: 0000000000000000
R13: ffff88001636aac8 R14: ffff8800160f81c0 R15: 1ffff100026eff76
FS: 00007ffb3ea66700(0000) GS:ffff88001a400000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020e77000 CR3: 0000000016261000 CR4: 00000000000006f0
Call Trace:
pppol2tp_connect+0xd18/0x13c0
? pppol2tp_session_create+0x170/0x170
? __might_fault+0x115/0x1d0
? lock_downgrade+0x860/0x860
? __might_fault+0xe5/0x1d0
? security_socket_connect+0x8e/0xc0
SYSC_connect+0x1b6/0x310
? SYSC_bind+0x280/0x280
? __do_page_fault+0x5d1/0xca0
? up_read+0x1f/0x40
? __do_page_fault+0x3c8/0xca0
SyS_connect+0x29/0x30
? SyS_accept+0x40/0x40
do_syscall_64+0x1e0/0x730
? trace_hardirqs_off_thunk+0x1a/0x1c
entry_SYSCALL_64_after_hwframe+0x42/0xb7
RIP: 0033:0x7ffb3e376259
RSP: 002b:00007ffeda4f6508 EFLAGS: 00000202 ORIG_RAX: 000000000000002a
RAX: ffffffffffffffda RBX: 0000000020e77012 RCX: 00007ffb3e376259
RDX: 000000000000002e RSI: 0000000020e77000 RDI: 0000000000000004
RBP: 00007ffeda4f6540 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000400b60
R13: 00007ffeda4f6660 R14: 0000000000000000 R15: 0000000000000000
Code: 80 3d b0 ff 06 02 00 0f 84 07 02 00 00 e8 13 d6 db fc 49 8d bc 24 28 02 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 f
a 48 c1 ea 03 <80> 3c 02 00 0f 85 ed 02 00 00 4d 8b a4 24 28 02 00 00 e8 13 16
Fixes: 80d84ef3ff ("l2tp: prevent l2tp_tunnel_delete racing with userspace close")
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Previously, if a ppp session was closed, we called inet_shutdown to mark
the socket as unconnected such that userspace would get errors and
then close the socket. This could race with userspace closing the
socket. Instead, leave userspace to close the socket in its own time
(our session will be detached anyway).
BUG: KASAN: use-after-free in inet_shutdown+0x5d/0x1c0
Read of size 4 at addr ffff880010ea3ac0 by task syzbot_347bd5ac/8296
CPU: 3 PID: 8296 Comm: syzbot_347bd5ac Not tainted 4.16.0-rc1+ #91
Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
Call Trace:
dump_stack+0x101/0x157
? inet_shutdown+0x5d/0x1c0
print_address_description+0x78/0x260
? inet_shutdown+0x5d/0x1c0
kasan_report+0x240/0x360
__asan_load4+0x78/0x80
inet_shutdown+0x5d/0x1c0
? pppol2tp_show+0x80/0x80
pppol2tp_session_close+0x68/0xb0
l2tp_tunnel_closeall+0x199/0x210
? udp_v6_flush_pending_frames+0x90/0x90
l2tp_udp_encap_destroy+0x6b/0xc0
? l2tp_tunnel_del_work+0x2e0/0x2e0
udpv6_destroy_sock+0x8c/0x90
sk_common_release+0x47/0x190
udp_lib_close+0x15/0x20
inet_release+0x85/0xd0
inet6_release+0x43/0x60
sock_release+0x53/0x100
? sock_alloc_file+0x260/0x260
sock_close+0x1b/0x20
__fput+0x19f/0x380
____fput+0x1a/0x20
task_work_run+0xd2/0x110
exit_to_usermode_loop+0x18d/0x190
do_syscall_64+0x389/0x3b0
entry_SYSCALL_64_after_hwframe+0x26/0x9b
RIP: 0033:0x7fe240a45259
RSP: 002b:00007fe241132df8 EFLAGS: 00000297 ORIG_RAX: 0000000000000003
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007fe240a45259
RDX: 00007fe240a45259 RSI: 0000000000000000 RDI: 00000000000000a5
RBP: 00007fe241132e20 R08: 00007fe241133700 R09: 0000000000000000
R10: 00007fe241133700 R11: 0000000000000297 R12: 0000000000000000
R13: 00007ffc49aff84f R14: 0000000000000000 R15: 00007fe241141040
Allocated by task 8331:
save_stack+0x43/0xd0
kasan_kmalloc+0xad/0xe0
kasan_slab_alloc+0x12/0x20
kmem_cache_alloc+0x144/0x3e0
sock_alloc_inode+0x22/0x130
alloc_inode+0x3d/0xf0
new_inode_pseudo+0x1c/0x90
sock_alloc+0x30/0x110
__sock_create+0xaa/0x4c0
SyS_socket+0xbe/0x130
do_syscall_64+0x128/0x3b0
entry_SYSCALL_64_after_hwframe+0x26/0x9b
Freed by task 8314:
save_stack+0x43/0xd0
__kasan_slab_free+0x11a/0x170
kasan_slab_free+0xe/0x10
kmem_cache_free+0x88/0x2b0
sock_destroy_inode+0x49/0x50
destroy_inode+0x77/0xb0
evict+0x285/0x340
iput+0x429/0x530
dentry_unlink_inode+0x28c/0x2c0
__dentry_kill+0x1e3/0x2f0
dput.part.21+0x500/0x560
dput+0x24/0x30
__fput+0x2aa/0x380
____fput+0x1a/0x20
task_work_run+0xd2/0x110
exit_to_usermode_loop+0x18d/0x190
do_syscall_64+0x389/0x3b0
entry_SYSCALL_64_after_hwframe+0x26/0x9b
Fixes: fd558d186d ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts")
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When blkdev_open() races with device removal and creation it can happen
that unhashed bdev inode gets associated with newly created gendisk
like:
CPU0 CPU1
blkdev_open()
bdev = bd_acquire()
del_gendisk()
bdev_unhash_inode(bdev);
remove device
create new device with the same number
__blkdev_get()
disk = get_gendisk()
- gets reference to gendisk of the new device
Now another blkdev_open() will not find original 'bdev' as it got
unhashed, create a new one and associate it with the same 'disk' at
which point problems start as we have two independent page caches for
one device.
Fix the problem by verifying that the bdev inode didn't get unhashed
before we acquired gendisk reference. That way we make sure gendisk can
get associated only with visible bdev inodes.
Tested-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
When two blkdev_open() calls for a partition race with device removal
and recreation, we can hit BUG_ON(!bd_may_claim(bdev, whole, holder)) in
blkdev_open(). The race can happen as follows:
CPU0 CPU1 CPU2
del_gendisk()
bdev_unhash_inode(part1);
blkdev_open(part1, O_EXCL) blkdev_open(part1, O_EXCL)
bdev = bd_acquire() bdev = bd_acquire()
blkdev_get(bdev)
bd_start_claiming(bdev)
- finds old inode 'whole'
bd_prepare_to_claim() -> 0
bdev_unhash_inode(whole);
<device removed>
<new device under same
number created>
blkdev_get(bdev);
bd_start_claiming(bdev)
- finds new inode 'whole'
bd_prepare_to_claim()
- this also succeeds as we have
different 'whole' here...
- bad things happen now as we
have two exclusive openers of
the same bdev
The problem here is that block device opens can see various intermediate
states while gendisk is shutting down and then being recreated.
We fix the problem by introducing new lookup_sem in gendisk that
synchronizes gendisk deletion with get_gendisk() and furthermore by
making sure that get_gendisk() does not return gendisk that is being (or
has been) deleted. This makes sure that once we ever manage to look up
newly created bdev inode, we are also guaranteed that following
get_gendisk() will either return failure (and we fail open) or it
returns gendisk for the new device and following bdget_disk() will
return new bdev inode (i.e., blkdev_open() follows the path as if it is
completely run after new device is created).
Reported-and-analyzed-by: Hou Tao <houtao1@huawei.com>
Tested-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
When two blkdev_open() calls race with device removal and recreation,
__blkdev_get() can use looked up gendisk after it is freed:
CPU0 CPU1 CPU2
del_gendisk(disk);
bdev_unhash_inode(inode);
blkdev_open() blkdev_open()
bdev = bd_acquire(inode);
- creates and returns new inode
bdev = bd_acquire(inode);
- returns the same inode
__blkdev_get(devt) __blkdev_get(devt)
disk = get_gendisk(devt);
- got structure of device going away
<finish device removal>
<new device gets
created under the same
device number>
disk = get_gendisk(devt);
- got new device structure
if (!bdev->bd_openers) {
does the first open
}
if (!bdev->bd_openers)
- false
} else {
put_disk_and_module(disk)
- remember this was old device - this was last ref and disk is
now freed
}
disk_unblock_events(disk); -> oops
Fix the problem by making sure we drop reference to disk in
__blkdev_get() only after we are really done with it.
Reported-by: Hou Tao <houtao1@huawei.com>
Tested-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Add a proper counterpart to get_disk_and_module() -
put_disk_and_module(). Currently it is opencoded in several places.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Rename get_disk() to get_disk_and_module() to make sure what the
function does. It's not a great name but at least it is now clear that
put_disk() is not it's counterpart.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Commit 8ddcd65325 "block: introduce GENHD_FL_HIDDEN" added handling of
hidden devices to get_gendisk() but forgot to drop module reference
which is also acquired by get_disk(). Drop the reference as necessary.
Arguably the function naming here is misleading as put_disk() is *not*
the counterpart of get_disk() but let's fix that in the follow up
commit since that will be more intrusive.
Fixes: 8ddcd65325
CC: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Introduce __smp_{mb,rmb,wmb}, and rely on the generic definitions
for smp_{mb,rmb,wmb}. A first consequence is that smp_{mb,rmb,wmb}
map to a compiler barrier on !SMP (while their definition remains
unchanged on SMP). As a further consequence, smp_load_acquire and
smp_store_release have "fence rw,rw" instead of "fence iorw,iorw".
Signed-off-by: Andrea Parri <parri.andrea@gmail.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
We don't want a separate module for vb2-trace.
That fixes this warning:
WARNING: modpost: missing MODULE_LICENSE() in drivers/media/common/videobuf2/vb2-trace.o
When building as module.
While here, add a SPDX header.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Currently if map is null then a potential null pointer deference
occurs when calling sock_release on map->sock. I believe the
actual intention was to call sock_release on sock instead. Fix
this.
Fixes: 5db4d286a8 ("xen/pvcalls: implement connect command")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Commit e864f39569 "fs: add RWF_DSYNC aand RWF_SYNC" added additional
way for direct IO to become synchronous and thus trigger fsync from the
IO completion handler. Then commit 9830f4be15 "fs: Use RWF_* flags for
AIO operations" allowed these flags to be set for AIO as well. However
that commit forgot to update the condition checking whether the IO
completion handling should be defered to a workqueue and thus AIO DIO
with RWF_[D]SYNC set will call fsync() from IRQ context resulting in
sleep in atomic.
Fix the problem by checking directly iocb flags (the same way as it is
done in dio_complete()) instead of checking all conditions that could
lead to IO being synchronous.
CC: Christoph Hellwig <hch@lst.de>
CC: Goldwyn Rodrigues <rgoldwyn@suse.com>
CC: stable@vger.kernel.org
Reported-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Fixes: 9830f4be15
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
redirect_dir=nofollow should not follow a redirect. But in a specific
configuration it can still follow it. For example try this.
$ mkdir -p lower0 lower1/foo upper work merged
$ touch lower1/foo/lower-file.txt
$ setfattr -n "trusted.overlay.opaque" -v "y" lower1/foo
$ mount -t overlay -o lowerdir=lower1:lower0,workdir=work,upperdir=upper,redirect_dir=on none merged
$ cd merged
$ mv foo foo-renamed
$ umount merged
# mount again. This time with redirect_dir=nofollow
$ mount -t overlay -o lowerdir=lower1:lower0,workdir=work,upperdir=upper,redirect_dir=nofollow none merged
$ ls merged/foo-renamed/
# This lists lower-file.txt, while it should not have.
Basically, we are doing redirect check after we check for d.stop. And
if this is not last lower, and we find an opaque lower, d.stop will be
set.
ovl_lookup_single()
if (!d->last && ovl_is_opaquedir(this)) {
d->stop = d->opaque = true;
goto out;
}
To fix this, first check redirect is allowed. And after that check if
d.stop has been set or not.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Fixes: 438c84c2f0 ("ovl: don't follow redirects if redirect_dir=off")
Cc: <stable@vger.kernel.org> #v4.15
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
When failing from ceph_fs_debugfs_init() in ceph_real_mount(),
there is lack of dput of root_dentry and it causes slab errors,
so change the calling order of ceph_fs_debugfs_init() and
open_root_dentry() and do some cleanups to avoid this issue.
Signed-off-by: Chengguang Xu <cgxu519@icloud.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
When parsing string option, in order to avoid memory leak we need to
carefully free it first in case of specifying same option several times.
Signed-off-by: Chengguang Xu <cgxu519@icloud.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
We've added a quirk to enable the recent Lenovo dock support, where it
overwrites the pin configs of NID 0x17 and 19, not only updating the
pin config cache. It works right after the boot, but the problem is
that the pin configs are occasionally cleared when the machine goes to
PM. Meanwhile the quirk writes the pin configs only at the pre-probe,
so this won't be applied any longer.
For addressing that issue, this patch moves the code to overwrite the
pin configs into HDA_FIXUP_ACT_INIT section so that it's always
applied at both probe and resume time.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=195161
Fixes: 61fcf8ece9 ("ALSA: hda/realtek - Enable Thinkpad Dock device for ALC298 platform")
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The routine pgattr_change_is_safe() was extended in commit 4e60205655
("arm64: mm: Permit transitioning from Global to Non-Global without BBM")
to permit changing the nG attribute from not set to set, but did so in a
way that inadvertently disallows such changes if other permitted attribute
changes take place at the same time. So update the code to take this into
account.
Fixes: 4e60205655 ("arm64: mm: Permit transitioning from Global to ...")
Cc: <stable@vger.kernel.org> # 4.14.x-
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The error checks on freq for a negative error return always fails because
freq is unsigned and can never be negative. Fix this by making freq a
signed long.
Detected with Coccinelle:
drivers/clocksource/fsl_ftm_timer.c:287:5-9: WARNING: Unsigned expression
compared with zero: freq <= 0
drivers/clocksource/fsl_ftm_timer.c:291:5-9: WARNING: Unsigned expression
compared with zero: freq <= 0
Fixes: 2529c3a330 ("clocksource: Add Freescale FlexTimer Module (FTM) timer support")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: kernel-janitors@vger.kernel.org
Link: https://lkml.kernel.org/r/20180226113614.3092-1-colin.king@canonical.com
fs/overlayfs/export.c:459:10-16: WARNING: PTR_ERR_OR_ZERO can be used
Use PTR_ERR_OR_ZERO rather than if(IS_ERR(...)) + PTR_ERR
Generated by: scripts/coccinelle/api/ptr_ret.cocci
Fixes: 4b91c30a5a ("ovl: lookup connected ancestor of dir in inode cache")
CC: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
I noticed that with 4.16-rc1 LVDS output on A83T based TBS A711 tablet doesn't
work (there's output but it's garbled). I compared some older patches for LVDS
support with the mainlined ones and this change is missing from mainline Linux.
I don't know what the register does exactly and the harcoded register value
doesn't inspire much confidence that it will work in a general case, so I'm
sending this RFC.
This patch fixes the issue on A83T.
Signed-off-by: Ondrej Jirman <megous@megous.com>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180222161217.23904-1-megous@megous.com
This patch fixes nvme queue cleanup if requesting an IRQ handler for
the queue's vector fails. It does this by resetting the cq_vector to
the uninitialized value of -1 so it is ignored for a controller reset.
Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
[changelog updates, removed misc whitespace changes]
Signed-off-by: Keith Busch <keith.busch@intel.com>
Some processor revisions do not support transactional memory, and
additionally kernel support can be disabled. In either case the
tm-trap test should be skipped, otherwise it will fail with a SIGILL.
Fixes: a08082f8e4 ("powerpc/selftests: Check endianness on trap in TM")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Pull Xtensa fixes from Max Filippov:
"Two fixes for reserved memory/DMA buffers allocation in high memory on
xtensa architecture
- fix memory accounting when reserved memory is in high memory region
- fix DMA allocation from high memory"
* tag 'xtensa-20180225' of git://github.com/jcmvbkbc/linux-xtensa:
xtensa: support DMA buffers in high memory
xtensa: fix high memory/reserved memory collision
Pull x86 fixes from Thomas Gleixner:
"A small set of fixes:
- UAPI data type correction for hyperv
- correct the cpu cores field in /proc/cpuinfo on CPU hotplug
- return proper error code in the resctrl file system failure path to
avoid silent subsequent failures
- correct a subtle accounting issue in the new vector allocation code
which went unnoticed for a while and caused suspend/resume
failures"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/topology: Update the 'cpu cores' field in /proc/cpuinfo correctly across CPU hotplug operations
x86/topology: Fix function name in documentation
x86/intel_rdt: Fix incorrect returned value when creating rdgroup sub-directory in resctrl file system
x86/apic/vector: Handle vector release on CPU unplug correctly
genirq/matrix: Handle CPU offlining proper
x86/headers/UAPI: Use __u64 instead of u64 in <uapi/asm/hyperv.h>
Pull perf fix from Thomas Gleixner:
"A single commit which shuts up a bogus GCC-8 warning"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/oprofile: Fix bogus GCC-8 warning in nmi_setup()
Pull locking fixes from Thomas Gleixner:
"Three patches to fix memory ordering issues on ALPHA and a comment to
clarify the usage scope of a mutex internal function"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/xchg/alpha: Fix xchg() and cmpxchg() memory ordering bugs
locking/xchg/alpha: Clean up barrier usage by using smp_mb() in place of __ASM__MB
locking/xchg/alpha: Add unconditional memory barrier to cmpxchg()
locking/mutex: Add comment to __mutex_owner() to deter usage
Pull cleanup patchlet from Thomas Gleixner:
"A single commit removing a bunch of bogus double semicolons all over
the tree"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
treewide/trivial: Remove ';;$' typo noise
Pull NFS client bugfixes from Trond Myklebust:
- fix a broken cast in nfs4_callback_recallany()
- fix an Oops during NFSv4 migration events
- make struct nlmclnt_fl_close_lock_ops static
* tag 'nfs-for-4.16-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFS: make struct nlmclnt_fl_close_lock_ops static
nfs: system crashes after NFS4ERR_MOVED recovery
NFSv4: Fix broken cast in nfs4_callback_recallany()
According to the devicetree binding the shutdown and device wake
GPIOs are optional. Since commit 3e81a4ca51 ("Bluetooth: hci_bcm:
Mandate presence of shutdown and device wake GPIO") this driver
won't probe anymore on Raspberry Pi 3 and Zero W (no device wake GPIO
connected). So fix this regression by reverting this commit partially.
Fixes: 3e81a4ca51 ("Bluetooth: hci_bcm: Mandate presence of shutdown and device wake GPIO")
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
For some reason, Florian forgot to apply to ip6_route_me_harder
the fix that went in commit 29e09229d9 ("netfilter: use
skb_to_full_sk in ip_route_me_harder")
Fixes: ca6fb06518 ("tcp: attach SYNACK messages to request sockets instead of listener")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
batman-adv uses internal indices for each enabled and active interface.
It is currently used by the B.A.T.M.A.N. IV algorithm to identifify the
correct position in the ogm_cnt bitmaps.
The type for the number of enabled interfaces (which defines the next
interface index) was set to char. This type can be (depending on the
architecture) either signed (limiting batman-adv to 127 active slave
interfaces) or unsigned (limiting batman-adv to 255 active slave
interfaces).
This limit was not correctly checked when an interface was enabled and thus
an overflow happened. This was only catched on systems with the signed char
type when the B.A.T.M.A.N. IV code tried to resize its counter arrays with
a negative size.
The if_num interface index was only a s16 and therefore significantly
smaller than the ifindex (int) used by the code net code.
Both &batadv_hard_iface->if_num and &batadv_priv->num_ifaces must be
(unsigned) int to support the same number of slave interfaces as the net
core code. And the interface activation code must check the number of
active slave interfaces to avoid integer overflows.
Fixes: c6c8fea297 ("net: Add batman-adv meshing protocol")
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
"fib" starts to behave strangely when an ipv6 default route is
added - the FIB lookup returns a route using 'oif' in this case.
This behaviour was inherited from ip6tables rpfilter so change
this as well.
Bugzilla: https://bugzilla.netfilter.org/show_bug.cgi?id=1221
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
In the ip_rcv, IPSTATS_MIB_CSUMERRORS is increased when
checksum error is occurred.
bridge netfilter routine should increase IPSTATS_MIB_CSUMERRORS.
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The function batadv_bla_backbone_dump_bucket must be able to handle
non-complete dumps of a single bucket. It tries to do that by saving the
latest dumped index in *idx_skip to inform the caller about the current
state.
But the caller only assumes that buckets were not completely dumped when
the return code is non-zero. This function must therefore also return a
non-zero index when the dumping of an entry failed. Otherwise the caller
will just skip all remaining buckets.
And the function must also reset *idx_skip back to zero when it finished a
bucket. Otherwise it will skip the same number of entries in the next
bucket as the previous one had.
Fixes: ea4152e117 ("batman-adv: add backbone table netlink support")
Reported-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
The function batadv_bla_claim_dump_bucket must be able to handle
non-complete dumps of a single bucket. It tries to do that by saving the
latest dumped index in *idx_skip to inform the caller about the current
state.
But the caller only assumes that buckets were not completely dumped when
the return code is non-zero. This function must therefore also return a
non-zero index when the dumping of an entry failed. Otherwise the caller
will just skip all remaining buckets.
And the function must also reset *idx_skip back to zero when it finished a
bucket. Otherwise it will skip the same number of entries in the next
bucket as the previous one had.
Fixes: 04f3f5bf18 ("batman-adv: add B.A.T.M.A.N. Dump BLA claims via netlink")
Reported-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
The function batadv_v_gw_dump stops the processing loop when
batadv_v_gw_dump_entry returns a non-0 return code. This should only
happen when the buffer is full. Otherwise, an empty message may be
returned by batadv_gw_dump. This empty message will then stop the netlink
dumping of gateway entries. At worst, not a single entry is returned to
userspace even when plenty of possible gateways exist.
Fixes: b71bb6f924 ("batman-adv: add B.A.T.M.A.N. V bat_gw_dump implementations")
Signed-off-by: Sven Eckelmann <sven.eckelmann@openmesh.com>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
The function batadv_iv_gw_dump stops the processing loop when
batadv_iv_gw_dump_entry returns a non-0 return code. This should only
happen when the buffer is full. Otherwise, an empty message may be
returned by batadv_gw_dump. This empty message will then stop the netlink
dumping of gateway entries. At worst, not a single entry is returned to
userspace even when plenty of possible gateways exist.
Fixes: efb766af06 ("batman-adv: add B.A.T.M.A.N. IV bat_gw_dump implementations")
Signed-off-by: Sven Eckelmann <sven.eckelmann@openmesh.com>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
We need to make sure the offsets are not out of range of the
total size.
Also check that they are in ascending order.
The WARN_ON triggered by syzkaller (it sets panic_on_warn) is
changed to also bail out, no point in continuing parsing.
Briefly tested with simple ruleset of
-A INPUT --limit 1/s' --log
plus jump to custom chains using 32bit ebtables binary.
Reported-by: <syzbot+845a53d13171abf8bf29@syzkaller.appspotmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
All of these conditions are not fatal and should have
been WARN_ONs from the get-go.
Convert them to WARN_ONs and bail out.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
ebt_among is special, it has a dynamic match size and is exempt
from the central size checks.
Therefore it must check that the size of the match structure
provided from userspace is sane by making sure em->match_size
is at least the minimum size of the expected structure.
The module has such a check, but its only done after accessing
a structure that might be out of bounds.
tested with: ebtables -A INPUT ... \
--among-dst fe:fe:fe:fe:fe:fe
--among-dst fe:fe:fe:fe:fe:fe --among-src fe:fe:fe:fe:ff:f,fe:fe:fe:fe:fe:fb,fe:fe:fe:fe:fc:fd,fe:fe:fe:fe:fe:fd,fe:fe:fe:fe:fe:fe
--among-src fe:fe:fe:fe:ff:f,fe:fe:fe:fe:fe:fa,fe:fe:fe:fe:fe:fd,fe:fe:fe:fe:fe:fe,fe:fe:fe:fe:fe:fe
Reported-by: <syzbot+fe0b19af568972814355@syzkaller.appspotmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Once struct is added to per-netns list it becomes visible to other cpus,
so we cannot use kfree().
Also delay setting entries refcount to 1 until after everything is
initialised so that when we call clusterip_config_put() in this spot
entries is still zero.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This needs to put() the entry to avoid a resource leak in error path.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
A more sophisticated implementation could try to combine fragment checksums
when all fragments have CHECKSUM_COMPLETE and are split at even offsets.
For now, we just set ip_summed to CHECKSUM_NONE to avoid "hw csum failure"
warnings in the kernel log when fragmented frames are received. In
consequence, skb_pull_rcsum() can be replaced with skb_pull().
Note that in usual setups, packets don't reach batman-adv with
CHECKSUM_COMPLETE (I assume NICs bail out of checksumming when they see
batadv's ethtype?), which is why the log messages do not occur on every
system using batman-adv. I could reproduce this issue by stacking
batman-adv on top of a VXLAN interface.
Fixes: 610bfc6bc9 ("batman-adv: Receive fragmented packets and merge")
Tested-by: Maximilian Wilhelm <max@sdn.clinic>
Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
eth_type_trans() internally calls skb_pull(), which does not adjust the
skb checksum; skb_postpull_rcsum() is necessary to avoid log spam of the
form "bat0: hw csum failure" when packets with CHECKSUM_COMPLETE are
received.
Note that in usual setups, packets don't reach batman-adv with
CHECKSUM_COMPLETE (I assume NICs bail out of checksumming when they see
batadv's ethtype?), which is why the log messages do not occur on every
system using batman-adv. I could reproduce this issue by stacking
batman-adv on top of a VXLAN interface.
Fixes: c6c8fea297 ("net: Add batman-adv meshing protocol")
Tested-by: Maximilian Wilhelm <max@sdn.clinic>
Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
There is a potential deadlock if mount/umount happens when
raid5_finish_reshape() tries to grow the size of emulated disk.
How the deadlock happens?
1) The raid5 resync thread finished reshape (expanding array).
2) The mount or umount thread holds VFS sb->s_umount lock and tries to
write through critical data into raid5 emulated block device. So it
waits for raid5 kernel thread handling stripes in order to finish it
I/Os.
3) In the routine of raid5 kernel thread, md_check_recovery() will be
called first in order to reap the raid5 resync thread. That is,
raid5_finish_reshape() will be called. In this function, it will try
to update conf and call VFS revalidate_disk() to grow the raid5
emulated block device. It will try to acquire VFS sb->s_umount lock.
The raid5 kernel thread cannot continue, so no one can handle mount/
umount I/Os (stripes). Once the write-through I/Os cannot be finished,
mount/umount will not release sb->s_umount lock. The deadlock happens.
The raid5 kernel thread is an emulated block device. It is responible to
handle I/Os (stripes) from upper layers. The emulated block device
should not request any I/Os on itself. That is, it should not call VFS
layer functions. (If it did, it will try to acquire VFS locks to
guarantee the I/Os sequence.) So we have the resync thread to send
resync I/O requests and to wait for the results.
For solving this potential deadlock, we can put the size growth of the
emulated block device as the final step of reshape thread.
2017/12/29:
Thanks to Guoqing Jiang <gqjiang@suse.com>,
we confirmed that there is the same deadlock issue in raid10. It's
reproducible and can be fixed by this patch. For raid10.c, we can remove
the similar code to prevent deadlock as well since they has been called
before.
Reported-by: Alex Wu <alexwu@synology.com>
Reviewed-by: Alex Wu <alexwu@synology.com>
Reviewed-by: Chung-Chiang Cheng <cccheng@synology.com>
Signed-off-by: BingJing Chang <bingjingc@synology.com>
Signed-off-by: Shaohua Li <sh.li@alibaba-inc.com>
i_dir_seq is subject to concurrent modification by a cmpxchg or
store-release operation, so ensure that the relaxed access in
d_alloc_parallel uses READ_ONCE.
Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
If d_alloc_parallel runs concurrently with __d_add, it is possible for
d_alloc_parallel to continuously retry whilst i_dir_seq has been
incremented to an odd value by __d_add:
CPU0:
__d_add
n = start_dir_add(dir);
cmpxchg(&dir->i_dir_seq, n, n + 1) == n
CPU1:
d_alloc_parallel
retry:
seq = smp_load_acquire(&parent->d_inode->i_dir_seq) & ~1;
hlist_bl_lock(b);
bit_spin_lock(0, (unsigned long *)b); // Always succeeds
CPU0:
__d_lookup_done(dentry)
hlist_bl_lock
bit_spin_lock(0, (unsigned long *)b); // Never succeeds
CPU1:
if (unlikely(parent->d_inode->i_dir_seq != seq)) {
hlist_bl_unlock(b);
goto retry;
}
Since the simple bit_spin_lock used to implement hlist_bl_lock does not
provide any fairness guarantees, then CPU1 can starve CPU0 of the lock
and prevent it from reaching end_dir_add(dir), therefore CPU1 cannot
exit its retry loop because the sequence number always has the bottom
bit set.
This patch resolves the livelock by not taking hlist_bl_lock in
d_alloc_parallel if the sequence counter is odd, since any subsequent
masked comparison with i_dir_seq will fail anyway.
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Reported-by: Naresh Madhusudana <naresh.madhusudana@arm.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Some laptops such as the XPS 9360 support the intel-vbtn INT33D6
interface but don't initialize the bit that intel-vbtn uses to
represent switching tablet mode.
By running this only on real 2-in-1's it shouldn't cause false
positives.
Fixes: 30323fb6d5 ("Support tablet mode switch")
Reported-by: Jeremy Cline <jeremy@jcline.org>
Signed-off-by: Mario Limonciello <mario.limonciello@dell.com>
Tested-by: Jeremy Cline <jeremy@jcline.org>
Tested-by: Darren Hart (VMware) <dvhart@infradead.org>
Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
- Add an empty linux/compiler_types.h (now being included by kconfig.h)
- Add __GFP_ZERO
- Add kzalloc
- Test __GFP_DIRECT_RECLAIM instead of __GFP_NOWARN
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
In commit c713fb071e ("rtlwifi: rtl8821ae: Fix connection lost problem
correctly") a problem in rtl8821ae that caused loss of signal was fixed.
That same problem has now been reported for rtl8723be. Accordingly,
the ASPM L1 latency has been increased from 0 to 7 to fix the instability.
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Stable <stable@vger.kernel.org>
Tested-by: James Cameron <quozl@laptop.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
First batch of iwlwifi fixes intended for 4.16:
* A couple of bugzilla fixes
- Fix a bogus warning when freeing a TFD;
- Fix severe throughput problem with 9000 series;
* Fix for a bug that caused queue hangs in certain situations;
* Fix for an issue with IBSS;
* Fix an issue with rate-scalingin AP-mode.
Pull powerpc fixes from Michael Ellerman:
- Add handling for a missing instruction in our 32-bit BPF JIT so that
it can be used for seccomp filtering.
- Add a missing NULL pointer check before a function call in new EEH
code.
- Fix an error path in the new ocxl driver to correctly return EFAULT.
- The support for the new ibm,drc-info device tree property turns out
to need several fixes, so for now we just stop advertising to
firmware that we support it until the bugs can be ironed out.
- One fix for the new drmem code which was incorrectly modifying the
device tree in place.
- Finally two fixes for the RFI flush support, so that firmware can
advertise to us that it should be disabled entirely so as not to
affect performance.
Thanks to: Bharata B Rao, Frederic Barrat, Juan J. Alvarez, Mark Lord,
Michael Bringmann.
* tag 'powerpc-4.16-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/powernv: Support firmware disable of RFI flush
powerpc/pseries: Support firmware disable of RFI flush
powerpc/mm/drmem: Fix unexpected flag value in ibm,dynamic-memory-v2
powerpc/bpf/jit: Fix 32-bit JIT for seccomp_data access
powerpc/pseries: Revert support for ibm,drc-info devtree property
powerpc/pseries: Fix duplicate firmware feature for DRC_INFO
ocxl: Fix potential bad errno on irq allocation
powerpc/eeh: Fix crashes in eeh_report_resume()
__blk_mq_requeue_request() covers two cases:
- one is that the requeued request is added to hctx->dispatch, such as
blk_mq_dispatch_rq_list()
- another case is that the request is requeued to io scheduler, such as
blk_mq_requeue_request().
We should call io sched's .requeue_request callback only for the 2nd
case.
Cc: Paolo Valente <paolo.valente@linaro.org>
Cc: Omar Sandoval <osandov@fb.com>
Fixes: bd166ef183 ("blk-mq-sched: add framework for MQ capable IO schedulers")
Cc: stable@vger.kernel.org
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Acked-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
io-channel-cells should be <0> since sigma delta modulator exports only
one channel, as described in ../iio/iio-bindings.txt "IIO providers"
section. Only the phandle is necessary for IIO consumers in this case.
Fixes: af11143757 ("IIO: Add DT bindings for sigma delta adc modulator")
Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com>
Acked-by: Arnaud Pouliquen <arnaud.pouliquen@st.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
When several channels are registered (e.g. via st,adc-channels property):
- channels array is wrongly filled in. Only 1st element in array is being
initialized with last registered channel.
Fix it by passing reference to relevant channel (e.g. array[index]).
- only last initialized channel can work properly (e.g. unique 'ch_id'
is used). Converting any other channel result in conversion timeout.
Fix it by getting rid of 'ch_id', use chan->channel instead.
Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com>
Acked-by: Arnaud Pouliquen <arnaud.pouliquen@st.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
CCS811 has different I2C register maps in boot and application mode. When
CCS811 is in boot mode, register APP_START (0xF4) is used to transit the
firmware state from boot to application mode. However, APP_START is not a
valid register location when CCS811 is in application mode (refer to
"CCS811 Bootloader Register Map" and "CCS811 Application Register Map" in
CCS811 datasheet). The driver should not attempt to perform a write to
APP_START while CCS811 is in application mode, as this is not a valid or
documented register location.
When prob function is being called, the driver assumes the CCS811 sensor
is in boot mode, and attempts to perform a write to APP_START. Although
CCS811 powers-up in boot mode, it may have already been transited to
application mode by previous instances, e.g. unload and reload device
driver by the system, or explicitly by user. Depending on the system
design, CCS811 sensor may be permanently connected to system power source
rather than power controlled by GPIO, hence it is possible that the sensor
is never power reset, thus the firmware could be in either boot or
application mode at any given time when driver prob function is being
called.
This patch checks the STATUS register before attempting to send a write to
APP_START. Only if the firmware is not in application mode and has valid
firmware application loaded, then it will continue to start transiting the
firmware boot to application mode.
Signed-off-by: Richard Lai <richard@richardman.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
The capture interface doesn't work and the playback interface only
supports 48 kHz sampling rate even though it advertises more rates.
Signed-off-by: Erik Veijola <erik.veijola@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
On some boards setting power_save to a non 0 value leads to clicking /
popping sounds when ever we enter/leave powersaving mode. Ideally we would
figure out how to avoid these sounds, but that is not always feasible.
This commit adds a blacklist for devices where powersaving is known to
cause problems and disables it on these devices.
Note I tried to put this blacklist in userspace first:
https://github.com/systemd/systemd/pull/8128
But the systemd maintainers rightfully pointed out that it would be
impossible to then later remove entries once we actually find a way to
make power-saving work on listed boards without issues. Having this list
in the kernel will allow removal of the blacklist entry in the same commit
which fixes the clicks / plops.
The blacklist only applies to the default power_save module-option value,
if a user explicitly sets the module-option then the blacklist is not
used.
[ added an ifdef CONFIG_PM for the build error -- tiwai]
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1525104
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=198611
Cc: stable@vger.kernel.org
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
This patch fixes the wrongly included dtsi file which
was breaking mainline support for Engicam i.CoreM6 DualLite/Solo RQS.
As per the board name, the correct file should be imx6dl.dtsi instead
of imx6q.dtsi
Reported-by: Michael Trimarchi <michael@amarulasolutions.com>
Suggested-by: Jagan Teki <jagan@amarulasolutions.com>
Signed-off-by: Shyam Saini <shyam@amarulasolutions.com>
Reviewed-by: Fabio Estevam <fabio.estevam@nxp.com>
Fixes: 7a9caba55a ("ARM: dts: imx6dl: Add Engicam i.CoreM6 DualLite/Solo RQS initial support")
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
In case when dentry passed to lock_parent() is protected from freeing only
by the fact that it's on a shrink list and trylock of parent fails, we
could get hit by __dentry_kill() (and subsequent dentry_kill(parent))
between unlocking dentry and locking presumed parent. We need to recheck
that dentry is alive once we lock both it and parent *and* postpone
rcu_read_unlock() until after that point. Otherwise we could return
a pointer to struct dentry that already is rcu-scheduled for freeing, with
->d_lock held on it; caller's subsequent attempt to unlock it can end
up with memory corruption.
Cc: stable@vger.kernel.org # 3.12+, counting backports
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
RSM instruction is used by the SMM handler to return from SMM mode.
Currently, rsm causes a #UD - which results in instruction fetch, decode,
and emulate. By installing the RSM intercept we can avoid the instruction
fetch since we know that #VMEXIT was due to rsm.
The patch is required for the SEV guest, because in case of SEV guest
memory is encrypted with guest-specific key and hypervisor will not
able to fetch the instruction bytes from the guest memory.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Avoid traversing all the cpus for pv tlb flush when steal time
is disabled since pv tlb flush depends on the field in steal time
for shared data.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim KrÄmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Guests on new hypersiors might set KVM_ASYNC_PF_DELIVERY_AS_PF_VMEXIT
bit when enabling async_PF, but this bit is reserved on old hypervisors,
which results in a failure upon migration.
To avoid breaking different cases, we are checking for CPUID feature bit
before enabling the feature and nothing else.
Fixes: 52a5c155cf ("KVM: async_pf: Let guest support delivery of async_pf from guest mode")
Cc: <stable@vger.kernel.org>
Reviewed-by: Wanpeng Li <wanpengli@tencent.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Fix the following sparse warning by moving the prototype
of kvm_arch_mmu_notifier_invalidate_range() to linux/kvm_host.h .
CHECK arch/s390/kvm/../../../virt/kvm/kvm_main.c
arch/s390/kvm/../../../virt/kvm/kvm_main.c:138:13: warning: symbol 'kvm_arch_mmu_notifier_invalidate_range' was not declared. Should it be static?
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move the kvm_arch_irq_routing_update() prototype outside of
ifdef CONFIG_HAVE_KVM_EVENTFD guards to fix the following sparse warning:
arch/s390/kvm/../../../virt/kvm/irqchip.c:171:28: warning: symbol 'kvm_arch_irq_routing_update' was not declared. Should it be static?
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The 'Total' line looks a bit weird when we have a single event only. This
can happen e.g. due to filters. Therefore suppress when there's only a
single event in the output.
Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We keep the current logic that sorts all events (parent and child), but
re-shuffle the events afterwards, grouping the children after the
respective parent. Note that the percentage column for child events
gives the percentage of the parent's total.
Since we rework the logic anyway, we modify the total average
calculation to use the raw numbers instead of the (rounded) averages.
Note that this can result in differing numbers (between total average
and the sum of the individual averages) due to rounding errors.
Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Drilldown (i.e. toggle display of child trace events) was implemented by
overriding the fields filter. This resulted in inconsistencies: E.g. when
drilldown was not active, adding a filter that also matches child trace
events would not only filter fields according to the filter, but also add
in the child trace events matching the filter. E.g. on x86, setting
'kvm_userspace_exit' as the fields filter after startup would result in
display of kvm_userspace_exit(DCR), although that wasn't previously
present - not exactly what one would expect from a filter.
This patch addresses the issue by keeping drilldown and fields filter
separate. While at it, we also fix a PEP8 issue by adding a blank line
at one place (since we're in the area...).
We implement this by adding a framework that also allows to define a
taxonomy among the debugfs events to identify child trace events. I.e.
drilldown using 'x' can now also work with debugfs. A respective parent-
child relationship is only known for S390 at the moment, but could be
added adjusting other platforms' ARCH.dbg_is_child() methods
accordingly.
Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We can do with a single dialog that takes both, pids and guest names.
Note that we keep both interactive commands, 'p' and 'g' for now, to
avoid confusion among users used to a specific key.
While at it, we improve on some minor glitches regarding curses usage,
e.g. cursor still visible when not supposed to be.
Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Helps quite a bit reading the code when it's obvious when a method is
intended for internal use only.
Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Te checks for debugfs assumed that debugfs is always mounted at
/sys/kernel/debug - which is likely, but not guaranteed. This is addressed
by checking /proc/mounts for the actual location.
Furthermore, when debugfs was mounted, but the kvm module not loaded, a
misleading error pointing towards debugfs not present was given.
To reproduce,
(a) run kvm_stat with debugfs mounted at a place different from
/sys/kernel/debug
(b) run kvm_stat with debugfs mounted but kvm module not loaded
Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Entering an invalid regular expression did not produce any indication of an
error so far.
To reproduce, press 'f' and enter 'foo(' (with an unescaped bracket).
Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When we apply a filter that will only leave child trace events, we
receive a ZeroDivisionError when calculating the percentages.
In that case, provide percentages based on child events only.
To reproduce, run 'kvm_stat -f .*[\(].*'.
Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use '==' for equality checks and 'is' when comparing identities.
An example where '==' and 'is' behave differently:
>>> a = 4242
>>> a == 4242
True
>>> a is 4242
False
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If it's clear that the values of a dictionary will be used then use
the '.items()' method.
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Tested-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
[Include fix for logging mode by Stefan Raspl]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use a namedtuple for storing the values as it allows to access the
fields of a tuple via names. This makes the overall code much easier
to read and to understand. Access by index is still possible as
before.
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Tested-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The 'sortkey' function references a value in its enclosing
scope (closure). This is not common practice for a sort key function
so let's replace it. Additionally, the function 'sorted' has already a
parameter for reversing the result therefore the inversion of the
values is unneeded. The check for stats[x][1] is also superfluous as
it's ensured that this value is initialized with 0.
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Tested-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reported by syzkaller:
WARNING: CPU: 6 PID: 2434 at arch/x86/kvm/vmx.c:6660 handle_ept_misconfig+0x54/0x1e0 [kvm_intel]
CPU: 6 PID: 2434 Comm: repro_test Not tainted 4.15.0+ #4
RIP: 0010:handle_ept_misconfig+0x54/0x1e0 [kvm_intel]
Call Trace:
vmx_handle_exit+0xbd/0xe20 [kvm_intel]
kvm_arch_vcpu_ioctl_run+0xdaf/0x1d50 [kvm]
kvm_vcpu_ioctl+0x3e9/0x720 [kvm]
do_vfs_ioctl+0xa4/0x6a0
SyS_ioctl+0x79/0x90
entry_SYSCALL_64_fastpath+0x25/0x9c
The testcase creates a first thread to issue KVM_SMI ioctl, and then creates
a second thread to mmap and operate on the same vCPU. This triggers a race
condition when running the testcase with multiple threads. Sometimes one thread
exits with a triple fault while another thread mmaps and operates on the same
vCPU. Because CS=0x3000/IP=0x8000 is not mapped, accessing the SMI handler
results in an EPT misconfig. This patch fixes it by returning RET_PF_EMULATE
in kvm_handle_bad_page(), which will go on to cause an emulation failure and an
exit with KVM_EXIT_INTERNAL_ERROR.
Reported-by: syzbot+c1d9517cab094dae65e446c0c5b4de6c40f4dc58@syzkaller.appspotmail.com
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Although L2 is in halt state, it will be in the active state after
VM entry if the VM entry is vectoring according to SDM 26.6.2 Activity
State. Halting the vcpu here means the event won't be injected to L2
and this decision isn't reported to L1. Thus L0 drops an event that
should be injected to L2.
Cc: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
On x86, special KVM memslots such as the TSS region have anonymous
memory mappings created on behalf of userspace, and these mappings are
removed when the VM is destroyed.
It is however possible for removing these mappings via vm_munmap() to
fail. This can most easily happen if the thread receives SIGKILL while
it's waiting to acquire ->mmap_sem. This triggers the 'WARN_ON(r < 0)'
in __x86_set_memory_region(). syzkaller was able to hit this, using
'exit()' to send the SIGKILL. Note that while the vm_munmap() failure
results in the mapping not being removed immediately, it is not leaked
forever but rather will be freed when the process exits.
It's not really possible to handle this failure properly, so almost
every other caller of vm_munmap() doesn't check the return value. It's
a limitation of having the kernel manage these mappings rather than
userspace.
So just remove the WARN_ON() so that users can't spam the kernel log
with this warning.
Fixes: f0d648bdf0 ("KVM: x86: map/unmap private slots in __x86_set_memory_region")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
L1 might want to use SECONDARY_EXEC_DESC, so we must not clear the VMCS
bit if UMIP is not being emulated.
We must still set the bit when emulating UMIP as the feature can be
passed to L2 where L0 will do the emulation and because L2 can change
CR4 without a VM exit, we should clear the bit if UMIP is disabled.
Fixes: 0367f205a3 ("KVM: vmx: add support for emulating UMIP")
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
The initial reset of the local APIC is performed before the VMCS has been
created, but it tries to do a vmwrite:
vmwrite error: reg 810 value 4a00 (err 18944)
CPU: 54 PID: 38652 Comm: qemu-kvm Tainted: G W I 4.16.0-0.rc2.git0.1.fc28.x86_64 #1
Hardware name: Intel Corporation S2600CW/S2600CW, BIOS SE5C610.86B.01.01.0003.090520141303 09/05/2014
Call Trace:
vmx_set_rvi [kvm_intel]
vmx_hwapic_irr_update [kvm_intel]
kvm_lapic_reset [kvm]
kvm_create_lapic [kvm]
kvm_arch_vcpu_init [kvm]
kvm_vcpu_init [kvm]
vmx_create_vcpu [kvm_intel]
kvm_vm_ioctl [kvm]
Move it later, after the VMCS has been created.
Fixes: 4191db26b7 ("KVM: x86: Update APICv on APIC reset")
Cc: stable@vger.kernel.org
Cc: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Pull networking fixes from David Miller:
1) Fix TTL offset calculation in mac80211 mesh code, from Peter Oh.
2) Fix races with procfs in ipt_CLUSTERIP, from Cong Wang.
3) Memory leak fix in lpm_trie BPF map code, from Yonghong Song.
4) Need to use GFP_ATOMIC in BPF cpumap allocations, from Jason Wang.
5) Fix potential deadlocks in netfilter getsockopt() code paths, from
Paolo Abeni.
6) Netfilter stackpointer size checks really are needed to validate
user input, from Florian Westphal.
7) Missing timer init in x_tables, from Paolo Abeni.
8) Don't use WQ_MEM_RECLAIM in mac80211 hwsim, from Johannes Berg.
9) When an ibmvnic device is brought down then back up again, it can be
sent queue entries from a previous session, handle this properly
instead of crashing. From Thomas Falcon.
10) Fix TCP checksum on LRO buffers in mlx5e, from Gal Pressman.
11) When we are dumping filters in cls_api, the output SKB is empty, and
the filter we are dumping is too large for the space in the SKB, we
should return -EMSGSIZE like other netlink dump operations do.
Otherwise userland has no signal that is needs to increase the size
of its read buffer. From Roman Kapl.
12) Several XDP fixes for virtio_net, from Jesper Dangaard Brouer.
13) Module refcount leak in netlink when a dump start fails, from Jason
Donenfeld.
14) Handle sub-optimal GSO sizes better in TCP BBR congestion control,
from Eric Dumazet.
15) Releasing bpf per-cpu arraymaps can take a long time, add a
condtional scheduling point. From Eric Dumazet.
16) Implement retpolines for tail calls in x64 and arm64 bpf JITs. From
Daniel Borkmann.
17) Fix page leak in gianfar driver, from Andy Spencer.
18) Missed clearing of estimator scratch buffer, from Eric Dumazet.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (76 commits)
net_sched: gen_estimator: fix broken estimators based on percpu stats
gianfar: simplify FCS handling and fix memory leak
ipv6 sit: work around bogus gcc-8 -Wrestrict warning
macvlan: fix use-after-free in macvlan_common_newlink()
bpf, arm64: fix out of bounds access in tail call
bpf, x64: implement retpoline for tail call
rxrpc: Fix send in rxrpc_send_data_packet()
net: aquantia: Fix error handling in aq_pci_probe()
bpf: fix rcu lockdep warning for lpm_trie map_free callback
bpf: add schedule points in percpu arrays management
regulatory: add NUL to request alpha2
ibmvnic: Fix early release of login buffer
net/smc9194: Remove bogus CONFIG_MAC reference
net: ipv4: Set addr_type in hash_keys for forwarded case
tcp_bbr: better deal with suboptimal GSO
smsc75xx: fix smsc75xx_set_features()
netlink: put module reference if dump start fails
selftests/bpf/test_maps: exit child process without error in ENOMEM case
selftests/bpf: update gitignore with test_libbpf_open
selftests/bpf: tcpbpf_kern: use in6_* macros from glibc
..
Pull security subsystem fixes from James Morris:
- keys fixes via David Howells:
"A collection of fixes for Linux keyrings, mostly thanks to Eric
Biggers:
- Fix some PKCS#7 verification issues.
- Fix handling of unsupported crypto in X.509.
- Fix too-large allocation in big_key"
- Seccomp updates via Kees Cook:
"These are fixes for the get_metadata interface that landed during
-rc1. While the new selftest is strictly not a bug fix, I think
it's in the same spirit of avoiding bugs"
- an IMA build fix from Randy Dunlap
* 'fixes-v4.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
integrity/security: fix digsig.c build error with header file
KEYS: Use individual pages in big_key for crypto buffers
X.509: fix NULL dereference when restricting key with unsupported_sig
X.509: fix BUG_ON() when hash algorithm is unsupported
PKCS#7: fix direct verification of SignerInfo signature
PKCS#7: fix certificate blacklisting
PKCS#7: fix certificate chain verification
seccomp: add a selftest for get_metadata
ptrace, seccomp: tweak get_metadata behavior slightly
seccomp, ptrace: switch get_metadata types to arch independent
Pull MIPS fix from James Hogan:
"A single MIPS fix for mismatching struct compat_flock, resulting in
bus errors starting Firefox on Debian 8 since 4.13"
* tag 'mips_fixes_4.16_3' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/mips:
MIPS: Drop spurious __unused in struct compat_flock
Pull printk fixlet from Petr Mladek:
"People expect to see the real pointer value for %px.
Let's substitute '(null)' only for the other %p? format modifiers that
need to deference the pointer"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk:
vsprintf: avoid misleading "(null)" for %px
Pull i2c fixes from Wolfram Sang:
"Two bugfixes, one v4.16 regression fix, and two documentation fixes"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: designware: Consider SCL GPIO optional
i2c: busses: i2c-sirf: Fix spelling: "formular" -> "formula".
i2c: bcm2835: Set up the rising/falling edge delays
i2c: i801: Add missing documentation entries for Braswell and Kaby Lake
i2c: designware: must wait for enable
The requirements around atomic_add() / atomic64_add() resp. their
JIT implementations differ across architectures. E.g. while x86_64
seems just fine with BPF's xadd on unaligned memory, on arm64 it
triggers via interpreter but also JIT the following crash:
[ 830.864985] Unable to handle kernel paging request at virtual address ffff8097d7ed6703
[...]
[ 830.916161] Internal error: Oops: 96000021 [#1] SMP
[ 830.984755] CPU: 37 PID: 2788 Comm: test_verifier Not tainted 4.16.0-rc2+ #8
[ 830.991790] Hardware name: Huawei TaiShan 2280 /BC11SPCD, BIOS 1.29 07/17/2017
[ 830.998998] pstate: 80400005 (Nzcv daif +PAN -UAO)
[ 831.003793] pc : __ll_sc_atomic_add+0x4/0x18
[ 831.008055] lr : ___bpf_prog_run+0x1198/0x1588
[ 831.012485] sp : ffff00001ccabc20
[ 831.015786] x29: ffff00001ccabc20 x28: ffff8017d56a0f00
[ 831.021087] x27: 0000000000000001 x26: 0000000000000000
[ 831.026387] x25: 000000c168d9db98 x24: 0000000000000000
[ 831.031686] x23: ffff000008203878 x22: ffff000009488000
[ 831.036986] x21: ffff000008b14e28 x20: ffff00001ccabcb0
[ 831.042286] x19: ffff0000097b5080 x18: 0000000000000a03
[ 831.047585] x17: 0000000000000000 x16: 0000000000000000
[ 831.052885] x15: 0000ffffaeca8000 x14: 0000000000000000
[ 831.058184] x13: 0000000000000000 x12: 0000000000000000
[ 831.063484] x11: 0000000000000001 x10: 0000000000000000
[ 831.068783] x9 : 0000000000000000 x8 : 0000000000000000
[ 831.074083] x7 : 0000000000000000 x6 : 000580d428000000
[ 831.079383] x5 : 0000000000000018 x4 : 0000000000000000
[ 831.084682] x3 : ffff00001ccabcb0 x2 : 0000000000000001
[ 831.089982] x1 : ffff8097d7ed6703 x0 : 0000000000000001
[ 831.095282] Process test_verifier (pid: 2788, stack limit = 0x0000000018370044)
[ 831.102577] Call trace:
[ 831.105012] __ll_sc_atomic_add+0x4/0x18
[ 831.108923] __bpf_prog_run32+0x4c/0x70
[ 831.112748] bpf_test_run+0x78/0xf8
[ 831.116224] bpf_prog_test_run_xdp+0xb4/0x120
[ 831.120567] SyS_bpf+0x77c/0x1110
[ 831.123873] el0_svc_naked+0x30/0x34
[ 831.127437] Code: 97fffe97 17ffffec 00000000 f9800031 (885f7c31)
Reason for this is because memory is required to be aligned. In
case of BPF, we always enforce alignment in terms of stack access,
but not when accessing map values or packet data when the underlying
arch (e.g. arm64) has CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS set.
xadd on packet data that is local to us anyway is just wrong, so
forbid this case entirely. The only place where xadd makes sense in
fact are map values; xadd on stack is wrong as well, but it's been
around for much longer. Specifically enforce strict alignment in case
of xadd, so that we handle this case generically and avoid such crashes
in the first place.
Fixes: 17a5267067 ("bpf: verifier (add verifier core)")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The 'lend' parameter of truncate_inode_pages_range is required to be
inclusive, so follow the rule.
This patch fixes one memory corruption triggered by discard.
Cc: <stable@vger.kernel.org>
Cc: Dmitry Monakhov <dmonakhov@openvz.org>
Fixes: 351499a172 ("block: Invalidate cache on discard v2")
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pull SCSI fixes from James Bottomley:
"These are mostly fixes for problems with merge window code.
In addition we have one doc update (alua) and two dead code removals
(aiclib and octogon) a spurious assignment removal (csiostor) and a
performance improvement for storvsc involving better interrupt
spreading and increasing the command per lun handling"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: qla4xxx: skip error recovery in case of register disconnect.
scsi: aacraid: fix shutdown crash when init fails
scsi: qedi: Cleanup local str variable
scsi: qedi: Fix truncation of CHAP name and secret
scsi: qla2xxx: Fix incorrect handle for abort IOCB
scsi: qla2xxx: Fix double free bug after firmware timeout
scsi: storvsc: Increase cmd_per_lun for higher speed devices
scsi: qla2xxx: Fix a locking imbalance in qlt_24xx_handle_els()
scsi: scsi_dh: Document alua_rtpg_queue() arguments
scsi: Remove Makefile entry for oktagon files
scsi: aic7xxx: remove aiclib.c
scsi: qla2xxx: Avoid triggering undefined behavior in qla2x00_mbx_completion()
scsi: mptfusion: Add bounds check in mptctl_hp_targetinfo()
scsi: sym53c8xx_2: iterator underflow in sym_getsync()
scsi: bnx2fc: Fix check in SCSI completion handler for timed out request
scsi: csiostor: remove redundant assignment to pointer 'ln'
scsi: ufs: Enable quirk to ignore sending WRITE_SAME command
scsi: ibmvfc: fix misdefined reserved field in ibmvfc_fcp_rsp_info
scsi: qla2xxx: Fix memory corruption during hba reset test
scsi: mpt3sas: fix an out of bound write
The DCPU can now send message data in two ways:
- via the data RAM, as before (this is now message type 0)
- via the message RAM (this is message type 1)
In order to support both methods, we check the message type of the
response (bits 31:28) and then treat the offset (bits 27:0)
accordingly.
Signed-off-by: Markus Mayer <mmayer@broadcom.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
In some functions, variable "ret" should be ssize_t, so we fix it.
Fixes: 2f330caff5 ("memory: brcmstb: Add driver for DPFE")
Signed-off-by: Markus Mayer <mmayer@broadcom.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
We were printing the entire 32 bit register rather than just the lower
8 bits. Anything above bit 7 is reserved and may be any random value.
Fixes: 2f330caff5 ("memory: brcmstb: Add driver for DPFE")
Signed-off-by: Markus Mayer <mmayer@broadcom.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Pull drm fixes from Dave Airlie:
"A bunch of fixes for rc3:
Exynos:
- fixes for using monotonic timestamps
- register definitions
- removal of unused file
ipu-v3L
- minor changes
- make some register arrays const+static
- fix some leaks
meson:
- fix for vsync
atomic:
- fix for memory leak
EDID parser:
- add quirks for some more non-desktop devices
- 6-bit panel fix.
drm_mm:
- fix a bug in the core drm mm hole handling
cirrus:
- fix lut loading regression
Lastly there is a deadlock fix around runtime suspend for secondary
GPUs.
There was a deadlock between one thread trying to wait for a workqueue
job to finish in the runtime suspend path, and the workqueue job it
was waiting for in turn waiting for a runtime_get_sync to return.
The fixes avoids it by not doing the runtime sync in the workqueue as
then we always wait for all those tasks to complete before we runtime
suspend"
* tag 'drm-fixes-for-v4.16-rc3' of git://people.freedesktop.org/~airlied/linux: (25 commits)
drm/tve200: fix kernel-doc documentation comment include
drm/edid: quirk Sony PlayStation VR headset as non-desktop
drm/edid: quirk Windows Mixed Reality headsets as non-desktop
drm/edid: quirk Oculus Rift headsets as non-desktop
drm/meson: fix vsync buffer update
drm: Handle unexpected holes in color-eviction
drm: exynos: Use proper macro definition for HDMI_I2S_PIN_SEL_1
drm/exynos: remove exynos_drm_rotator.h
drm/exynos: g2d: Delete an error message for a failed memory allocation in two functions
drm/exynos: fix comparison to bitshift when dealing with a mask
drm/exynos: g2d: use monotonic timestamps
drm/edid: Add 6 bpc quirk for CPT panel in Asus UX303LA
gpu: ipu-csi: add 10/12-bit grayscale support to mbus_code_to_bus_cfg
gpu: ipu-cpmem: add 16-bit grayscale support to ipu_cpmem_set_image
gpu: ipu-v3: prg: fix device node leak in ipu_prg_lookup_by_phandle
gpu: ipu-v3: pre: fix device node leak in ipu_pre_lookup_by_phandle
drm/amdgpu: Fix deadlock on runtime suspend
drm/radeon: Fix deadlock on runtime suspend
drm/nouveau: Fix deadlock on runtime suspend
drm: Allow determining if current task is output poll worker
...
KVM: s390: fixes for multiple epoch facility
We have certain cases where the multiple epoch facility is broken:
- timer wakeup during epoch change
- cpu hotplug
- SCK instruction
- stp sync checks
Fix those.
KVM/ARM Fixes for v4.16, Round 1
Fix the interaction of userspace irqchip VMs with in-kernl irqchip VMs
and make sure we can build 32-bit KVM/ARM with gcc-8.
pfifo_fast got percpu stats lately, uncovering a bug I introduced last
year in linux-4.10.
I missed the fact that we have to clear our temporary storage
before calling __gnet_stats_copy_basic() in the case of percpu stats.
Without this fix, rate estimators (tc qd replace dev xxx root est 1sec
4sec pfifo_fast) are utterly broken.
Fixes: 1c0d32fde5 ("net_sched: gen_estimator: complete rewrite of rate estimators")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexei Starovoitov says:
====================
pull-request: bpf 2018-02-22
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) two urgent fixes for bpf_tail_call logic for x64 and arm64 JITs, from Daniel.
2) cond_resched points in percpu array alloc/free paths, from Eric.
3) lockdep and other minor fixes, from Yonghong, Arnd, Anders, Li.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Previously, buffer descriptors containing only the frame check sequence
(FCS) were skipped and not added to the skb. However, the page reference
count was still incremented, leading to a memory leak.
Fixing this inside gfar_add_rx_frag() is difficult due to reserved
memory handling and page reuse. Instead, move the FCS handling to
gfar_process_frame() and trim off the FCS before passing the skb up the
networking stack.
Signed-off-by: Andy Spencer <aspencer@spacex.com>
Signed-off-by: Jim Gruen <jgruen@spacex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When a large BPF percpu map is destroyed, I have seen
pcpu_balance_workfn() holding cpu for hundreds of milliseconds.
On KASAN config and 112 hyperthreads, average time to destroy a chunk
is ~4 ms.
[ 2489.841376] destroy chunk 1 in 4148689 ns
...
[ 2490.093428] destroy chunk 32 in 4072718 ns
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
As pointed by Dan, possible values for bits[3:0] of te Line Mode Registers
can range from 0x0 to 0xf, but the check logic allow values ranging
from 0x0 to 0xe.
As static arrays are initialized with zero, using a value without
an explicit initializer at the array won't cause any harm.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Now that we have support for a buffer counter and for
error flags, update them at DMX_DQBUF.
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
gcc-8 has a new warning that detects overlapping input and output arguments
in memcpy(). It triggers for sit_init_net() calling ipip6_tunnel_clone_6rd(),
which is actually correct:
net/ipv6/sit.c: In function 'sit_init_net':
net/ipv6/sit.c:192:3: error: 'memcpy' source argument is the same as destination [-Werror=restrict]
The problem here is that the logic detecting the memcpy() arguments finds them
to be the same, but the conditional that tests for the input and output of
ipip6_tunnel_clone_6rd() to be identical is not a compile-time constant.
We know that netdev_priv(t->dev) is the same as t for a tunnel device,
and comparing "dev" directly here lets the compiler figure out as well
that 'dev == sitn->fb_tunnel_dev' when called from sit_init_net(), so
it no longer warns.
This code is old, so Cc stable to make sure that we don't get the warning
for older kernels built with new gcc.
Cc: Martin Sebor <msebor@gmail.com>
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83456
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The following use-after-free was reported by KASan when running
LTP macvtap01 test on 4.16-rc2:
[10642.528443] BUG: KASAN: use-after-free in
macvlan_common_newlink+0x12ef/0x14a0 [macvlan]
[10642.626607] Read of size 8 at addr ffff880ba49f2100 by task ip/18450
...
[10642.963873] Call Trace:
[10642.994352] dump_stack+0x5c/0x7c
[10643.035325] print_address_description+0x75/0x290
[10643.092938] kasan_report+0x28d/0x390
[10643.137971] ? macvlan_common_newlink+0x12ef/0x14a0 [macvlan]
[10643.207963] macvlan_common_newlink+0x12ef/0x14a0 [macvlan]
[10643.275978] macvtap_newlink+0x171/0x260 [macvtap]
[10643.334532] rtnl_newlink+0xd4f/0x1300
...
[10646.256176] Allocated by task 18450:
[10646.299964] kasan_kmalloc+0xa6/0xd0
[10646.343746] kmem_cache_alloc_trace+0xf1/0x210
[10646.397826] macvlan_common_newlink+0x6de/0x14a0 [macvlan]
[10646.464386] macvtap_newlink+0x171/0x260 [macvtap]
[10646.522728] rtnl_newlink+0xd4f/0x1300
...
[10647.022028] Freed by task 18450:
[10647.061549] __kasan_slab_free+0x138/0x180
[10647.111468] kfree+0x9e/0x1c0
[10647.147869] macvlan_port_destroy+0x3db/0x650 [macvlan]
[10647.211411] rollback_registered_many+0x5b9/0xb10
[10647.268715] rollback_registered+0xd9/0x190
[10647.319675] register_netdevice+0x8eb/0xc70
[10647.370635] macvlan_common_newlink+0xe58/0x14a0 [macvlan]
[10647.437195] macvtap_newlink+0x171/0x260 [macvtap]
Commit d02fd6e7d2 ("macvlan: Fix one possible double free") handles
the case when register_netdevice() invokes ndo_uninit() on error and
as a result free the port. But 'macvlan_port_get_rtnl(dev))' check
(returns dev->rx_handler_data), which was added by this commit in order
to prevent double free, is not quite correct:
* for macvlan it always returns NULL because 'lowerdev' is the one that
was used to register rx handler (port) in macvlan_port_create() as
well as to unregister it in macvlan_port_destroy().
* for macvtap it always returns a valid pointer because macvtap registers
its own rx handler before macvlan_common_newlink().
Fixes: d02fd6e7d2 ("macvlan: Fix one possible double free")
Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
do_task_stat() calls get_wchan(), which further does unwind_frame().
unwind_frame() restores frame->pc to original value in case function
graph tracer has modified a return address (LR) in a stack frame to hook
a function return. However, if function graph tracer has hit a filtered
function, then we can't unwind it as ftrace_push_return_trace() has
biased the index(frame->graph) with a 'huge negative'
offset(-FTRACE_NOTRACE_DEPTH).
Moreover, arm64 stack walker defines index(frame->graph) as unsigned
int, which can not compare a -ve number.
Similar problem we can have with calling of walk_stackframe() from
save_stack_trace_tsk() or dump_backtrace().
This patch fixes unwind_frame() to test the index for -ve value and
restore index accordingly before we can restore frame->pc.
Reproducer:
cd /sys/kernel/debug/tracing/
echo schedule > set_graph_notrace
echo 1 > options/display-graph
echo wakeup > current_tracer
ps -ef | grep -i agent
Above commands result in:
Unable to handle kernel paging request at virtual address ffff801bd3d1e000
pgd = ffff8003cbe97c00
[ffff801bd3d1e000] *pgd=0000000000000000, *pud=0000000000000000
Internal error: Oops: 96000006 [#1] SMP
[...]
CPU: 5 PID: 11696 Comm: ps Not tainted 4.11.0+ #33
[...]
task: ffff8003c21ba000 task.stack: ffff8003cc6c0000
PC is at unwind_frame+0x12c/0x180
LR is at get_wchan+0xd4/0x134
pc : [<ffff00000808892c>] lr : [<ffff0000080860b8>] pstate: 60000145
sp : ffff8003cc6c3ab0
x29: ffff8003cc6c3ab0 x28: 0000000000000001
x27: 0000000000000026 x26: 0000000000000026
x25: 00000000000012d8 x24: 0000000000000000
x23: ffff8003c1c04000 x22: ffff000008c83000
x21: ffff8003c1c00000 x20: 000000000000000f
x19: ffff8003c1bc0000 x18: 0000fffffc593690
x17: 0000000000000000 x16: 0000000000000001
x15: 0000b855670e2b60 x14: 0003e97f22cf1d0f
x13: 0000000000000001 x12: 0000000000000000
x11: 00000000e8f4883e x10: 0000000154f47ec8
x9 : 0000000070f367c0 x8 : 0000000000000000
x7 : 00008003f7290000 x6 : 0000000000000018
x5 : 0000000000000000 x4 : ffff8003c1c03cb0
x3 : ffff8003c1c03ca0 x2 : 00000017ffe80000
x1 : ffff8003cc6c3af8 x0 : ffff8003d3e9e000
Process ps (pid: 11696, stack limit = 0xffff8003cc6c0000)
Stack: (0xffff8003cc6c3ab0 to 0xffff8003cc6c4000)
[...]
[<ffff00000808892c>] unwind_frame+0x12c/0x180
[<ffff000008305008>] do_task_stat+0x864/0x870
[<ffff000008305c44>] proc_tgid_stat+0x3c/0x48
[<ffff0000082fde0c>] proc_single_show+0x5c/0xb8
[<ffff0000082b27e0>] seq_read+0x160/0x414
[<ffff000008289e6c>] __vfs_read+0x58/0x164
[<ffff00000828b164>] vfs_read+0x88/0x144
[<ffff00000828c2e8>] SyS_read+0x60/0xc0
[<ffff0000080834a0>] __sys_trace_return+0x0/0x4
Fixes: 20380bb390 (arm64: ftrace: fix a stack tracer's output under function graph tracer)
Signed-off-by: Pratyush Anand <panand@redhat.com>
Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
[catalin.marinas@arm.com: replace WARN_ON with WARN_ON_ONCE]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Nouveau only exposes support for XBGR2101010. Prior to the atomic
conversion, drm would pass in the wrong format in the framebuffer, but
it was always ignored -- both userspace (xf86-video-nouveau) and the
kernel driver agreed on the layout, so the fact that the format was
wrong didn't matter.
With the atomic conversion, nouveau all of a sudden started caring about
the exact format, and so the previously-working code in
xf86-video-nouveau no longer functioned since the (internally-assigned)
format from the addfb ioctl was wrong.
This change adds infrastructure to allow a drm driver to specify that it
prefers the XBGR format variant for the addfb ioctl, and makes nouveau's
nv50 display driver set it. (Prior gens had no support for 30bpp at all.)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: stable@vger.kernel.org # v4.10+
Acked-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20180203191123.31507-1-imirkin@alum.mit.edu
Fixes rx for 4-addr packets in AP mode. These may be used for setting
up a 4-addr link for stations that are allowed to do so.
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Since we now have the convenient helper to do so, actually adjust the
TSQ pacing shift for packets going out over a WiFi interface. This
significantly improves throughput for locally-originated TCP
connections. The default pacing shift of 10 corresponds to ~1ms of
queued packet data. Adjusting this to a shift of 8 (i.e. ~4ms) improves
1-hop throughput for ath9k by a factor of 3, whereas increasing it more
has diminishing returns.
Achieved throughput for different values of sk_pacing_shift (average of
5 iterations of 10-sec netperf runs to a host on the other side of the
WiFi hop):
sk_pacing_shift 10: 43.21 Mbps (pre-patch)
sk_pacing_shift 9: 78.17 Mbps
sk_pacing_shift 8: 123.94 Mbps
sk_pacing_shift 7: 128.31 Mbps
Latency for competing flows increases from ~3 ms to ~10 ms with this
change. This is about the same magnitude of queueing latency induced by
flows that are not originated on the WiFi device itself (and so are not
limited by TSQ).
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
While userspace can detect discontinuity errors, it is useful to
also let Kernelspace reporting discontinuity, as it can help to
identify if the data loss happened either at Kernel or userspace side.
Update documentation accordingly.
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Some conditions required for DVB mmap support to work are reversed.
Also, the logic is not too clear.
So, improve the logic, making it easier to be handled.
PS.: I'm pretty sure that I fixed it while testing, but, somehow,
the change got lost.
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Returning -EINVAL when an ioctl is not implemented is a very
bad idea, as it is hard to distinguish from other error
contitions that an ioctl could lead. Replace it by its
right error code: -ENOTTY.
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
After the move of videobuf2 into the common directory, selecting the
au0828 driver with CONFIG_V4L2 disabled started causing a link failure,
as we now attempt to build videobuf2 but it still requires v4l2:
ERROR: "v4l2_event_pending" [drivers/media/common/videobuf/videobuf2-v4l2.ko] undefined!
ERROR: "v4l2_fh_release" [drivers/media/common/videobuf/videobuf2-v4l2.ko] undefined!
ERROR: "video_devdata" [drivers/media/common/videobuf/videobuf2-v4l2.ko] undefined!
ERROR: "__tracepoint_vb2_buf_done" [drivers/media/common/videobuf/videobuf2-core.ko] undefined!
ERROR: "__tracepoint_vb2_dqbuf" [drivers/media/common/videobuf/videobuf2-core.ko] undefined!
ERROR: "v4l_vb2q_enable_media_source" [drivers/media/common/videobuf/videobuf2-core.ko] undefined!
This adds the same dependency in au0828 that the other users of videobuf2
have.
Fixes: 03fbdb2fc2 ("media: move videobuf2 to drivers/media/common")
Fixes: 05439b1a36 ("[media] media: au0828 - convert to use videobuf2")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Enabling CONFIG_DVB_MMAP without CONFIG_VIDEOBUF2_VMALLOC results
in a link error:
drivers/media/dvb-core/dvb_vb2.o: In function `_stop_streaming':
dvb_vb2.c:(.text+0x894): undefined reference to `vb2_buffer_done'
drivers/media/dvb-core/dvb_vb2.o: In function `dvb_vb2_init':
dvb_vb2.c:(.text+0xbec): undefined reference to `vb2_vmalloc_memops'
dvb_vb2.c:(.text+0xc4c): undefined reference to `vb2_core_queue_init'
drivers/media/dvb-core/dvb_vb2.o: In function `dvb_vb2_release':
dvb_vb2.c:(.text+0xe14): undefined reference to `vb2_core_queue_release'
drivers/media/dvb-core/dvb_vb2.o: In function `dvb_vb2_stream_on':
dvb_vb2.c:(.text+0xeb8): undefined reference to `vb2_core_streamon'
drivers/media/dvb-core/dvb_vb2.o: In function `dvb_vb2_stream_off':
dvb_vb2.c:(.text+0xfe8): undefined reference to `vb2_core_streamoff'
drivers/media/dvb-core/dvb_vb2.o: In function `dvb_vb2_fill_buffer':
dvb_vb2.c:(.text+0x13ec): undefined reference to `vb2_plane_vaddr'
dvb_vb2.c:(.text+0x149c): undefined reference to `vb2_buffer_done'
This adds a 'select' statement for it, plus a dependency that
ensures that videobuf2 in turn works, as it in turn depends on
VIDEO_V4L2 to link, and that must not be a module if videobuf2
is built-in.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
CONFIG_DVB_MMAP was misspelled either as CONFIG_DVB_MMSP
or DVB_MMAP, so it had no effect at all. This fixes that,
to make it possible to build it again.
Fixes: 4021053ed5 ("media: dvb-core: make DVB mmap API optional")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
There was a trouble with vb2-trace: instead of being part of
VB2 core, it was stored at V4L2 videodev. That was wrong,
as it doesn't actually belong to V4L2 core.
Now that vb2 is not part of v4l2-core, its trace functions
should be moved altogether. So, move it to its rightful
place: at videobuf2-core.
That fixes those errors:
drivers/media/common/videobuf2/videobuf2-core.o: In function `__read_once_size':
./include/linux/compiler.h:183: undefined reference to `__tracepoint_vb2_buf_queue'
./include/linux/compiler.h:183: undefined reference to `__tracepoint_vb2_buf_queue'
./include/linux/compiler.h:183: undefined reference to `__tracepoint_vb2_buf_done'
./include/linux/compiler.h:183: undefined reference to `__tracepoint_vb2_buf_done'
./include/linux/compiler.h:183: undefined reference to `__tracepoint_vb2_qbuf'
./include/linux/compiler.h:183: undefined reference to `__tracepoint_vb2_qbuf'
./include/linux/compiler.h:183: undefined reference to `__tracepoint_vb2_dqbuf'
./include/linux/compiler.h:183: undefined reference to `__tracepoint_vb2_dqbuf'
drivers/media/common/videobuf2/videobuf2-core.o:(__jump_table+0x10): undefined reference to `__tracepoint_vb2_buf_queue'
drivers/media/common/videobuf2/videobuf2-core.o:(__jump_table+0x28): undefined reference to `__tracepoint_vb2_buf_done'
drivers/media/common/videobuf2/videobuf2-core.o:(__jump_table+0x40): undefined reference to `__tracepoint_vb2_qbuf'
drivers/media/common/videobuf2/videobuf2-core.o:(__jump_table+0x58): undefined reference to `__tracepoint_vb2_dqbuf'
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Videobuf2 is now separate from V4L2 and can be now built without it, at
least in principle --- enabling videobuf2 in kernel configuration attempts
to compile videobuf2-v4l2.c but that will fail if CONFIG_VIDEO_V4L2 isn't
enabled.
Solve this by adding a separate Kconfig option for videobuf2-v4l2 and make
it a separate module as well. This means that drivers now need to choose
both the appropriate videobuf2 memory type
(VIDEOBUF2_{VMALLOC,DMA_CONTIG,DMA_SG}) and VIDEOBUF2_V4L2 if they need
both.
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
When a irq vector is replaced, then the previous vector is normally
released when the first interrupt happens on the new vector. If the target
CPU of the previous vector is already offline when the new vector is
installed, then the previous vector is silently discarded, which leads to
accounting issues causing suspend failures and other problems.
Adjust the logic so that the previous vector is freed in the underlying
matrix allocator to ensure that the accounting stays correct.
Fixes: 69cde0004a ("x86/vector: Use matrix allocator for vector assignment")
Reported-by: Yuriy Vostrikov <delamonpansie@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Yuriy Vostrikov <delamonpansie@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180222112316.930791749@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Some versions of firmware will have a setting that can be configured
to disable the RFI flush, add support for it.
Fixes: 6e032b350c ("powerpc/powernv: Check device-tree for RFI flush settings")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Some versions of firmware will have a setting that can be configured
to disable the RFI flush, add support for it.
Fixes: 8989d56878 ("powerpc/pseries: Query hypervisor for RFI flush settings")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Memory addtion and removal by count and indexed-count methods
temporarily mark the LMBs that are being added/removed by a special
flag value DRMEM_LMB_RESERVED. Accessing flags value directly at a few
places without proper accessor method is causing two unexpected
side-effects:
- DRMEM_LMB_RESERVED bit is becoming part of the flags word of
drconf_cell_v2 entries in ibm,dynamic-memory-v2 DT property.
- This results in extra drconf_cell entries in ibm,dynamic-memory-v2.
For example if 1G memory is added, it leads to one entry for 3 LMBs
and 1 separate entry for the last LMB. All the 4 LMBs should be
defined by one entry here.
Fix this by always accessing the flags by its accessor method
drmem_lmb_flags().
Fixes: 2b31e3aec1 ("powerpc/drmem: Add support for ibm, dynamic-memory-v2 property")
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
In radeon_device_init, set the need_dma32 flag for Cedar chips
(e.g. FirePro 2270). This fixes, or at least works around, a bug
on PowerPC exposed by last year's commits
8e3f1b1d82 (Russell Currey)
and
253fd51e2f (Alistair Popple)
which enabled the 64-bit DMA iommu bypass.
This caused the device to freeze, in some cases unrecoverably, and is
the subject of several bug reports internal to Red Hat.
Signed-off-by: Ben Crocker <bcrocker@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
drm/imx: ipu-v3 fixups and grayscale support
- Make const interrupt register arrays static, reduces object size.
- Fix device_node leaks in PRE/PRG phandle lookup functions.
- Add 8-bit and 16-bit grayscale buffer support to ipu_cpmem_set_image,
- add 10-bit and 12-bit grayscale media bus support to ipu-csi,
to be used by the imx-media driver.
* tag 'imx-drm-next-2018-02-22' of git://git.pengutronix.de/git/pza/linux:
gpu: ipu-csi: add 10/12-bit grayscale support to mbus_code_to_bus_cfg
gpu: ipu-cpmem: add 16-bit grayscale support to ipu_cpmem_set_image
gpu: ipu-v3: prg: fix device node leak in ipu_prg_lookup_by_phandle
gpu: ipu-v3: pre: fix device node leak in ipu_pre_lookup_by_phandle
gpu: ipu-cpmem: add 8-bit grayscale support to ipu_cpmem_set_image
gpu: ipu-v3: make const arrays int_reg static, shrinks object size
The MIPS %.its.S compiler command did not define __ASSEMBLY__, which meant
when compiler_types.h was added to kconfig.h, unexpected things appeared
(e.g. struct declarations) which should not have been present. As done in
the general %.S compiler command, __ASSEMBLY__ is now included here too.
The failure was:
Error: arch/mips/boot/vmlinux.gz.its:201.1-2 syntax error
FATAL ERROR: Unable to parse input tree
/usr/bin/mkimage: Can't read arch/mips/boot/vmlinux.gz.itb.tmp: Invalid argument
/usr/bin/mkimage Can't add hashes to FIT blob
Reported-by: kbuild test robot <lkp@intel.com>
Fixes: 28128c61e0 ("kconfig.h: Include compiler types to avoid missed struct attributes")
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull siginfo fix from Eric Biederman:
"This fixes a build error that only shows up on blackfin"
* 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
fs/signalfd: fix build error for BUS_MCEERR_AR
Pull crypto fix from Herbert Xu:
"Fix an oops in the s5p-sss driver when used with ecb(aes)"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: s5p-sss - Fix kernel Oops in AES-ECB mode
I recently noticed a crash on arm64 when feeding a bogus index
into BPF tail call helper. The crash would not occur when the
interpreter is used, but only in case of JIT. Output looks as
follows:
[ 347.007486] Unable to handle kernel paging request at virtual address fffb850e96492510
[...]
[ 347.043065] [fffb850e96492510] address between user and kernel address ranges
[ 347.050205] Internal error: Oops: 96000004 [#1] SMP
[...]
[ 347.190829] x13: 0000000000000000 x12: 0000000000000000
[ 347.196128] x11: fffc047ebe782800 x10: ffff808fd7d0fd10
[ 347.201427] x9 : 0000000000000000 x8 : 0000000000000000
[ 347.206726] x7 : 0000000000000000 x6 : 001c991738000000
[ 347.212025] x5 : 0000000000000018 x4 : 000000000000ba5a
[ 347.217325] x3 : 00000000000329c4 x2 : ffff808fd7cf0500
[ 347.222625] x1 : ffff808fd7d0fc00 x0 : ffff808fd7cf0500
[ 347.227926] Process test_verifier (pid: 4548, stack limit = 0x000000007467fa61)
[ 347.235221] Call trace:
[ 347.237656] 0xffff000002f3a4fc
[ 347.240784] bpf_test_run+0x78/0xf8
[ 347.244260] bpf_prog_test_run_skb+0x148/0x230
[ 347.248694] SyS_bpf+0x77c/0x1110
[ 347.251999] el0_svc_naked+0x30/0x34
[ 347.255564] Code: 9100075a d280220a 8b0a002a d37df04b (f86b694b)
[...]
In this case the index used in BPF r3 is the same as in r1
at the time of the call, meaning we fed a pointer as index;
here, it had the value 0xffff808fd7cf0500 which sits in x2.
While I found tail calls to be working in general (also for
hitting the error cases), I noticed the following in the code
emission:
# bpftool p d j i 988
[...]
38: ldr w10, [x1,x10]
3c: cmp w2, w10
40: b.ge 0x000000000000007c <-- signed cmp
44: mov x10, #0x20 // #32
48: cmp x26, x10
4c: b.gt 0x000000000000007c
50: add x26, x26, #0x1
54: mov x10, #0x110 // #272
58: add x10, x1, x10
5c: lsl x11, x2, #3
60: ldr x11, [x10,x11] <-- faulting insn (f86b694b)
64: cbz x11, 0x000000000000007c
[...]
Meaning, the tests passed because commit ddb55992b0 ("arm64:
bpf: implement bpf_tail_call() helper") was using signed compares
instead of unsigned which as a result had the test wrongly passing.
Change this but also the tail call count test both into unsigned
and cap the index as u32. Latter we did as well in 90caccdd8c
("bpf: fix bpf_tail_call() x64 JIT") and is needed in addition here,
too. Tested on HiSilicon Hi1616.
Result after patch:
# bpftool p d j i 268
[...]
38: ldr w10, [x1,x10]
3c: add w2, w2, #0x0
40: cmp w2, w10
44: b.cs 0x0000000000000080
48: mov x10, #0x20 // #32
4c: cmp x26, x10
50: b.hi 0x0000000000000080
54: add x26, x26, #0x1
58: mov x10, #0x110 // #272
5c: add x10, x1, x10
60: lsl x11, x2, #3
64: ldr x11, [x10,x11]
68: cbz x11, 0x0000000000000080
[...]
Fixes: ddb55992b0 ("arm64: bpf: implement bpf_tail_call() helper")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Implement a retpoline [0] for the BPF tail call JIT'ing that converts
the indirect jump via jmp %rax that is used to make the long jump into
another JITed BPF image. Since this is subject to speculative execution,
we need to control the transient instruction sequence here as well
when CONFIG_RETPOLINE is set, and direct it into a pause + lfence loop.
The latter aligns also with what gcc / clang emits (e.g. [1]).
JIT dump after patch:
# bpftool p d x i 1
0: (18) r2 = map[id:1]
2: (b7) r3 = 0
3: (85) call bpf_tail_call#12
4: (b7) r0 = 2
5: (95) exit
With CONFIG_RETPOLINE:
# bpftool p d j i 1
[...]
33: cmp %edx,0x24(%rsi)
36: jbe 0x0000000000000072 |*
38: mov 0x24(%rbp),%eax
3e: cmp $0x20,%eax
41: ja 0x0000000000000072 |
43: add $0x1,%eax
46: mov %eax,0x24(%rbp)
4c: mov 0x90(%rsi,%rdx,8),%rax
54: test %rax,%rax
57: je 0x0000000000000072 |
59: mov 0x28(%rax),%rax
5d: add $0x25,%rax
61: callq 0x000000000000006d |+
66: pause |
68: lfence |
6b: jmp 0x0000000000000066 |
6d: mov %rax,(%rsp) |
71: retq |
72: mov $0x2,%eax
[...]
* relative fall-through jumps in error case
+ retpoline for indirect jump
Without CONFIG_RETPOLINE:
# bpftool p d j i 1
[...]
33: cmp %edx,0x24(%rsi)
36: jbe 0x0000000000000063 |*
38: mov 0x24(%rbp),%eax
3e: cmp $0x20,%eax
41: ja 0x0000000000000063 |
43: add $0x1,%eax
46: mov %eax,0x24(%rbp)
4c: mov 0x90(%rsi,%rdx,8),%rax
54: test %rax,%rax
57: je 0x0000000000000063 |
59: mov 0x28(%rax),%rax
5d: add $0x25,%rax
61: jmpq *%rax |-
63: mov $0x2,%eax
[...]
* relative fall-through jumps in error case
- plain indirect jump as before
[0] https://support.google.com/faqs/answer/7625886
[1] a31e654fa1
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Marty reported a memory leakage introduced by commit 3aaabbf1c3
("lib/dma-debug.c: fix incorrect pfn calculation"). Fix it
by checking the virtual address before allocating the entry.
This patch also use virt_addr_valid() instead of virt_to_page()
to check if a virtual address is linear.
Fixes: 3aaabbf1 ("lib/dma-debug.c: fix incorrect pfn calculation")
Reported-by: Marty Faltesek <mfaltesek@google.com>
Signed-off-by: Miles Chen <miles.chen@mediatek.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
During log recovery, the per-AG reservations aren't yet set up, so log
recovery has to reserve enough blocks to handle all possible btree
splits.
Reported-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Apparently different gcc versions have competing and
incompatible notions of how to initialize at declaration,
so just give up and fall back to the time-tested memset().
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Improve the DTS files by removing all the leading "0x" and zeros to fix the
following dtc warnings:
Warning (unit_address_format): Node /XXX unit name should not have leading "0x"
and
Warning (unit_address_format): Node /XXX unit name should not have leading 0s
Converted using the following command:
find . -type f \( -iname *.dts -o -iname *.dtsi \) -exec sed -i -e "s/@\([0-9a-fA-FxX\.;:#]+\)\s*{/@\L\1 {/g" -e "s/@0x\(.*\) {/@\1 {/g" -e "s/@0+\(.*\) {/@\1 {/g" {} +^C
For simplicity, two sed expressions were used to solve each warnings separately.
To make the regex expression more robust a few other issues were resolved,
namely setting unit-address to lower case, and adding a whitespace before the
the opening curly brace:
https://elinux.org/Device_Tree_Linux#Linux_conventions
This will solve as a side effect warning:
Warning (simple_bus_reg): Node /XXX@<UPPER> simple-bus unit address format error, expected "<lower>"
This is a follow up to commit 4c9847b737 ("dt-bindings: Remove leading 0x from bindings notation")
Reported-by: David Daney <ddaney@caviumnetworks.com>
Suggested-by: Rob Herring <robh@kernel.org>
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Acked-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
At CPU hotunplug the corresponding per cpu matrix allocator is shut down and
the allocated interrupt bits are discarded under the assumption that all
allocated bits have been either migrated away or shut down through the
managed interrupts mechanism.
This is not true because interrupts which are not started up might have a
vector allocated on the outgoing CPU. When the interrupt is started up
later or completely shutdown and freed then the allocated vector is handed
back, triggering warnings or causing accounting issues which result in
suspend failures and other issues.
Change the CPU hotplug mechanism of the matrix allocator so that the
remaining allocations at unplug time are preserved and global accounting at
hotplug is correctly readjusted to take the dormant vectors into account.
Fixes: 2f75d9e1c9 ("genirq: Implement bitmap matrix allocator")
Reported-by: Yuriy Vostrikov <delamonpansie@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Yuriy Vostrikov <delamonpansie@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180222112316.849980972@linutronix.de
Fix build error in fs/signalfd.c by using same method that is used in
kernel/signal.c: separate blocks for different signal si_code values.
./fs/signalfd.c: error: 'BUS_MCEERR_AR' undeclared (first use in this function)
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
All the kernel_sendmsg() calls in rxrpc_send_data_packet() need to send
both parts of the iov[] buffer, but one of them does not. Fix it so that
it does.
Without this, short IPv6 rxrpc DATA packets may be seen that have the rxrpc
header included, but no payload.
Fixes: 5a924b8951 ("rxrpc: Don't store the rxrpc header in the Tx queue sk_buffs")
Reported-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We should check "self->aq_hw" for allocation failure, and also we should
free it on the error paths.
Fixes: 23ee07ad3c ("net: aquantia: Cleanup pci functions module")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 9a3efb6b66 ("bpf: fix memory leak in lpm_trie map_free callback function")
fixed a memory leak and removed unnecessary locks in map_free callback function.
Unfortrunately, it introduced a lockdep warning. When lockdep checking is turned on,
running tools/testing/selftests/bpf/test_lpm_map will have:
[ 98.294321] =============================
[ 98.294807] WARNING: suspicious RCU usage
[ 98.295359] 4.16.0-rc2+ #193 Not tainted
[ 98.295907] -----------------------------
[ 98.296486] /home/yhs/work/bpf/kernel/bpf/lpm_trie.c:572 suspicious rcu_dereference_check() usage!
[ 98.297657]
[ 98.297657] other info that might help us debug this:
[ 98.297657]
[ 98.298663]
[ 98.298663] rcu_scheduler_active = 2, debug_locks = 1
[ 98.299536] 2 locks held by kworker/2:1/54:
[ 98.300152] #0: ((wq_completion)"events"){+.+.}, at: [<00000000196bc1f0>] process_one_work+0x157/0x5c0
[ 98.301381] #1: ((work_completion)(&map->work)){+.+.}, at: [<00000000196bc1f0>] process_one_work+0x157/0x5c0
Since actual trie tree removal happens only after no other
accesses to the tree are possible, replacing
rcu_dereference_protected(*slot, lockdep_is_held(&trie->lock))
with
rcu_dereference_protected(*slot, 1)
fixed the issue.
Fixes: 9a3efb6b66 ("bpf: fix memory leak in lpm_trie map_free callback function")
Reported-by: Eric Dumazet <edumazet@google.com>
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
syszbot managed to trigger RCU detected stalls in
bpf_array_free_percpu()
It takes time to allocate a huge percpu map, but even more time to free
it.
Since we run in process context, use cond_resched() to yield cpu if
needed.
Fixes: a10423b87a ("bpf: introduce BPF_MAP_TYPE_PERCPU_ARRAY map")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Johannes Berg says:
====================
Various fixes across the tree, the shortlog basically says it all:
cfg80211: fix cfg80211_beacon_dup
-> old bug in this code
cfg80211: clear wep keys after disconnection
-> certain ways of disconnecting left the keys
mac80211: round IEEE80211_TX_STATUS_HEADROOM up to multiple of 4
-> alignment issues with using 14 bytes
mac80211: Do not disconnect on invalid operating class
-> if the AP has a bogus operating class, let it be
mac80211: Fix sending ADDBA response for an ongoing session
-> don't send the same frame twice
cfg80211: use only 1Mbps for basic rates in mesh
-> interop issue with old versions of our code
mac80211_hwsim: don't use WQ_MEM_RECLAIM
-> it causes splats because it flushes work on a non-reclaim WQ
regulatory: add NUL to request alpha2
-> nla_put_string() issue from Kees
mac80211: mesh: fix wrong mesh TTL offset calculation
-> protocol issue
mac80211: fix a possible leak of station stats
-> error path might leak memory
mac80211: fix calling sleeping function in atomic context
-> percpu allocations need to be made with gfp flags
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull USB fixes from Greg KH:
"Here are a number of USB fixes for 4.16-rc3
Nothing major, but a number of different fixes all over the place in
the USB stack for reported issues. Mostly gadget driver fixes,
although the typical set of xhci bugfixes are there, along with some
new quirks additions as well.
All of these have been in linux-next for a while with no reported
issues"
* tag 'usb-4.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (39 commits)
Revert "usb: musb: host: don't start next rx urb if current one failed"
usb: musb: fix enumeration after resume
usb: cdc_acm: prevent race at write to acm while system resumes
Add delay-init quirk for Corsair K70 RGB keyboards
usb: ohci: Proper handling of ed_rm_list to handle race condition between usb_kill_urb() and finish_unlinks()
usb: host: ehci: always enable interrupt for qtd completion at test mode
usb: ldusb: add PIDs for new CASSY devices supported by this driver
usb: renesas_usbhs: missed the "running" flag in usb_dmac with rx path
usb: host: ehci: use correct device pointer for dma ops
usbip: keep usbip_device sockfd state in sync with tcp_socket
ohci-hcd: Fix race condition caused by ohci_urb_enqueue() and io_watchdog_func()
USB: serial: option: Add support for Quectel EP06
xhci: fix xhci debugfs errors in xhci_stop
xhci: xhci debugfs device nodes weren't removed after device plugged out
xhci: Fix xhci debugfs devices node disappearance after hibernation
xhci: Fix NULL pointer in xhci debugfs
xhci: Don't print a warning when setting link state for disabled ports
xhci: workaround for AMD Promontory disabled ports wakeup
usb: dwc3: core: Fix ULPI PHYs and prevent phy_get/ulpi_init during suspend/resume
USB: gadget: udc: Add missing platform_device_put() on error in bdc_pci_probe()
...
Pull staging/IIO fixes from Greg KH:
"Here are a small number of staging and iio driver fixes for 4.16-rc2.
The IIO fixes are all for reported things, and the android driver
fixes also resolve some reported problems. The remaining fsl-mc
Kconfig change resolves a build testing error that Arnd reported.
All of these have been in linux-next with no reported issues"
* tag 'staging-4.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
iio: buffer: check if a buffer has been set up when poll is called
iio: adis_lib: Initialize trigger before requesting interrupt
staging: android: ion: Zero CMA allocated memory
staging: android: ashmem: Fix a race condition in pin ioctls
staging: fsl-mc: fix build testing on x86
iio: srf08: fix link error "devm_iio_triggered_buffer_setup" undefined
staging: iio: ad5933: switch buffer mode to software
iio: adc: stm32: fix stm32h7_adc_enable error handling
staging: iio: adc: ad7192: fix external frequency setting
iio: adc: aspeed: Fix error handling path
Pull char/misc driver fixes from Greg KH:
"Here are a handful of char/misc driver fixes for 4.16-rc3.
There are some binder driver fixes to resolve reported issues in
stress testing the recent binder changes, some extcon driver fixes,
and a few mei driver fixes and new device ids.
All of these, with the exception of the mei driver id additions, have
been in linux-next for a while. I forgot to push out the mei driver id
additions to kernel.org until today, but all build tests pass with
them enabled"
* tag 'char-misc-4.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
mei: me: add cannon point device ids for 4th device
mei: me: add cannon point device ids
mei: set device client to the disconnected state upon suspend.
ANDROID: binder: synchronize_rcu() when using POLLFREE.
binder: replace "%p" with "%pK"
ANDROID: binder: remove WARN() for redundant txn error
binder: check for binder_thread allocation failure in binder_poll()
extcon: int3496: process id-pin first so that we start with the right status
Revert "extcon: axp288: Redo charger type detection a couple of seconds after probe()"
extcon: axp288: Constify the axp288_pwr_up_down_info array
Similar to the ancient commit a5fe8e7695 ("regulatory: add NUL
to alpha2"), add another byte to alpha2 in the request struct so
that when we use nla_put_string(), we don't overrun anything.
Fixes: 73d54c9e74 ("cfg80211: add regulatory netlink multicast group")
Reported-by: Kees Cook <keescook@google.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Pull rdma fixes from Doug Ledford:
"Nothing in this is overly interesting, it's mostly your garden variety
fixes.
There was some work in this merge cycle around the new ioctl kABI, so
there are fixes in here related to that (probably with more to come).
We've also recently added new netlink support with a goal of moving
the primary means of configuring the entire subsystem to netlink
(eventually, this is a long term project), so there are fixes for
that.
Then a few bnxt_re driver fixes, and a few minor WARN_ON removals, and
that covers this pull request. There are already a few more fixes on
the list as of this morning, so there will certainly be more to come
in this rc cycle ;-)
Summary:
- Lots of fixes for the new IOCTL interface and general uverbs flow.
Found through testing and syzkaller
- Bugfixes for the new resource track netlink reporting
- Remove some unneeded WARN_ONs that were triggering for some users
in IPoIB
- Various fixes for the bnxt_re driver"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (27 commits)
RDMA/uverbs: Fix kernel panic while using XRC_TGT QP type
RDMA/bnxt_re: Avoid system hang during device un-reg
RDMA/bnxt_re: Fix system crash during load/unload
RDMA/bnxt_re: Synchronize destroy_qp with poll_cq
RDMA/bnxt_re: Unpin SQ and RQ memory if QP create fails
RDMA/bnxt_re: Disable atomic capability on bnxt_re adapters
RDMA/restrack: don't use uaccess_kernel()
RDMA/verbs: Check existence of function prior to accessing it
RDMA/vmw_pvrdma: Fix usage of user response structures in ABI file
RDMA/uverbs: Sanitize user entered port numbers prior to access it
RDMA/uverbs: Fix circular locking dependency
RDMA/uverbs: Fix bad unlock balance in ib_uverbs_close_xrcd
RDMA/restrack: Increment CQ restrack object before committing
RDMA/uverbs: Protect from command mask overflow
IB/uverbs: Fix unbalanced unlock on error path for rdma_explicit_destroy
IB/uverbs: Improve lockdep_check
RDMA/uverbs: Protect from races between lookup and destroy of uobjects
IB/uverbs: Hold the uobj write lock after allocate
IB/uverbs: Fix possible oops with duplicate ioctl attributes
IB/uverbs: Add ioctl support for 32bit processes
...
Pull RISC-V cleanups from Palmer Dabbelt:
"This contains a handful of small cleanups.
The only functional change is that IRQs are now enabled during
exception handling, which was found when some warnings triggered with
`CONFIG_DEBUG_ATOMIC_SLEEP=y`.
The remaining fixes should have no functional change: `sbi_save()` has
been renamed to `parse_dtb()` reflect what it actually does, and a
handful of unused Kconfig entries have been removed"
* tag 'riscv-for-linus-4.16-rc3-riscv_cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux:
Rename sbi_save to parse_dtb to improve code readability
RISC-V: Enable IRQ during exception handling
riscv: Remove ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE select
riscv: kconfig: Remove RISCV_IRQ_INTC select
riscv: Remove ARCH_WANT_OPTIONAL_GPIOLIB select
The login buffer is released before the driver can perform
sanity checks between resources the driver requested and what
firmware will provide. Don't release the login buffer until
the sanity check is performed.
Fixes: 34f0f4e3f4 ("ibmvnic: Fix login buffer memory leaks")
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
AFAIK the only version of smc9194.c with Mac support is the one in the
linux-mac68k CVS repo, which never made it to the mainline.
Despite that, from v2.3.45, arch/m68k/config.in listed CONFIG_SMC9194
under CONFIG_MAC. This mistake got carried over into Kconfig in v2.5.55.
(See pre-git era "[PATCH] add m68k dependencies to net driver config".)
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The result of the skb flow dissect is copied from keys to hash_keys to
ensure only the intended data is hashed. The original L4 hash patch
overlooked setting the addr_type for this case; add it.
Fixes: bf4e0a3db9 ("net: ipv4: add support for ECMP hash policy choice")
Reported-by: Ido Schimmel <idosch@idosch.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
BBR uses tcp_tso_autosize() in an attempt to probe what would be the
burst sizes and to adjust cwnd in bbr_target_cwnd() with following
gold formula :
/* Allow enough full-sized skbs in flight to utilize end systems. */
cwnd += 3 * bbr->tso_segs_goal;
But GSO can be lacking or be constrained to very small
units (ip link set dev ... gso_max_segs 2)
What we really want is to have enough packets in flight so that both
GSO and GRO are efficient.
So in the case GSO is off or downgraded, we still want to have the same
number of packets in flight as if GSO/TSO was fully operational, so
that GRO can hopefully be working efficiently.
To fix this issue, we make tcp_tso_autosize() unaware of
sk->sk_gso_max_segs
Only tcp_tso_segs() has to enforce the gso_max_segs limit.
Tested:
ethtool -K eth0 tso off gso off
tc qd replace dev eth0 root pfifo_fast
Before patch:
for f in {1..5}; do ./super_netperf 1 -H lpaa24 -- -K bbr; done
691 (ss -temoi shows cwnd is stuck around 6 )
667
651
631
517
After patch :
# for f in {1..5}; do ./super_netperf 1 -H lpaa24 -- -K bbr; done
1733 (ss -temoi shows cwnd is around 386 )
1778
1746
1781
1718
Fixes: 0f8782ea14 ("tcp_bbr: add BBR congestion control")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If an attempt is made to disable RX checksums, USB adapter is changed
but netdev->features is not, because smsc75xx_set_features() returns a
non zero value.
This throws errors from netdev_rx_csum_fault() :
<devname>: hw csum failure
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Steve Glendinning <steve.glendinning@shawell.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Before, if cb->start() failed, the module reference would never be put,
because cb->cb_running is intentionally false at this point. Users are
generally annoyed by this because they can no longer unload modules that
leak references. Also, it may be possible to tediously wrap a reference
counter back to zero, especially since module.c still uses atomic_inc
instead of refcount_inc.
This patch expands the error path to simply call module_put if
cb->start() fails.
Fixes: 41c87425a1 ("netlink: do not set cb_running if dump's start() errs")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Fix seccomp GET_METADATA to deal with field sizes correctly (Tycho Andersen)
- Add selftest to make sure GET_METADATA doesn't regress (Tycho Andersen)
Merge misc fixes from Andrew Morton:
"16 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
mm: don't defer struct page initialization for Xen pv guests
lib/Kconfig.debug: enable RUNTIME_TESTING_MENU
vmalloc: fix __GFP_HIGHMEM usage for vmalloc_32 on 32b systems
selftests/memfd: add run_fuse_test.sh to TEST_FILES
bug.h: work around GCC PR82365 in BUG()
mm/swap.c: make functions and their kernel-doc agree (again)
mm/zpool.c: zpool_evictable: fix mismatch in parameter name and kernel-doc
ida: do zeroing in ida_pre_get()
mm, swap, frontswap: fix THP swap if frontswap enabled
certs/blacklist_nohashes.c: fix const confusion in certs blacklist
kernel/relay.c: limit kmalloc size to KMALLOC_MAX_SIZE
mm, mlock, vmscan: no more skipping pagevecs
mm: memcontrol: fix NR_WRITEBACK leak in memcg and system stats
Kbuild: always define endianess in kconfig.h
include/linux/sched/mm.h: re-inline mmdrop()
tools: fix cross-compile var clobbering
Each read from a file in efivarfs results in two calls to EFI
(one to get the file size, another to get the actual data).
On X86 these EFI calls result in broadcast system management
interrupts (SMI) which affect performance of the whole system.
A malicious user can loop performing reads from efivarfs bringing
the system to its knees.
Linus suggested per-user rate limit to solve this.
So we add a ratelimit structure to "user_struct" and initialize
it for the root user for no limit. When allocating user_struct for
other users we set the limit to 100 per second. This could be used
for other places that want to limit the rate of some detrimental
user action.
In efivarfs if the limit is exceeded when reading, we take an
interruptible nap for 50ms and check the rate limit again.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We need to enable PM runtime on omap1 also as otherwise we
will get errors:
omap_timer omap_timer.1: omap_dm_timer_probe: pm_runtime_get_sync failed!
omap_timer: probe of omap_timer.1 failed with error -13
...
We are checking for OMAP_TIMER_NEEDS_RESET flag elsewhere so this is
safe to do.
Cc: Aaro Koskinen <aaro.koskinen@iki.fi>
Cc: Keerthy <j-keerthy@ti.com>
Cc: Ladislav Michl <ladis@linux-mips.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
The header files for some structures could get included in such a way
that struct attributes (specifically __randomize_layout from path.h) would
be parsed as variable names instead of attributes. This could lead to
some instances of a structure being unrandomized, causing nasty GPFs, etc.
This patch makes sure the compiler_types.h header is included in
kconfig.h so that we've always got types and struct attributes defined,
since kconfig.h is included from the compiler command line.
Reported-by: Patrick McLean <chutzpah@gentoo.org>
Root-caused-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Tested-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Fixes: 3859a271a0 ("randstruct: Mark various structs for randomization")
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
samples/seccomp relies on the host setting which is not suitable for
crosscompilation and it actually fails when crosscompiling s390 and
powerpc all{yes,mod}config on x86_64 with
samples/seccomp/bpf-helper.h:135:2: error: #error __BITS_PER_LONG value unusable.
#error __BITS_PER_LONG value unusable.
^
In file included from samples/seccomp/bpf-fancy.c:13:0:
samples/seccomp/bpf-fancy.c: In function ‘main’:
samples/seccomp/bpf-fancy.c:38:11: error: ‘__NR_exit’ undeclared (first use in this function)
SYSCALL(__NR_exit, ALLOW),
and many others. I am doing these for compile testing and it's been
quite useful to catch issues. Crosscompiling sample code on the other
hand doesn't seem all that important so it seems like the easiest way to
simply disable samples/seccomp when crosscompiling.
Fixing this properly is not that easy as Kees explains:
: IIRC, one of the problems is with build ordering problems: the kernel
: headers used by the samples aren't available when cross compiling.
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
The structure nlmclnt_fl_close_lock_ops s local to the source and does
not need to be in global scope, so make it static.
Cleans up sparse warning:
fs/nfs/nfs3proc.c:876:33: warning: symbol 'nlmclnt_fl_close_lock_ops' was not
declared. Should it be static?
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
nfs4_update_server unconditionally releases the nfs_client for the
source server. If migration fails, this can cause the source server's
nfs_client struct to be left with a low reference count, resulting in
use-after-free. Also, adjust reference count handling for ELOOP.
NFS: state manager: migration failed on NFSv4 server nfsvmu10 with error 6
WARNING: CPU: 16 PID: 17960 at fs/nfs/client.c:281 nfs_put_client+0xfa/0x110 [nfs]()
nfs_put_client+0xfa/0x110 [nfs]
nfs4_run_state_manager+0x30/0x40 [nfsv4]
kthread+0xd8/0xf0
BUG: unable to handle kernel NULL pointer dereference at 00000000000002a8
nfs4_xdr_enc_write+0x6b/0x160 [nfsv4]
rpcauth_wrap_req+0xac/0xf0 [sunrpc]
call_transmit+0x18c/0x2c0 [sunrpc]
__rpc_execute+0xa6/0x490 [sunrpc]
rpc_async_schedule+0x15/0x20 [sunrpc]
process_one_work+0x160/0x470
worker_thread+0x112/0x540
? rescuer_thread+0x3f0/0x3f0
kthread+0xd8/0xf0
This bug was introduced by 32e62b7c ("NFS: Add nfs4_update_server"),
but the fix applies cleanly to 52442f9b ("NFS4: Avoid migration loops")
Reported-by: Helen Chao <helen.chao@oracle.com>
Fixes: 52442f9b11 ("NFS4: Avoid migration loops")
Signed-off-by: Bill Baker <bill.baker@oracle.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
On i386, there are 2 types of PLTs, PIC and non-PIC. PIE and shared
objects must use PIC PLT. To use PIC PLT, you need to load
_GLOBAL_OFFSET_TABLE_ into EBX first. There is no need for that on
x86-64 since x86-64 uses PC-relative PLT.
On x86-64, for 32-bit PC-relative branches, we can generate PLT32
relocation, instead of PC32 relocation, which can also be used as
a marker for 32-bit PC-relative branches. Linker can always reduce
PLT32 relocation to PC32 if function is defined locally. Local
functions should use PC32 relocation. As far as Linux kernel is
concerned, R_X86_64_PLT32 can be treated the same as R_X86_64_PC32
since Linux kernel doesn't use PLT.
R_X86_64_PLT32 for 32-bit PC-relative branches has been enabled in
binutils master branch which will become binutils 2.31.
[ hjl is working on having better documentation on this all, but a few
more notes from him:
"PLT32 relocation is used as marker for PC-relative branches. Because
of EBX, it looks odd to generate PLT32 relocation on i386 when EBX
doesn't have GOT.
As for symbol resolution, PLT32 and PC32 relocations are almost
interchangeable. But when linker sees PLT32 relocation against a
protected symbol, it can resolved locally at link-time since it is
used on a branch instruction. Linker can't do that for PC32
relocation"
but for the kernel use, the two are basically the same, and this
commit gets things building and working with the current binutils
master - Linus ]
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A section type mismatch warning shows up when building with LTO,
since orion_ge00_mvmdio_bus_name was put in __initconst but not marked
const itself:
include/linux/of.h: In function 'spear_setup_of_timer':
arch/arm/mach-spear/time.c:207:34: error: 'timer_of_match' causes a section type conflict with 'orion_ge00_mvmdio_bus_name'
static const struct of_device_id timer_of_match[] __initconst = {
^
arch/arm/plat-orion/common.c:475:32: note: 'orion_ge00_mvmdio_bus_name' was declared here
static __initconst const char *orion_ge00_mvmdio_bus_name = "orion-mii";
^
As pointed out by Andrew Lunn, it should in fact be 'const' but not
'__initconst' because the string is never copied but may be accessed
after the init sections are freed. To fix that, I get rid of the
extra symbol and rewrite the initialization in a simpler way that
assigns both the bus_id and modalias statically.
I spotted another theoretical bug in the same place, where d->netdev[i]
may be an out of bounds access, this can be fixed by moving the device
assignment into the loop.
Cc: stable@vger.kernel.org
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Pull "Rockchip dts64 fixes for 4.16" from Heiko Stübner:
Fixes of dwmmc tuning clocks that may make probing HS cards fail,
adding the grf-vio clock to the edp so that it can also be build
as module, correct pcie ep-gpio on the sapphire board and finally
a fix that makes the gmac work at gigabit speeds on the rk3328-rock64.
* tag 'v4.16-rockchip-dts64fixes-1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip:
arm64: dts: rockchip: Fix DWMMC clocks
arm64: dts: rockchip: introduce pclk_vio_grf in rk3399-eDP device node
arm64: dts: rockchip: correct ep-gpios for rk3399-sapphire
arm64: dts: rockchip: fix rock64 gmac2io stability issues
Pull "Rockchip dts32 fixes for 4.16" from Heiko Stübner:
Fix wrong dwmmc tuning clocks that may make probing HS cards fail to
probe and removal of special opps from the phycore boards that may
run the cpu outside the soc-vendor specs.
* tag 'v4.16-rockchip-dts32fixes-1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip:
ARM: dts: rockchip: Fix DWMMC clocks
ARM: dts: rockchip: Remove 1.8 GHz operation point from phycore som
Fixes for omaps for v4.16-rc cycle
This is mostly SoC related fixes for clocks, interconnect, and PM with few
board specifc dts related fixes:
- Fix quirk handling for ti-sysc to check all quirk flags instead of just
the first one
- Fix LogicPD boards for i2c1 muxing to avoid intermittent PMIC errors
- Fix debounce-interval use for omap5-uevm
- Fix debugfs_create_*() usage for omap1
- Fix sar_base initialization for HS omaps
- Fix omap3 prm wake interrupt for resume
- Fix kmemleak for omap_get_timer_dt()
- Enable optional clocks before main clock to prevent interconnect target
module from being stuck in transition
* tag 'omap-for-v4.16/fixes-signed' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
bus: ti-sysc: Fix checking of no-reset-on-init quirk
ARM: dts: LogicPD SOM-LV: Fix I2C1 pinmux
ARM: dts: LogicPD Torpedo: Fix I2C1 pinmux
ARM: dts: OMAP5: uevm: Fix "debounce-interval" property misspelling
ARM: OMAP1: clock: Fix debugfs_create_*() usage
ARM: OMAP2+: Fix sar_base inititalization for HS omaps
ARM: OMAP3: Fix prm wake interrupt for resume
ARM: OMAP2+: timer: fix a kmemleak caused in omap_get_timer_dt
ARM: OMAP2+: hwmod_core: enable optional clocks before main clock
Pull "mvebu fixes for 4.16 (part 1)" from Gregory CLEMENT:
- Updating my emails address (from free-electrons to bootlin)
- Adding back the selection of the PL310 Errata fix for the Cortex A9
based Armada SoCs (Armada 375 and 38x)
* tag 'mvebu-fixes-4.16-1' of git://git.infradead.org/linux-mvebu:
ARM: mvebu: Fix broken PL310_ERRATA_753970 selects
MAINTAINERS: update email address for Gregory CLEMENT
Building with LTO revealed that three spi_board_info arrays are marked
__initconst, but not const:
arch/arm/mach-davinci/board-dm365-evm.c: In function 'dm365_evm_init':
arch/arm/mach-davinci/board-dm365-evm.c:729:30: error: 'dm365_evm_spi_info' causes a section type conflict with 'dm646x_edma_device'
static struct spi_board_info dm365_evm_spi_info[] __initconst = {
^
arch/arm/mach-davinci/dm646x.c:603:42: note: 'dm646x_edma_device' was declared here
static const struct platform_device_info dm646x_edma_device __initconst = {
This marks them const as well, as was originally intended.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
The array of string pointers is put in __initconst, and the strings themselves
are marke 'const' but the the pointers are not, which caused a warning when
built with LTO:
arch/arm/mach-clps711x/board-dt.c:72:20: error: 'clps711x_compat' causes a section type conflict with 'feroceon_ids'
static const char *clps711x_compat[] __initconst = {
This marks the array itself const as well, which was certainly the
intention originally.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Pull "AT91 SOC fixes for 4.16" from Alexandre Belloni:
- change my email address
* tag 'at91-ab-4.16-soc-fixes' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/abelloni/linux:
MAINTAINERS: ARM: at91: update my email address
Improve the DTS files by removing all the leading "0x" and zeros to fix the
following dtc warnings:
Warning (unit_address_format): Node /XXX unit name should not have leading "0x"
and
Warning (unit_address_format): Node /XXX unit name should not have leading 0s
Converted using the following command:
find . -type f \( -iname *.dts -o -iname *.dtsi \) -exec sed -i -e "s/@\([0-9a-fA-FxX\.;:#]+\)\s*{/@\L\1 {/g" -e "s/@0x\(.*\) {/@\1 {/g" -e "s/@0+\(.*\) {/@\1 {/g" {} +^C
For simplicity, two sed expressions were used to solve each warnings separately.
To make the regex expression more robust a few other issues were resolved,
namely setting unit-address to lower case, and adding a whitespace before the
the opening curly brace:
https://elinux.org/Device_Tree_Linux#Linux_conventions
This will solve as a side effect warning:
Warning (simple_bus_reg): Node /XXX@<UPPER> simple-bus unit address format error, expected "<lower>"
This is a follow up to commit 4c9847b737 ("dt-bindings: Remove leading 0x from bindings notation")
Reported-by: David Daney <ddaney@caviumnetworks.com>
Suggested-by: Rob Herring <robh@kernel.org>
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Improve the DTS files by removing all the leading "0x" and zeros to fix the
following dtc warnings:
Warning (unit_address_format): Node /XXX unit name should not have leading "0x"
and
Warning (unit_address_format): Node /XXX unit name should not have leading 0s
Converted using the following command:
find . -type f \( -iname *.dts -o -iname *.dtsi \) -exec sed -E -i -e "s/@0x([0-9a-fA-F\.]+)\s?\{/@\L\1 \{/g" -e "s/@0+([0-9a-fA-F\.]+)\s?\{/@\L\1 \{/g" {} +
For simplicity, two sed expressions were used to solve each warnings separately.
To make the regex expression more robust a few other issues were resolved,
namely setting unit-address to lower case, and adding a whitespace before the
the opening curly brace:
https://elinux.org/Device_Tree_Linux#Linux_conventions
This is a follow up to commit 4c9847b737 ("dt-bindings: Remove leading 0x from bindings notation")
Reported-by: David Daney <ddaney@caviumnetworks.com>
Suggested-by: Rob Herring <robh@kernel.org>
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Acked-by: Matthias Brugger <matthias.bgg@gmail.com>
Acked-by: Andy Gross <andy.gross@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
dtc recently added PCI bus checks. Fix these warnings:
arch/arm64/boot/dts/cavium/thunder2-99xx.dtb: Warning (pci_bridge): Node /pci missing bus-range for PCI bridge
arch/arm64/boot/dts/cavium/thunder2-99xx.dtb: Warning (unit_address_vs_reg): Node /pci has a reg or ranges property, but no unit name
Signed-off-by: Rob Herring <robh@kernel.org>
Cc: Jayachandran C <jnair@caviumnetworks.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
kmalloc() can't always allocate large enough buffers for big_key to use for
crypto (1MB + some metadata) so we cannot use that to allocate the buffer.
Further, vmalloc'd pages can't be passed to sg_init_one() and the aead
crypto accessors cannot be called progressively and must be passed all the
data in one go (which means we can't pass the data in one block at a time).
Fix this by allocating the buffer pages individually and passing them
through a multientry scatterlist to the crypto layer. This has the bonus
advantage that we don't have to allocate a contiguous series of pages.
We then vmap() the page list and pass that through to the VFS read/write
routines.
This can trigger a warning:
WARNING: CPU: 0 PID: 60912 at mm/page_alloc.c:3883 __alloc_pages_nodemask+0xb7c/0x15f8
([<00000000002acbb6>] __alloc_pages_nodemask+0x1ee/0x15f8)
[<00000000002dd356>] kmalloc_order+0x46/0x90
[<00000000002dd3e0>] kmalloc_order_trace+0x40/0x1f8
[<0000000000326a10>] __kmalloc+0x430/0x4c0
[<00000000004343e4>] big_key_preparse+0x7c/0x210
[<000000000042c040>] key_create_or_update+0x128/0x420
[<000000000042e52c>] SyS_add_key+0x124/0x220
[<00000000007bba2c>] system_call+0xc4/0x2b0
from the keyctl/padd/useradd test of the keyutils testsuite on s390x.
Note that it might be better to shovel data through in page-sized lumps
instead as there's no particular need to use a monolithic buffer unless the
kernel itself wants to access the data.
Fixes: 13100a72f4 ("Security: Keys: Big keys stored encrypted")
Reported-by: Paul Bunyan <pbunyan@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Kirill Marinushkin <k.marinushkin@gmail.com>
The asymmetric key type allows an X.509 certificate to be added even if
its signature's hash algorithm is not available in the crypto API. In
that case 'payload.data[asym_auth]' will be NULL. But the key
restriction code failed to check for this case before trying to use the
signature, resulting in a NULL pointer dereference in
key_or_keyring_common() or in restrict_link_by_signature().
Fix this by returning -ENOPKG when the signature is unsupported.
Reproducer when all the CONFIG_CRYPTO_SHA512* options are disabled and
keyctl has support for the 'restrict_keyring' command:
keyctl new_session
keyctl restrict_keyring @s asymmetric builtin_trusted
openssl req -new -sha512 -x509 -batch -nodes -outform der \
| keyctl padd asymmetric desc @s
Fixes: a511e1af8b ("KEYS: Move the point of trust determination to __key_link()")
Cc: <stable@vger.kernel.org> # v4.7+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
The X.509 parser mishandles the case where the certificate's signature's
hash algorithm is not available in the crypto API. In this case,
x509_get_sig_params() doesn't allocate the cert->sig->digest buffer;
this part seems to be intentional. However,
public_key_verify_signature() is still called via
x509_check_for_self_signed(), which triggers the 'BUG_ON(!sig->digest)'.
Fix this by making public_key_verify_signature() return -ENOPKG if the
hash buffer has not been allocated.
Reproducer when all the CONFIG_CRYPTO_SHA512* options are disabled:
openssl req -new -sha512 -x509 -batch -nodes -outform der \
| keyctl padd asymmetric desc @s
Fixes: 6c2dc5ae4a ("X.509: Extract signature digest and make self-signed cert checks earlier")
Reported-by: Paolo Valente <paolo.valente@linaro.org>
Cc: Paolo Valente <paolo.valente@linaro.org>
Cc: <stable@vger.kernel.org> # v4.7+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
If none of the certificates in a SignerInfo's certificate chain match a
trusted key, nor is the last certificate signed by a trusted key, then
pkcs7_validate_trust_one() tries to check whether the SignerInfo's
signature was made directly by a trusted key. But, it actually fails to
set the 'sig' variable correctly, so it actually verifies the last
signature seen. That will only be the SignerInfo's signature if the
certificate chain is empty; otherwise it will actually be the last
certificate's signature.
This is not by itself a security problem, since verifying any of the
certificates in the chain should be sufficient to verify the SignerInfo.
Still, it's not working as intended so it should be fixed.
Fix it by setting 'sig' correctly for the direct verification case.
Fixes: 757932e6da ("PKCS#7: Handle PKCS#7 messages that contain no X.509 certs")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
If there is a blacklisted certificate in a SignerInfo's certificate
chain, then pkcs7_verify_sig_chain() sets sinfo->blacklisted and returns
0. But, pkcs7_verify() fails to handle this case appropriately, as it
actually continues on to the line 'actual_ret = 0;', indicating that the
SignerInfo has passed verification. Consequently, PKCS#7 signature
verification ignores the certificate blacklist.
Fix this by not considering blacklisted SignerInfos to have passed
verification.
Also fix the function comment with regards to when 0 is returned.
Fixes: 03bb79315d ("PKCS#7: Handle blacklisted certificates")
Cc: <stable@vger.kernel.org> # v4.12+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
When pkcs7_verify_sig_chain() is building the certificate chain for a
SignerInfo using the certificates in the PKCS#7 message, it is passing
the wrong arguments to public_key_verify_signature(). Consequently,
when the next certificate is supposed to be used to verify the previous
certificate, the next certificate is actually used to verify itself.
An attacker can use this bug to create a bogus certificate chain that
has no cryptographic relationship between the beginning and end.
Fortunately I couldn't quite find a way to use this to bypass the
overall signature verification, though it comes very close. Here's the
reasoning: due to the bug, every certificate in the chain beyond the
first actually has to be self-signed (where "self-signed" here refers to
the actual key and signature; an attacker might still manipulate the
certificate fields such that the self_signed flag doesn't actually get
set, and thus the chain doesn't end immediately). But to pass trust
validation (pkcs7_validate_trust()), either the SignerInfo or one of the
certificates has to actually be signed by a trusted key. Since only
self-signed certificates can be added to the chain, the only way for an
attacker to introduce a trusted signature is to include a self-signed
trusted certificate.
But, when pkcs7_validate_trust_one() reaches that certificate, instead
of trying to verify the signature on that certificate, it will actually
look up the corresponding trusted key, which will succeed, and then try
to verify the *previous* certificate, which will fail. Thus, disaster
is narrowly averted (as far as I could tell).
Fixes: 6c2dc5ae4a ("X.509: Extract signature digest and make self-signed cert checks earlier")
Cc: <stable@vger.kernel.org> # v4.7+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
test_maps contains a series of stress tests, and previously it will break the
rest tests when it failed to alloc memory.
-----------------------
Failed to create hashmap key=8 value=262144 'Cannot allocate memory'
Failed to create hashmap key=16 value=262144 'Cannot allocate memory'
Failed to create hashmap key=8 value=262144 'Cannot allocate memory'
Failed to create hashmap key=8 value=262144 'Cannot allocate memory'
test_maps: test_maps.c:955: run_parallel: Assertion `status == 0' failed.
Aborted
not ok 1..3 selftests: test_maps [FAIL]
-----------------------
after this patch, the rest tests will be continue when it occurs an ENOMEM failure
CC: Alexei Starovoitov <alexei.starovoitov@gmail.com>
CC: Philip Li <philip.li@intel.com>
Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Li Zhijian <zhijianx.li@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
The system call path can be interrupted before the switch back to the
standard branch prediction with BPENTER has been done. The critical
section cleanup code skips forward to .Lsysc_do_svc and bypasses the
BPENTER. In this case the kernel and all subsequent code will run with
the limited branch prediction.
Fixes: eacf67eb9b32 ("s390: run user space and KVM guests with modified branch prediction")
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
ioremap_page_range doesn't honour break-before-make and attempts to put
down huge mappings (using p*d_set_huge) over the top of pre-existing
table entries. This leads to us leaking page table memory and also gives
rise to TLB conflicts and spurious aborts, which have been seen in
practice on Cortex-A75.
Until this has been resolved, refuse to put block mappings when the
existing entry is found to be present.
Fixes: 324420bf91 ("arm64: add support for ioremap() block mappings")
Reported-by: Hanjun Guo <hanjun.guo@linaro.org>
Reported-by: Lei Li <lious.lilei@hisilicon.com>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
We were leaving them in the power on state (or the state the firmware
had set up for some client, if we were taking over from them). The
boot state was 30 core clocks, when we actually want to sample some
time after (to make sure that the new input bit has actually arrived).
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Cc: stable@kernel.org
The recent support for the multiple PCM devices allowed user to use
multiple HDMI/DP outputs, but at the same time, the PCM stream
assignment has been changed, too. Due to that, the former PCM#0
(there was only one stream in the past) is likely assigned to a
different one (e.g. PCM#2), and it ends up with the regression when
user sticks with the fixed configuration using the device#0.
Although the multiple monitor support shouldn't matter when user
deploys the backend like PulseAudio that checks the jack detection
state, the behavior change isn't always acceptable for some users.
As a mitigation, this patch introduces an option to switch the
behavior back to the old-good-days: when the new option,
single_port=1, is passed, the driver creates only a single PCM device,
and it's assigned to the first connected one, like the earlier
versions did. The option is turned off as default still to support
the multiple monitors.
Fixes: 8a2d6ae1f7 ("ALSA: x86: Register multiple PCM devices for the LPE audio card")
Reported-and-tested-by: Hubert Mantel <mantel@metadox.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
On lkml suggestions were made to split up such trivial typo fixes into per subsystem
patches:
--- a/arch/x86/boot/compressed/eboot.c
+++ b/arch/x86/boot/compressed/eboot.c
@@ -439,7 +439,7 @@ setup_uga32(void **uga_handle, unsigned long size, u32 *width, u32 *height)
struct efi_uga_draw_protocol *uga = NULL, *first_uga;
efi_guid_t uga_proto = EFI_UGA_PROTOCOL_GUID;
unsigned long nr_ugas;
- u32 *handles = (u32 *)uga_handle;;
+ u32 *handles = (u32 *)uga_handle;
efi_status_t status = EFI_INVALID_PARAMETER;
int i;
This patch is the result of the following script:
$ sed -i 's/;;$/;/g' $(git grep -E ';;$' | grep "\.[ch]:" | grep -vwE 'for|ia64' | cut -d: -f1 | sort | uniq)
... followed by manual review to make sure it's all good.
Splitting this up is just crazy talk, let's get over with this and just do it.
Reported-by: Pavel Machek <pavel@ucw.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
When we terminate driver I/O (because we need to stop using a certain
channel path) we also need to ensure that a timer (which may have been
set up using ccw_device_start_timeout) is cleared.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
When a timeout occurs for users of ccw_device_start_timeout
we will stop the IO and call the drivers int handler with
the irb pointer set to ERR_PTR(-ETIMEDOUT). Sometimes
however we'd set the irb pointer to ERR_PTR(-EIO) which is
not intended. Just set the correct value in all codepaths.
Reported-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
There are cases a device driver can't start IO because the device is
currently in use by cio. In this case the device driver is notified
when the device is usable again.
Using ccw_device_start_timeout we would set the timeout (and change
an existing timeout) before we test for internal usage. Worst case
this could lead to an unexpected timer deletion.
Fix this by setting the timeout after we test for internal usage.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Commit f19fbd5ed6 ("s390: introduce execute-trampolines for
branches") introduces .cfi_* assembler directives. Instead of
using the directives directly, use the macros from asm/dwarf.h.
This also ensures that the dwarf debug information are created
in the .debug_frame section.
Fixes: f19fbd5ed6 ("s390: introduce execute-trampolines for branches")
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
blk_rq_bytes does the wrong thing for special payloads like discards and
might cause the driver to not set up a SGL.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Keith Busch <keith.busch@intel.com>
blk_rq_bytes does the wrong thing for special payloads like discards and
might cause the driver to not set up a SGL.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Keith Busch <keith.busch@intel.com>
THIS_MODULE evaluates to NULL when used from code built into the kernel,
thus breaking built-in transport modules. Remove the bogus check.
Fixes: 0de5cd36 ("nvme-fabrics: protect against module unload during create_ctrl")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Keith Busch <keith.busch@intel.com>
This patch finishes all outstanding SCSI IO commands (but not other commands,
e.g., task management) in the shutdown and unload paths.
It first waits for the commands to complete (this is done after setting
'ioc->remove_host = 1 ', which prevents new commands to be queued) then it
flushes commands that might still be running.
This avoids triggering error handling (e.g., abort command) for all commands
possibly completed by the adapter after interrupts disabled.
[mauricfo: introduced something in commit message.]
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Tested-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch adds checks for 'ioc->remove_host' in the SCSI error handlers, so
not to access pointers/resources potentially freed in the PCI shutdown/module
unload path. The error handlers may be invoked after shutdown/unload,
depending on other components.
This problem was observed with kexec on a system with a mpt3sas based adapter
and an infiniband adapter which takes long enough to shutdown:
The mpt3sas driver finished shutting down / disabled interrupt handling, thus
some commands have not finished and timed out.
Since the system was still running (waiting for the infiniband adapter to
shutdown), the scsi error handler for task abort of mpt3sas was invoked, and
hit an oops -- either in scsih_abort() because 'ioc->scsi_lookup' was NULL
without commit dbec4c9040 ("scsi: mpt3sas: lockless command submission"), or
later up in scsih_host_reset() (with or without that commit), because it
eventually called mpt3sas_base_get_iocstate().
After the above commit, the oops in scsih_abort() does not occur anymore
(_scsih_scsi_lookup_find_by_scmd() is no longer called), but that commit is
too big and out of the scope of linux-stable, where this patch might help, so
still go for the changes.
Also, this might help to prevent similar errors in the future, in case code
changes and possibly tries to access freed stuff.
Note the fix in scsih_host_reset() is still important anyway.
Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Acked-by: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
I am using SECCOMP to filter syscalls on a ppc32 platform, and noticed
that the JIT compiler was failing on the BPF even though the
interpreter was working fine.
The issue was that the compiler was missing one of the instructions
used by SECCOMP, so here is a patch to enable JIT for that
instruction.
Fixes: eb84bab0fb ("ppc: Kconfig: Enable BPF JIT on ppc32")
Signed-off-by: Mark Lord <mlord@pobox.com>
Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This reverts commit 02ef6dd810.
The earlier patch tried to enable support for a new property
"ibm,drc-info" on powerpc systems.
Unfortunately, some errors in the associated patch set break things
in some of the DLPAR operations. In particular when attempting to
hot-add a new CPU or set of CPUs, the original patch failed to
properly calculate the available resources, and aborted the operation.
In addition, the original set missed several opportunities to compress
and reuse common code.
As the associated patch set was meant to provide an optimization of
storage and performance of a set of device-tree properties for future
systems with large amounts of resources, reverting just restores
the previous behavior for existing systems. It seems unnecessary
to enable this feature and introduce the consequent problems in the
field that it will cause at this time, so please revert it for now
until testing of the corrections are finished properly.
Fixes: 02ef6dd810 ("powerpc: Enable support for ibm,drc-info devtree property")
Signed-off-by: Michael W. Bringmann <mwb@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
We had a mid-air collision between two new firmware features, DRMEM_V2
and DRC_INFO, and they ended up with the same value.
No one's actually reported any problems, presumably because the new
firmware that supports both properties is not widely available, and
the two properties tend to be enabled together.
Still if we ever had one enabled but not the other, the bugs that
could result are many and varied. So fix it.
Fixes: 3f38000eda ("powerpc/firmware: Add definitions for new drc-info firmware feature")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
Update the algorithm in storvsc_do_io to look for a channel
starting with the current CPU + 1 and wrap around (within the
current NUMA node). This spreads VMbus interrupts more evenly
across CPUs. Previous code always started with first CPU in
the current NUMA node, skewing the interrupt load to that CPU.
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Reviewed-by: Long Li <longli@microsoft.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If power domain information are missing in the device tree, no
power domains get initialized. However, imx_gpc_remove tries to
remove power domains always in the old DT binding case. Only
remove power domains when imx_gpc_probe initialized them in
first place.
Fixes: 721cabf6c6 ("soc: imx: move PGC handling to a new GPC driver")
Signed-off-by: Stefan Agner <stefan@agner.ch>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Let's test that we get the flags correctly, and that we preserve the filter
index across the ptrace(PTRACE_SECCOMP_GET_METADATA) correctly.
Signed-off-by: Tycho Andersen <tycho@tycho.ws>
CC: Kees Cook <keescook@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Previously if users passed a small size for the input structure size, they
would get get odd behavior. It doesn't make sense to pass a structure
smaller than at least filter_off size, so let's just give -EINVAL in this
case.
This changes userspace visible behavior, but was only introduced in commit
26500475ac ("ptrace, seccomp: add support for retrieving seccomp
metadata") in 4.16-rc2, so should be safe to change if merged before then.
Reported-by: Eugene Syromiatnikov <esyr@redhat.com>
Signed-off-by: Tycho Andersen <tycho@tycho.ws>
CC: Kees Cook <keescook@chromium.org>
CC: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Commit 26500475ac ("ptrace, seccomp: add support for retrieving seccomp
metadata") introduced `struct seccomp_metadata`, which contained unsigned
longs that should be arch independent. The type of the flags member was
chosen to match the corresponding argument to seccomp(), and so we need
something at least as big as unsigned long. My understanding is that __u64
should fit the bill, so let's switch both types to that.
While this is userspace facing, it was only introduced in 4.16-rc2, and so
should be safe assuming it goes in before then.
Reported-by: "Dmitry V. Levin" <ldv@altlinux.org>
Signed-off-by: Tycho Andersen <tycho@tycho.ws>
CC: Kees Cook <keescook@chromium.org>
CC: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: "Dmitry V. Levin" <ldv@altlinux.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Both glibc and the kernel have in6_* macros definitions. Build fails
because it picks up wrong in6_* macro from the kernel header and not the
header from glibc.
Fixes build error below:
clang -I. -I./include/uapi -I../../../include/uapi
-Wno-compare-distinct-pointer-types \
-O2 -target bpf -emit-llvm -c test_tcpbpf_kern.c -o - | \
llc -march=bpf -mcpu=generic -filetype=obj
-o .../tools/testing/selftests/bpf/test_tcpbpf_kern.o
In file included from test_tcpbpf_kern.c:12:
.../netinet/in.h:101:5: error: expected identifier
IPPROTO_HOPOPTS = 0, /* IPv6 Hop-by-Hop options. */
^
.../linux/in6.h:131:26: note: expanded from macro 'IPPROTO_HOPOPTS'
^
In file included from test_tcpbpf_kern.c:12:
/usr/include/netinet/in.h:103:5: error: expected identifier
IPPROTO_ROUTING = 43, /* IPv6 routing header. */
^
.../linux/in6.h:132:26: note: expanded from macro 'IPPROTO_ROUTING'
^
In file included from test_tcpbpf_kern.c:12:
.../netinet/in.h:105:5: error: expected identifier
IPPROTO_FRAGMENT = 44, /* IPv6 fragmentation header. */
^
Since both glibc and the kernel have in6_* macros definitions, use the
one from glibc. Kernel headers will check for previous libc definitions
by including include/linux/libc-compat.h.
Reported-by: Daniel Díaz <daniel.diaz@linaro.org>
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Tested-by: Daniel Díaz <daniel.diaz@linaro.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
The only user of this variable is inside of an #ifdef, causing
a warning without CONFIG_INET:
net/core/filter.c: In function '____bpf_sock_ops_cb_flags_set':
net/core/filter.c:3382:6: error: unused variable 'val' [-Werror=unused-variable]
int val = argval & BPF_SOCK_OPS_ALL_CB_FLAGS;
This replaces the #ifdef with a nicer IS_ENABLED() check that
makes the code more readable and avoids the warning.
Fixes: b13d880721 ("bpf: Adds field bpf_sock_ops_cb_flags to tcp_sock")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Commit f7f99100d8 ("mm: stop zeroing memory during allocation in
vmemmap") broke Xen pv domains in some configurations, as the "Pinned"
information in struct page of early page tables could get lost.
This will lead to the kernel trying to write directly into the page
tables instead of asking the hypervisor to do so. The result is a crash
like the following:
BUG: unable to handle kernel paging request at ffff8801ead19008
IP: xen_set_pud+0x4e/0xd0
PGD 1c0a067 P4D 1c0a067 PUD 23a0067 PMD 1e9de0067 PTE 80100001ead19065
Oops: 0003 [#1] PREEMPT SMP
Modules linked in:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.14.0-default+ #271
Hardware name: Dell Inc. Latitude E6440/0159N7, BIOS A07 06/26/2014
task: ffffffff81c10480 task.stack: ffffffff81c00000
RIP: e030:xen_set_pud+0x4e/0xd0
Call Trace:
__pmd_alloc+0x128/0x140
ioremap_page_range+0x3f4/0x410
__ioremap_caller+0x1c3/0x2e0
acpi_os_map_iomem+0x175/0x1b0
acpi_tb_acquire_table+0x39/0x66
acpi_tb_validate_table+0x44/0x7c
acpi_tb_verify_temp_table+0x45/0x304
acpi_reallocate_root_table+0x12d/0x141
acpi_early_init+0x4d/0x10a
start_kernel+0x3eb/0x4a1
xen_start_kernel+0x528/0x532
Code: 48 01 e8 48 0f 42 15 a2 fd be 00 48 01 d0 48 ba 00 00 00 00 00 ea ff ff 48 c1 e8 0c 48 c1 e0 06 48 01 d0 48 8b 00 f6 c4 02 75 5d <4c> 89 65 00 5b 5d 41 5c c3 65 8b 05 52 9f fe 7e 89 c0 48 0f a3
RIP: xen_set_pud+0x4e/0xd0 RSP: ffffffff81c03cd8
CR2: ffff8801ead19008
---[ end trace 38eca2e56f1b642e ]---
Avoid this problem by not deferring struct page initialization when
running as Xen pv guest.
Pavel said:
: This is unique for Xen, so this particular issue won't effect other
: configurations. I am going to investigate if there is a way to
: re-enable deferred page initialization on xen guests.
[akpm@linux-foundation.org: explicitly include xen.h]
Link: http://lkml.kernel.org/r/20180216154101.22865-1-jgross@suse.com
Fixes: f7f99100d8 ("mm: stop zeroing memory during allocation in vmemmap")
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Steven Sistare <steven.sistare@oracle.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Bob Picco <bob.picco@oracle.com>
Cc: <stable@vger.kernel.org> [4.15.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kai Heng Feng has noticed that BUG_ON(PageHighMem(pg)) triggers in
drivers/media/common/saa7146/saa7146_core.c since 19809c2da2 ("mm,
vmalloc: use __GFP_HIGHMEM implicitly").
saa7146_vmalloc_build_pgtable uses vmalloc_32 and it is reasonable to
expect that the resulting page is not in highmem. The above commit
aimed to add __GFP_HIGHMEM only for those requests which do not specify
any zone modifier gfp flag. vmalloc_32 relies on GFP_VMALLOC32 which
should do the right thing. Except it has been missed that GFP_VMALLOC32
is an alias for GFP_KERNEL on 32b architectures. Thanks to Matthew to
notice this.
Fix the problem by unconditionally setting GFP_DMA32 in GFP_VMALLOC32
for !64b arches (as a bailout). This should do the right thing and use
ZONE_NORMAL which should be always below 4G on 32b systems.
Debugged by Matthew Wilcox.
[akpm@linux-foundation.org: coding-style fixes]
Link: http://lkml.kernel.org/r/20180212095019.GX21609@dhcp22.suse.cz
Fixes: 19809c2da2 ("mm, vmalloc: use __GFP_HIGHMEM implicitly”)
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Kai Heng Feng <kai.heng.feng@canonical.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Laura Abbott <labbott@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Looking at functions with large stack frames across all architectures
led me discovering that BUG() suffers from the same problem as
fortify_panic(), which I've added a workaround for already.
In short, variables that go out of scope by calling a noreturn function
or __builtin_unreachable() keep using stack space in functions
afterwards.
A workaround that was identified is to insert an empty assembler
statement just before calling the function that doesn't return. I'm
adding a macro "barrier_before_unreachable()" to document this, and
insert calls to that in all instances of BUG() that currently suffer
from this problem.
The files that saw the largest change from this had these frame sizes
before, and much less with my patch:
fs/ext4/inode.c:82:1: warning: the frame size of 1672 bytes is larger than 800 bytes [-Wframe-larger-than=]
fs/ext4/namei.c:434:1: warning: the frame size of 904 bytes is larger than 800 bytes [-Wframe-larger-than=]
fs/ext4/super.c:2279:1: warning: the frame size of 1160 bytes is larger than 800 bytes [-Wframe-larger-than=]
fs/ext4/xattr.c:146:1: warning: the frame size of 1168 bytes is larger than 800 bytes [-Wframe-larger-than=]
fs/f2fs/inode.c:152:1: warning: the frame size of 1424 bytes is larger than 800 bytes [-Wframe-larger-than=]
net/netfilter/ipvs/ip_vs_core.c:1195:1: warning: the frame size of 1068 bytes is larger than 800 bytes [-Wframe-larger-than=]
net/netfilter/ipvs/ip_vs_core.c:395:1: warning: the frame size of 1084 bytes is larger than 800 bytes [-Wframe-larger-than=]
net/netfilter/ipvs/ip_vs_ftp.c:298:1: warning: the frame size of 928 bytes is larger than 800 bytes [-Wframe-larger-than=]
net/netfilter/ipvs/ip_vs_ftp.c:418:1: warning: the frame size of 908 bytes is larger than 800 bytes [-Wframe-larger-than=]
net/netfilter/ipvs/ip_vs_lblcr.c:718:1: warning: the frame size of 960 bytes is larger than 800 bytes [-Wframe-larger-than=]
drivers/net/xen-netback/netback.c:1500:1: warning: the frame size of 1088 bytes is larger than 800 bytes [-Wframe-larger-than=]
In case of ARC and CRIS, it turns out that the BUG() implementation
actually does return (or at least the compiler thinks it does),
resulting in lots of warnings about uninitialized variable use and
leaving noreturn functions, such as:
block/cfq-iosched.c: In function 'cfq_async_queue_prio':
block/cfq-iosched.c:3804:1: error: control reaches end of non-void function [-Werror=return-type]
include/linux/dmaengine.h: In function 'dma_maxpq':
include/linux/dmaengine.h:1123:1: error: control reaches end of non-void function [-Werror=return-type]
This makes them call __builtin_trap() instead, which should normally
dump the stack and kill the current process, like some of the other
architectures already do.
I tried adding barrier_before_unreachable() to panic() and
fortify_panic() as well, but that had very little effect, so I'm not
submitting that patch.
Vineet said:
: For ARC, it is double win.
:
: 1. Fixes 3 -Wreturn-type warnings
:
: | ../net/core/ethtool.c:311:1: warning: control reaches end of non-void function
: [-Wreturn-type]
: | ../kernel/sched/core.c:3246:1: warning: control reaches end of non-void function
: [-Wreturn-type]
: | ../include/linux/sunrpc/svc_xprt.h:180:1: warning: control reaches end of
: non-void function [-Wreturn-type]
:
: 2. bloat-o-meter reports code size improvements as gcc elides the
: generated code for stack return.
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82365
Link: http://lkml.kernel.org/r/20171219114112.939391-1-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Vineet Gupta <vgupta@synopsys.com> [arch/arc]
Tested-by: Vineet Gupta <vgupta@synopsys.com> [arch/arc]
Cc: Mikael Starvik <starvik@axis.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Christopher Li <sparse@chrisli.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
As far as I can tell, the only place the per-cpu ida_bitmap is populated
is in ida_pre_get. The pre-allocated element is stolen in two places in
ida_get_new_above, in both cases immediately followed by a memset(0).
Since ida_get_new_above is called with locks held, do the zeroing in
ida_pre_get, or rather let kmalloc() do it. Also, apparently gcc
generates ~44 bytes of code to do a memset(, 0, 128):
$ scripts/bloat-o-meter vmlinux.{0,1}
add/remove: 0/0 grow/shrink: 2/1 up/down: 5/-88 (-83)
Function old new delta
ida_pre_get 115 119 +4
vermagic 27 28 +1
ida_get_new_above 715 627 -88
Link: http://lkml.kernel.org/r/20180108225634.15340-1-linux@rasmusvillemoes.dk
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Eric Biggers <ebiggers@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It was reported by Sergey Senozhatsky that if THP (Transparent Huge
Page) and frontswap (via zswap) are both enabled, when memory goes low
so that swap is triggered, segfault and memory corruption will occur in
random user space applications as follow,
kernel: urxvt[338]: segfault at 20 ip 00007fc08889ae0d sp 00007ffc73a7fc40 error 6 in libc-2.26.so[7fc08881a000+1ae000]
#0 0x00007fc08889ae0d _int_malloc (libc.so.6)
#1 0x00007fc08889c2f3 malloc (libc.so.6)
#2 0x0000560e6004bff7 _Z14rxvt_wcstoutf8PKwi (urxvt)
#3 0x0000560e6005e75c n/a (urxvt)
#4 0x0000560e6007d9f1 _ZN16rxvt_perl_interp6invokeEP9rxvt_term9hook_typez (urxvt)
#5 0x0000560e6003d988 _ZN9rxvt_term9cmd_parseEv (urxvt)
#6 0x0000560e60042804 _ZN9rxvt_term6pty_cbERN2ev2ioEi (urxvt)
#7 0x0000560e6005c10f _Z17ev_invoke_pendingv (urxvt)
#8 0x0000560e6005cb55 ev_run (urxvt)
#9 0x0000560e6003b9b9 main (urxvt)
#10 0x00007fc08883af4a __libc_start_main (libc.so.6)
#11 0x0000560e6003f9da _start (urxvt)
After bisection, it was found the first bad commit is bd4c82c22c ("mm,
THP, swap: delay splitting THP after swapped out").
The root cause is as follows:
When the pages are written to swap device during swapping out in
swap_writepage(), zswap (fontswap) is tried to compress the pages to
improve performance. But zswap (frontswap) will treat THP as a normal
page, so only the head page is saved. After swapping in, tail pages
will not be restored to their original contents, causing memory
corruption in the applications.
This is fixed by refusing to save page in the frontswap store functions
if the page is a THP. So that the THP will be swapped out to swap
device.
Another choice is to split THP if frontswap is enabled. But it is found
that the frontswap enabling isn't flexible. For example, if
CONFIG_ZSWAP=y (cannot be module), frontswap will be enabled even if
zswap itself isn't enabled.
Frontswap has multiple backends, to make it easy for one backend to
enable THP support, the THP checking is put in backend frontswap store
functions instead of the general interfaces.
Link: http://lkml.kernel.org/r/20180209084947.22749-1-ying.huang@intel.com
Fixes: bd4c82c22c ("mm, THP, swap: delay splitting THP after swapped out")
Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Suggested-by: Minchan Kim <minchan@kernel.org> [put THP checking in backend]
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Shaohua Li <shli@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: <stable@vger.kernel.org> [4.14]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When a thread mlocks an address space backed either by file pages which
are currently not present in memory or swapped out anon pages (not in
swapcache), a new page is allocated and added to the local pagevec
(lru_add_pvec), I/O is triggered and the thread then sleeps on the page.
On I/O completion, the thread can wake on a different CPU, the mlock
syscall will then sets the PageMlocked() bit of the page but will not be
able to put that page in unevictable LRU as the page is on the pagevec
of a different CPU. Even on drain, that page will go to evictable LRU
because the PageMlocked() bit is not checked on pagevec drain.
The page will eventually go to right LRU on reclaim but the LRU stats
will remain skewed for a long time.
This patch puts all the pages, even unevictable, to the pagevecs and on
the drain, the pages will be added on their LRUs correctly by checking
their evictability. This resolves the mlocked pages on pagevec of other
CPUs issue because when those pagevecs will be drained, the mlocked file
pages will go to unevictable LRU. Also this makes the race with munlock
easier to resolve because the pagevec drains happen in LRU lock.
However there is still one place which makes a page evictable and does
PageLRU check on that page without LRU lock and needs special attention.
TestClearPageMlocked() and isolate_lru_page() in clear_page_mlock().
#0: __pagevec_lru_add_fn #1: clear_page_mlock
SetPageLRU() if (!TestClearPageMlocked())
return
smp_mb() // <--required
// inside does PageLRU
if (!PageMlocked()) if (isolate_lru_page())
move to evictable LRU putback_lru_page()
else
move to unevictable LRU
In '#1', TestClearPageMlocked() provides full memory barrier semantics
and thus the PageLRU check (inside isolate_lru_page) can not be
reordered before it.
In '#0', without explicit memory barrier, the PageMlocked() check can be
reordered before SetPageLRU(). If that happens, '#0' can put a page in
unevictable LRU and '#1' might have just cleared the Mlocked bit of that
page but fails to isolate as PageLRU fails as '#0' still hasn't set
PageLRU bit of that page. That page will be stranded on the unevictable
LRU.
There is one (good) side effect though. Without this patch, the pages
allocated for System V shared memory segment are added to evictable LRUs
even after shmctl(SHM_LOCK) on that segment. This patch will correctly
put such pages to unevictable LRU.
Link: http://lkml.kernel.org/r/20171121211241.18877-1-shakeelb@google.com
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Greg Thelen <gthelen@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Shaohua Li <shli@fb.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
After commit a983b5ebee ("mm: memcontrol: fix excessive complexity in
memory.stat reporting"), we observed slowly upward creeping NR_WRITEBACK
counts over the course of several days, both the per-memcg stats as well
as the system counter in e.g. /proc/meminfo.
The conversion from full per-cpu stat counts to per-cpu cached atomic
stat counts introduced an irq-unsafe RMW operation into the updates.
Most stat updates come from process context, but one notable exception
is the NR_WRITEBACK counter. While writebacks are issued from process
context, they are retired from (soft)irq context.
When writeback completions interrupt the RMW counter updates of new
writebacks being issued, the decs from the completions are lost.
Since the global updates are routed through the joint lruvec API, both
the memcg counters as well as the system counters are affected.
This patch makes the joint stat and event API irq safe.
Link: http://lkml.kernel.org/r/20180203082353.17284-1-hannes@cmpxchg.org
Fixes: a983b5ebee ("mm: memcontrol: fix excessive complexity in memory.stat reporting")
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Debugged-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Rik van Riel <riel@surriel.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Build testing with LTO found a couple of files that get compiled
differently depending on whether asm/byteorder.h gets included early
enough or not. In particular, include/asm-generic/qrwlock_types.h is
affected by this, but there are probably others as well.
The symptom is a series of LTO link time warnings, including these:
net/netlabel/netlabel_unlabeled.h:223: error: type of 'netlbl_unlhsh_add' does not match original declaration [-Werror=lto-type-mismatch]
int netlbl_unlhsh_add(struct net *net,
net/netlabel/netlabel_unlabeled.c:377: note: 'netlbl_unlhsh_add' was previously declared here
include/net/ipv6.h:360: error: type of 'ipv6_renew_options_kern' does not match original declaration [-Werror=lto-type-mismatch]
ipv6_renew_options_kern(struct sock *sk,
net/ipv6/exthdrs.c:1162: note: 'ipv6_renew_options_kern' was previously declared here
net/core/dev.c:761: note: 'dev_get_by_name_rcu' was previously declared here
struct net_device *dev_get_by_name_rcu(struct net *net, const char *name)
net/core/dev.c:761: note: code may be misoptimized unless -fno-strict-aliasing is used
drivers/gpu/drm/i915/i915_drv.h:3377: error: type of 'i915_gem_object_set_to_wc_domain' does not match original declaration [-Werror=lto-type-mismatch]
i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write);
drivers/gpu/drm/i915/i915_gem.c:3639: note: 'i915_gem_object_set_to_wc_domain' was previously declared here
include/linux/debugfs.h:92:9: error: type of 'debugfs_attr_read' does not match original declaration [-Werror=lto-type-mismatch]
ssize_t debugfs_attr_read(struct file *file, char __user *buf,
fs/debugfs/file.c:318: note: 'debugfs_attr_read' was previously declared here
include/linux/rwlock_api_smp.h:30: error: type of '_raw_read_unlock' does not match original declaration [-Werror=lto-type-mismatch]
void __lockfunc _raw_read_unlock(rwlock_t *lock) __releases(lock);
kernel/locking/spinlock.c:246:26: note: '_raw_read_unlock' was previously declared here
include/linux/fs.h:3308:5: error: type of 'simple_attr_open' does not match original declaration [-Werror=lto-type-mismatch]
int simple_attr_open(struct inode *inode, struct file *file,
fs/libfs.c:795: note: 'simple_attr_open' was previously declared here
All of the above are caused by include/asm-generic/qrwlock_types.h
failing to include asm/byteorder.h after commit e0d02285f1
("locking/qrwlock: Use 'struct qrwlock' instead of 'struct __qrwlock'")
in linux-4.15.
Similar bugs may or may not exist in older kernels as well, but there is
no easy way to test those with link-time optimizations, and kernels
before 4.14 are harder to fix because they don't have Babu's patch
series
We had similar issues with CONFIG_ symbols in the past and ended up
always including the configuration headers though linux/kconfig.h. This
works around the issue through that same file, defining either
__BIG_ENDIAN or __LITTLE_ENDIAN depending on CONFIG_CPU_BIG_ENDIAN,
which is now always set on all architectures since commit 4c97a0c8fe
("arch: define CPU_BIG_ENDIAN for all fixed big endian archs").
Link: http://lkml.kernel.org/r/20180202154104.1522809-2-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Babu Moger <babu.moger@amd.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Nicolas Pitre <nico@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently a number of Makefiles break when used with toolchains that
pass extra flags in CC and other cross-compile related variables (such
as --sysroot).
Thus we get this error when we use a toolchain that puts --sysroot in
the CC var:
~/src/linux/tools$ make iio
[snip]
iio_event_monitor.c:18:10: fatal error: unistd.h: No such file or directory
#include <unistd.h>
^~~~~~~~~~
This occurs because we clobber several env vars related to
cross-compiling with lines like this:
CC = $(CROSS_COMPILE)gcc
Although this will point to a valid cross-compiler, we lose any extra
flags that might exist in the CC variable, which can break toolchains
that rely on them (for example, those that use --sysroot).
This easily shows up using a Yocto SDK:
$ . [snip]/sdk/environment-setup-cortexa8hf-neon-poky-linux-gnueabi
$ echo $CC
arm-poky-linux-gnueabi-gcc -march=armv7-a -mfpu=neon -mfloat-abi=hard
-mcpu=cortex-a8
--sysroot=[snip]/sdk/sysroots/cortexa8hf-neon-poky-linux-gnueabi
$ echo $CROSS_COMPILE
arm-poky-linux-gnueabi-
$ echo ${CROSS_COMPILE}gcc
krm-poky-linux-gnueabi-gcc
Although arm-poky-linux-gnueabi-gcc is a cross-compiler, we've lost the
--sysroot and other flags that enable us to find the right libraries to
link against, so we can't find unistd.h and other libraries and headers.
Normally with the --sysroot flag we would find unistd.h in the sdk
directory in the sysroot:
$ find [snip]/sdk/sysroots -path '*/usr/include/unistd.h'
[snip]/sdk/sysroots/cortexa8hf-neon-poky-linux-gnueabi/usr/include/unistd.h
The perf Makefile adds CC = $(CROSS_COMPILE)gcc if and only if CC is not
already set, and it compiles correctly with the above toolchain.
So, generalize the logic that perf uses in the common Makefile and
remove the manual CC = $(CROSS_COMPILE)gcc lines from each Makefile.
Note that this patch does not fix cross-compile for all the tools (some
have other bugs), but it does fix it for all except usb and acpi, which
still have other unrelated issues.
I tested both with and without the patch on native and cross-build and
there appear to be no regressions.
Link: http://lkml.kernel.org/r/20180107214028.23771-1-martin@martingkelly.com
Signed-off-by: Martin Kelly <martin@martingkelly.com>
Acked-by: Mark Brown <broonie@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Jonathan Cameron <jic23@kernel.org>
Cc: Pali Rohar <pali.rohar@gmail.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Jacek Anaszewski <jacek.anaszewski@gmail.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Robert Moore <robert.moore@intel.com>
Cc: Lv Zheng <lv.zheng@intel.com>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Valentina Manea <valentina.manea.m@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Mario Limonciello <mario.limonciello@dell.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fixes for 4.16. I contains fixes for deadlock on runtime suspend on few
drivers, a memory leak on non-blocking commits, a crash on color-eviction.
The is also meson and edid fixes, plus a fix for a doc warning.
* tag 'drm-misc-fixes-2018-02-21' of git://anongit.freedesktop.org/drm/drm-misc:
drm/tve200: fix kernel-doc documentation comment include
drm/meson: fix vsync buffer update
drm: Handle unexpected holes in color-eviction
drm/edid: Add 6 bpc quirk for CPT panel in Asus UX303LA
drm/amdgpu: Fix deadlock on runtime suspend
drm/radeon: Fix deadlock on runtime suspend
drm/nouveau: Fix deadlock on runtime suspend
drm: Allow determining if current task is output poll worker
workqueue: Allow retrieval of current task's work struct
drm/atomic: Fix memleak on ERESTARTSYS during non-blocking commits
After resuming from suspend, the PCI device support must re-enable the
interrupt setting so that interrupts are actually delivered.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann says:
====================
pull-request: bpf 2018-02-20
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) Fix a memory leak in LPM trie's map_free() callback function, where
the trie structure itself was not freed since initial implementation.
Also a synchronize_rcu() was needed in order to wait for outstanding
programs accessing the trie to complete, from Yonghong.
2) Fix sock_map_alloc()'s error path in order to correctly propagate
the -EINVAL error in case of too large allocation requests. This
was just recently introduced when fixing close hooks via ULP layer,
fix from Eric.
3) Do not use GFP_ATOMIC in __cpu_map_entry_alloc(). Reason is that this
will not work with the recent __ptr_ring_init_queue_alloc() conversion
to kvmalloc_array(), where in case of fallback to vmalloc() that GFP
flag is invalid, from Jason.
4) Fix two recent syzkaller warnings: i) fix bpf_prog_array_copy_to_user()
when a prog query with a big number of ids was performed where we'd
otherwise trigger a warning from allocator side, ii) fix a missing
mlock precharge on arraymaps, from Daniel.
5) Two fixes for bpftool in order to avoid breaking JSON output when used
in batch mode, from Quentin.
6) Move a pr_debug() in libbpf in order to avoid having an otherwise
uninitialized variable in bpf_program__reloc_text(), from Jeremy.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jesper Dangaard Brouer says:
====================
virtio_net: several bugs in XDP code for driver virtio_net
The virtio_net driver actually violates the original memory model of
XDP causing hard to debug crashes. Per request of John Fastabend,
instead of removing the XDP feature I'm fixing as much as possible.
While testing virtio_net with XDP_REDIRECT I found 4 different bugs.
Patch-1: not enough tail-room for build_skb in receive_mergeable()
only option is to disable XDP_REDIRECT in receive_mergeable()
Patch-2: XDP in receive_small() basically never worked (check wrong flag)
Patch-3: fix memory leak for XDP_REDIRECT in error cases
Patch-4: avoid crash when ndo_xdp_xmit is called on dev not ready for XDP
In the longer run, we should consider introducing a separate receive
function when attaching an XDP program, and also change the memory
model to be compatible with XDP when attaching an XDP prog.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When a driver implements the ndo_xdp_xmit() function, there is
(currently) no generic way to determine whether it is safe to call.
It is e.g. unsafe to call the drivers ndo_xdp_xmit, if it have not
allocated the needed XDP TX queues yet. This is the case for
virtio_net, which first allocates the XDP TX queues once an XDP/bpf
prog is attached (in virtnet_xdp_set()).
Thus, a crash will occur for virtio_net when redirecting to another
virtio_net device's ndo_xdp_xmit, which have not attached a XDP prog.
The sample xdp_redirect_map tries to attach a dummy XDP prog to take
this into account, but it can also easily fail if the virtio_net (or
actually underlying vhost driver) have not allocated enough extra
queues for the device.
Allocating more queue this is currently a manual config.
Hint for libvirt XML add:
<driver name='vhost' queues='16'>
<host mrg_rxbuf='off'/>
<guest tso4='off' tso6='off' ecn='off' ufo='off'/>
</driver>
The solution in this patch is to check that the device have loaded an
XDP/bpf prog before proceeding. This is similar to the check
performed in driver ixgbe.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
XDP_REDIRECT calling xdp_do_redirect() can fail for multiple reasons
(which can be inspected by tracepoints). The current semantics is that
on failure the driver calling xdp_do_redirect() must handle freeing or
recycling the page associated with this frame. This can be seen as an
optimization, as drivers usually have an optimized XDP_DROP code path
for frame recycling in place already.
The virtio_net driver didn't handle when xdp_do_redirect() failed.
This caused a memory leak as the page refcnt wasn't decremented on
failures.
The function __virtnet_xdp_xmit() did handle one type of failure,
when the xmit queue virtqueue_add_outbuf() is full, which "hides"
releasing a refcnt on the page. Instead the function __virtnet_xdp_xmit()
must follow API of xdp_do_redirect(), which on errors leave it up to
the caller to free the page, of the failed send operation.
Fixes: 186b3c998c ("virtio-net: support XDP_REDIRECT")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When configuring virtio_net to use the code path 'receive_small()',
in-order to get correct XDP_REDIRECT support, I discovered TCP packets
would get silently dropped when loading an XDP program action XDP_PASS.
The bug seems to be that receive_small() when XDP is loaded check that
hdr->hdr.flags is zero, which seems wrong as hdr.flags contains the
flags VIRTIO_NET_HDR_F_* :
#define VIRTIO_NET_HDR_F_NEEDS_CSUM 1 /* Use csum_start, csum_offset */
#define VIRTIO_NET_HDR_F_DATA_VALID 2 /* Csum is valid */
TCP got dropped as it had the VIRTIO_NET_HDR_F_DATA_VALID flag set.
The flags that are relevant here are the VIRTIO_NET_HDR_GSO_* flags
stored in hdr->hdr.gso_type. Thus, the fix is just check that none of
the gso_type flags have been set.
Fixes: bb91accf27 ("virtio-net: XDP support for small buffers")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The virtio_net code have three different RX code-paths in receive_buf().
Two of these code paths can handle XDP, but one of them is broken for
at least XDP_REDIRECT.
Function(1): receive_big() does not support XDP.
Function(2): receive_small() support XDP fully and uses build_skb().
Function(3): receive_mergeable() broken XDP_REDIRECT uses napi_alloc_skb().
The simple explanation is that receive_mergeable() is broken because
it uses napi_alloc_skb(), which violates XDP given XDP assumes packet
header+data in single page and enough tail room for skb_shared_info.
The longer explaination is that receive_mergeable() tries to
work-around and satisfy these XDP requiresments e.g. by having a
function xdp_linearize_page() that allocates and memcpy RX buffers
around (in case packet is scattered across multiple rx buffers). This
does currently satisfy XDP_PASS, XDP_DROP and XDP_TX (but only because
we have not implemented bpf_xdp_adjust_tail yet).
The XDP_REDIRECT action combined with cpumap is broken, and cause hard
to debug crashes. The main issue is that the RX packet does not have
the needed tail-room (SKB_DATA_ALIGN(skb_shared_info)), causing
skb_shared_info to overlap the next packets head-room (in which cpumap
stores info).
Reproducing depend on the packet payload length and if RX-buffer size
happened to have tail-room for skb_shared_info or not. But to make
this even harder to troubleshoot, the RX-buffer size is runtime
dynamically change based on an Exponentially Weighted Moving Average
(EWMA) over the packet length, when refilling RX rings.
This patch only disable XDP_REDIRECT support in receive_mergeable()
case, because it can cause a real crash.
IMHO we should consider NOT supporting XDP in receive_mergeable() at
all, because the principles behind XDP are to gain speed by (1) code
simplicity, (2) sacrificing memory and (3) where possible moving
runtime checks to setup time. These principles are clearly being
violated in receive_mergeable(), that e.g. runtime track average
buffer size to save memory consumption.
In the longer run, we should consider introducing a separate receive
function when attaching an XDP program, and also change the memory
model to be compatible with XDP when attaching an XDP prog.
Fixes: 186b3c998c ("virtio-net: support XDP_REDIRECT")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Saeed Mahameed says:
====================
Mellanox, mlx5 fixes 2018-02-20
The following pull request includes some fixes for the mlx5 core and
netdevice driver.
Please pull and let me know if there's any issue.
-stable 4.10.y:
('net/mlx5e: Fix loopback self test when GRO is off')
-stable 4.12.y:
('net/mlx5e: Specify numa node when allocating drop rq')
-stable 4.13.y:
('net/mlx5e: Verify inline header size do not exceed SKB linear size')
-stable 4.15.y:
('net/mlx5e: Fix TCP checksum in LRO buffers')
('net/mlx5: Fix error handling when adding flow rules')
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains large batch with Netfilter fixes for
your net tree, mostly due to syzbot report fixups and pr_err()
ratelimiting, more specifically, they are:
1) Get rid of superfluous unnecessary check in x_tables before vmalloc(),
we don't hit BUG there anymore, patch from Michal Hock, suggested by
Andrew Morton.
2) Race condition in proc file creation in ipt_CLUSTERIP, from Cong Wang.
3) Drop socket lock that results in circular locking dependency, patch
from Paolo Abeni.
4) Drop packet if case of malformed blob that makes backpointer jump
in x_tables, from Florian Westphal.
5) Fix refcount leak due to race in ipt_CLUSTERIP in
clusterip_config_find_get(), from Cong Wang.
6) Several patches to ratelimit pr_err() for x_tables since this can be
a problem where CAP_NET_ADMIN semantics can protect us in untrusted
namespace, from Florian Westphal.
7) Missing .gitignore update for new autogenerated asn1 state machine
for the SNMP NAT helper, from Zhu Lingshan.
8) Missing timer initialization in xt_LED, from Paolo Abeni.
9) Do not allow negative port range in NAT, also from Paolo.
10) Lock imbalance in the xt_hashlimit rate match mode, patch from
Eric Dumazet.
11) Initialize workqueue before timer in the idletimer match,
from Eric Dumazet.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
A domain cgroup isn't allowed to be turned threaded if its subtree is
populated or domain controllers are enabled. cgroup_enable_threaded()
depended on cgroup_can_be_thread_root() test to enforce this rule. A
parent which has populated domain descendants or have domain
controllers enabled can't become a thread root, so the above rules are
enforced automatically.
However, for the root cgroup which can host mixed domain and threaded
children, cgroup_can_be_thread_root() doesn't check any of those
conditions and thus first level cgroups ends up escaping those rules.
This patch fixes the bug by adding explicit checks for those rules in
cgroup_enable_threaded().
Reported-by: Michael Kerrisk (man-pages) <mtk.manpages@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Fixes: 8cfd8147df ("cgroup: implement cgroup v2 thread support")
Cc: stable@vger.kernel.org # v4.14+
gcc warns about a possible overflow of the kmem_cache string, when adding
four characters to a string of the same length:
drivers/md/raid5.c: In function 'setup_conf':
drivers/md/raid5.c:2207:34: error: '-alt' directive writing 4 bytes into a region of size between 1 and 32 [-Werror=format-overflow=]
sprintf(conf->cache_name[1], "%s-alt", conf->cache_name[0]);
^~~~
drivers/md/raid5.c:2207:2: note: 'sprintf' output between 5 and 36 bytes into a destination of size 32
sprintf(conf->cache_name[1], "%s-alt", conf->cache_name[0]);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If I'm counting correctly, we need 11 characters for the fixed part
of the string and 18 characters for a 64-bit pointer (when no gendisk
is used), so that leaves three characters for conf->level, which should
always be sufficient.
This makes the code use snprintf() with the correct length, to
make the code more robust against changes, and to get the compiler
to shut up.
In commit f4be6b43f1 ("md/raid5: ensure we create a unique name for
kmem_cache when mddev has no gendisk") from 2010, Neil said that
the pointer could be removed "shortly" once devices without gendisk
are disallowed. I have no idea if that happened, but if it did, that
should probably be changed as well.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Shaohua Li <sh.li@alibaba-inc.com>
Add missing bio completion. Without this any flush request would hang.
Fixes: 1532d9e87e ("raid5-ppl: PPL support for disks with write-back cache enabled")
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Shaohua Li <sh.li@alibaba-inc.com>
After initmem has been freed, any jump labels in __init code are
prevented from being written to by the kernel_text_address() check in
__jump_label_update(). However, this check is quite broad. If
kernel_text_address() were to return false for any other reason, the
jump label write would fail silently with no warning.
For jump labels in module init code, entry->code is set to zero to
indicate that the entry is disabled. Do the same thing for core kernel
init code. This makes the behavior more consistent, and will also make
it more straightforward to detect non-init jump label write failures in
the next patch.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/c52825c73f3a174e8398b6898284ec20d4deb126.1519051220.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
We can also move the CLD, SWAPGS, and the switch_to_thread_stack() call
to the interrupt_entry() helper function. As we do not want call depths
of two, convert switch_to_thread_stack() to a macro.
However, switch_to_thread_stack() has another user in entry_64_compat.S,
which currently expects it to be a function. To keep the code changes
in this patch minimal, create a wrapper function.
The switch to a macro means that there is some binary code duplication
if CONFIG_IA32_EMULATION=y is enabled. Therefore, the size reduction
differs whether CONFIG_IA32_EMULATION is enabled or not:
CONFIG_IA32_EMULATION=y (-0.13k):
text data bss dec hex filename
17158 0 0 17158 4306 entry_64.o-orig
17028 0 0 17028 4284 entry_64.o
CONFIG_IA32_EMULATION=n (-0.27k):
text data bss dec hex filename
17158 0 0 17158 4306 entry_64.o-orig
16882 0 0 16882 41f2 entry_64.o
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: dan.j.williams@intel.com
Link: http://lkml.kernel.org/r/20180220210113.6725-4-linux@dominikbrodowski.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Moving the switch to IRQ stack from the interrupt macro to the helper
function requires some trickery: All ENTER_IRQ_STACK really cares about
is where the "original" stack -- meaning the GP registers etc. -- is
stored. Therefore, we need to offset the stored RSP value by 8 whenever
ENTER_IRQ_STACK is called from within a function. In such cases, and
after switching to the IRQ stack, we need to push the "original" return
address (i.e. the return address from the call to the interrupt entry
function) to the IRQ stack.
This trickery allows us to carve another .85k from the text size (it
would be more except for the additional unwind hints):
text data bss dec hex filename
18006 0 0 18006 4656 entry_64.o-orig
17158 0 0 17158 4306 entry_64.o
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: dan.j.williams@intel.com
Link: http://lkml.kernel.org/r/20180220210113.6725-3-linux@dominikbrodowski.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
GCC-8 shows a warning for the x86 oprofile code that copies per-CPU
data from CPU 0 to all other CPUs, which when building a non-SMP
kernel turns into a memcpy() with identical source and destination
pointers:
arch/x86/oprofile/nmi_int.c: In function 'mux_clone':
arch/x86/oprofile/nmi_int.c:285:2: error: 'memcpy' source argument is the same as destination [-Werror=restrict]
memcpy(per_cpu(cpu_msrs, cpu).multiplex,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
per_cpu(cpu_msrs, 0).multiplex,
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
sizeof(struct op_msr) * model->num_virt_counters);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/x86/oprofile/nmi_int.c: In function 'nmi_setup':
arch/x86/oprofile/nmi_int.c:466:3: error: 'memcpy' source argument is the same as destination [-Werror=restrict]
arch/x86/oprofile/nmi_int.c:470:3: error: 'memcpy' source argument is the same as destination [-Werror=restrict]
I have analyzed a number of such warnings now: some are valid and the
GCC warning is welcome. Others turned out to be false-positives, and
GCC was changed to not warn about those any more. This is a corner case
that is a false-positive but the GCC developers feel it's better to keep
warning about it.
In this case, it seems best to work around it by telling GCC
a little more clearly that this code path is never hit with
an IS_ENABLED() configuration check.
Cc:stable as we also want old kernels to build cleanly with GCC-8.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Sebor <msebor@gcc.gnu.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <rric@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: oprofile-list@lists.sf.net
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20180220205826.2008875-1-arnd@arndb.de
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84095
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Commits adding PCI IDs for Intel Braswell and Kaby Lake PCH-H lacked the
respective Kconfig and Documentation/i2c/busses/i2c-i801 change. Add
them now.
Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
One I2C bus on my Atom E3845 board has been broken since 4.9.
It has two devices, both declared by ACPI and with built-in drivers.
There are two back-to-back transactions originating from the kernel, one
targeting each device. The first transaction works, the second one locks
up the I2C controller. The controller never recovers.
These kernel logs show up whenever an I2C transaction is attempted after
this failure.
i2c-designware-pci 0000:00:18.3: timeout in disabling adapter
i2c-designware-pci 0000:00:18.3: timeout waiting for bus ready
Waiting for the I2C controller status to indicate that it is enabled
before programming it fixes the issue.
I have tested this patch on 4.14 and 4.15.
Fixes: commit 2702ea7dbe ("i2c: designware: wait for disable/enable only if necessary")
Cc: linux-stable <stable@vger.kernel.org> #4.13+
Signed-off-by: Ben Gardner <gardner.ben@gmail.com>
Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
So far, if the filter was too large to fit in the allocated skb, the
kernel did not return any error and stopped dumping. Modify the dumper
so that it returns -EMSGSIZE when a filter fails to dump and it is the
first filter in the skb. If we are not first, we will get a next chance
with more room.
I understand this is pretty near to being an API change, but the
original design (silent truncation) can be considered a bug.
Note: The error case can happen pretty easily if you create a filter
with 32 actions and have 4kb pages. Also recent versions of iproute try
to be clever with their buffer allocation size, which in turn leads to
Signed-off-by: Roman Kapl <code@rkapl.cz>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This uses the EDID info from the Sony PlayStation VR headset,
when connected directly, to mark it as non-desktop.
Since the connection box (product id b403) defaults to HDMI
pass-through to the TV, it is not marked as non-desktop.
Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This uses the EDID info from Lenovo Explorer (LEN-b800), Acer AH100
(ACR-7fce), and Samsung Odyssey (SEC-144a) to mark them as non-desktop.
The other entries are for the HP Windows Mixed Reality Headset (HPN-3515),
the Fujitsu Windows Mixed Reality headset (FUJ-1970), the Dell Visor
(DEL-7fce), and the ASUS HC102 (AUS-c102). They are not tested with real
hardware, but listed as HMD monitors alongside the tested headsets in the
Microsoft HololensSensors driver package.
Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This uses the EDID info from Oculus Rift DK1 (OVR-0001), DK2 (OVR-0003),
and CV1 (OVR-0004) to mark them as non-desktop.
Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Fix some issues found by a static checker:
When allocating an AFU interrupt, if the driver cannot copy the output
parameters to userland, the errno value was not set to EFAULT
Remove a (now) useless cast.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The notify_resume() callback in eeh_ops is NULL on powernv, leading to
crashes:
NIP (null)
LR eeh_report_resume+0x218/0x220
Call Trace:
eeh_report_resume+0x1f0/0x220 (unreliable)
eeh_pe_dev_traverse+0x98/0x170
eeh_handle_normal_event+0x3f4/0x650
eeh_handle_event+0x54/0x380
eeh_event_handler+0x14c/0x210
kthread+0x168/0x1b0
ret_from_kernel_thread+0x5c/0xb4
Fix it by adding a check before calling it.
Fixes: 856e1eb9bd ("PCI/AER: Add uevents in AER and EEH error/resume")
Signed-off-by: Juan J. Alvarez <jjalvare@linux.vnet.ibm.com>
Reviewed-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Tested-by: Carol L. Soto <clsoto@us.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Tested-by: Mauro S. M. Rodrigues <maurosr@linux.vnet.ibm.com>
Acked-by: Michael Neuling <mikey@neuling.org>
[mpe: Rewrite change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
After Laptop Mode Tools starts to use min_power for LPM, a user found
out Crucial BX100 SSD can't get mounted.
Crucial BX100 SSD 500GB drive don't work well with min_power. This also
happens to med_power_with_dipm.
So let's disable LPM for Crucial BX100 SSD 500GB drive.
BugLink: https://bugs.launchpad.net/bugs/1726930
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org
- three fixeups
. it fixes potential issues[1] by using monotonic timestamp
instead of 'struct timeval'
. correct HDMI_I2S_PIN_SEL_1 definition and setting value.
. fix bit shift typo of FIMC register definition
- two cleanups
. remove unnecessary error messages
. remove exynos_drm_rotator.h file
[1] https://patchwork.kernel.org/patch/10170205/
* tag 'exynos-drm-fixes-for-v4.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos:
drm: exynos: Use proper macro definition for HDMI_I2S_PIN_SEL_1
drm/exynos: remove exynos_drm_rotator.h
drm/exynos: g2d: Delete an error message for a failed memory allocation in two functions
drm/exynos: fix comparison to bitshift when dealing with a mask
drm/exynos: g2d: use monotonic timestamps
If building match list or adding existing fg fails when
node is locked, function returned without unlocking it.
This happened if node version changed or adding existing fg
returned with EAGAIN after jumping to search_again_locked label.
Fixes: bd71b08ec2 ("net/mlx5: Support multiple updates of steering rules in parallel")
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
First use of drop counters happens in esw_apply_vport_conf function,
while they are allocated later in the flow. Fix that by moving
esw_vport_create_drop_counters function to be called before the first use.
Fixes: b8a0dbe3a9 ("net/mlx5e: E-switch, Add steering drop counters")
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
We can't allow only some of the rules sharing an FTE to ask for
header re-write, add it to the conflicting action checks.
Fixes: 0d235c3fab ('net/mlx5: Add hash table to search FTEs in a flow-group')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The adapter uses the cache_line_128byte setting to set the bounds for
end padding. On systems where the cacheline size is greater than 128B
use 128B instead of the default of 64B. This results in fewer partial
cacheline writes. There's a 50% chance it will pad to the end of a 256B
cache line vs only 25% when using 64B.
Fixes: f32f5bd2eb ("net/mlx5: Configure cache line size for start and end padding")
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
When allocating a drop rq, no numa node is explicitly set which means
allocations are done on node zero. This is not necessarily the nearest
numa node to the HCA, and even worse, might even be a memoryless numa
node.
Choose the numa_node given to us by the pci device in order to properly
allocate the coherent dma memory instead of assuming zero is valid.
Fixes: 556dd1b9c3 ("net/mlx5e: Set drop RQ's necessary parameters only")
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
This isn't supported when we emulate eswitch vlan push action which
is the current state of things.
Fixes: 8b32580df1 ('net/mlx5e: Add TC vlan action for SRIOV offloads')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Fix these gcc warnings on drivers/net/ethernet/mellanox/mlx5:
[..]/core/lib/clock.c:454:6: warning: no previous prototype for 'mlx5_init_clock' [-Wmissing-prototypes]
[..]/core/lib/clock.c:510:6: warning: no previous prototype for 'mlx5_cleanup_clock' [-Wmissing-prototypes]
[..]/core/en_main.c:3141:5: warning: no previous prototype for 'mlx5e_setup_tc' [-Wmissing-prototypes]
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Driver tries to copy at least MLX5E_MIN_INLINE bytes into the control
segment of the WQE. It assumes that the linear part contains at least
MLX5E_MIN_INLINE bytes, which can be wrong.
Cited commit verified that driver will not copy more bytes into the
inline header part that the actual size of the packet. Re-factor this
check to make sure we do not exceed the linear part as well.
This fix is aligned with the current driver's assumption that the entire
L2 will be present in the linear part of the SKB.
Fixes: 6aace17e64 ("net/mlx5e: Fix inline header size for small packets")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
When GRO is off, the transport header pointer in sk_buff is
initialized to network's header.
To find the udp header, instead of using udp_hdr() which assumes
skb_network_header was set, manually calculate the udp header offset.
Fixes: 0952da791c ("net/mlx5e: Add support for loopback selftest")
Signed-off-by: Inbar Karmy <inbark@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
When receiving an LRO packet, the checksum field is set by the hardware
to the checksum of the first coalesced packet. Obviously, this checksum
is not valid for the merged LRO packet and should be fixed. We can use
the CQE checksum which covers the checksum of the entire merged packet
TCP payload to help us calculate the checksum incrementally.
Tested by sending IPv4/6 traffic with LRO enabled, RX checksum disabled
and watching nstat checksum error counters (in addition to the obvious
bandwidth drop caused by checksum errors).
This bug is usually "hidden" since LRO packets would go through the
CHECKSUM_UNNECESSARY flow which does not validate the packet checksum.
It's important to note that previous to this patch, LRO packets provided
with CHECKSUM_UNNECESSARY are indeed packets with a correct validated
checksum (even though the checksum inside the TCP header is incorrect),
since the hardware LRO aggregation is terminated upon receiving a packet
with bad checksum.
Fixes: e586b3b0ba ("net/mlx5: Ethernet Datapath files")
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Right now, SET CLOCK called in the guest does not properly take care of
the epoch index, as the call goes via the old kvm_s390_set_tod_clock()
interface. So the epoch index is neither reset to 0, if required, nor
properly set to e.g. 0xff on negative values.
Fix this by providing a single kvm_s390_set_tod_clock() function. Move
Multiple-epoch facility handling into it.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180207114647.6220-3-david@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Fixes: 8fa1696ea7 ("KVM: s390: Multiple Epoch Facility support")
Cc: stable@vger.kernel.org
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
For now, we don't take care of over/underflows. Especially underflows
are critical:
Assume the epoch is currently 0 and we get a sync request for delta=1,
meaning the TOD is moved forward by 1 and we have to fix it up by
subtracting 1 from the epoch. Right now, this will leave the epoch
index untouched, resulting in epoch=-1, epoch_idx=0, which is wrong.
We have to take care of over and underflows, also for the VSIE case. So
let's factor out calculation into a separate function.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180207114647.6220-5-david@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Fixes: 8fa1696ea7 ("KVM: s390: Multiple Epoch Facility support")
Cc: stable@vger.kernel.org
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
[use u8 for idx]
Missed when enabling the Multiple-epoch facility. If the facility is
installed and the control is set, a sign based comaprison has to be
performed.
Right now we would inject wrong interrupts and ignore interrupt
conditions. Also the sleep time is calculated in a wrong way.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180207114647.6220-2-david@redhat.com>
Fixes: 8fa1696ea7 ("KVM: s390: Multiple Epoch Facility support")
Cc: stable@vger.kernel.org
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
[Description] For MST, DC already notify MST sink for MST mode, DC stll
check DP SINK DPCD register to see if MST enabled. DP RX firmware may
not handle this properly.
Signed-off-by: Hersen Wu <hersenxs.wu@amd.com>
Reviewed-by: Tony Cheng <Tony.Cheng@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
amdgpu_dm_display_resume is now called from dm_resume to
unify DAL resume call into a single function call
There is no more need to separately call 2 resume functions
for DM.
Initially they were separated to resume display state after
cursor is pinned. But because there is no longer any corruption
with the cursor - the calls can be merged into one function hook.
Signed-off-by: Mikita Lipski <mikita.lipski@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fixes a GCC maybe-uninitialized warning introduced by 48cca7e44f.
"text" is only initialized inside the if statement so only print debug
info there.
Fixes: 48cca7e44f ("libbpf: add support for bpf_call")
Signed-off-by: Jeremy Cline <jeremy@jcline.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
After introduction of commit d0869c0071, there were some instances of
RX queue entries from a previous session (before the device was closed
and reopened) returned to the NAPI polling routine. Since the corresponding
socket buffers were freed, this resulted in a panic on reopen. Include
a check for a NULL skb here to avoid this.
Fixes: d0869c0071 ("ibmvnic: Clean RX pool buffers during device close")
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The sbi_ prefix would seem to indicate an SBI interface, and save is not
very specific. After applying this patch, reading head.S makes more sense.
Signed-off-by: Michael Clark <michaeljclark@mac.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Interrupt is allowed during exception handling.
There are warning messages if the kernel enables the configuration
'CONFIG_DEBUG_ATOMIC_SLEEP=y'.
BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:23
in_atomic(): 0, irqs_disabled(): 1, pid: 43, name: ash
CPU: 0 PID: 43 Comm: ash Tainted: G W 4.15.0-rc8-00089-g89ffdae-dirty #17
Call Trace:
[<000000009abb1587>] walk_stackframe+0x0/0x7a
[<00000000d4f3d088>] ___might_sleep+0x102/0x11a
[<00000000b1fd792a>] down_read+0x18/0x28
[<000000000289ec01>] do_page_fault+0x86/0x2f6
[<00000000012441f6>] _do_fork+0x1b4/0x1e0
[<00000000f46c3e3b>] ret_from_syscall+0xa/0xe
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Zong Li <zong@andestech.com>
Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
These three kconfig cleanups were found by ulfalyzer. They're all
things we were selecting that were undefined, either because they'd been
remove upstream or are part of a future RISC-V submission.
* ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE is obselete.
* RISCV_IRQ_INTC is the old name for our interrupt controller driver,
it'll be changed for the final submission and doesn't exist now.
* ARCH_WANT_OPTIONAL_GPIOLIB is obselete.
The RISCV_IRQ_INTC configuration symbol is undefined, but RISCV selects
it. Quoting Palmer Dabbelt:
It looks like this slipped through, the symbol has been renamed
RISCV_INTC.
No RISCV_INTC configuration symbol has been merged either. Just remove
the RISCV_IRQ_INTC select for now.
Signed-off-by: Ulf Magnusson <ulfalizer@gmail.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Pull LED maintainer update:
"LED update to MAINTAINERS, to admit the reality.
Message from Richard:
"I've been looking at some of the emails but not needed to be
involved for a while now, you're doing fine without me!" [0]
Many thanks to Richard for his work as a founder of the LED
subsystem!"
[0] https://lkml.org/lkml/2018/2/18/145
* tag 'leds_for-4.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds:
MAINTAINERS: Remove Richard Purdie from LED maintainers
BNXT_RE_FLAG_TASK_IN_PROG doesn't handle multiple work
requests posted together. Track schedule of multiple
workqueue items by maintaining a per device counter
and proceed with IB dereg only if this counter is zero.
flush_workqueue is no longer required from
NETDEV_UNREGISTER path.
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
During driver unload, the driver proceeds with cleanup
without waiting for the scheduled events. So the device
pointers get freed up and driver crashes when the events
are scheduled later.
Flush the bnxt_re_task work queue before starting
device removal.
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Avoid system crash when destroy_qp is invoked while
the driver is processing the poll_cq. Synchronize these
functions using the cq_lock.
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Driver leaves the QP memory pinned if QP create command
fails from the FW. Avoids this scenario by adding a proper
exit path if the FW command fails.
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
More testing needs to be done before enabling this feature.
Disabling the feature temporarily
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
MIPS' struct compat_flock doesn't match the 32-bit struct flock, as it
has an extra short __unused before pad[4], which combined with alignment
increases the size to 40 bytes compared with struct flock's 36 bytes.
Since commit 8c6657cb50 ("Switch flock copyin/copyout primitives to
copy_{from,to}_user()"), put_compat_flock() writes the full compat_flock
struct to userland, which results in corruption of the userland word
after the struct flock when running 32-bit userlands on 64-bit kernels.
This was observed to cause a bus error exception when starting Firefox
on Debian 8 (Jessie).
Reported-by: Peter Mamonov <pmamonov@gmail.com>
Signed-off-by: James Hogan <jhogan@kernel.org>
Tested-by: Peter Mamonov <pmamonov@gmail.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 4.13+
Patchwork: https://patchwork.linux-mips.org/patch/18646/
The hcp->chmap_info must not be freed up in the hdmi_codec_remove()
function as it leads to kernel crash due ALSA core's
pcm_chmap_ctl_private_free() is trying to free it up again when the card
destroyed via snd_card_free.
Commit cd6111b262 ("ASoC: hdmi-codec: add channel mapping control")
should not have added the kfree(hcp->chmap_info); to the hdmi_codec_remove
function.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Reviewed-by: Jyri Sarha <jsarha@ti.com>
Tested-by: Jyri Sarha <jsarha@ti.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
On dm3730 there are enumeration problems after resume.
Investigation led to the cause that the MUSB_POWER_SOFTCONN
bit is not set. If it was set before suspend (because it
was enabled via musb_pullup()), it is set in
musb_restore_context() so the pullup is enabled. But then
musb_start() is called which overwrites MUSB_POWER and
therefore disables MUSB_POWER_SOFTCONN, so no pullup is
enabled and the device is not enumerated.
So let's do a subset of what musb_start() does
in the same way as musb_suspend() does it. Platform-specific
stuff it still called as there might be some phy-related stuff
which needs to be enabled.
Also interrupts are enabled, as it was the original idea
of calling musb_start() in musb_resume() according to
Commit 6fc6f4b87c ("usb: musb: Disable interrupts on suspend,
enable them on resume")
Signed-off-by: Andreas Kemnade <andreas@kemnade.info>
Tested-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When resuming from idle with the new suspend mode configuration support
we go through the resume callbacks with a state of PM_SUSPEND_TO_IDLE
which we don't have regulator constraints for, causing an error:
dpm_run_callback(): regulator_resume_early+0x0/0x64 returns -22
PM: Device regulator.0 failed to resume early: error -22
Avoid this and similar errors by treating missing constraints as a noop.
See also commit 57a0dd1879 ("regulator: Fix suspend to idle"),
which fixed the suspend part.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Mark Brown <broonie@kernel.org>
The ID_AA64DFR0_EL1.PMUVer field doesn't follow the usual ID registers
scheme. While value 0xf indicates a non-architected PMU is implemented,
values 0x1 to 0xe indicate an increasingly featureful architected PMU,
as if the field were unsigned.
For more details, see ARM DDI 0487C.a, D10.1.4, "Alternative ID scheme
used for the Performance Monitors Extension version".
Currently, we treat the field as signed, and erroneously bail out for
values 0x8 to 0xe. Let's correct that.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
We can't request IRQs in atomic context, so for ACPI systems we'll have
to request them up-front, and later associate them with CPUs.
This patch reorganises the arm_pmu code to do so. As we no longer have
the arm_pmu structure at probe time, a number of prototypes need to be
adjusted, requiring changes to the common arm_pmu code and arm_pmu
platform code.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
To support ACPI systems, we need to request IRQs before we know the
associated PMU, and thus we need some percpu variable that the IRQ
handler can find the PMU from.
As we're going to request IRQs without the PMU, we can't rely on the
arm_pmu::active_irqs mask, and similarly need to track requested IRQs
with a percpu variable.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
[will: made armpmu_count_irq_users static]
Signed-off-by: Will Deacon <will.deacon@arm.com>
To support ACPI systems, we need to request IRQs before CPUs are
hotplugged, and thus we need to request IRQs before we know their
associated PMU.
This is problematic if a PMU IRQ is pending out of reset, as it may be
taken before we know the PMU, and thus the IRQ handler won't be able to
handle it, leaving it screaming.
To avoid such problems, lets request all IRQs in a disabled state, and
explicitly enable/disable them at hotplug time, when we're sure the PMU
has been probed.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The arm_pmu platform code explicitly checks for mismatched PPIs at probe
time, while the ACPI code leaves this to the core code. Future
refactoring will make this difficult for the core code to check, so
let's have the ACPI code check this explicitly.
As before, upon a failure we'll continue on without an interrupt. Ho
hum.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
In ACPI systems, we don't know the makeup of CPUs until we hotplug them
on, and thus have to allocate the PMU datastructures at hotplug time.
Thus, we must use GFP_ATOMIC allocations.
Let's add an armpmu_alloc_atomic() that we can use in this case.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The armpmu_{request,free}_irqs() helpers are only used by
arm_pmu_platform.c, so let's fold them in and make them static.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Now that we have no platforms passing platform data to the arm_pmu code,
we can get rid of the platdata and associated hooks, paving the way for
rework of our IRQ handling.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The ux500 PMU IRQ bouncer is getting in the way of some fundametnal
changes to the ARM PMU driver, and it's the only special case that
exists today. Let's remove it.
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The plane buffer address/stride/height was incorrectly updated in the
plane_atomic_update operation instead of the vsync irq.
This patch delays this operation in the vsync irq along with the
other plane delayed setup.
This issue was masked using legacy framebuffer and X11 modesetting, but
is clearly visible using gbm rendering when buffer is submitted late after
vblank, like using software decoding and OpenGL rendering in Kodi.
With this patch, tearing and other artifacts disappears completely.
Cc: Michal Lazo <michal.lazo@gmail.com>
Fixes: bbbe775ec5 ("drm: Add support for Amlogic Meson Graphic Controller")
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/1518689976-23292-1-git-send-email-narmstrong@baylibre.com
Jonathan writes:
First round of IIO fixes for the 4.16 cycle.
One nasty very old crash around polling for buffers that aren't there
- though that can only cause effects on drivers that support events
but not buffers.
* buffer / kfifo handling in the core.
- Check there is a buffer and return 0 from poll directly if there
isn't. Poll doesn't make sense in this circumstances, but best to close
the hole.
* ad5933
- Change the marked buffer mode to a software buffer as the meaning of
the hardware buffer label has long since changed and this uses a front
end software buffer anyway.
* ad7192
- Fix the fact the external clock frequency was only set when using the
internal clock which was less than helpful.
* adis_lib
- Initialize the trigger before requesting the interrupt. Some newer
parts can power up with interrupt generation enabled so ordering now
matters.
* aspeed-adc
- Fix an errror handling path as labels and general ordering were wrong.
* srf08
- Fix a link error due to undefined devm_iio_triggered_buffer_setup.
* stm32-adc
- Fix error handling unwind squence in stm32h7_adc_enable.
Commit:
df3405245a ("x86/asm: Add suffix macro for GEN_*_RMWcc()")
... introduced "suffix" RMWcc operations, adding bogus clobber specifiers:
For one, on x86 there's no point explicitly clobbering "cc".
In fact, with GCC properly fixed, this results in an overlap being detected by
the compiler between outputs and clobbers.
Furthermore it seems bad practice to me to have clobber specification
and use of the clobbered register(s) disconnected - it should rather be
at the invocation place of that GEN_{UN,BIN}ARY_SUFFIXED_RMWcc() macros
that the clobber is specified which this particular invocation needs.
Drop the "cc" clobber altogether and move the "cx" one to refcount.h.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/5A8AF1F802000078001A91E1@prv-mh.provo.novell.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Chanwoo writes:
Update extcon for v4.16-rc3
This patch fixes issue of X-power extcon-axp288 and Intel extcon-int3496 driver.
- For extcon-int3496 driver,
Process id-pin first so that we start with the right status in order to fix
a race where the initial work might still be running while other drivers
were already calling extcon_get_state().
- For extcon-axp288 driver,
Revert the patch[1] which were applied to v4.16-rc1 because there are better
ways with usb-role-switch and constify the axp288_pwr_up_down_info array.
[1] 60ed999614 ("extcon: axp288: Redo charger type detection a couple of seconds after probe()")
On transport mode we forget to fetch the child dst_entry
before we continue the while loop, this leads to an infinite
loop. Fix this by fetching the child dst_entry before we
continue the while loop.
Fixes: 0f6c480f23 ("xfrm: Move dst->path into struct xfrm_dst")
Reported-by: syzbot+7d03c810e50aaedef98a@syzkaller.appspotmail.com
Tested-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Bit field [2:0] of HDMI_I2S_PIN_SEL_1 corresponds to SDATA_0,
not SDATA_2. This patch removes redefinition of HDMI_I2S_SEL_DATA2
constant and adds missing HDMI_I2S_SEL_DATA0.
The value of bit field selecting SDATA_1 (pin_sel_3) is also changed,
so it is 3 as suggested in the Exynos TRMs.
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Since its inclusion in 2012 via commit bea8a429d9 ("drm/exynos: add rotator ipp driver")
this header is not used by any source files and is empty.
Lets just remove it.
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Omit an extra message for a memory allocation failure in these functions.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
The exynos DRM driver uses real-time 'struct timeval' values
for exporting its timestamps to user space. This has multiple
problems:
1. signed seconds overflow in y2038
2. the 'struct timeval' definition is deprecated in the kernel
3. time may jump or go backwards after a 'settimeofday()' syscall
4. other DRM timestamps are in CLOCK_MONOTONIC domain, so they
can't be compared
5. exporting microseconds requires a division by 1000, which may
be slow on some architectures.
The code existed in two places before, but the IPP portion was
removed in 8ded59413c ("drm/exynos: ipp: Remove Exynos DRM
IPP subsystem"), so we no longer need to worry about it.
Ideally timestamps should just use 64-bit nanoseconds instead, but
of course we can't change that now. Instead, this tries to address
the first four points above by using monotonic 'timespec' values.
According to Tobias Jakobi, user space doesn't care about the
timestamp at the moment, so we can change the format. Even if
there is something looking at them, it will work just fine with
monotonic times as long as the application only looks at the
relative values between two events.
Link: https://patchwork.kernel.org/patch/10038593/
Cc: Tobias Jakobi <tjakobi@math.uni-bielefeld.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Pull networking fixes from David Miller:
1) Prevent index integer overflow in ptr_ring, from Jason Wang.
2) Program mvpp2 multicast filter properly, from Mikulas Patocka.
3) The bridge brport attribute file is write only and doesn't have a
->show() method, don't blindly invoke it. From Xin Long.
4) Inverted mask used in genphy_setup_forced(), from Ingo van Lil.
5) Fix multiple definition issue with if_ether.h UAPI header, from
Hauke Mehrtens.
6) Fix GFP_KERNEL usage in atomic in RDS protocol code, from Sowmini
Varadhan.
7) Revert XDP redirect support from thunderx driver, it is not
implemented properly. From Jesper Dangaard Brouer.
8) Fix missing RTNL protection across some tipc operations, from Ying
Xue.
9) Return the correct IV bytes in the TLS getsockopt code, from Boris
Pismenny.
10) Take tclassid into consideration properly when doing FIB rule
matching. From Stefano Brivio.
11) cxgb4 device needs more PCI VPD quirks, from Casey Leedom.
12) TUN driver doesn't align frags properly, and we can end up doing
unaligned atomics on misaligned metadata. From Eric Dumazet.
13) Fix various crashes found using DEBUG_PREEMPT in rmnet driver, from
Subash Abhinov Kasiviswanathan.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (56 commits)
tg3: APE heartbeat changes
mlxsw: spectrum_router: Do not unconditionally clear route offload indication
net: qualcomm: rmnet: Fix possible null dereference in command processing
net: qualcomm: rmnet: Fix warning seen with 64 bit stats
net: qualcomm: rmnet: Fix crash on real dev unregistration
sctp: remove the left unnecessary check for chunk in sctp_renege_events
rxrpc: Work around usercopy check
tun: fix tun_napi_alloc_frags() frag allocator
udplite: fix partial checksum initialization
skbuff: Fix comment mis-spelling.
dn_getsockoptdecnet: move nf_{get/set}sockopt outside sock lock
PCI/cxgb4: Extend T3 PCI quirk to T4+ devices
cxgb4: fix trailing zero in CIM LA dump
cxgb4: free up resources of pf 0-3
fib_semantics: Don't match route with mismatching tclassid
NFC: llcp: Limit size of SDP URI
tls: getsockopt return record sequence number
tls: reset the crypto info if copy_from_user fails
tls: retrun the correct IV in getsockopt
docs: segmentation-offloads.txt: add SCTP info
...
Richard has been inactive on the linux-leds list for a long time.
After email discussion we agreed on removing him from
the LED maintainers, which will better reflect the actual status.
Acked-by: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Jacek Anaszewski <jacek.anaszewski@gmail.com>
In ungraceful host shutdown or driver crash case BMC connectivity is
lost. APE firmware is missing the driver state in this
case to keep the BMC connectivity alive.
This patch has below change to address this issue.
Heartbeat mechanism with APE firmware. This heartbeat mechanism
is needed to notify the APE firmware about driver state.
This patch also has the change in wait time for APE event from
1ms to 20ms as there can be some delay in getting response.
v2: Drop inline keyword as per David suggestion.
Signed-off-by: Prashant Sreedharan <prashant.sreedharan@broadcom.com>
Signed-off-by: Satish Baddipadige <satish.baddipadige@broadcom.com>
Signed-off-by: Siva Reddy Kallam <siva.kallam@broadcom.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the case of 'recover', an r10bio with R10BIO_WriteError &
R10BIO_IsRecover will be progressed by handle_write_completed().
This function traverses all r10bio->devs[copies].
If devs[m].repl_bio != NULL, it thinks conf->mirrors[dev].replacement
is also not NULL. However, this is not always true.
When there is an rdev of raid10 has replacement, then each r10bio
->devs[m].repl_bio != NULL in conf->r10buf_pool. However, in 'recover',
even if corresponded replacement is NULL, it doesn't clear r10bio
->devs[m].repl_bio, resulting in replacement NULL deference.
This bug was introduced when replacement support for raid10 was
added in Linux 3.3.
As NeilBrown suggested:
Elsewhere the determination of "is this device part of the
resync/recovery" is made by resting bio->bi_end_io.
If this is end_sync_write, then we tried to write here.
If it is NULL, then we didn't try to write.
Fixes: 9ad1aefc8a ("md/raid10: Handle replacement devices during resync.")
Cc: stable (V3.3+)
Suggested-by: NeilBrown <neilb@suse.com>
Signed-off-by: Yufen Yu <yuyufen@huawei.com>
Signed-off-by: Shaohua Li <sh.li@alibaba-inc.com>
The locking protocols in md assume that a device will
never be removed from an array during resync/recovery/reshape.
When that isn't happening, rcu or reconfig_mutex is needed
to protect an rdev pointer while taking a refcount. When
it is happening, that protection isn't needed.
Unfortunately there are cases were remove_and_add_spares() is
called when recovery might be happening: is state_store(),
slot_store() and hot_remove_disk().
In each case, this is just an optimization, to try to expedite
removal from the personality so the device can be removed from
the array. If resync etc is happening, we just have to wait
for md_check_recover to find a suitable time to call
remove_and_add_spares().
This optimization and not essential so it doesn't
matter if it fails.
So change remove_and_add_spares() to abort early if
resync/recovery/reshape is happening, unless it is called
from md_check_recovery() as part of a newly started recovery.
The parameter "this" is only NULL when called from
md_check_recovery() so when it is NULL, there is no need to abort.
As this can result in a NULL dereference, the fix is suitable
for -stable.
cc: yuyufen <yuyufen@huawei.com>
Cc: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
Fixes: 8430e7e0af ("md: disconnect device from personality before trying to remove it.")
Cc: stable@ver.kernel.org (v4.8+)
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Shaohua Li <sh.li@alibaba-inc.com>
__show_regs pretty prints PC and LR by attempting to map them to kernel
function names to improve the utility of crash reports. Unfortunately,
this mapping is applied even when the pt_regs corresponds to user mode,
resulting in a KASLR oracle.
Avoid this issue by only looking up the function symbols when the register
state indicates that we're actually running at EL1.
Cc: <stable@vger.kernel.org>
Reported-by: NCSC Security <security@ncsc.gov.uk>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Stop printing a (ratelimited) kernel message for each instance of an
unimplemented syscall being called. Userland making an unimplemented
syscall is not necessarily misbehaviour and to be expected with a
current userland running on an older kernel. Also, the current message
looks scary to users but does not actually indicate a real problem nor
help them narrow down the cause. Just rely on sys_ni_syscall() to return
-ENOSYS.
Cc: <stable@vger.kernel.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Michael Weiser <michael.weiser@gmx.de>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Our field definitions for CTR_EL0 suffer from a number of problems:
- The IDC and DIC fields are missing, which causes us to enable CTR
trapping on CPUs with either of these returning non-zero values.
- The ERG is FTR_LOWER_SAFE, whereas it should be treated like CWG as
FTR_HIGHER_SAFE so that applications can use it to avoid false sharing.
- [nit] A RES1 field is described as "RAO"
This patch updates the CTR_EL0 field definitions to fix these issues.
Cc: <stable@vger.kernel.org>
Cc: Shanker Donthineni <shankerd@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
isp5100_tco.c uses watchdog core functions (from watchdog_core.c) and, when
compiled without CONFIG_WATCHDOG_CORE being set, it produces the
following build error:
ERROR: "devm_watchdog_register_device" [drivers/watchdog/sp5100_tco.ko] undefined!
ERROR: "watchdog_init_timeout" [drivers/watchdog/sp5100_tco.ko] undefined!
Fix this by selecting CONFIG_WATCHDOG_CORE.
Fixes: 7cd9d5fff7 ("watchdog: sp5100_tco: Convert to use watchdog subsystem")
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
xen_wdt uses watchdog core functions (from watchdog_core.c) and, when
compiled without CONFIG_WATCHDOG_CORE being set, it produces the
following build error:
ERROR: "devm_watchdog_register_device" [drivers/watchdog/xen_wdt.ko] undefined!
ERROR: "watchdog_init_timeout" [drivers/watchdog/xen_wdt.ko] undefined!
Fix this by selecting CONFIG_WATCHDOG_CORE when CONFIG_XEN_WDT is set.
Fixes: 18cffd68e0 ("watchdog: xen_wdt: use the watchdog subsystem")
Signed-off-by: Radu Rendec <radu.rendec@gmail.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
i6300esb uses fuctions defined in watchdog_core.c, and when
CONFIG_WATCHDOG_CORE is not set we have this build error:
drivers/watchdog/i6300esb.o: In function `esb_remove':
i6300esb.c:(.text+0xcc): undefined reference to `watchdog_unregister_device'
drivers/watchdog/i6300esb.o: In function `esb_probe':
i6300esb.c:(.text+0x2a1): undefined reference to `watchdog_init_timeout'
i6300esb.c:(.text+0x388): undefined reference to `watchdog_register_device'
make: *** [Makefile:1029: vmlinux] Error 1
Fix this by selecting CONFIG_WATCHDOG_CORE when I6300ESB_WDT is set.
Fixes: 7af4ac8772 ("watchdog: i6300esb: use the watchdog subsystem")
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
We can build this driver with or without NVMEM, but not built-in
when NVMEM is a loadable module:
drivers/watchdog/rave-sp-wdt.o: In function `rave_sp_wdt_probe':
rave-sp-wdt.c:(.text+0x27c): undefined reference to `nvmem_cell_get'
rave-sp-wdt.c:(.text+0x290): undefined reference to `nvmem_cell_read'
rave-sp-wdt.c:(.text+0x2c4): undefined reference to `nvmem_cell_put'
This adds a Kconfig dependency to enforce that.
Fixes: c3bb333457 ("watchdog: Add RAVE SP watchdog driver")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
When mlxsw replaces (or deletes) a route it removes the offload
indication from the replaced route. This is problematic for IPv4 routes,
as the offload indication is stored in the fib_info which is usually
shared between multiple routes.
Instead of unconditionally clearing the offload indication, only clear
it if no other route is using the fib_info.
Fixes: 3984d1a89f ("mlxsw: spectrum_router: Provide offload indication using nexthop flags")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Alexander Petrovskiy <alexpe@mellanox.com>
Tested-by: Alexander Petrovskiy <alexpe@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Subash Abhinov Kasiviswanathan says:
====================
net: qualcomm: rmnet: Fix issues with CONFIG_DEBUG_PREEMPT enabled
Patch 1 and 2 fixes issues identified when CONFIG_DEBUG_PREEMPT was
enabled. These involve APIs which were called in invalid contexts.
Patch 3 is a null derefence fix identified by code inspection.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
If a command packet with invalid mux id is received, the packet would
not have a valid endpoint. This invalid endpoint maybe dereferenced
leading to a crash. Identified by manual code inspection.
Fixes: 3352e6c457 ("net: qualcomm: rmnet: Convert the muxed endpoint to hlist")
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
With CONFIG_DEBUG_PREEMPT enabled, a crash with the following call
stack was observed when removing a real dev which had rmnet devices
attached to it.
To fix this, remove the netdev_upper link APIs and instead use the
existing information in rmnet_port and rmnet_priv to get the
association between real and rmnet devs.
BUG: sleeping function called from invalid context
in_atomic(): 0, irqs_disabled(): 0, pid: 5762, name: ip
Preemption disabled at:
[<ffffff9d49043564>] debug_object_active_state+0xa4/0x16c
Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
Modules linked in:
PC is at ___might_sleep+0x13c/0x180
LR is at ___might_sleep+0x17c/0x180
[<ffffff9d48ce0924>] ___might_sleep+0x13c/0x180
[<ffffff9d48ce09c0>] __might_sleep+0x58/0x8c
[<ffffff9d49d6253c>] mutex_lock+0x2c/0x48
[<ffffff9d48ed4840>] kernfs_remove_by_name_ns+0x48/0xa8
[<ffffff9d48ed6ec8>] sysfs_remove_link+0x30/0x58
[<ffffff9d49b05840>] __netdev_adjacent_dev_remove+0x14c/0x1e0
[<ffffff9d49b05914>] __netdev_adjacent_dev_unlink_lists+0x40/0x68
[<ffffff9d49b08820>] netdev_upper_dev_unlink+0xb4/0x1fc
[<ffffff9d494a29f0>] rmnet_dev_walk_unreg+0x6c/0xc8
[<ffffff9d49b00b40>] netdev_walk_all_lower_dev_rcu+0x58/0xb4
[<ffffff9d494a30fc>] rmnet_config_notify_cb+0xf4/0x134
[<ffffff9d48cd21b4>] raw_notifier_call_chain+0x58/0x78
[<ffffff9d49b028a4>] call_netdevice_notifiers_info+0x48/0x78
[<ffffff9d49b0b568>] rollback_registered_many+0x230/0x3c8
[<ffffff9d49b0b738>] unregister_netdevice_many+0x38/0x94
[<ffffff9d49b1e110>] rtnl_delete_link+0x58/0x88
[<ffffff9d49b201dc>] rtnl_dellink+0xbc/0x1cc
[<ffffff9d49b2355c>] rtnetlink_rcv_msg+0xb0/0x244
[<ffffff9d49b5230c>] netlink_rcv_skb+0xb4/0xdc
[<ffffff9d49b204f4>] rtnetlink_rcv+0x34/0x44
[<ffffff9d49b51af0>] netlink_unicast+0x1ec/0x294
[<ffffff9d49b51fdc>] netlink_sendmsg+0x320/0x390
[<ffffff9d49ae6858>] sock_sendmsg+0x54/0x60
[<ffffff9d49ae6f94>] ___sys_sendmsg+0x298/0x2b0
[<ffffff9d49ae98f8>] SyS_sendmsg+0xb4/0xf0
[<ffffff9d48c83770>] el0_svc_naked+0x24/0x28
Fixes: ceed73a2cf ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation")
Fixes: 60d58f971c ("net: qualcomm: rmnet: Implement bridge mode")
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The 10/12-bit config used for bayer formats is used for grayscale as
well.
Signed-off-by: Jan Luebbe <jlu@pengutronix.de>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Add the missing offset calculation for 16-bit grayscale images. Since
the IPU only supports capturing greyscale in raw passthrough mode, it
is the same as 16-bit bayer formats.
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Before returning, call of_node_put() for the device node returned by
of_parse_phandle().
Fixes: ea9c260514 ("gpu: ipu-v3: add driver for Prefetch Resolve Gasket")
Signed-off-by: Tobias Jordan <Tobias.Jordan@elektrobit.com>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Before returning, call of_node_put() for the device node returned by
of_parse_phandle().
Fixes: d2a3423258 ("gpu: ipu-v3: add driver for Prefetch Resolve Engine")
Signed-off-by: Tobias Jordan <Tobias.Jordan@elektrobit.com>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
In converting __range_ok() into a static inline, I inadvertently made
it more type-safe, but without considering the ordering of the relevant
conversions. This leads to quite a lot of Sparse noise about the fact
that we use __chk_user_ptr() after addr has already been converted from
a user pointer to an unsigned long.
Rather than just adding another cast for the sake of shutting Sparse up,
it seems reasonable to rework the types to make logical sense (although
the resulting codegen for __range_ok() remains identical). The only
callers this affects directly are our compat traps where the inferred
"user-pointer-ness" of a register value now warrants explicit casting.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
In case an ADDBA request is received while there is already
an ongoing BA sessions with the same parameters, i.e., update
flow, an ADBBA response with decline status was sent twice. Fix it.
Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Some APs include a non global operating class in their extended channel
switch information element. In such a case, as the operating class is not
known, mac80211 would decide to disconnect.
However the specification states that the operating class needs to be
taken from Annex E, but it does not specify from which table it should be
taken, so it is valid for an AP to use a non global operating class.
To avoid possibly unneeded disconnection, in such a case ignore the
operating class and assume that the current band is used, and if the
resulting channel and band configuration is invalid disconnect.
Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
When a low level driver calls cfg80211_disconnected(), wep keys are
not cleared. As a result, following connection requests will fail
since cfg80211 internal state shows a connection is still in progress.
Fix this by clearing the wep keys when disconnecting.
Signed-off-by: Avraham Stern <avraham.stern@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
sta_info_alloc can be called from atomic paths (such as RX path)
so we need to call pcpu_alloc with the correct gfp.
Fixes: c9c5962b56 ("mac80211: enable collecting station statistics per-CPU")
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
If sta_info_alloc fails after allocating the per CPU statistics,
they are not properly freed.
Fixes: c9c5962b56 ("mac80211: enable collecting station statistics per-CPU")
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
This ensures that mac80211 allocated management frames are properly
aligned, which makes copying them more efficient.
For instance, mt76 uses iowrite32_copy to copy beacon frames to beacon
template memory on the chip.
Misaligned 32-bit accesses cause CPU exceptions on MIPS and should be
avoided.
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Since commit e1a50de378 (arm64: cputype: Silence Sparse warnings),
compilation of arm64 architecture is broken with the following error
messages:
AR arch/arm64/kernel/built-in.o
arch/arm64/kernel/head.S: Assembler messages:
arch/arm64/kernel/head.S:677: Error: found 'L', expected: ')'
arch/arm64/kernel/head.S:677: Error: found 'L', expected: ')'
arch/arm64/kernel/head.S:677: Error: found 'L', expected: ')'
arch/arm64/kernel/head.S:677: Error: junk at end of line, first
unrecognized character is `L'
arch/arm64/kernel/head.S:677: Error: unexpected characters following
instruction at operand 2 -- `movz x1,:abs_g1_s:0xff00ffffffUL'
arch/arm64/kernel/head.S:677: Error: unexpected characters following
instruction at operand 2 -- `movk x1,:abs_g0_nc:0xff00ffffffUL'
This patch fixes the same by using the UL() macro correctly for
assigning the MPIDR_HWID_BITMASK macro value.
Fixes: e1a50de378 ("arm64: cputype: Silence Sparse warnings")
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
We're obviously not part of a memory reclaim path, so don't set the flag.
This also causes a warning in check_flush_dependency() since we end up
in a code path that flushes a non-reclaim workqueue, and we shouldn't do
that if we were really part of reclaim.
Reported-by: syzbot+41cdaf4232c50e658934@syzkaller.appspotmail.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
gcc-8 warns about some obviously incorrect code:
net/mac80211/cfg.c: In function 'cfg80211_beacon_dup':
net/mac80211/cfg.c:2896:3: error: 'memcpy' source argument is the same as destination [-Werror=restrict]
From the context, I conclude that we want to copy from beacon into
new_beacon, as we do in the rest of the function.
Cc: stable@vger.kernel.org
Fixes: 73da7d5bab ("mac80211: add channel switch command and beacon callbacks")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
The ALC5651 does not like multi-write accesses, avoid them. This fixes:
rt5651 i2c-10EC5651:00: Unable to sync registers 0x27-0x28. -121
Errors on resume (and all registers after the registers in the error not
being synced).
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
When support for the A31/A31s CCU was first added, the clock ops for
the CLK_OUT_* clocks was set to the wrong type. The clocks are MP-type,
but the ops was set for div (M) clocks. This went unnoticed until now.
This was because while they are different clocks, their data structures
aligned in a way that ccu_div_ops would access the second ccu_div_internal
and ccu_mux_internal structures, which were valid, if not incorrect.
Furthermore, the use of these CLK_OUT_* was for feeding a precise 32.768
kHz clock signal to the WiFi chip. This was achievable by using the parent
with the same clock rate and no divider. So the incorrect divider setting
did not affect this usage.
Commit 946797aa3f ("clk: sunxi-ng: Support fixed post-dividers on MP
style clocks") added a new field to the ccu_mp structure, which broke
the aforementioned alignment. Now the system crashes as div_ops tries
to look up a nonexistent table.
Reported-by: Philipp Rossak <embed3d@gmail.com>
Tested-by: Philipp Rossak <embed3d@gmail.com>
Fixes: c6e6c96d8f ("clk: sunxi-ng: Add A31/A31s clocks")
Cc: <stable@vger.kernel.org>
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
When xfrm_policy_get_afinfo returns NULL, it will not hold rcu
read lock. In this case, rcu_read_unlock should not be called
in xfrm_get_tos, just like other places where it's calling
xfrm_policy_get_afinfo.
Fixes: f5e2bb4f5b ("xfrm: policy: xfrm_get_tos cannot fail")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
I've accidentally stumbled upon the IS_ENABLED(EXPOLINE_*) lines, which
obviously always evaluate to false. Fix this.
Fixes: f19fbd5ed6 ("s390: introduce execute-trampolines for branches")
Signed-off-by: Eugeniu Rosca <erosca@de.adit-jv.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Internal DASD device driver I/O such as query host access count or
path verification is started using the _sleep_on() function.
To mark a request as started or ended the callback_data is set to either
DASD_SLEEPON_START_TAG or DASD_SLEEPON_END_TAG.
In cases where the request has to be stopped unconditionally the status is
set to DASD_SLEEPON_END_TAG as well which leads to immediate clearing of
the request.
But the request might still be on a device request queue for normal
operation which might lead to a panic because of a BUG() statement in
__dasd_device_process_final_queue() or a list corruption of the device
request queue.
Fix by removing the setting of DASD_SLEEPON_END_TAG in the
dasd_cancel_req() and dasd_generic_requeue_all_requests() functions and
ensure that the request is not deleted in the requeue function.
Trigger the device tasklet in the requeue function and let the normal
processing cleanup the request.
Signed-off-by: Stefan Haberland <sth@linux.vnet.ibm.com>
Reviewed-by: Jan Hoeppner <hoeppner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The rdev pointer kept in the local 'config' for each for
raid1, raid10, raid4/5/6 has non-obvious lifetime rules.
Sometimes RCU is needed, sometimes a lock, something nothing.
Add documentation to explain this.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Shaohua Li <sh.li@alibaba-inc.com>
If no metadata devices are configured on raid1/4/5/6/10
(e.g. via dm-raid), md_write_start() unconditionally waits
for superblocks to be written thus deadlocking.
Fix introduces mddev->has_superblocks bool, defines it in md_run()
and checks for it in md_write_start() to conditionally avoid waiting.
Once on it, check for non-existing superblocks in md_super_write().
Link: https://bugzilla.kernel.org/show_bug.cgi?id=198647
Fixes: cc27b0c78c ("md: fix deadlock between mddev_suspend() and md_write_start()")
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Shaohua Li <sh.li@alibaba-inc.com>
The prior patch added support for passing gfp flags through to the
underlying allocators. This patch allows users to pass along gfp flags
(currently only __GFP_NORETRY and __GFP_NOWARN) to the underlying
allocators. This should allow users to decide if they are ok with
failing allocations recovering in a more graceful way.
Additionally, gfp passing was done as additional flags in the previous
patch. Instead, change this to caller passed semantics. GFP_KERNEL is
also removed as the default flag. It continues to be used for internally
caused underlying percpu allocations.
V2:
Removed gfp_percpu_mask in favor of doing it inline.
Removed GFP_KERNEL as a default flag for __alloc_percpu_gfp.
Signed-off-by: Dennis Zhou <dennisszhou@gmail.com>
Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Percpu memory using the vmalloc area based chunk allocator lazily
populates chunks by first requesting the full virtual address space
required for the chunk and subsequently adding pages as allocations come
through. To ensure atomic allocations can succeed, a workqueue item is
used to maintain a minimum number of empty pages. In certain scenarios,
such as reported in [1], it is possible that physical memory becomes
quite scarce which can result in either a rather long time spent trying
to find free pages or worse, a kernel panic.
This patch adds support for __GFP_NORETRY and __GFP_NOWARN passing them
through to the underlying allocators. This should prevent any
unnecessary panics potentially caused by the workqueue item. The passing
of gfp around is as additional flags rather than a full set of flags.
The next patch will change these to caller passed semantics.
V2:
Added const modifier to gfp flags in the balance path.
Removed an extra whitespace.
[1] https://lkml.org/lkml/2018/2/12/551
Signed-off-by: Dennis Zhou <dennisszhou@gmail.com>
Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
Reported-by: syzbot+adb03f3f0bb57ce3acda@syzkaller.appspotmail.com
Acked-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
At some point the function declaration parameters got out of sync with
the function definitions in percpu-vm.c and percpu-km.c. This patch
makes them match again.
Signed-off-by: Dennis Zhou <dennisszhou@gmail.com>
Acked-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Various people have reported the Crucial MX100 512GB model not working
with LPM set to min_power. I've now received a report that it also does
not work with the new med_power_with_dipm level.
It does work with medium_power, but that has no measurable power-savings
and given the amount of people being bitten by the other levels not
working, this commit just disables LPM altogether.
Note all reporters of this have either the 512GB model (max capacity), or
are not specifying their SSD's size. So for now this quirk assumes this is
a problem with the 512GB model only.
Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=89261
Buglink: https://github.com/linrunner/TLP/issues/84
Cc: stable@vger.kernel.org
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
To align with raid1's resync window, we need to
set the resync window of raid10 to 32M as well.
Fixes: 8db87912c9 ("md-cluster: Use a small window for raid10 resync")
Reported-by: Zhilong Liu <zlliu@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Shaohua Li <sh.li@alibaba-inc.com>
A single character (closing square bracket) should be put into a sequence.
Thus use the corresponding function "seq_putc".
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Shaohua Li <sh.li@alibaba-inc.com>
The trailing semicolon is an empty statement that does no operation.
Removing it since it doesn't do anything.
Signed-off-by: Luis de Bethencourt <luisbg@kernel.org>
Signed-off-by: Shaohua Li <sh.li@alibaba-inc.com>
Don't use shrinker.nr_deferred to check whether shrinker was
initialized or not. Now this check was integrated into
unregister_shrinker(), so it is safe to call it against
unregistered shrinker.
Signed-off-by: Aliaksei Karaliou <akaraliou.dev@gmail.com>
Signed-off-by: Shaohua Li <sh.li@alibaba-inc.com>
If no iio buffer has been set up and poll is called return 0.
Without this check there will be a null pointer dereference when
calling poll on a iio driver without an iio buffer.
Cc: stable@vger.kernel.org
Signed-off-by: Stefan Windfeldt-Prytz <stefan.windfeldt@axis.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
The adis_probe_trigger() creates a new IIO trigger and requests an
interrupt associated with the trigger. The interrupt uses the generic
iio_trigger_generic_data_rdy_poll() function as its interrupt handler.
Currently the driver initializes some fields of the trigger structure after
the interrupt has been requested. But an interrupt can fire as soon as it
has been requested. This opens up a race condition.
iio_trigger_generic_data_rdy_poll() will access the trigger data structure
and dereference the ops field. If the ops field is not yet initialized this
will result in a NULL pointer deref.
It is not expected that the device generates an interrupt at this point, so
typically this issue did not surface unless e.g. due to a hardware
misconfiguration (wrong interrupt number, wrong polarity, etc.).
But some newer devices from the ADIS family start to generate periodic
interrupts in their power-on reset configuration and unfortunately the
interrupt can not be masked in the device. This makes the race condition
much more visible and the following crash has been observed occasionally
when booting a system using the ADIS16460.
Unable to handle kernel NULL pointer dereference at virtual address 00000008
pgd = c0004000
[00000008] *pgd=00000000
Internal error: Oops: 5 [#1] PREEMPT SMP ARM
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.9.0-04126-gf9739f0-dirty #257
Hardware name: Xilinx Zynq Platform
task: ef04f640 task.stack: ef050000
PC is at iio_trigger_notify_done+0x30/0x68
LR is at iio_trigger_generic_data_rdy_poll+0x18/0x20
pc : [<c042d868>] lr : [<c042d924>] psr: 60000193
sp : ef051bb8 ip : 00000000 fp : ef106400
r10: c081d80a r9 : ef3bfa00 r8 : 00000087
r7 : ef051bec r6 : 00000000 r5 : ef3bfa00 r4 : ee92ab00
r3 : 00000000 r2 : 00000000 r1 : 00000000 r0 : ee97e400
Flags: nZCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment none
Control: 18c5387d Table: 0000404a DAC: 00000051
Process swapper/0 (pid: 1, stack limit = 0xef050210)
[<c042d868>] (iio_trigger_notify_done) from [<c0065b10>] (__handle_irq_event_percpu+0x88/0x118)
[<c0065b10>] (__handle_irq_event_percpu) from [<c0065bbc>] (handle_irq_event_percpu+0x1c/0x58)
[<c0065bbc>] (handle_irq_event_percpu) from [<c0065c30>] (handle_irq_event+0x38/0x5c)
[<c0065c30>] (handle_irq_event) from [<c0068e28>] (handle_level_irq+0xa4/0x130)
[<c0068e28>] (handle_level_irq) from [<c0064e74>] (generic_handle_irq+0x24/0x34)
[<c0064e74>] (generic_handle_irq) from [<c021ab7c>] (zynq_gpio_irqhandler+0xb8/0x13c)
[<c021ab7c>] (zynq_gpio_irqhandler) from [<c0064e74>] (generic_handle_irq+0x24/0x34)
[<c0064e74>] (generic_handle_irq) from [<c0065370>] (__handle_domain_irq+0x5c/0xb4)
[<c0065370>] (__handle_domain_irq) from [<c000940c>] (gic_handle_irq+0x48/0x8c)
[<c000940c>] (gic_handle_irq) from [<c0013e8c>] (__irq_svc+0x6c/0xa8)
To fix this make sure that the trigger is fully initialized before
requesting the interrupt.
Fixes: ccd2b52f4a ("staging:iio: Add common ADIS library")
Reported-by: Robin Getz <Robin.Getz@analog.com>
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
The last expression in a statement expression need not be a bare
variable, quoting gcc docs
The last thing in the compound statement should be an expression
followed by a semicolon; the value of this subexpression serves as the
value of the entire construct.
and we already use that in e.g. the min/max macros which end with a
ternary expression.
This way, we can allow index to have const-qualified type, which will in
some cases avoid the need for introducing a local copy of index of
non-const qualified type. That, in turn, can prevent readers not
familiar with the internals of array_index_nospec from wondering about
the seemingly redundant extra variable, and I think that's worthwhile
considering how confusing the whole _nospec business is.
The expression _i&_mask has type unsigned long (since that is the type
of _mask, and the BUILD_BUG_ONs guarantee that _i will get promoted to
that), so in order not to change the type of the whole expression, add
a cast back to typeof(_i).
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arch@vger.kernel.org
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/151881604837.17395.10812767547837568328.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
If a DMA buffer is allocated in high memory and kernel mapping is
required use dma_common_contiguous_remap to map buffer to the vmalloc
region and dma_common_free_remap to unmap it.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
amdgpu's ->runtime_suspend hook calls drm_kms_helper_poll_disable(),
which waits for the output poll worker to finish if it's running.
The output poll worker meanwhile calls pm_runtime_get_sync() in
amdgpu's ->detect hooks, which waits for the ongoing suspend to finish,
causing a deadlock.
Fix by not acquiring a runtime PM ref if the ->detect hooks are called
in the output poll worker's context. This is safe because the poll
worker is only enabled while runtime active and we know that
->runtime_suspend waits for it to finish.
Fixes: d38ceaf99e ("drm/amdgpu: add core driver (v4)")
Cc: stable@vger.kernel.org # v4.2+: 27d4ee0307: workqueue: Allow retrieval of current task's work struct
Cc: stable@vger.kernel.org # v4.2+: 25c058ccaf: drm: Allow determining if current task is output poll worker
Cc: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Link: https://patchwork.freedesktop.org/patch/msgid/4c9bf72aacae1eef062bd134cd112e0770a7f121.1518338789.git.lukas@wunner.de
radeon's ->runtime_suspend hook calls drm_kms_helper_poll_disable(),
which waits for the output poll worker to finish if it's running.
The output poll worker meanwhile calls pm_runtime_get_sync() in
radeon's ->detect hooks, which waits for the ongoing suspend to finish,
causing a deadlock.
Fix by not acquiring a runtime PM ref if the ->detect hooks are called
in the output poll worker's context. This is safe because the poll
worker is only enabled while runtime active and we know that
->runtime_suspend waits for it to finish.
Stack trace for posterity:
INFO: task kworker/0:3:31847 blocked for more than 120 seconds
Workqueue: events output_poll_execute [drm_kms_helper]
Call Trace:
schedule+0x3c/0x90
rpm_resume+0x1e2/0x690
__pm_runtime_resume+0x3f/0x60
radeon_lvds_detect+0x39/0xf0 [radeon]
output_poll_execute+0xda/0x1e0 [drm_kms_helper]
process_one_work+0x14b/0x440
worker_thread+0x48/0x4a0
INFO: task kworker/2:0:10493 blocked for more than 120 seconds.
Workqueue: pm pm_runtime_work
Call Trace:
schedule+0x3c/0x90
schedule_timeout+0x1b3/0x240
wait_for_common+0xc2/0x180
wait_for_completion+0x1d/0x20
flush_work+0xfc/0x1a0
__cancel_work_timer+0xa5/0x1d0
cancel_delayed_work_sync+0x13/0x20
drm_kms_helper_poll_disable+0x1f/0x30 [drm_kms_helper]
radeon_pmops_runtime_suspend+0x3d/0xa0 [radeon]
pci_pm_runtime_suspend+0x61/0x1a0
vga_switcheroo_runtime_suspend+0x21/0x70
__rpm_callback+0x32/0x70
rpm_callback+0x24/0x80
rpm_suspend+0x12b/0x640
pm_runtime_work+0x6f/0xb0
process_one_work+0x14b/0x440
worker_thread+0x48/0x4a0
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94147
Fixes: 10ebc0bc09 ("drm/radeon: add runtime PM support (v2)")
Cc: stable@vger.kernel.org # v3.13+: 27d4ee0307: workqueue: Allow retrieval of current task's work struct
Cc: stable@vger.kernel.org # v3.13+: 25c058ccaf: drm: Allow determining if current task is output poll worker
Cc: Ismo Toijala <ismo.toijala@gmail.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Dave Airlie <airlied@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Link: https://patchwork.freedesktop.org/patch/msgid/64ea02c44f91dda19bc563902b97bbc699040392.1518338789.git.lukas@wunner.de
Commit fb23403536 ("sctp: remove the useless check in
sctp_renege_events") forgot to remove another check for
chunk in sctp_renege_events.
Dan found this when doing a static check.
This patch is to remove that check, and also to merge
two checks into one 'if statement'.
Fixes: fb23403536 ("sctp: remove the useless check in sctp_renege_events")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
nouveau's ->runtime_suspend hook calls drm_kms_helper_poll_disable(),
which waits for the output poll worker to finish if it's running.
The output poll worker meanwhile calls pm_runtime_get_sync() in
nouveau_connector_detect() which waits for the ongoing suspend to finish,
causing a deadlock.
Fix by not acquiring a runtime PM ref if nouveau_connector_detect() is
called in the output poll worker's context. This is safe because
the poll worker is only enabled while runtime active and we know that
->runtime_suspend waits for it to finish.
Other contexts calling nouveau_connector_detect() do require a runtime
PM ref, these comprise:
status_store() drm sysfs interface
->fill_modes drm callback
drm_fb_helper_probe_connector_modes()
drm_mode_getconnector()
nouveau_connector_hotplug()
nouveau_display_hpd_work()
nv17_tv_set_property()
Stack trace for posterity:
INFO: task kworker/0:1:58 blocked for more than 120 seconds.
Workqueue: events output_poll_execute [drm_kms_helper]
Call Trace:
schedule+0x28/0x80
rpm_resume+0x107/0x6e0
__pm_runtime_resume+0x47/0x70
nouveau_connector_detect+0x7e/0x4a0 [nouveau]
nouveau_connector_detect_lvds+0x132/0x180 [nouveau]
drm_helper_probe_detect_ctx+0x85/0xd0 [drm_kms_helper]
output_poll_execute+0x11e/0x1c0 [drm_kms_helper]
process_one_work+0x184/0x380
worker_thread+0x2e/0x390
INFO: task kworker/0:2:252 blocked for more than 120 seconds.
Workqueue: pm pm_runtime_work
Call Trace:
schedule+0x28/0x80
schedule_timeout+0x1e3/0x370
wait_for_completion+0x123/0x190
flush_work+0x142/0x1c0
nouveau_pmops_runtime_suspend+0x7e/0xd0 [nouveau]
pci_pm_runtime_suspend+0x5c/0x180
vga_switcheroo_runtime_suspend+0x1e/0xa0
__rpm_callback+0xc1/0x200
rpm_callback+0x1f/0x70
rpm_suspend+0x13c/0x640
pm_runtime_work+0x6e/0x90
process_one_work+0x184/0x380
worker_thread+0x2e/0x390
Bugzilla: https://bugs.archlinux.org/task/53497
Bugzilla: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=870523
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70388#c33
Fixes: 5addcf0a5f ("nouveau: add runtime PM support (v0.9)")
Cc: stable@vger.kernel.org # v3.12+: 27d4ee0307: workqueue: Allow retrieval of current task's work struct
Cc: stable@vger.kernel.org # v3.12+: 25c058ccaf: drm: Allow determining if current task is output poll worker
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Dave Airlie <airlied@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Link: https://patchwork.freedesktop.org/patch/msgid/b7d2cbb609a80f59ccabfdf479b9d5907c603ea1.1518338789.git.lukas@wunner.de
Introduce a helper to determine if the current task is an output poll
worker.
This allows us to fix a long-standing deadlock in several DRM drivers
wherein the ->runtime_suspend callback waits for the output poll worker
to finish and the worker in turn calls a ->detect callback which waits
for runtime suspend to finish. The ->detect callback is invoked from
multiple call sites and waiting for runtime suspend to finish is the
correct thing to do except if it's executing in the context of the
worker.
v2: Expand kerneldoc to specifically mention deadlock between
output poll worker and autosuspend worker as use case. (Lyude)
Cc: Dave Airlie <airlied@redhat.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Link: https://patchwork.freedesktop.org/patch/msgid/3549ce32e7f1467102e70d3e9cbf70c46bfe108e.1518593424.git.lukas@wunner.de
Due to a check recently added to copy_to_user(), it's now not permitted to
copy from slab-held data to userspace unless the slab is whitelisted. This
affects rxrpc_recvmsg() when it attempts to place an RXRPC_USER_CALL_ID
control message in the userspace control message buffer. A warning is
generated by usercopy_warn() because the source is the copy of the
user_call_ID retained in the rxrpc_call struct.
Work around the issue by copying the user_call_ID to a variable on the
stack and passing that to put_cmsg().
The warning generated looks like:
Bad or missing usercopy whitelist? Kernel memory exposure attempt detected from SLUB object 'dmaengine-unmap-128' (offset 680, size 8)!
WARNING: CPU: 0 PID: 1401 at mm/usercopy.c:81 usercopy_warn+0x7e/0xa0
...
RIP: 0010:usercopy_warn+0x7e/0xa0
...
Call Trace:
__check_object_size+0x9c/0x1a0
put_cmsg+0x98/0x120
rxrpc_recvmsg+0x6fc/0x1010 [rxrpc]
? finish_wait+0x80/0x80
___sys_recvmsg+0xf8/0x240
? __clear_rsb+0x25/0x3d
? __clear_rsb+0x15/0x3d
? __clear_rsb+0x25/0x3d
? __clear_rsb+0x15/0x3d
? __clear_rsb+0x25/0x3d
? __clear_rsb+0x15/0x3d
? __clear_rsb+0x25/0x3d
? __clear_rsb+0x15/0x3d
? finish_task_switch+0xa6/0x2b0
? trace_hardirqs_on_caller+0xed/0x180
? _raw_spin_unlock_irq+0x29/0x40
? __sys_recvmsg+0x4e/0x90
__sys_recvmsg+0x4e/0x90
do_syscall_64+0x7a/0x220
entry_SYSCALL_64_after_hwframe+0x26/0x9b
Reported-by: Jonathan Billings <jsbillings@jsbillings.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Kees Cook <keescook@chromium.org>
Tested-by: Jonathan Billings <jsbillings@jsbillings.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
<Mark Rutland reported>
While fuzzing arm64 v4.16-rc1 with Syzkaller, I've been hitting a
misaligned atomic in __skb_clone:
atomic_inc(&(skb_shinfo(skb)->dataref));
where dataref doesn't have the required natural alignment, and the
atomic operation faults. e.g. i often see it aligned to a single
byte boundary rather than a four byte boundary.
AFAICT, the skb_shared_info is misaligned at the instant it's
allocated in __napi_alloc_skb() __napi_alloc_skb()
</end of report>
Problem is caused by tun_napi_alloc_frags() using
napi_alloc_frag() with user provided seg sizes,
leading to other users of this API getting unaligned
page fragments.
Since we would like to not necessarily add paddings or alignments to
the frags that tun_napi_alloc_frags() attaches to the skb, switch to
another page frag allocator.
As a bonus skb_page_frag_refill() can use GFP_KERNEL allocations,
meaning that we can not deplete memory reserves as easily.
Fixes: 90e33d4594 ("tun: enable napi_gro_frags() for TUN/TAP driver")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since UDP-Lite is always using checksum, the following path is
triggered when calculating pseudo header for it:
udp4_csum_init() or udp6_csum_init()
skb_checksum_init_zero_check()
__skb_checksum_validate_complete()
The problem can appear if skb->len is less than CHECKSUM_BREAK. In
this particular case __skb_checksum_validate_complete() also invokes
__skb_checksum_complete(skb). If UDP-Lite is using partial checksum
that covers only part of a packet, the function will return bad
checksum and the packet will be dropped.
It can be fixed if we skip skb_checksum_init_zero_check() and only
set the required pseudo header checksum for UDP-Lite with partial
checksum before udp4_csum_init()/udp6_csum_init() functions return.
Fixes: ed70fcfcee ("net: Call skb_checksum_init in IPv4")
Fixes: e4f45b7f40 ("net: Call skb_checksum_init in IPv6")
Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After commit 3f34cfae12 ("netfilter: on sockopt() acquire sock lock
only in the required scope"), the caller of nf_{get/set}sockopt() must
not hold any lock, but, in such changeset, I forgot to cope with DECnet.
This commit addresses the issue moving the nf call outside the lock,
in the dn_{get,set}sockopt() with the same schema currently used by
ipv4 and ipv6. Also moves the unhandled sockopts of the end of the main
switch statements, to improve code readability.
Reported-by: Petr Vandrovec <petr@vandrovec.name>
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=198791#c2
Fixes: 3f34cfae12 ("netfilter: on sockopt() acquire sock lock only in the required scope")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We've run into a problem where our device is attached
to a Virtual Machine and the use of the new pci_set_vpd_size()
API doesn't help. The VM kernel has been informed that
the accesses are okay, but all of the actual VPD Capability
Accesses are trapped down into the KVM Hypervisor where it
goes ahead and imposes the silent denials.
The right idea is to follow the kernel.org
commit 1c7de2b4ff ("PCI: Enable access to non-standard VPD for
Chelsio devices (cxgb3)") which Alexey Kardashevskiy authored
to establish a PCI Quirk for our T3-based adapters. This commit
extends that PCI Quirk to cover Chelsio T4 devices and later.
The advantage of this approach is that the VPD Size gets set early
in the Base OS/Hypervisor Boot and doesn't require that the cxgb4
driver even be available in the Base OS/Hypervisor. Thus PF4 can
be exported to a Virtual Machine and everything should work.
Fixes: 67e658794c ("cxgb4: Set VPD size so we can read both VPD structures")
Cc: <stable@vger.kernel.org> # v4.9+
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
free pf 0-3 resources, commit baf5086840 ("cxgb4:
restructure VF mgmt code") erroneously removed the
code which frees the pf 0-3 resources, causing the
probe of pf 0-3 to fail in case of driver reload.
Fixes: baf5086840 ("cxgb4: restructure VF mgmt code")
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In fib_nh_match(), if output interface or gateway are passed in
the FIB configuration, we don't have to check next hops of
multipath routes to conclude whether we have a match or not.
However, we might still have routes with different realms
matching the same output interface and gateway configuration,
and this needs to cause the match to fail. Otherwise the first
route inserted in the FIB will match, regardless of the realms:
# ip route add 1.1.1.1 dev eth0 table 1234 realms 1/2
# ip route append 1.1.1.1 dev eth0 table 1234 realms 3/4
# ip route list table 1234
1.1.1.1 dev eth0 scope link realms 1/2
1.1.1.1 dev eth0 scope link realms 3/4
# ip route del 1.1.1.1 dev ens3 table 1234 realms 3/4
# ip route list table 1234
1.1.1.1 dev ens3 scope link realms 3/4
whereas route with realms 3/4 should have been deleted instead.
Explicitly check for fc_flow passed in the FIB configuration
(this comes from RTA_FLOW extracted by rtm_to_fib_config()) and
fail matching if it differs from nh_tclassid.
The handling of RTA_FLOW for multipath routes later in
fib_nh_match() is still needed, as we can have multiple RTA_FLOW
attributes that need to be matched against the tclassid of each
next hop.
v2: Check that fc_flow is set before discarding the match, so
that the user can still select the first matching rule by
not specifying any realm, as suggested by David Ahern.
Reported-by: Jianlin Shi <jishi@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The tlv_len is u8, so we need to limit the size of the SDP URI. Enforce
this both in the NLA policy and in the code that performs the allocation
and copy, to avoid writing past the end of the allocated buffer.
Fixes: d9b8d8e19b ("NFC: llcp: Service Name Lookup netlink interface")
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
uaccess_kernel() isn't sufficient to determine if an rdma resource is
user-mode or not. For example, resources allocated in the add_one()
function of an ib_client get falsely labeled as user mode, when they
are kernel mode allocations. EG: mad qps.
The result is that these qps are skipped over during a nldev query
because of an erroneous namespace mismatch.
So now we determine if the resource is user-mode by looking at the object
struct's uobject or similar pointer to know if it was allocated for user
mode applications.
Fixes: 02d8883f52 ("RDMA/restrack: Add general infrastructure to track RDMA resources")
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Save 26 lines worth of Sparse complaints by fixing up this minor
mishap. The pointee lies in the __iomem space; the pointer does not.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Since commit 204f672255 ("staging: android: ion: Use CMA APIs directly")
the CMA API is now used directly and therefore the allocated memory is no
longer automatically zeroed.
Explicitly zero CMA allocated memory to ensure that no data is exposed to
userspace.
Fixes: 204f672255 ("staging: android: ion: Use CMA APIs directly")
Signed-off-by: Liam Mark <lmark@codeaurora.org>
Acked-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ashmem_pin_unpin() reads asma->file and asma->size before taking the
ashmem_mutex, so it can race with other operations that modify them.
Build-tested only.
Cc: stable@vger.kernel.org
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Selecting GENERIC_MSI_IRQ_DOMAIN on x86 causes a compile-time error in
some configurations:
drivers/base/platform-msi.c:37:19: error: field 'arg' has incomplete type
On the other architectures, we are fine, but here we should have an additional
dependency on X86_LOCAL_APIC so we can get the PCI_MSI_IRQ_DOMAIN symbol.
Cc: stable@vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Update all the flows to ensure that function pointer exists prior
to accessing it.
This is much safer than checking the uverbs_ex_mask variable, especially
since we know that test isn't working properly and will be removed
in -next.
This prevents a user triggereable oops.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Commit 8419caa727 ("ASoC: sgtl5000: Do not disable regulators in
SND_SOC_BIAS_OFF") causes the sgtl5000 to fail after a suspend/resume
sequence:
Playing WAVE '/media/a2002011001-e02.wav' : Signed 16 bit Little
Endian, Rate 44100 Hz, Stereo
aplay: pcm_write:2051: write error: Input/output error
The problem is caused by the fact that the aforementioned commit
dropped the cache handling, so re-introduce the register map
resync to fix the problem.
Suggested-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: <stable@vger.kernel.org>
I would like helping maintaining and reviewing/testing sgtl5000
related patches.
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
In AP mode, when a new station associates, rs is initialized immediately
upon association completion, before the phy context is updated with the
association parameters, so the sta bandwidth might be wider than the phy
context allows.
To avoid this issue, always initialize rs with 20mhz bandwidth rate, and
after authorization, when the phy context is already up-to-date, re-init
rs with the correct bw.
Signed-off-by: Naftali Goldstein <naftali.goldstein@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Canceling the periodic timestamp work should be
done in the opposite flow to where it was started.
This also prevents from sending the MARKER command
during the mac_stop flow - causing a false queue hang
(FW is no longer there to send a response).
Fixes: 93b167c13a ("iwlwifi: runtime: sync FW and host clocks for logs")
Signed-off-by: Shaul Triebitz <shaul.triebitz@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
This change relaxes copy up on encode of merge dir with lower layer > 1
and handles the case of encoding a merge dir with lower layer 1, where an
ancestor is a non-indexed merge dir. In that case, decode of the lower
file handle will not have been possible if the non-indexed ancestor is
redirected before or after encode.
Before encoding a non-upper directory file handle from real layer N, we
need to check if it will be possible to reconnect an overlay dentry from
the real lower decoded dentry. This is done by following the overlay
ancestry up to a "layer N connected" ancestor and verifying that all
parents along the way are "layer N connectable". If an ancestor that is
NOT "layer N connectable" is found, we need to copy up an ancestor, which
is "layer N connectable", thus making that ancestor "layer N connected".
For example:
layer 1: /a
layer 2: /a/b/c
The overlay dentry /a is NOT "layer 2 connectable", because if dir /a is
copied up and renamed, upper dir /a will be indexed by lower dir /a from
layer 1. The dir /a from layer 2 will never be indexed, so the algorithm
in ovl_lookup_real_ancestor() (*) will not be able to lookup a connected
overlay dentry from the connected lower dentry /a/b/c.
To avoid this problem on decode time, we need to copy up an ancestor of
/a/b/c, which is "layer 2 connectable", on encode time. That ancestor is
/a/b. After copy up (and index) of /a/b, it will become "layer 2 connected"
and when the time comes to decode the file handle from lower dentry /a/b/c,
ovl_lookup_real_ancestor() will find the indexed ancestor /a/b and decoding
a connected overlay dentry will be accomplished.
(*) the algorithm in ovl_lookup_real_ancestor() can be improved to lookup
an entry /a in the lower layers above layer N and find the indexed dir /a
from layer 1. If that improvement is made, then the check for "layer N
connected" will need to verify there are no redirects in lower layers above
layer N. In the example above, /a will be "layer 2 connectable". However,
if layer 2 dir /a is a target of a layer 1 redirect, then /a will NOT be
"layer 2 connectable":
layer 1: /A (redirect = /a)
layer 2: /a/b/c
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Commit 31747eda41 ("ovl: hash directory inodes for fsnotify")
fixed an issue of inotify watch on directory that stops getting
events after dropping dentry caches.
A similar issue exists for non-dir non-upper files, for example:
$ mkdir -p lower upper work merged
$ touch lower/foo
$ mount -t overlay -o
lowerdir=lower,workdir=work,upperdir=upper none merged
$ inotifywait merged/foo &
$ echo 2 > /proc/sys/vm/drop_caches
$ cat merged/foo
inotifywait doesn't get the OPEN event, because ovl_lookup() called
from 'cat' allocates a new overlay inode and does not reuse the
watched inode.
Fix this by hashing non-dir overlay inodes by lower real inode in
the following cases that were not hashed before this change:
- A non-upper overlay mount
- A lower non-hardlink when index=off
A helper ovl_hash_bylower() was added to put all the logic and
documentation about which real inode an overlay inode is hashed by
into one place.
The issue dates back to initial version of overlayfs, but this
patch depends on ovl_inode code that was introduced in kernel v4.13.
Cc: <stable@vger.kernel.org> #v4.13
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
syszkaller found that rcu was not held in hashlimit_mt_common()
We only need to enable BH at this point.
Fixes: bea74641e3 ("netfilter: xt_hashlimit: add rate match mode")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzkaller <syzkaller@googlegroups.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Add the missing offset calculation for grayscale images. Since the IPU
only supports capturing greyscale in raw passthrough mode, it is the
same as 8-bit bayer formats.
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Don't populate the const read-only arrays int_reg on the stack but instead
make them static. Makes the object code smaller by over 80 bytes:
Before:
text data bss dec hex filename
28024 8936 192 37152 9120 drivers/gpu/ipu-v3/ipu-common.o
After:
text data bss dec hex filename
27794 9080 192 37066 90ca drivers/gpu/ipu-v3/ipu-common.o
(gcc version 7.2.0 x86_64)
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Our Transmit Frame Descriptor (TFD) is a DMA descriptor that
includes several pointers to be able to transmit a packet
which is not physically contiguous.
Depending on the hardware being use, we can have 20 or 25
pointers in a single TFD. In both cases, it is more than
enough and it is quite hard to hit this limit.
It has been reported that when using specific applications
(Ktorrent), we can actually use all the pointers and then
a long standing bug showed up.
When we free the TFD, we check its number of valid pointers
and make sure it doesn't exceed the number of pointers the
hardware support.
This check had an off by one bug: it is perfectly valid to
free the 20 pointers if the TFD has 20 pointers.
Fix that.
https://bugzilla.kernel.org/show_bug.cgi?id=197981
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
In IBSS, the mac80211 sets the cab_queue to be invalid.
However, the multicast station uses it, so we need to override it.
A previous patch did it, but it was nested inside the if's and was
applied only for legacy FWs that don't support the new station type
API, instead of being applied for all paths.
In addition, add a missing NL80211_IFTYPE_ADHOC to the initialization
of the queues in iwl_mvm_mac_ctxt_init()
Fixes: ee48b72211 ("iwlwifi: mvm: support ibss in dqa mode")
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
A previous patch allowed the same PN for packets originating from the
same AMSDU by copying PN only for the last packet in the series.
This however is bogus since we cannot assume the last frame will be
received on the same queue, and if it is received on a different ueue
we will end up not incrementing the PN and possibly let the next
packet to have the same PN and pass through.
Change the logic instead to driver explicitly indicate for the second
sub frame and on to be allowed to have the same PN as the first
subframe. Indicate it to mac80211 as well for the fallback queue.
Fixes: f1ae02b186 ("iwlwifi: mvm: allow same PN for de-aggregated AMSDU")
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
This fixes regression introduced by
commit 8d52af6795 ("mei: speed up the power down flow")
In mei_cldev_disable during device power down flow, such as
suspend or system power off, it jumps over disconnecting function
to speed up the power down process, however, because the client is
unlinked from the file_list (mei_cl_unlink) mei_cl_set_disconnected
is not called from mei_cl_all_disconnect leaving resource leaking.
The most visible is reference counter on underlying HW module is
not decreased preventing to remove modules after suspend/resume cycles.
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Fixes: 8d52af6795 ("mei: speed up the power down flow")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The format specifier "%p" can leak kernel addresses. Use
"%pK" instead. There were 4 remaining cases in binder.c.
Signed-off-by: Todd Kjos <tkjos@google.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
binder_send_failed_reply() is called when a synchronous
transaction fails. It reports an error to the thread that
is waiting for the completion. Given that the transaction
is synchronous, there should never be more than 1 error
response to that thread -- this was being asserted with
a WARN().
However, when exercising the driver with syzbot tests, cases
were observed where multiple "synchronous" requests were
sent without waiting for responses, so it is possible that
multiple errors would be reported to the thread. This testing
was conducted with panic_on_warn set which forced the crash.
This is easily reproduced by sending back-to-back
"synchronous" transactions without checking for any
response (eg, set read_size to 0):
bwr.write_buffer = (uintptr_t)&bc1;
bwr.write_size = sizeof(bc1);
bwr.read_buffer = (uintptr_t)&br;
bwr.read_size = 0;
ioctl(fd, BINDER_WRITE_READ, &bwr);
sleep(1);
bwr2.write_buffer = (uintptr_t)&bc2;
bwr2.write_size = sizeof(bc2);
bwr2.read_buffer = (uintptr_t)&br;
bwr2.read_size = 0;
ioctl(fd, BINDER_WRITE_READ, &bwr2);
sleep(1);
The first transaction is sent to the servicemanager and the reply
fails because no VMA is set up by this client. After
binder_send_failed_reply() is called, the BINDER_WORK_RETURN_ERROR
is sitting on the thread's todo list since the read_size was 0 and
the client is not waiting for a response.
The 2nd transaction is sent and the BINDER_WORK_RETURN_ERROR has not
been consumed, so the thread's reply_error.cmd is still set (normally
cleared when the BINDER_WORK_RETURN_ERROR is handled). Therefore
when the servicemanager attempts to reply to the 2nd failed
transaction, the error is already set and it triggers this warning.
This is a user error since it is not waiting for the synchronous
transaction to complete. If it ever does check, it will see an
error.
Changed the WARN() to a pr_warn().
Signed-off-by: Todd Kjos <tkjos@android.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If the kzalloc() in binder_get_thread() fails, binder_poll()
dereferences the resulting NULL pointer.
Fix it by returning POLLERR if the memory allocation failed.
This bug was found by syzkaller using fault injection.
Reported-by: syzbot <syzkaller@googlegroups.com>
Fixes: 457b9a6f09 ("Staging: android: add binder driver")
Cc: stable@vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Trying to boot an RK3328 box with an HS200-capable eMMC, I see said eMMC
fail to initialise as it can't run its tuning procedure, because the
sample clock is missing. Upon closer inspection, whilst the clock is
present in the DT, its name is subtly incorrect per the binding, so
__of_clk_get_by_name() never finds it. By inspection, the drive clock
suffers from a similar problem, so has never worked properly either.
This error has propagated across the 32-bit DTs too, so fix those up.
Fixes: 187d7967a5 ("ARM: dts: rockchip: add the sdio/sdmmc node for rk3036")
Fixes: faea098e18 ("ARM: dts: rockchip: add core rk3036 dtsi")
Fixes: 9848ebeb95 ("ARM: dts: rockchip: add core rk3228 dtsi")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Trying to boot an RK3328 box with an HS200-capable eMMC, I see said eMMC
fail to initialise as it can't run its tuning procedure, because the
sample clock is missing. Upon closer inspection, whilst the clock is
present in the DT, its name is subtly incorrect per the binding, so
__of_clk_get_by_name() never finds it. By inspection, the drive clock
suffers from a similar problem, so has never worked properly either.
Fix up all instances of the incorrect clock names across the 64-bit DTs.
Fixes: d717f7352e ("arm64: dts: rockchip: add sdmmc/sdio/emmc nodes for RK3328 SoCs")
Fixes: b790c2cab5 ("arm64: dts: add Rockchip rk3368 core dtsi and board dts for the r88 board")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Felipe writes:
usb: fixes for v4.16-rc2
First set of fixes for current -rc cycle. Most of the changes are on
dwc3 this time around (59%) with some function changes (25%).
Out of the those, the most important fixes are:
- EP0 TRB counter fix on dwc3
- dwc3-omap stopped missing events during suspend/resume
- maxpacket size fix for ep0 in dwc3
- Descriptor processing fix for functionfs
Apart from these, your usual set of important-but-not-so-critical
fixes all over the place.
ACM driver may accept data to transmit while system is not fully
resumed. In this case ACM driver buffers data and prepare URBs
on usb anchor list.
There is a little chance that two tasks put a char and initiate
acm_tty_flush_chars(). In such a case, driver will put one URB
twice on usb anchor list.
This patch also reset length of data before resue of a buffer.
This not only prevent sending rubbish, but also lower risc of race.
Without this patch we hit following kernel panic in one of our
stabilty/stress tests.
[ 46.884442] *list_add double add*: new=ffff9b2ab7289330, prev=ffff9b2ab7289330, next=ffff9b2ab81e28e0.
[ 46.884476] Modules linked in: hci_uart btbcm bluetooth rfkill_gpio igb_avb(O) cfg80211 snd_soc_sst_bxt_tdf8532 snd_soc_skl snd_soc_skl_ipc snd_soc_sst_ipc snd_soc_sst_dsp snd_soc_sst_acpi snd_soc_sst_match snd_hda_ext_core snd_hda_core trusty_timer trusty_wall trusty_log trusty_virtio trusty_ipc trusty_mem trusty_irq trusty virtio_ring virtio intel_ipu4_mmu_bxtB0 lib2600_mod_bxtB0 intel_ipu4_isys_mod_bxtB0 lib2600psys_mod_bxtB0 intel_ipu4_psys_mod_bxtB0 intel_ipu4_mod_bxtB0 intel_ipu4_wrapper_bxtB0 intel_ipu4_acpi videobuf2_dma_contig as3638 dw9714 lm3643 crlmodule smiapp smiapp_pll
[ 46.884480] CPU: 1 PID: 33 Comm: kworker/u8:1 Tainted: G U W O 4.9.56-quilt-2e5dc0ac-g618ed69ced6e-dirty #4
[ 46.884489] Workqueue: events_unbound flush_to_ldisc
[ 46.884494] ffffb98ac012bb08 ffffffffad3e82e5 ffffb98ac012bb58 0000000000000000
[ 46.884497] ffffb98ac012bb48 ffffffffad0a23d1 00000024ad6374dd ffff9b2ab7289330
[ 46.884500] ffff9b2ab81e28e0 ffff9b2ab7289330 0000000000000002 0000000000000000
[ 46.884501] Call Trace:
[ 46.884507] [<ffffffffad3e82e5>] dump_stack+0x67/0x92
[ 46.884511] [<ffffffffad0a23d1>] __warn+0xd1/0xf0
[ 46.884513] [<ffffffffad0a244f>] warn_slowpath_fmt+0x5f/0x80
[ 46.884516] [<ffffffffad407443>] __list_add+0xb3/0xc0
[ 46.884521] [<ffffffffad71133c>] *usb_anchor_urb*+0x4c/0xa0
[ 46.884524] [<ffffffffad782c6f>] *acm_tty_flush_chars*+0x8f/0xb0
[ 46.884527] [<ffffffffad782cd1>] *acm_tty_put_char*+0x41/0x100
[ 46.884530] [<ffffffffad4ced34>] tty_put_char+0x24/0x40
[ 46.884533] [<ffffffffad4d3bf5>] do_output_char+0xa5/0x200
[ 46.884535] [<ffffffffad4d3e98>] __process_echoes+0x148/0x290
[ 46.884538] [<ffffffffad4d654c>] n_tty_receive_buf_common+0x57c/0xb00
[ 46.884541] [<ffffffffad4d6ae4>] n_tty_receive_buf2+0x14/0x20
[ 46.884543] [<ffffffffad4d9662>] tty_ldisc_receive_buf+0x22/0x50
[ 46.884545] [<ffffffffad4d9c05>] flush_to_ldisc+0xc5/0xe0
[ 46.884549] [<ffffffffad0bcfe8>] process_one_work+0x148/0x440
[ 46.884551] [<ffffffffad0bdc19>] worker_thread+0x69/0x4a0
[ 46.884554] [<ffffffffad0bdbb0>] ? max_active_store+0x80/0x80
[ 46.884556] [<ffffffffad0c2e10>] kthread+0x110/0x130
[ 46.884559] [<ffffffffad0c2d00>] ? kthread_park+0x60/0x60
[ 46.884563] [<ffffffffadad9917>] ret_from_fork+0x27/0x40
[ 46.884566] ---[ end trace 3bd599058b8a9eb3 ]---
Signed-off-by: Dominik Bozek <dominikx.bozek@intel.com>
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Acked-by: Oliver Neukum <oneukum@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In early time, when freeing a xdst, it would be inserted into
dst_garbage.list first. Then if it's refcnt was still held
somewhere, later it would be put into dst_busy_list in
dst_gc_task().
When one dev was being unregistered, the dev of these dsts in
dst_busy_list would be set with loopback_dev and put this dev.
So that this dev's removal wouldn't get blocked, and avoid the
kmsg warning:
kernel:unregister_netdevice: waiting for veth0 to become \
free. Usage count = 2
However after Commit 52df157f17 ("xfrm: take refcnt of dst
when creating struct xfrm_dst bundle"), the xdst will not be
freed with dst gc, and this warning happens.
To fix it, we need to find these xdsts that are still held by
others when removing the dev, and free xdst's dev and set it
with loopback_dev.
But unfortunately after flow_cache for xfrm was deleted, no
list tracks them anymore. So we need to save these xdsts
somewhere to release the xdst's dev later.
To make this easier, this patch is to reuse uncached_list to
track xdsts, so that the dev refcnt can be released in the
event NETDEV_UNREGISTER process of fib_netdev_notifier.
Thanks to Florian, we could move forward this fix quickly.
Fixes: 52df157f17 ("xfrm: take refcnt of dst when creating struct xfrm_dst bundle")
Reported-by: Jianlin Shi <jishi@redhat.com>
Reported-by: Hangbin Liu <liuhangbin@gmail.com>
Tested-by: Eyal Birger <eyal.birger@gmail.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
syzkaller recently triggered OOM during percpu map allocation;
while there is work in progress by Dennis Zhou to add __GFP_NORETRY
semantics for percpu allocator under pressure, there seems also a
missing bpf_map_precharge_memlock() check in array map allocation.
Given today the actual bpf_map_charge_memlock() happens after the
find_and_alloc_map() in syscall path, the bpf_map_precharge_memlock()
is there to bail out early before we go and do the map setup work
when we find that we hit the limits anyway. Therefore add this for
array map as well.
Fixes: 6c90598174 ("bpf: pre-allocate hash map elements")
Fixes: a10423b87a ("bpf: introduce BPF_MAP_TYPE_PERCPU_ARRAY map")
Reported-by: syzbot+adb03f3f0bb57ce3acda@syzkaller.appspotmail.com
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Dennis Zhou <dennisszhou@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Problem Statement: Sending I/O through 32 bit descriptors to Ventura series of
controller results in IO timeout on certain conditions.
This error only occurs on systems with high I/O activity on Ventura series
controllers.
Changes in this patch will prevent driver from using 32 bit descriptor and use
64 bit Descriptors.
Cc: <stable@vger.kernel.org>
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This ensures that we return the right structures back to userspace.
Otherwise, it looks like the reserved fields in the response structures
in userspace might have uninitialized data in them.
Fixes: 8b10ba783c ("RDMA/vmw_pvrdma: Add shared receive queue support")
Fixes: 29c8d9eba5 ("IB: Add vmw_pvrdma driver")
Suggested-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Bryan Tan <bryantan@vmware.com>
Reviewed-by: Aditya Sarwade <asarwade@vmware.com>
Reviewed-by: Jorgen Hansen <jhansen@vmware.com>
Signed-off-by: Adit Ranadive <aditr@vmware.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
There is no matching lock for this mutex. Git history suggests this is
just a missed remnant from an earlier version of the function before
this locking was moved into uverbs_free_xrcd.
Originally this lock was protecting the xrcd_table_delete()
=====================================
WARNING: bad unlock balance detected!
4.15.0+ #87 Not tainted
-------------------------------------
syzkaller223405/269 is trying to release lock (&uverbs_dev->xrcd_tree_mutex) at:
[<00000000b8703372>] ib_uverbs_close_xrcd+0x195/0x1f0
but there are no more locks to release!
other info that might help us debug this:
1 lock held by syzkaller223405/269:
#0: (&uverbs_dev->disassociate_srcu){....}, at: [<000000005af3b960>] ib_uverbs_write+0x265/0xef0
stack backtrace:
CPU: 0 PID: 269 Comm: syzkaller223405 Not tainted 4.15.0+ #87
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
Call Trace:
dump_stack+0xde/0x164
? dma_virt_map_sg+0x22c/0x22c
? ib_uverbs_write+0x265/0xef0
? console_unlock+0x502/0xbd0
? ib_uverbs_close_xrcd+0x195/0x1f0
print_unlock_imbalance_bug+0x131/0x160
lock_release+0x59d/0x1100
? ib_uverbs_close_xrcd+0x195/0x1f0
? lock_acquire+0x440/0x440
? lock_acquire+0x440/0x440
__mutex_unlock_slowpath+0x88/0x670
? wait_for_completion+0x4c0/0x4c0
? rdma_lookup_get_uobject+0x145/0x2f0
ib_uverbs_close_xrcd+0x195/0x1f0
? ib_uverbs_open_xrcd+0xdd0/0xdd0
ib_uverbs_write+0x7f9/0xef0
? cyc2ns_read_end+0x10/0x10
? ib_uverbs_open_xrcd+0xdd0/0xdd0
? uverbs_devnode+0x110/0x110
? cyc2ns_read_end+0x10/0x10
? cyc2ns_read_end+0x10/0x10
? sched_clock_cpu+0x18/0x200
__vfs_write+0x10d/0x700
? uverbs_devnode+0x110/0x110
? kernel_read+0x170/0x170
? __fget+0x358/0x5d0
? security_file_permission+0x93/0x260
vfs_write+0x1b0/0x550
SyS_write+0xc7/0x1a0
? SyS_read+0x1a0/0x1a0
? trace_hardirqs_on_thunk+0x1a/0x1c
entry_SYSCALL_64_fastpath+0x1e/0x8b
RIP: 0033:0x4335c9
Cc: syzkaller <syzkaller@googlegroups.com>
Cc: <stable@vger.kernel.org> # 4.11
Fixes: fd3c7904db ("IB/core: Change idr objects to use the new schema")
Reported-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Once the uobj is committed it is immediately possible another thread
could destroy it, which worst case, can result in a use-after-free
of the restrack objects.
Cc: syzkaller <syzkaller@googlegroups.com>
Fixes: 08f294a152 ("RDMA/core: Add resource tracking for create and destroy CQs")
Reported-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The command number is not bounds checked against the command mask before it
is shifted, resulting in an ubsan hit. This does not cause malfunction since
the command number is eventually bounds checked, but we can make this ubsan
clean by moving the bounds check to before the mask check.
================================================================================
UBSAN: Undefined behaviour in
drivers/infiniband/core/uverbs_main.c:647:21
shift exponent 207 is too large for 64-bit type 'long long unsigned int'
CPU: 0 PID: 446 Comm: syz-executor3 Not tainted 4.15.0-rc2+ #61
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
Call Trace:
dump_stack+0xde/0x164
? dma_virt_map_sg+0x22c/0x22c
ubsan_epilogue+0xe/0x81
__ubsan_handle_shift_out_of_bounds+0x293/0x2f7
? debug_check_no_locks_freed+0x340/0x340
? __ubsan_handle_load_invalid_value+0x19b/0x19b
? lock_acquire+0x440/0x440
? lock_acquire+0x19d/0x440
? __might_fault+0xf4/0x240
? ib_uverbs_write+0x68d/0xe20
ib_uverbs_write+0x68d/0xe20
? __lock_acquire+0xcf7/0x3940
? uverbs_devnode+0x110/0x110
? cyc2ns_read_end+0x10/0x10
? sched_clock_cpu+0x18/0x200
? sched_clock_cpu+0x18/0x200
__vfs_write+0x10d/0x700
? uverbs_devnode+0x110/0x110
? kernel_read+0x170/0x170
? __fget+0x35b/0x5d0
? security_file_permission+0x93/0x260
vfs_write+0x1b0/0x550
SyS_write+0xc7/0x1a0
? SyS_read+0x1a0/0x1a0
? trace_hardirqs_on_thunk+0x1a/0x1c
entry_SYSCALL_64_fastpath+0x18/0x85
RIP: 0033:0x448e29
RSP: 002b:00007f033f567c58 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00007f033f5686bc RCX: 0000000000448e29
RDX: 0000000000000060 RSI: 0000000020001000 RDI: 0000000000000012
RBP: 000000000070bea0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
R13: 00000000000056a0 R14: 00000000006e8740 R15: 0000000000000000
================================================================================
Cc: syzkaller <syzkaller@googlegroups.com>
Cc: <stable@vger.kernel.org> # 4.5
Fixes: 2dbd5186a3 ("IB/core: IB/core: Allow legacy verbs through extended interfaces")
Reported-by: Noa Osherovich <noaos@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
If remove_commit fails then the lock is left locked while the uobj still
exists. Eventually the kernel will deadlock.
lockdep detects this and says:
test/4221 is leaving the kernel with locks still held!
1 lock held by test/4221:
#0: (&ucontext->cleanup_rwsem){.+.+}, at: [<000000001e5c7523>] rdma_explicit_destroy+0x37/0x120 [ib_uverbs]
Fixes: 4da70da23e ("IB/core: Explicitly destroy an object while keeping uobject")
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
This is really being used as an assert that the expected usecnt
is being held and implicitly that the usecnt is valid. Rename it to
assert_uverbs_usecnt and tighten the checks to only accept valid
values of usecnt (eg 0 and < -1 are invalid).
The tigher checkes make the assertion cover more cases and is more
likely to find bugs via syzkaller/etc.
Fixes: 3832125624 ("IB/core: Add support for idr types")
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The race is between lookup_get_idr_uobject and
uverbs_idr_remove_uobj -> uverbs_uobject_put.
We deliberately do not call sychronize_rcu after the idr_remove in
uverbs_idr_remove_uobj for performance reasons, instead we call
kfree_rcu() during uverbs_uobject_put.
However, this means we can obtain pointers to uobj's that have
already been released and must protect against krefing them
using kref_get_unless_zero.
==================================================================
BUG: KASAN: use-after-free in copy_ah_attr_from_uverbs.isra.2+0x860/0xa00
Read of size 4 at addr ffff88005fda1ac8 by task syz-executor2/441
CPU: 1 PID: 441 Comm: syz-executor2 Not tainted 4.15.0-rc2+ #56
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
Call Trace:
dump_stack+0x8d/0xd4
print_address_description+0x73/0x290
kasan_report+0x25c/0x370
? copy_ah_attr_from_uverbs.isra.2+0x860/0xa00
copy_ah_attr_from_uverbs.isra.2+0x860/0xa00
? uverbs_try_lock_object+0x68/0xc0
? modify_qp.isra.7+0xdc4/0x10e0
modify_qp.isra.7+0xdc4/0x10e0
ib_uverbs_modify_qp+0xfe/0x170
? ib_uverbs_query_qp+0x970/0x970
? __lock_acquire+0xa11/0x1da0
ib_uverbs_write+0x55a/0xad0
? ib_uverbs_query_qp+0x970/0x970
? ib_uverbs_query_qp+0x970/0x970
? ib_uverbs_open+0x760/0x760
? futex_wake+0x147/0x410
? sched_clock_cpu+0x18/0x180
? check_prev_add+0x1680/0x1680
? do_futex+0x3b6/0xa30
? sched_clock_cpu+0x18/0x180
__vfs_write+0xf7/0x5c0
? ib_uverbs_open+0x760/0x760
? kernel_read+0x110/0x110
? lock_acquire+0x370/0x370
? __fget+0x264/0x3b0
vfs_write+0x18a/0x460
SyS_write+0xc7/0x1a0
? SyS_read+0x1a0/0x1a0
? trace_hardirqs_on_thunk+0x1a/0x1c
entry_SYSCALL_64_fastpath+0x18/0x85
RIP: 0033:0x448e29
RSP: 002b:00007f443fee0c58 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00007f443fee16bc RCX: 0000000000448e29
RDX: 0000000000000078 RSI: 00000000209f8000 RDI: 0000000000000012
RBP: 000000000070bea0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
R13: 0000000000008e98 R14: 00000000006ebf38 R15: 0000000000000000
Allocated by task 1:
kmem_cache_alloc_trace+0x16c/0x2f0
mlx5_alloc_cmd_msg+0x12e/0x670
cmd_exec+0x419/0x1810
mlx5_cmd_exec+0x40/0x70
mlx5_core_mad_ifc+0x187/0x220
mlx5_MAD_IFC+0xd7/0x1b0
mlx5_query_mad_ifc_gids+0x1f3/0x650
mlx5_ib_query_gid+0xa4/0xc0
ib_query_gid+0x152/0x1a0
ib_query_port+0x21e/0x290
mlx5_port_immutable+0x30f/0x490
ib_register_device+0x5dd/0x1130
mlx5_ib_add+0x3e7/0x700
mlx5_add_device+0x124/0x510
mlx5_register_interface+0x11f/0x1c0
mlx5_ib_init+0x56/0x61
do_one_initcall+0xa3/0x250
kernel_init_freeable+0x309/0x3b8
kernel_init+0x14/0x180
ret_from_fork+0x24/0x30
Freed by task 1:
kfree+0xeb/0x2f0
mlx5_free_cmd_msg+0xcd/0x140
cmd_exec+0xeba/0x1810
mlx5_cmd_exec+0x40/0x70
mlx5_core_mad_ifc+0x187/0x220
mlx5_MAD_IFC+0xd7/0x1b0
mlx5_query_mad_ifc_gids+0x1f3/0x650
mlx5_ib_query_gid+0xa4/0xc0
ib_query_gid+0x152/0x1a0
ib_query_port+0x21e/0x290
mlx5_port_immutable+0x30f/0x490
ib_register_device+0x5dd/0x1130
mlx5_ib_add+0x3e7/0x700
mlx5_add_device+0x124/0x510
mlx5_register_interface+0x11f/0x1c0
mlx5_ib_init+0x56/0x61
do_one_initcall+0xa3/0x250
kernel_init_freeable+0x309/0x3b8
kernel_init+0x14/0x180
ret_from_fork+0x24/0x30
The buggy address belongs to the object at ffff88005fda1ab0
which belongs to the cache kmalloc-32 of size 32
The buggy address is located 24 bytes inside of
32-byte region [ffff88005fda1ab0, ffff88005fda1ad0)
The buggy address belongs to the page:
page:00000000d5655c19 count:1 mapcount:0 mapping: (null)
index:0xffff88005fda1fc0
flags: 0x4000000000000100(slab)
raw: 4000000000000100 0000000000000000 ffff88005fda1fc0 0000000180550008
raw: ffffea00017f6780 0000000400000004 ffff88006c803980 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff88005fda1980: fc fc fb fb fb fb fc fc fb fb fb fb fc fc fb fb
ffff88005fda1a00: fb fb fc fc fb fb fb fb fc fc 00 00 00 00 fc fc
ffff88005fda1a80: fb fb fb fb fc fc fb fb fb fb fc fc fb fb fb fb
ffff88005fda1b00: fc fc 00 00 00 00 fc fc fb fb fb fb fc fc fb fb
ffff88005fda1b80: fb fb fc fc fb fb fb fb fc fc fb fb fb fb fc fc
==================================================================@
Cc: syzkaller <syzkaller@googlegroups.com>
Cc: <stable@vger.kernel.org> # 4.11
Fixes: 3832125624 ("IB/core: Add support for idr types")
Reported-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
This clarifies the design intention that time between allocate and
commit has the uobj exclusive to the caller. We already guarantee
this by delaying publishing the uobj pointer via idr_insert,
fd_install, list_add, etc.
Additionally holding the usecnt lock during this period provides
extra clarity and more protection against future mistakes.
Fixes: 3832125624 ("IB/core: Add support for idr types")
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
If the same attribute is listed twice by the user in the ioctl attribute
list then error unwind can cause the kernel to deref garbage.
This happens when an object with WRITE access is sent twice. The second
parse properly fails but corrupts the state required for the error unwind
it triggers.
Fixing this by making duplicates in the attribute list invalid. This is
not something we need to support.
The ioctl interface is currently recommended to be disabled in kConfig.
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
32 bit processes running on a 64 bit kernel call compat_ioctl so that
implementations can revise any structure layout issues. Point compat_ioctl
at our normal ioctl because:
- All our structures are designed to be the same on 32 and 64 bit, ie we
use __aligned_u64 when required and are careful to manage padding.
- Any pointers are stored in u64's and userspace is expected
to prepare them properly.
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
This has no impact on the structure layout since these structs already
have their u64s already properly aligned, but it does document that we
have this requirement for 32 bit compatibility.
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Fix a bug in uverbs_ioctl_merge that looked at the object's iterator
number instead of the method's iterator number when merging methods.
While we're at it, make the uverbs_ioctl_merge code a bit more clear
and faster.
Fixes: 118620d368 ('IB/core: Add uverbs merge trees functionality')
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The union approach will get the endianness wrong sometimes if the kernel's
pointer size is 32 bits resulting in EFAULTs when trying to copy to/from
user.
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The rule for the API is pointers less than 8 bytes are inlined into
the .data field of the attribute. Fix the creation of the driver udata
struct to follow this rule and point to the .data itself when the size
is less than 8 bytes.
Otherwise if the UHW struct is less than 8 bytes the driver will get
EFAULT during copy_from_user.
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
This fixes several bugs around the copy_to/from user path:
- copy_to used the user provided size of the attribute
and could copy data beyond the end of the kernel buffer into
userspace.
- copy_from didn't know the size of the kernel buffer and
could have left kernel memory unexpectedly un-initialized.
- copy_from did not use the user length to determine if the
attribute data is inlined or not.
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Resource tracking of XRCD objects is not implemented in current
version of restrack and hence can be removed.
Fixes: 02d8883f52 ("RDMA/restrack: Add general infrastructure to track RDMA resources")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
netdev_wait_allrefs() could rebroadcast NETDEV_UNREGISTER event
multiple times until all refs are gone, which will result in calling
ipoib_delete_debug_files multiple times and printing a warning.
Remove the WARN_ONCE since checks of NULL pointers before calling
debugfs_remove are not needed.
Fixes: 771a525840 ("IB/IPoIB: ibX: failed to create mcg debug file")
Signed-off-by: Alaa Hleihel <alaa@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
In banked-sr.c, we use a top-level '__asm__(".arch_extension virt")'
statement to allow compilation of a multi-CPU kernel for ARMv6
and older ARMv7-A that don't normally support access to the banked
registers.
This is considered to be a programming error by the gcc developers
and will no longer work in gcc-8, where we now get a build error:
/tmp/cc4Qy7GR.s:34: Error: Banked registers are not available with this architecture. -- `mrs r3,SP_usr'
/tmp/cc4Qy7GR.s:41: Error: Banked registers are not available with this architecture. -- `mrs r3,ELR_hyp'
/tmp/cc4Qy7GR.s:55: Error: Banked registers are not available with this architecture. -- `mrs r3,SP_svc'
/tmp/cc4Qy7GR.s:62: Error: Banked registers are not available with this architecture. -- `mrs r3,LR_svc'
/tmp/cc4Qy7GR.s:69: Error: Banked registers are not available with this architecture. -- `mrs r3,SPSR_svc'
/tmp/cc4Qy7GR.s:76: Error: Banked registers are not available with this architecture. -- `mrs r3,SP_abt'
Passign the '-march-armv7ve' flag to gcc works, and is ok here, because
we know the functions won't ever be called on pre-ARMv7VE machines.
Unfortunately, older compiler versions (4.8 and earlier) do not understand
that flag, so we still need to keep the asm around.
Backporting to stable kernels (4.6+) is needed to allow those to be built
with future compilers as well.
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84129
Fixes: 33280b4cd1 ("ARM: KVM: Add banked registers save/restore")
Cc: stable@vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
When introducing support for irqchip in userspace we needed a way to
mask the timer signal to prevent the guest continuously exiting due to a
screaming timer.
We did this by disabling the corresponding percpu interrupt on the
host interrupt controller, because we cannot rely on the host system
having a GIC, and therefore cannot make any assumptions about having an
active state to hide the timer signal.
Unfortunately, when introducing this feature, it became entirely
possible that a VCPU which belongs to a VM that has a userspace irqchip
can disable the vtimer irq on the host on some physical CPU, and then go
away without ever enabling the vtimer irq on that physical CPU again.
This means that using irqchips in userspace on a system that also
supports running VMs with an in-kernel GIC can prevent forward progress
from in-kernel GIC VMs.
Later on, when we started taking virtual timer interrupts in the arch
timer code, we would also leave this timer state active for userspace
irqchip VMs, because we leave it up to a VGIC-enabled guest to
deactivate the hardware IRQ using the HW bit in the LR.
Both issues are solved by only using the enable/disable trick on systems
that do not have a host GIC which supports the active state, because all
VMs on such systems must use irqchips in userspace. Systems that have a
working GIC with support for an active state use the active state to
mask the timer signal for both userspace and in-kernel irqchips.
Cc: Alexander Graf <agraf@suse.de>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: <stable@vger.kernel.org> # v4.12+
Fixes: d9e1397783 ("KVM: arm/arm64: Support arch timers with a userspace gic")
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Following on from this patch: https://lkml.org/lkml/2017/11/3/516,
Corsair K70 RGB keyboards also require the DELAY_INIT quirk to
start correctly at boot.
Device ids found here:
usb 3-3: New USB device found, idVendor=1b1c, idProduct=1b13
usb 3-3: New USB device strings: Mfr=1, Product=2, SerialNumber=3
usb 3-3: Product: Corsair K70 RGB Gaming Keyboard
Signed-off-by: Jack Stocker <jackstocker.93@gmail.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Xtensa memory initialization code frees high memory pages without
checking whether they are in the reserved memory regions or not. That
results in invalid value of totalram_pages and duplicate page usage by
CMA and highmem. It produces a bunch of BUGs at startup looking like
this:
BUG: Bad page state in process swapper pfn:70800
page:be60c000 count:0 mapcount:-127 mapping: (null) index:0x1
flags: 0x80000000()
raw: 80000000 00000000 00000001 ffffff80 00000000 be60c014 be60c014 0000000a
page dumped because: nonzero mapcount
Modules linked in:
CPU: 0 PID: 1 Comm: swapper Tainted: G B 4.16.0-rc1-00015-g7928b2cbe55b-dirty #23
Stack:
bd839d33 00000000 00000018 ba97b64c a106578c bd839d70 be60c000 00000000
a1378054 bd86a000 00000003 ba97b64c a1066166 bd839da0 be60c000 ffe00000
a1066b58 bd839dc0 be504000 00000000 000002f4 bd838000 00000000 0000001e
Call Trace:
[<a1065734>] bad_page+0xac/0xd0
[<a106578c>] free_pages_check_bad+0x34/0x4c
[<a1066166>] __free_pages_ok+0xae/0x14c
[<a1066b58>] __free_pages+0x30/0x64
[<a1365de5>] init_cma_reserved_pageblock+0x35/0x44
[<a13682dc>] cma_init_reserved_areas+0xf4/0x148
[<a10034b8>] do_one_initcall+0x80/0xf8
[<a1361c16>] kernel_init_freeable+0xda/0x13c
[<a125b59d>] kernel_init+0x9/0xd0
[<a1004304>] ret_from_kernel_thread+0xc/0x18
Only free high memory pages that are not reserved.
Cc: stable@vger.kernel.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
There is a race condition between finish_unlinks->finish_urb() function
and usb_kill_urb() in ohci controller case. The finish_urb calls
spin_unlock(&ohci->lock) before usb_hcd_giveback_urb() function call,
then if during this time, usb_kill_urb is called for another endpoint,
then new ed will be added to ed_rm_list at beginning for unlink, and
ed_rm_list will point to newly added.
When finish_urb() is completed in finish_unlinks() and ed->td_list
becomes empty as in below code (in finish_unlinks() function):
if (list_empty(&ed->td_list)) {
*last = ed->ed_next;
ed->ed_next = NULL;
} else if (ohci->rh_state == OHCI_RH_RUNNING) {
*last = ed->ed_next;
ed->ed_next = NULL;
ed_schedule(ohci, ed);
}
The *last = ed->ed_next will make ed_rm_list to point to ed->ed_next
and previously added ed by usb_kill_urb will be left unreferenced by
ed_rm_list. This causes usb_kill_urb() hang forever waiting for
finish_unlink to remove added ed from ed_rm_list.
The main reason for hang in this race condtion is addition and removal
of ed from ed_rm_list in the beginning during usb_kill_urb and later
last* is modified in finish_unlinks().
As suggested by Alan Stern, the solution for proper handling of
ohci->ed_rm_list is to remove ed from the ed_rm_list before finishing
any URBs. Then at the end, we can add ed back to the list if necessary.
This properly handle the updated ohci->ed_rm_list in usb_kill_urb().
Fixes: 977dcfdc60 ("USB: OHCI: don't lose track of EDs when a controller dies")
Acked-by: Alan Stern <stern@rowland.harvard.edu>
CC: <stable@vger.kernel.org>
Signed-off-by: Aman Deep <aman.deep@samsung.com>
Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
At former code, the SETUP stage does not enable interrupt
for qtd completion, it relies on IAA watchdog to complete
interrupt, then the transcation would be considered timeout
if the flag need_io_watchdog is cleared by platform code.
In this commit, we always add enable interrupt for qtd completion,
then the qtd completion can be notified by hardware interrupt.
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch adds support for new CASSY devices to the ldusb driver. The
PIDs are also added to the ignore list in hid-quirks.
Signed-off-by: Karsten Koop <kkoop@ld-didactic.de>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This fixes an issue that a gadget driver (usb_f_fs) is possible to
stop rx transactions after the usb-dmac is used because the following
functions missed to set/check the "running" flag.
- usbhsf_dma_prepare_pop_with_usb_dmac()
- usbhsf_dma_pop_done_with_usb_dmac()
So, if next transaction uses pio, the usbhsf_prepare_pop() can not
start the transaction because the "running" flag is 0.
Fixes: 8355b2b308 ("usb: renesas_usbhs: fix the behavior of some usbhs_pkt_handle")
Cc: <stable@vger.kernel.org> # v3.19+
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a8c06e407e ("usb: separate out sysdev pointer from usb_bus")
converted to use hcd->self.sysdev for DMA operations instead of
hcd->self.controller, but forgot to do it for hcd test mode. Replace
the correct one in this commit.
Fixes: a8c06e407e ("usb: separate out sysdev pointer from usb_bus")
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Running io_watchdog_func() while ohci_urb_enqueue() is running can
cause a race condition where ohci->prev_frame_no is corrupted and the
watchdog can mis-detect following error:
ohci-platform 664a0800.usb: frame counter not updating; disabled
ohci-platform 664a0800.usb: HC died; cleaning up
Specifically, following scenario causes a race condition:
1. ohci_urb_enqueue() calls spin_lock_irqsave(&ohci->lock, flags)
and enters the critical section
2. ohci_urb_enqueue() calls timer_pending(&ohci->io_watchdog) and it
returns false
3. ohci_urb_enqueue() sets ohci->prev_frame_no to a frame number
read by ohci_frame_no(ohci)
4. ohci_urb_enqueue() schedules io_watchdog_func() with mod_timer()
5. ohci_urb_enqueue() calls spin_unlock_irqrestore(&ohci->lock,
flags) and exits the critical section
6. Later, ohci_urb_enqueue() is called
7. ohci_urb_enqueue() calls spin_lock_irqsave(&ohci->lock, flags)
and enters the critical section
8. The timer scheduled on step 4 expires and io_watchdog_func() runs
9. io_watchdog_func() calls spin_lock_irqsave(&ohci->lock, flags)
and waits on it because ohci_urb_enqueue() is already in the
critical section on step 7
10. ohci_urb_enqueue() calls timer_pending(&ohci->io_watchdog) and it
returns false
11. ohci_urb_enqueue() sets ohci->prev_frame_no to new frame number
read by ohci_frame_no(ohci) because the frame number proceeded
between step 3 and 6
12. ohci_urb_enqueue() schedules io_watchdog_func() with mod_timer()
13. ohci_urb_enqueue() calls spin_unlock_irqrestore(&ohci->lock,
flags) and exits the critical section, then wake up
io_watchdog_func() which is waiting on step 9
14. io_watchdog_func() enters the critical section
15. io_watchdog_func() calls ohci_frame_no(ohci) and set frame_no
variable to the frame number
16. io_watchdog_func() compares frame_no and ohci->prev_frame_no
On step 16, because this calling of io_watchdog_func() is scheduled on
step 4, the frame number set in ohci->prev_frame_no is expected to the
number set on step 3. However, ohci->prev_frame_no is overwritten on
step 11. Because step 16 is executed soon after step 11, the frame
number might not proceed, so ohci->prev_frame_no must equals to
frame_no.
To address above scenario, this patch introduces a special sentinel
value IO_WATCHDOG_OFF and set this value to ohci->prev_frame_no when
the watchdog is not pending or running. When ohci_urb_enqueue()
schedules the watchdog (step 4 and 12 above), it compares
ohci->prev_frame_no to IO_WATCHDOG_OFF so that ohci->prev_frame_no is
not overwritten while io_watchdog_func() is running.
Signed-off-by: Shigeru Yoshida <Shigeru.Yoshida@windriver.com>
Signed-off-by: Haiqing Bai <Haiqing.Bai@windriver.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The Quectel EP06 is a Cat. 6 LTE modem, and the interface mapping is as
follows:
0: Diag
1: NMEA
2: AT
3: Modem
Interface 4 is QMI and interface 5 is ADB, so they are blacklisted.
This patch should also be considered for -stable. The QMI-patch for this
modem is already in the -stable-queue.
v1->v2:
* Updated commit prefix (thanks Johan Hovold)
* Updated commit message slightly.
Signed-off-by: Kristian Evensen <kristian.evensen@gmail.com>
Acked-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In function xhci_stop, xhci_debugfs_exit called before xhci_mem_cleanup.
xhci_debugfs_exit removed the xhci debugfs root nodes, xhci_mem_cleanup
called function xhci_free_virt_devices_depth_first which in turn called
function xhci_debugfs_remove_slot.
Function xhci_debugfs_remove_slot removed the nodes for devices, the nodes
folders are sub folder of xhci debugfs.
It is unreasonable to remove xhci debugfs root folder before
xhci debugfs sub folder. Function xhci_mem_cleanup should be called
before function xhci_debugfs_exit.
Fixes: 02b6fdc2a1 ("usb: xhci: Add debugfs interface for xHCI driver")
Cc: <stable@vger.kernel.org> # v4.15
Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There is a bug after plugged out USB device, the device and its ep00
nodes are still kept, we need to remove the nodes in xhci_free_dev when
USB device is plugged out.
Fixes: 052f71e25a ("xhci: Fix xhci debugfs NULL pointer dereference in resume from hibernate")
Cc: <stable@vger.kernel.org> # v4.15
Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
During system resume from hibernation, xhci host is reset, all the
nodes in devices folder are removed in xhci_mem_cleanup function.
Later nodes in /sys/kernel/debug/usb/xhci/* are created again in
function xhci_run, but the nodes already exist, so the nodes still
keep the old ones, finally device nodes in xhci debugfs folder
/sys/kernel/debug/usb/xhci/*/devices/* are disappeared.
This fix removed xhci debugfs nodes before the nodes are re-created,
so all the nodes in xhci debugfs can be re-created successfully.
Fixes: 02b6fdc2a1 ("usb: xhci: Add debugfs interface for xHCI driver")
Cc: <stable@vger.kernel.org> # v4.15
Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When disabling a USB3 port the hub driver will set the port link state to
U3 to prevent "ejected" or "safely removed" devices that are still
physically connected from immediately re-enumerating.
If the device was really unplugged, then error messages were printed
as the hub tries to set the U3 link state for a port that is no longer
enabled.
xhci-hcd ee000000.usb: Cannot set link state.
usb usb8-port1: cannot disable (err = -32)
Don't print error message in xhci-hub if hub tries to set port link state
for a disabled port. Return -ENODEV instead which also silences hub driver.
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
For AMD Promontory xHCI host, although you can disable USB ports in
BIOS settings, those ports will be enabled anyway after you remove a
device on that port and re-plug it in again. It's a known limitation of
the chip. As a workaround we can clear the PORT_WAKE_BITS.
[commit and code comment rephrasing -Mathias]
Signed-off-by: Joe Lee <asmt.swfae@gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We are currently only checking for the first entry in the table while
we should check them all. Usual no-idle-on-init is together with
no-reset-on-init, so this has gone unnoticed.
Fixes: 566a9b05e1 ("bus: ti-sysc: Handle module quirks based dts
configuration")
Signed-off-by: Tony Lindgren <tony@atomide.com>
In order for ULPI PHYs to work, dwc3_phy_setup() and dwc3_ulpi_init()
must be doene before dwc3_core_get_phy().
commit 541768b08a ("usb: dwc3: core: Call dwc3_core_get_phy() before initializing phys")
broke this.
The other issue is that dwc3_core_get_phy() and dwc3_ulpi_init() should
be called only once during the life cycle of the driver. However,
as dwc3_core_init() is called during system suspend/resume it will
result in multiple calls to dwc3_core_get_phy() and dwc3_ulpi_init()
which is wrong.
Fix this by moving dwc3_ulpi_init() out of dwc3_phy_setup()
into dwc3_core_ulpi_init(). Use a flag 'ulpi_ready' to ensure that
dwc3_core_ulpi_init() is called only once from dwc3_core_init().
Use another flag 'phys_ready' to call dwc3_core_get_phy() only once from
dwc3_core_init().
Fixes: 541768b08a ("usb: dwc3: core: Call dwc3_core_get_phy() before initializing phys")
Fixes: f54edb539c ("usb: dwc3: core: initialize ULPI before trying to get the PHY")
Cc: linux-stable <stable@vger.kernel.org> # >= v4.13
Signed-off-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Add the missing platform_device_put() before return from bdc_pci_probe()
in the platform_device_add_resources() error handling case.
Fixes: efed421a94 ("usb: gadget: Add UDC driver for Broadcom USB3.0 device controller IP BDC")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Rockchip recommends to run the CPU cores only with operations points of
1.6 GHz or lower.
Removed the cpu0 node with too high operation points and use the default
values instead.
Fixes: 903d31e346 ("ARM: dts: rockchip: Add support for phyCORE-RK3288 SoM")
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Schultz <d.schultz@phytec.de>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Free Electrons is now Bootlin, change my email address accordingly.
Acked-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
This patch fixes a bootproblem with the Bananapi M2 board. Since there
are some regulators missing we add them right now. Those values come
from the schematic, below you can find a small overview:
* reg_aldo1: 3,3V, powers the wifi
* reg_aldo2: 2,5V, powers the IO of the RTL8211E
* reg_aldo3: 3,3V, powers the audio
* reg_dldo1: 3,0V, powers the RTL8211E
* reg_dldo2: 2,8V, powers the analog part of the csi
* reg_dldo3: 3,3V, powers misc
* reg_eldo1: 1,8V, powers the csi
* reg_ldo_io1:1,8V, powers the gpio
* reg_dc5ldo: needs to be always on
This patch updates also the vmmc-supply properties on the mmc0 and mmc2
node to use the allready existent regulators.
We can now remove the sunxi-common-regulators.dtsi include since we
don't need it anymore.
Fixes: 7daa213700 ("ARM: dts: sunxi: Add regulators for Sinovoip BPI-M2")
Cc: <stable@vger.kernel.org>
Signed-off-by: Philipp Rossak <embed3d@gmail.com>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Quentin Monnet says:
====================
These are two minor fixes to avoid breaking JSON output in batch mode. The
first one makes bpftool output a "null" JSON object, as expected in batch
mode if nothing else is to be printed, when dumping program instructions
into an output file. The second one replaces a call to "perror()" with
something that does not break JSON when parsing input file for batch mode.
====================
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Before this patch, perror() function is used in some cases when bpftool
fails to parse its input file in batch mode. This function does not
integrate well with the rest of the output when JSON is used, so we
replace it by something that is compliant.
Most calls to perror() had already been replaced in a previous patch,
this one is a leftover.
Fixes: d319c8e101c5 ("tools: bpftool: preserve JSON output on errors on batch file parsing")
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Print a "null" JSON object to standard output when bpftool is used to
print program instructions to a file, so as to avoid breaking JSON
output on batch mode.
This null object was added for most commands in a previous commit, but
this specific case had been omitted.
Fixes: 004b45c0e5 ("tools: bpftool: provide JSON output for all possible commands")
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
The eldoin is supplied from the dcdc1 regulator. The N_VBUSEN pin is
connected to an external power regulator (SY6280AAC).
With this commit we update the pmic binding properties to support
those features.
Fixes: 7daa213700 ("ARM: dts: sunxi: Add regulators for Sinovoip BPI-M2")
Cc: <stable@vger.kernel.org>
Signed-off-by: Philipp Rossak <embed3d@gmail.com>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
when build kernel with default configure, files:
generatenet/ipv4/netfilter/nf_nat_snmp_basic-asn1.c
net/ipv4/netfilter/nf_nat_snmp_basic-asn1.h
will be automatically generated by ASN.1 compiler, so
No need to track them in git, it's better to ignore them.
Signed-off-by: Zhu Lingshan <lszhu@suse.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
all of these print simple error message - use single pr_ratelimit call.
checkpatch complains about lines > 80 but this would require splitting
several "literals" over multiple lines which is worse.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
ebt_among still uses pr_err -- these errors indicate ebtables tool bug,
not a usage error.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
also convert this to info for consistency.
These errors are informational message to user, given iptables doesn't
have netlink extack equivalent.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
switch this to info, since these aren't really errors.
We only use printk because we cannot report meaningful errors
in the xtables framework.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
checkpatch complains about line > 80 but this would require splitting
"literal" over two lines which is worse.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
most messages are converted to info, since they occur in response to
wrong usage.
Size mismatch however is a real error (xtables ABI bug) that should not
occur.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
remove several pr_info messages that cannot be triggered with iptables,
the check is only to ensure input is sane.
iptables(8) already prints error messages in these cases.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
In clusterip_config_find_get() we hold RCU read lock so it could
run concurrently with clusterip_config_entry_put(), as a result,
the refcnt could go back to 1 from 0, which leads to a double
list_del()... Just replace refcount_inc() with
refcount_inc_not_zero(), as for c->refcount.
Fixes: d73f33b168 ("netfilter: CLUSTERIP: RCU conversion")
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Florian Westphal <fw@strlen.de>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Return the TLS record sequence number in getsockopt.
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
copy_from_user could copy some partial information, as a result
TLS_CRYPTO_INFO_READY(crypto_info) could be true while crypto_info is
using uninitialzed data.
This patch resets crypto_info when copy_from_user fails.
fixes: 3c4d755915 ("tls: kernel TLS support")
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current code returns four bytes of salt followed by four bytes of IV.
This patch returns all eight bytes of IV.
fixes: 3c4d755915 ("tls: kernel TLS support")
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Axtens says:
====================
Updates to segmentation-offloads.txt
I've been trying to wrap my head around GSO for a while now. This is a
set of small changes to the docs that would probably have been helpful
when I was starting out.
I realise that GSO_DODGY is still a notable omission - I'm hesitant to
write too much on it just yet as I don't understand it well and I
think it's in the process of changing.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The doc originally called it SKB_GSO_REMCSUM. Fix it.
Fixes: f7a6272bf3 ("Documentation: Add documentation for TSO and GSO features")
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
UFO is deprecated except for tuntap and packet per 0c19f846d5,
("net: accept UFO datagrams from tuntap and packet"). Update UFO
docs to reflect this.
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The rationale for removing the check is only correct for rulesets
generated by ip(6)tables.
In iptables, a jump can only occur to a user-defined chain, i.e.
because we size the stack based on number of user-defined chains we
cannot exceed stack size.
However, the underlying binary format has no such restriction,
and the validation step only ensures that the jump target is a
valid rule start point.
IOW, its possible to build a rule blob that has no user-defined
chains but does contain a jump.
If this happens, no jump stack gets allocated and crash occurs
because no jumpstack was allocated.
Fixes: 7814b6ec6d ("netfilter: xtables: don't save/restore jumpstack offset")
Reported-by: syzbot+e783f671527912cd9403@syzkaller.appspotmail.com
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Ying Xue says:
====================
tipc: Fix missing RTNL lock protection during setting link properties
At present it's unsafe to configure link properties through netlink
as the entire setting process is not under RTNL lock protection. Now
TIPC supports two different sets of netlink APIs at the same time, and
they share the same set of backend functions to configure bearer,
media and net properties. In order to solve the missing RTNL issue,
we have to make the whole __tipc_nl_compat_doit() protected by RTNL,
which means any function called within it cannot take RTNL any more.
So in the series we first introduce the following new functions which
doesn't hold RTNl lock:
- __tipc_nl_bearer_disable()
- __tipc_nl_bearer_enable()
- __tipc_nl_bearer_set()
- __tipc_nl_media_set()
- __tipc_nl_net_set()
Meanwhile, __tipc_nl_compat_doit() has been reconstructed to minimize
the time of holding RTNL lock.
Changes in v4:
- Per suggestion of Kirill Tkhai, divided original big one patch into
seven small ones so that they can be easily reviewed.
Changes in v3:
- Optimized return method of __tipc_nl_bearer_enable() regarding
the comments from David M and Kirill Tkhai
- Moved the allocations of memory in __tipc_nl_compat_doit() out
of RTNL lock to minimize the time of holding RTNL lock according
to the suggestion of Kirill Tkhai.
Changes in v2:
- The whole operation of setting bearer/media properties has been
protected under RTNL, as per feedback from David M.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently when user changes link properties, TIPC first checks if
user's command message contains media name or bearer name through
tipc_media_find() or tipc_bearer_find() which is protected by RTNL
lock. But when tipc_nl_compat_link_set() conducts the checking with
the two functions, it doesn't hold RTNL lock at all, as a result,
the following complaints were reported:
audit: type=1400 audit(1514679888.244:9): avc: denied { write } for
pid=3194 comm="syzkaller021477" path="socket:[11143]" dev="sockfs"
ino=11143 scontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023
tcontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023
tclass=netlink_generic_socket permissive=1
Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
=============================
WARNING: suspicious RCU usage
4.15.0-rc5+ #152 Not tainted
-----------------------------
net/tipc/bearer.c:177 suspicious rcu_dereference_protected() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
2 locks held by syzkaller021477/3194:
#0: (cb_lock){++++}, at: [<00000000d20133ea>] genl_rcv+0x19/0x40
net/netlink/genetlink.c:634
#1: (genl_mutex){+.+.}, at: [<00000000fcc5d1bc>] genl_lock
net/netlink/genetlink.c:33 [inline]
#1: (genl_mutex){+.+.}, at: [<00000000fcc5d1bc>] genl_rcv_msg+0x115/0x140
net/netlink/genetlink.c:622
stack backtrace:
CPU: 1 PID: 3194 Comm: syzkaller021477 Not tainted 4.15.0-rc5+ #152
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:53
lockdep_rcu_suspicious+0x123/0x170 kernel/locking/lockdep.c:4585
tipc_bearer_find+0x2b4/0x3b0 net/tipc/bearer.c:177
tipc_nl_compat_link_set+0x329/0x9f0 net/tipc/netlink_compat.c:729
__tipc_nl_compat_doit net/tipc/netlink_compat.c:288 [inline]
tipc_nl_compat_doit+0x15b/0x660 net/tipc/netlink_compat.c:335
tipc_nl_compat_handle net/tipc/netlink_compat.c:1119 [inline]
tipc_nl_compat_recv+0x112f/0x18f0 net/tipc/netlink_compat.c:1201
genl_family_rcv_msg+0x7b7/0xfb0 net/netlink/genetlink.c:599
genl_rcv_msg+0xb2/0x140 net/netlink/genetlink.c:624
netlink_rcv_skb+0x21e/0x460 net/netlink/af_netlink.c:2408
genl_rcv+0x28/0x40 net/netlink/genetlink.c:635
netlink_unicast_kernel net/netlink/af_netlink.c:1275 [inline]
netlink_unicast+0x4e8/0x6f0 net/netlink/af_netlink.c:1301
netlink_sendmsg+0xa4a/0xe60 net/netlink/af_netlink.c:1864
sock_sendmsg_nosec net/socket.c:636 [inline]
sock_sendmsg+0xca/0x110 net/socket.c:646
sock_write_iter+0x31a/0x5d0 net/socket.c:915
call_write_iter include/linux/fs.h:1772 [inline]
new_sync_write fs/read_write.c:469 [inline]
__vfs_write+0x684/0x970 fs/read_write.c:482
vfs_write+0x189/0x510 fs/read_write.c:544
SYSC_write fs/read_write.c:589 [inline]
SyS_write+0xef/0x220 fs/read_write.c:581
do_syscall_32_irqs_on arch/x86/entry/common.c:327 [inline]
do_fast_syscall_32+0x3ee/0xf9d arch/x86/entry/common.c:389
entry_SYSENTER_compat+0x54/0x63 arch/x86/entry/entry_64_compat.S:129
In order to correct the mistake, __tipc_nl_compat_doit() has been
protected by RTNL lock, which means the whole operation of setting
bearer/media properties is under RTNL protection.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reported-by: syzbot <syzbot+6345fd433db009b29413@syzkaller.appspotmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce __tipc_nl_bearer_set() which doesn't holding RTNL lock.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce __tipc_nl_bearer_enable() which doesn't hold RTNL lock.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce __tipc_nl_bearer_disable() which doesn't hold RTNL lock.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As preparation for adding RTNL to make (*cmd->transcode)() and
(*cmd->transcode)() constantly protected by RTNL lock, we move out of
memory allocations existing between them as many as possible so that
the time of holding RTNL can be minimized in __tipc_nl_compat_doit().
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Syzbot reported a possible deadlock in the netfilter area caused by
rtnl lock, xt lock and socket lock being acquired with a different order
on different code paths, leading to the following backtrace:
Reviewed-by: Xin Long <lucien.xin@gmail.com>
======================================================
WARNING: possible circular locking dependency detected
4.15.0+ #301 Not tainted
------------------------------------------------------
syzkaller233489/4179 is trying to acquire lock:
(rtnl_mutex){+.+.}, at: [<0000000048e996fd>] rtnl_lock+0x17/0x20
net/core/rtnetlink.c:74
but task is already holding lock:
(&xt[i].mutex){+.+.}, at: [<00000000328553a2>]
xt_find_table_lock+0x3e/0x3e0 net/netfilter/x_tables.c:1041
which lock already depends on the new lock.
===
Since commit 3f34cfae1230 ("netfilter: on sockopt() acquire sock lock
only in the required scope"), we already acquire the socket lock in
the innermost scope, where needed. In such commit I forgot to remove
the outer-most socket lock from the getsockopt() path, this commit
addresses the issues dropping it now.
v1 -> v2: fix bad subj, added relavant 'fixes' tag
Fixes: 22265a5c3c ("netfilter: xt_TEE: resolve oif using netdevice notifiers")
Fixes: 202f59afd4 ("netfilter: ipt_CLUSTERIP: do not hold dev")
Fixes: 3f34cfae1230 ("netfilter: on sockopt() acquire sock lock only in the required scope")
Reported-by: syzbot+ddde1c7b7ff7442d7f2d@syzkaller.appspotmail.com
Suggested-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Thomas Falcon says:
====================
ibmvnic: Fix memory leaks in the driver
This patch set is pretty self-explanatory. It includes
a number of patches that fix memory leaks found with
kmemleak in the ibmvnic driver.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
During device close or reset, there were some cases of outstanding
RX socket buffers not being freed. Include a function similar to the
one that already exists to clean TX socket buffers in this case.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If a RX buffer is returned to the client driver with an error, free the
corresponding socket buffer before continuing.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This memory is allocated during initialization but never freed,
so do that now.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
During device bringup, the driver exchanges login buffers with
firmware. These buffers contain information such number of TX
and RX queues alloted to the device, RX buffer size, etc. These
buffers weren't being properly freed on device reset or close.
We can free the buffer we send to firmware as soon as we get
a response. There is information in the response buffer that
the driver needs for normal operation so retain it until the
next reset or removal.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pushes back setting the carrier on until the end of the reset
code. This resolves a bug where a watchdog timer was detecting
that a TX queue had stalled before the adapter reset was complete.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit aa136d0c82.
As I previously[1] pointed out this implementation of XDP_REDIRECT is
wrong. XDP_REDIRECT is a facility that must work between different
NIC drivers. Another NIC driver can call ndo_xdp_xmit/nicvf_xdp_xmit,
but your driver patch assumes payload data (at top of page) will
contain a queue index and a DMA addr, this is not true and worse will
likely contain garbage.
Given you have not fixed this in due time (just reached v4.16-rc1),
the only option I see is a revert.
[1] http://lkml.kernel.org/r/20171211130902.482513d3@redhat.com
Cc: Sunil Goutham <sgoutham@cavium.com>
Cc: Christina Jacob <cjacob@caviumnetworks.com>
Cc: Aleksey Makarov <aleksey.makarov@cavium.com>
Fixes: aa136d0c82 ("net: thunderx: Add support for xdp redirect")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is to fix the file comments in stream.c and
stream_interleave.c
v1->v2:
rephrase the comment for stream.c according to Neil's suggestion.
Fixes: a83863174a ("sctp: prepare asoc stream for stream reconf")
Fixes: 0c3f6f6554 ("sctp: implement make_datafrag for sctp_stream_interleave")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
netif_set_real_num_tx_queues() can be called when netdev is up.
That usually happens when user requests change of number of
channels/rings with ethtool -L. The procedure for changing
the number of queues involves resetting the qdiscs and setting
dev->num_tx_queues to the new value. When the new value is
lower than the old one, extra care has to be taken to ensure
ordering of accesses to the number of queues vs qdisc reset.
Currently the queues are reset before new dev->num_tx_queues
is assigned, leaving a window of time where packets can be
enqueued onto the queues going down, leading to a likely
crash in the drivers, since most drivers don't check if TX
skbs are assigned to an active queue.
Fixes: e6484930d7 ("net: allocate tx queues in register_netdevice")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
syzkaller tried to perform a prog query in perf_event_query_prog_array()
where struct perf_event_query_bpf had an ids_len of 1,073,741,353 and
thus causing a warning due to failed kcalloc() allocation out of the
bpf_prog_array_copy_to_user() helper. Given we cannot attach more than
64 programs to a perf event, there's no point in allowing huge ids_len.
Therefore, allow a buffer that would fix the maximum number of ids and
also add a __GFP_NOWARN to the temporary ids buffer.
Fixes: f371b304f1 ("bpf/tracing: allow user space to query prog array on the same tp")
Fixes: 0911287ce3 ("bpf: fix bpf_prog_array_copy_to_user() issues")
Reported-by: syzbot+cab5816b0edbabf598b3@syzkaller.appspotmail.com
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The pinmuxing was missing for I2C1 which was causing intermittent issues
with the PMIC which is connected to I2C1. The bootloader did not quite
configure the I2C1 either, so when running at 2.6MHz, it was generating
errors at times.
This correctly sets the I2C1 pinmuxing so it can operate at 2.6MHz
Fixes: ab8dd3aed0 ("ARM: DTS: Add minimal Support for Logic PD DM3730
SOM-LV")
Signed-off-by: Adam Ford <aford173@gmail.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
The pinmuxing was missing for I2C1 which was causing intermittent issues
with the PMIC which is connected to I2C1. The bootloader did not quite
configure the I2C1 either, so when running at 2.6MHz, it was generating
errors at time.
This correctly sets the I2C1 pinmuxing so it can operate at 2.6MHz
Fixes: 687c276761 ("ARM: dts: Add minimal support for LogicPD Torpedo
DM3730 devkit")
Signed-off-by: Adam Ford <aford173@gmail.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
When exposing data access through debugfs, the correct
debugfs_create_*() functions must be used, depending on data type.
Remove all casts from data pointers passed to debugfs_create_*()
functions, as such casts prevent the compiler from flagging bugs.
Correct all wrong usage:
- clk.rate is unsigned long, not u32,
- clk.flags is u8, not u32, which exposed the successive
clk.rate_offset and clk.src_offset fields.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: Tony Lindgren <tony@atomide.com>
HS omaps use irq_save_secure_context() instead of irq_save_context()
so sar_base will never get initialized and irq_sar_clear() gets called
with a wrong address for HS omaps from irq_restore_context().
Starting with commit f4b9f40ae9 ("ARM: OMAP4+: Initialize SAR RAM
base early for proper CPU1 reset for kexec") we have it available,
and this ideally would been fixed with that commit already.
Fixes: f4b9f40ae9 ("ARM: OMAP4+: Initialize SAR RAM base early for
proper CPU1 reset for kexec")
Cc: Andrew F. Davis <afd@ti.com>
Cc: Dave Gerlach <d-gerlach@ti.com>
Cc: Keerthy <j-keerthy@ti.com>
Cc: Santosh Shilimkar <ssantosh@kernel.org>
Cc: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
For platform_suspend_ops, the finish call is too late to re-enable wake
irqs and we need re-enable wake irqs on wake call instead.
Otherwise noirq resume for devices has already happened. And then
dev_pm_disarm_wake_irq() has already disabled the dedicated wake irqs
when the interrupt triggers and the wake irq is never handled.
For devices that are already in PM runtime suspended state when we
enter suspend this means that a possible wake irq will never trigger.
And this can lead into a situation where a device has a pending padconf
wake irq, and the device will stay unresponsive to any further wake
irqs.
This issue can be easily reproduced by setting serial console log level
to zero, letting the serial console idle, and suspend the system from
an ssh terminal. Then try to wake up the system by typing to the serial
console.
Note that this affects only omap3 PRM interrupt as that's currently
the only omap variant that does anything in omap_pm_wake().
In general, for the wake irqs to work, the interrupt must have either
IRQF_NO_SUSPEND or IRQF_EARLY_RESUME set for it to trigger before
dev_pm_disarm_wake_irq() disables the wake irqs.
Reported-by: Grygorii Strashko <grygorii.strashko@ti.com>
Cc: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
This patch adds missing DT binding files to the Samsung ASoC
drivers entry.
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Acked-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
When more than one GP timers are used as kernel system timers and the
corresponding nodes in device-tree are marked with the same "disabled"
property, then the "attr" field of the property will be initialized
more than once as the property being added to sys file system via
__of_add_property_sysfs().
In __of_add_property_sysfs(), the "name" field of pp->attr.attr is set
directly to the return value of safe_name(), without taking care of
whether it's already a valid pointer to a memory block. If it is, its
old value will always be overwritten by the new one and the memory block
allocated before will a "ghost", then a kmemleak happened.
That the same "disabled" property being added to different nodes of device
tree would cause that kind of kmemleak overhead, at least once.
To fix it, allocate the property dynamically, and delete static one.
Signed-off-by: Qi Hou <qi.hou@windriver.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
KVM: s390: Fixes and improvements for 4.16
- optimization for the exitless interrupt support that was merged
in 4.16-rc1
- improve the branch prediction blocking for nested KVM
- replace some jump tables with switch statements to improve
expoline performance
Current implementation mute codec in global way (DAC block).
That means when user routes sound not from I2S but from
AUX source (LINE_IN) it also will be muted by alsa core.
This should not happen.
Signed-off-by: Michal Oleszczyk <oleszczyk.m@gmail.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Instead of having huge jump tables for function selection,
let's use normal switch/case statements for the instruction
handlers in intercept.c We can now also get rid of
intercept_handler_t.
This allows the compiler to make the right decision depending
on the situation (e.g. avoid jump-tables for thunks).
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Instead of having huge jump tables for function selection,
let's use normal switch/case statements for the instruction
handlers in priv.c
This allows the compiler to make the right decision depending
on the situation (e.g. avoid jump-tables for thunks).
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
If the guest runs with bp isolation when doing a SIE instruction,
we must also run the nested guest with bp isolation when emulating
that SIE instruction.
This is done by activating BPBC in the lpar, which acts as an override
for lower level guests.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
If GISA is available, we do not have to kick CPUs out of SIE to deliver
interrupts. The hardware can deliver such interrupts while running.
Cc: Michael Mueller <mimu@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
For interrupt injection of floating interrupts we queue the interrupt
either in the GISA or in the floating interrupt list. The first CPU
that looks at these data structures - either in KVM code or hardware
will then deliver that interrupt. To minimize latency we also:
-a: choose a VCPU to deliver that interrupt. We prefer idle CPUs
-b: we wake up the host thread that runs the VCPU
-c: set an I/O intervention bit for that CPU so that it exits guest
context as soon as the PSW I/O mask is enabled
This will make sure that this CPU will execute the interrupt delivery
code of KVM very soon.
We can now optimize the injection case if we have exitless interrupts.
The wakeup is still necessary in case the target CPU sleeps. We can
avoid the I/O intervention request bit though. Whenever this
intervention request would be handled, the hardware could also directly
inject the interrupt on that CPU, no need to go through the interrupt
injection loop of KVM.
Cc: Michael Mueller <mimu@linux.vnet.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
In case user program provides silly parameters, we want
a map_alloc() handler to return an error, not a NULL pointer,
otherwise we crash later in find_and_alloc_map()
Fixes: 1aa12bdf1b ("bpf: sockmap, add sock close() hook to remove socks")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
There is a memory leak happening in lpm_trie map_free callback
function trie_free. The trie structure itself does not get freed.
Also, trie_free function did not do synchronize_rcu before freeing
various data structures. This is incorrect as some rcu_read_lock
region(s) for lookup, update, delete or get_next_key may not complete yet.
The fix is to add synchronize_rcu in the beginning of trie_free.
The useless spin_lock is removed from this function as well.
Fixes: b95a5c4db0 ("bpf: add a longest prefix match trie map implementation")
Reported-by: Mathieu Malaterre <malat@debian.org>
Reported-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
When aacraid init fails with "AAC0: adapter self-test failed.", shutdown
leads to UBSAN warning and then oops:
[154316.118423] ================================================================================
[154316.118508] UBSAN: Undefined behaviour in drivers/scsi/scsi_lib.c:2328:27
[154316.118566] member access within null pointer of type 'struct Scsi_Host'
[154316.118631] CPU: 2 PID: 14530 Comm: reboot Tainted: G W 4.15.0-dirty #89
[154316.118701] Hardware name: Hewlett Packard HP NetServer/HP System Board, BIOS 4.06.46 PW 06/25/2003
[154316.118774] Call Trace:
[154316.118848] dump_stack+0x48/0x65
[154316.118916] ubsan_epilogue+0xe/0x40
[154316.118976] __ubsan_handle_type_mismatch+0xfb/0x180
[154316.119043] scsi_block_requests+0x20/0x30
[154316.119135] aac_shutdown+0x18/0x40 [aacraid]
[154316.119196] pci_device_shutdown+0x33/0x50
[154316.119269] device_shutdown+0x18a/0x390
[...]
[154316.123435] BUG: unable to handle kernel NULL pointer dereference at 000000f4
[154316.123515] IP: scsi_block_requests+0xa/0x30
This is because aac_shutdown() does
struct Scsi_Host *shost = pci_get_drvdata(dev);
scsi_block_requests(shost);
and that assumes shost has been assigned with pci_set_drvdata().
However, pci_set_drvdata(pdev, shost) is done in aac_probe_one() far
after bailing out with error from calling the init function
((*aac_drivers[index].init)(aac)), and when the init function fails, no
error is returned from aac_probe_one() so PCI layer assumes there is
driver attached, and tries to shut it down later.
Fix it by returning error from aac_probe_one() when card-specific init
function fails.
This fixes reboot on my HP NetRAID-4M with dead battery.
Signed-off-by: Meelis Roos <mroos@linux.ee>
Reviewed-by: Dave Carroll <david.carroll@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The data in NVRAM is not guaranteed to be NUL terminated. Since
snprintf expects byte-stream to accommodate null byte, the CHAP secret
is truncated. Use sprintf instead of snprintf to fix the truncation of
CHAP name and secret.
Signed-off-by: Andrew Vasquez <andrew.vasquez@cavium.com>
Signed-off-by: Nilesh Javali <nilesh.javali@cavium.com>
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Acked-by: Chris Leech <cleech@redhat.com>
Acked-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch is based on Max's original patch.
When the qla2xxx firmware is unavailable, eventually
qla2x00_sp_timeout() is reached, which calls the timeout function and
frees the srb_t instance.
The timeout function always resolves to qla2x00_async_iocb_timeout(),
which invokes another callback function called "done". All of these
qla2x00_*_sp_done() callbacks also free the srb_t instance; after
returning to qla2x00_sp_timeout(), it is freed again.
The fix is to remove the "sp->free(sp)" call from qla2x00_sp_timeout()
and add it to those code paths in qla2x00_async_iocb_timeout() which
do not already free the object.
This is how it looks like with KASAN:
BUG: KASAN: use-after-free in qla2x00_sp_timeout+0x228/0x250
Read of size 8 at addr ffff88278147a590 by task swapper/2/0
Allocated by task 1502:
save_stack+0x33/0xa0
kasan_kmalloc+0xa0/0xd0
kmem_cache_alloc+0xb8/0x1c0
mempool_alloc+0xd6/0x260
qla24xx_async_gnl+0x3c5/0x1100
Freed by task 0:
save_stack+0x33/0xa0
kasan_slab_free+0x72/0xc0
kmem_cache_free+0x75/0x200
qla24xx_async_gnl_sp_done+0x556/0x9e0
qla2x00_async_iocb_timeout+0x1c7/0x420
qla2x00_sp_timeout+0x16d/0x250
call_timer_fn+0x36/0x200
The buggy address belongs to the object at ffff88278147a440
which belongs to the cache qla2xxx_srbs of size 344
The buggy address is located 336 bytes inside of
344-byte region [ffff88278147a440, ffff88278147a598)
Reported-by: Max Kellermann <mk@cm4all.com>
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Cc: Max Kellermann <mk@cm4all.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Increase cmd_per_lun to allow more I/Os in progress per device,
particularly for NVMe's. The Hyper-V host side can handle the higher
count with no issues.
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Acked-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Some other drivers may be waiting for our extcon to show-up, exiting their
probe methods with -EPROBE_DEFER until we show up.
These drivers will typically get the cable state directly after getting
the extcon, this commit changes the int3496 code to wait for the initial
processing of the id-pin to complete before exiting probe() with 0, which
will cause devices waiting on the defered probe to get reprobed.
This fixes a race where the initial work might still be running while other
drivers were already calling extcon_get_state().
Fixes: 2f556bdb9f ("extcon: int3496: Add Intel INT3496 ACPI ... driver")
Cc: stable@vger.kernel.org
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
The Makefile lacks a couple of line continuation backslashes
in an `if' clause, which can make the subsequent rsync
command go awry over the whole filesystem (`rsync -a / /`).
/bin/sh: -c: line 5: syntax error: unexpected end of file
make[1]: [all] Error 1 (ignored)
TEST=$DIR"_test.sh"; \
if [ -e $DIR/$TEST ]; then
/bin/sh: -c: line 2: syntax error: unexpected end of file
make[1]: [all] Error 1 (ignored)
rsync -a $DIR/$TEST $BUILD_TARGET/;
[...a myriad of:]
[ rsync: readlink_stat("...") failed: Permission denied (13)]
[ skipping non-regular file "..."]
[ rsync: opendir "..." failed: Permission denied (13)]
[and many other errors...]
fi
make[1]: fi: Command not found
make[1]: [all] Error 127 (ignored)
done
make[1]: done: Command not found
make[1]: [all] Error 127 (ignored)
Signed-off-by: Daniel Díaz <daniel.diaz@linaro.org>
Acked-by: Pintu Agarwal <pintu.ping@gmail.com>
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Commit ebeeb1ad9b ("rds: tcp: use rds_destroy_pending() to synchronize
netns/module teardown and rds connection/workq management")
adds an rcu read critical section to __rd_conn_create. The
memory allocations in that critcal section need to use
GFP_ATOMIC to avoid sleeping.
This patch was verified with syzkaller reproducer.
Reported-by: syzbot+a0564419941aaae3fe3c@syzkaller.appspotmail.com
Fixes: ebeeb1ad9b ("rds: tcp: use rds_destroy_pending() to synchronize
netns/module teardown and rds connection/workq management")
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Based on patch: https://patchwork.kernel.org/patch/10042045/
arch64-linux-gnu-gcc -c sync.c -o sync/sync.o
sync.c:42:29: fatal error: linux/sync_file.h: No such file or directory
#include <linux/sync_file.h>
^
CFLAGS is not used during the compile step, so the system instead of
kernel headers are used. Fix this by adding CFLAGS to the OBJS compile
rule.
Reported-by: Lei Yang <Lei.Yang@windriver.com>
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Daniel Díaz <daniel.diaz@linaro.org>
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Jiri Pirko says:
====================
net: sched: couple of fixes
This patchset contains couple of fixes following-up the shared block
patchsets.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The offending commit wrongly assumes 1:1 mapping between block and q.
However, there are multiple blocks for a single q for classful qdiscs.
Since the obscure tc_u_common sharing mechanism expects it to be shared
among a qdisc, fix it by storing q pointer in case the block is not
shared.
Reported-by: Paweł Staszewski <pstaszewski@itcare.pl>
Reported-by: Cong Wang <xiyou.wangcong@gmail.com>
Fixes: 7fa9d974f3 ("net: sched: cls_u32: use block instead of q in tc_u_common")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It is pointless to set block->q for block which are shared among
multiple qdiscs. So remove the assignment in that case. Do a bit of code
reshuffle to make block->index initialized at that point so we can use
tcf_block_shared() helper.
Reported-by: Cong Wang <xiyou.wangcong@gmail.com>
Fixes: 4861738775 ("net: sched: introduce shared filter blocks infrastructure")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since mlxsw_sp_fib_create() and mlxsw_sp_mr_table_create()
use ERR_PTR macro to propagate int err through return of a pointer,
the return value is not NULL in case of failure. So if one
of the calls fails, one of vr->fib4, vr->fib6 or vr->mr4_table
is not NULL and mlxsw_sp_vr_is_used wrongly assumes
that vr is in use which leads to crash like following one:
[ 1293.949291] BUG: unable to handle kernel NULL pointer dereference at 00000000000006c9
[ 1293.952729] IP: mlxsw_sp_mr_table_flush+0x15/0x70 [mlxsw_spectrum]
Fix this by using local variables to hold the pointers and set vr->*
only in case everything went fine.
Fixes: 76610ebbde ("mlxsw: spectrum_router: Refactor virtual router handling")
Fixes: a3d9bc506d ("mlxsw: spectrum_router: Extend virtual routers with IPv6 support")
Fixes: d42b0965b1 ("mlxsw: spectrum_router: Add multicast routes notification handling functionality")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes a compile problem of some user space applications by not
including linux/libc-compat.h in uapi/if_ether.h.
linux/libc-compat.h checks which "features" the header files, included
from the libc, provide to make the Linux kernel uapi header files only
provide no conflicting structures and enums. If a user application mixes
kernel headers and libc headers it could happen that linux/libc-compat.h
gets included too early where not all other libc headers are included
yet. Then the linux/libc-compat.h would not prevent all the
redefinitions and we run into compile problems.
This patch removes the include of linux/libc-compat.h from
uapi/if_ether.h to fix the recently introduced case, but not all as this
is more or less impossible.
It is no problem to do the check directly in the if_ether.h file and not
in libc-compat.h as this does not need any fancy glibc header detection
as glibc never provided struct ethhdr and should define
__UAPI_DEF_ETHHDR by them self when they will provide this.
The following test program did not compile correctly any more:
#include <linux/if_ether.h>
#include <netinet/in.h>
#include <linux/in.h>
int main(void)
{
return 0;
}
Fixes: 6926e041a8 ("uapi/if_ether.h: prevent redefinition of struct ethhdr")
Reported-by: Guillaume Nault <g.nault@alphalink.fr>
Cc: <stable@vger.kernel.org> # 4.15
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Free Electrons is now Bootlin, change my email address accordingly.
Actually the free-electrons.com emails are still valid but as I don't
know for how many time, it's better to do the change now.
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Dcoumentation has been added by parsing through git commit history and
reading code. This might be useful for scripting and tracking changes in
the ABI.
I do not have complete descriptions for the following 3 attributes; they
have been annotated with the comment [to be documented] -
/sys/class/scsi_host/hostX/ahci_port_cmd
/sys/class/scsi_host/hostX/ahci_host_caps
/sys/class/scsi_host/hostX/ahci_host_cap2
Signed-off-by: Aishwarya Pant <aishpant@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Clean-up the documentation of sysfs interfaces to be in the same format
as described in Documentation/ABI/README. This will be useful for
tracking changes in the ABI. Attributes are grouped by function (device,
link or port) and then by date added.
This patch also adds documentation for one attribute -
/sys/class/ata_port/ataX/port_no
Signed-off-by: Aishwarya Pant <aishpant@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
The sanity test added in ecd7918745 can be bypassed, validation
only occurs if XFRM_STATE_ESN flag is set, but rest of code doesn't care
and just checks if the attribute itself is present.
So always validate. Alternative is to reject if we have the attribute
without the flag but that would change abi.
Reported-by: syzbot+0ab777c27d2bb7588f73@syzkaller.appspotmail.com
Cc: Mathias Krause <minipli@googlemail.com>
Fixes: ecd7918745 ("xfrm_user: ensure user supplied esn replay window is valid")
Fixes: d8647b79c3 ("xfrm: Add user interface for esn and big anti-replay windows")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Now that the flowcache is removed we need to generate
a new dummy bundle every time we check if the needed
SAs are in place because the dummy bundle is not cached
anymore. Fix it by passing the XFRM_LOOKUP_QUEUE flag
to xfrm_lookup(). This makes sure that we get a dummy
bundle in case the SAs are not yet in place.
Fixes: 3ca28286ea ("xfrm_policy: bypass flow_cache_lookup")
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Some versions of gcc generate a warning that the variable "emulated"
may be used uninitialized in function kvmppc_handle_load128_by2x64().
It would be used uninitialized if kvmppc_handle_load128_by2x64 was
ever called with vcpu->arch.mmio_vmx_copy_nums == 0, but neither of
the callers ever do that, so there is no actual bug. When gcc
generates a warning, it causes the build to fail because arch/powerpc
is compiled with -Werror.
This silences the warning by initializing "emulated" to EMULATE_DONE.
Fixes: 09f984961c ("KVM: PPC: Book3S: Add MMIO emulation for VMX instructions")
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Commit accb757d79 ("KVM: Move vcpu_load to arch-specific
kvm_arch_vcpu_ioctl_run", 2017-12-04) added a "goto out"
statement and an "out:" label to kvm_arch_vcpu_ioctl_run().
Since the only "goto out" is inside a CONFIG_VSX block,
compiling with CONFIG_VSX=n gives a warning that label "out"
is defined but not used, and because arch/powerpc is compiled
with -Werror, that becomes a compile error that makes the kernel
build fail.
Merge commit 1ab03c072f ("Merge tag 'kvm-ppc-next-4.16-2' of
git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc",
2018-02-09) added a similar block of code inside a #ifdef
CONFIG_ALTIVEC, with a "goto out" statement.
In order to make the build succeed, this adds a #ifdef around the
"out:" label. This is a minimal, ugly fix, to be replaced later
by a refactoring of the code. Since CONFIG_VSX depends on
CONFIG_ALTIVEC, it is sufficient to use #ifdef CONFIG_ALTIVEC here.
Fixes: accb757d79 ("KVM: Move vcpu_load to arch-specific kvm_arch_vcpu_ioctl_run")
Reported-by: Christian Zigotzky <chzigotzky@xenosoft.de>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Redoing the charger type detection to give the usb-role-switch code time
to properly set the role-switch is no good for mainline, since the
usb-role-switch code is not yet in mainline (my bad, sorry).
Also once we've that code there are better ways to fix this which are
not prone to racing as doing a retry after 2 seconds is.
This reverts commit 50082c17bb1455acacd376ae30dff92f2e1addbd.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Make the axp288_pwr_up_down_info array const char * const, this leads
to the following section size changes:
.text 0x674 -> 0x664
.data 0x148 -> 0x0f0
.rodata 0x0b4 -> 0x114
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Dennis rewrote the percpu area allocator some months ago, understands
most of the code base and has been responsive with the bug reports and
questions. Let's add him as a co-maintainer.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Christopher Lameter <cl@linux.com>
While adding cgroup2 interface for the cpu controller, 0d5936344f
("sched: Implement interface for cgroup unified hierarchy") forgot to
update input validation and left it to reject cpu.max config if any
descendant has set a higher value.
cgroup2 officially supports delegation and a descendant must not be
able to restrict what its ancestors can configure. For absolute
limits such as cpu.max and memory.max, this means that the config at
each level should only act as the upper limit at that level and
shouldn't interfere with what other cgroups can configure.
This patch updates config validation on cgroup2 so that the cpu
controller follows the same convention.
Signed-off-by: Tejun Heo <tj@kernel.org>
Fixes: 0d5936344f ("sched: Implement interface for cgroup unified hierarchy")
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org # v4.15+
Because power of Salvator-X board is cut off in suspend,
it needs to reset SATA PHY state in resume.
Otherwise, SATA partition could not be accessed anymore.
Signed-off-by: Khiem Nguyen <khiem.nguyen.xt@rvc.renesas.com>
Signed-off-by: Hien Dang <hien.dang.eb@rvc.renesas.com>
[reinit phy in sata_rcar_resume() function on R-Car Gen3 only]
[factor out SATA module init sequence]
[fixed the prefix for the subject]
Signed-off-by: Yoshihiro Kaneko <ykaneko0929@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
syzkaller hit a WARN() in ata_bmdma_qc_issue() when writing to /dev/sg0.
This happened because it issued an ATA pass-through command (ATA_16)
where the protocol field indicated that NCQ should be used -- but the
device did not support NCQ.
We could just remove the WARN() from libata-sff.c, but the real problem
seems to be that the SCSI -> ATA translation code passes through NCQ
commands without verifying that the device actually supports NCQ.
Fix this by adding the appropriate check to ata_scsi_pass_thru().
Here's reproducer that works in QEMU when /dev/sg0 refers to a disk of
the default type ("82371SB PIIX3 IDE"):
#include <fcntl.h>
#include <unistd.h>
int main()
{
char buf[53] = { 0 };
buf[36] = 0x85; /* ATA_16 */
buf[37] = (12 << 1); /* FPDMA */
buf[38] = 0x1; /* Has data */
buf[51] = 0xC8; /* ATA_CMD_READ */
write(open("/dev/sg0", O_RDWR), buf, sizeof(buf));
}
Fixes: ee7fb331c3 ("libata: add support for NCQ commands for SG interface")
Reported-by: syzbot+2f69ca28df61bdfc77cd36af2e789850355a221e@syzkaller.appspotmail.com
Cc: <stable@vger.kernel.org> # v4.4+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
syzkaller hit a WARN() in ata_qc_issue() when writing to /dev/sg0. This
happened because it issued a READ_6 command with no data buffer.
Just remove the WARN(), as it doesn't appear indicate a kernel bug. The
expected behavior is to fail the command, which the code does.
Here's a reproducer that works in QEMU when /dev/sg0 refers to a disk of
the default type ("82371SB PIIX3 IDE"):
#include <fcntl.h>
#include <unistd.h>
int main()
{
char buf[42] = { [36] = 0x8 /* READ_6 */ };
write(open("/dev/sg0", O_RDWR), buf, sizeof(buf));
}
Fixes: f92a26365a ("libata: change ATA_QCFLAG_DMAMAP semantics")
Reported-by: syzbot+f7b556d1766502a69d85071d2ff08bd87be53d0f@syzkaller.appspotmail.com
Cc: <stable@vger.kernel.org> # v2.6.25+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
syzkaller reported a crash in ata_bmdma_fill_sg() when writing to
/dev/sg1. The immediate cause was that the ATA command's scatterlist
was not DMA-mapped, which causes 'pi - 1' to underflow, resulting in a
write to 'qc->ap->bmdma_prd[0xffffffff]'.
Strangely though, the flag ATA_QCFLAG_DMAMAP was set in qc->flags. The
root cause is that when __ata_scsi_queuecmd() is preparing to relay a
SCSI command to an ATAPI device, it doesn't correctly validate the CDB
length before copying it into the 16-byte buffer 'cdb' in 'struct
ata_queued_cmd'. Namely, it validates the fixed CDB length expected
based on the SCSI opcode but not the actual CDB length, which can be
larger due to the use of the SG_NEXT_CMD_LEN ioctl. Since 'flags' is
the next member in ata_queued_cmd, a buffer overflow corrupts it.
Fix it by requiring that the actual CDB length be <= 16 (ATAPI_CDB_LEN).
[Really it seems the length should be required to be <= dev->cdb_len,
but the current behavior seems to have been intentionally introduced by
commit 607126c2a2 ("libata-scsi: be tolerant of 12-byte ATAPI commands
in 16-byte CDBs") to work around a userspace bug in mplayer. Probably
the workaround is no longer needed (mplayer was fixed in 2007), but
continuing to allow lengths to up 16 appears harmless for now.]
Here's a reproducer that works in QEMU when /dev/sg1 refers to the
CD-ROM drive that qemu-system-x86_64 creates by default:
#include <fcntl.h>
#include <sys/ioctl.h>
#include <unistd.h>
#define SG_NEXT_CMD_LEN 0x2283
int main()
{
char buf[53] = { [36] = 0x7e, [52] = 0x02 };
int fd = open("/dev/sg1", O_RDWR);
ioctl(fd, SG_NEXT_CMD_LEN, &(int){ 17 });
write(fd, buf, sizeof(buf));
}
The crash was:
BUG: unable to handle kernel paging request at ffff8cb97db37ffc
IP: ata_bmdma_fill_sg drivers/ata/libata-sff.c:2623 [inline]
IP: ata_bmdma_qc_prep+0xa4/0xc0 drivers/ata/libata-sff.c:2727
PGD fb6c067 P4D fb6c067 PUD 0
Oops: 0002 [#1] SMP
CPU: 1 PID: 150 Comm: syz_ata_bmdma_q Not tainted 4.15.0-next-20180202 #99
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014
[...]
Call Trace:
ata_qc_issue+0x100/0x1d0 drivers/ata/libata-core.c:5421
ata_scsi_translate+0xc9/0x1a0 drivers/ata/libata-scsi.c:2024
__ata_scsi_queuecmd drivers/ata/libata-scsi.c:4326 [inline]
ata_scsi_queuecmd+0x8c/0x210 drivers/ata/libata-scsi.c:4375
scsi_dispatch_cmd+0xa2/0xe0 drivers/scsi/scsi_lib.c:1727
scsi_request_fn+0x24c/0x530 drivers/scsi/scsi_lib.c:1865
__blk_run_queue_uncond block/blk-core.c:412 [inline]
__blk_run_queue+0x3a/0x60 block/blk-core.c:432
blk_execute_rq_nowait+0x93/0xc0 block/blk-exec.c:78
sg_common_write.isra.7+0x272/0x5a0 drivers/scsi/sg.c:806
sg_write+0x1ef/0x340 drivers/scsi/sg.c:677
__vfs_write+0x31/0x160 fs/read_write.c:480
vfs_write+0xa7/0x160 fs/read_write.c:544
SYSC_write fs/read_write.c:589 [inline]
SyS_write+0x4d/0xc0 fs/read_write.c:581
do_syscall_64+0x5e/0x110 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x21/0x86
Fixes: 607126c2a2 ("libata-scsi: be tolerant of 12-byte ATAPI commands in 16-byte CDBs")
Reported-by: syzbot+1ff6f9fcc3c35f1c72a95e26528c8e7e3276e4da@syzkaller.appspotmail.com
Cc: <stable@vger.kernel.org> # v2.6.24+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Indent the numbered item with one space like all other items in the same
list.
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: Tejun Heo <tj@kernel.org>
Exit directly with ENODEV, if the AHCI controller is not available
anymore. Otherwise a delay of 500ms for each port is added to the remove
function while trying to issue a command on the non-existent controller.
Signed-off-by: Stefan Roese <sr@denx.de>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
The control channel calls registered callbacks when control messages
such as XDomain protocol messages are received. The control channel
handling is done in a worker running on system workqueue which means the
networking driver can't run tear down flow which includes sending
disconnect request and waiting for a reply in the same worker. Otherwise
reply is never received (as the work is already running) and the
operation times out.
To fix this run disconnect ThunderboltIP flow asynchronously once
ThunderboltIP logout message is received.
Fixes: e69b6c02b4 ("net: Add support for networking over Thunderbolt cable")
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
When suspending to mem or disk the Thunderbolt controller typically goes
down as well tearing down the connection automatically. However, when
suspend to idle is used this does not happen so we need to make sure the
connection is properly disconnected before it can be re-established
during resume.
Fixes: e69b6c02b4 ("net: Add support for networking over Thunderbolt cable")
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixs the following comile warnings with ATA_DEBUG enabled,
which detected by Linaro GCC 5.2-2015.11:
drivers/ata/libata-scsi.c: In function 'ata_scsi_dump_cdb':
./include/linux/kern_levels.h:5:18: warning: format '%d' expects
argument of type 'int', but argument 6 has type 'u64 {aka long
long unsigned int}' [-Wformat=]
tj: Patch hand-applied and description trimmed.
Signed-off-by: Dong Bo <dongbo4@huawei.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Currently, if Wake-on-LAN is enabled, the SH-ETH device's module clock
is manually kept running during system suspend, to make sure the device
stays active.
Since commits 91c719f5ec ("soc: renesas: rcar-sysc: Keep wakeup
sources active during system suspend") and 744dddcae8 ("clk:
renesas: mstp: Keep wakeup sources active during system suspend"), this
workaround is no longer needed. Hence remove all explicit clock
handling to keep the device active.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, if Wake-on-LAN is enabled, the EtherAVB device's module clock
is manually kept running during system suspend, to make sure the device
stays active.
Since commit 91c719f5ec ("soc: renesas: rcar-sysc: Keep wakeup
sources active during system suspend") , this workaround is no longer
needed. Hence remove all explicit clock handling to keep the device
active.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When forcing a specific link mode, the PHY driver must clear the
existing speed and duplex bits in BMCR while preserving some other
control bits. This logic was accidentally inverted with the introduction
of phy_modify().
Fixes: fea23fb591 ("net: phy: convert read-modify-write to phy_modify()")
Signed-off-by: Ingo van Lil <inguin@gmx.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Avoid SKB coalescing if eor bit is set in one of the relevant
SKBs.
Fixes: c134ecb878 ("tcp: Make use of MSG_EOR in tcp_sendmsg")
Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove the 'if (chunk)' check in sctp_renege_events for idata process,
as all renege commands are generated in sctp_eat_data and it can't be
NULL.
The same thing we already did for common data in sctp_ulpq_renege.
Fixes: 94014e8d87 ("sctp: implement renege_events for sctp_stream_interleave")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After the support for SCTP_CID_I_DATA and SCTP_CID_I_FWD_TSN chunks,
the corresp conversion in sctp_cname should also be added. Otherwise,
in some places, pr_debug will print them as "unknown chunk".
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The pr_err in sctp_hash_transport was supposed to report a sctp bug
for using rhashtable/rhlist.
The err '-EEXIST' introduced in Commit cd2b708750 ("sctp: check
duplicate node before inserting a new transport") doesn't belong
to that case.
So just return -EEXIST back without pr_err any kmsg.
Fixes: cd2b708750 ("sctp: check duplicate node before inserting a new transport")
Reported-by: Wei Chen <weichen@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now br_sysfs_if file flush doesn't have attr show. To read it will
cause kernel panic after users chmod u+r this file.
Xiong found this issue when running the commands:
ip link add br0 type bridge
ip link add type veth
ip link set veth0 master br0
chmod u+r /sys/devices/virtual/net/veth0/brport/flush
timeout 3 cat /sys/devices/virtual/net/veth0/brport/flush
kernel crashed with NULL a pointer dereference call trace.
This patch is to fix it by return -EINVAL when brport_attr->show
is null, just the same as the check for brport_attr->store in
brport_store().
Fixes: 9cf637473c ("bridge: add sysfs hook to flush forwarding table")
Reported-by: Xiong Zhou <xzhou@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is a follow up to the commit
00846a4425 ("auxdisplay: Move arm-charlcd.c to drivers/auxdisplay folder")
for Device Tree binding.
No functional change.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Simply adjust the pin group to _x _y _z style, as to
keep the consistency in DT with previous naming scheme.
Fixes: 83c566806a ("pinctrl: meson-axg: Add new pinctrl driver for Meson AXG SoC")
Signed-off-by: Yixun Lan <yixun.lan@amlogic.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
We should call dwc2_hsotg_enqueue_setup() after properly
setting lx_state. Because it may cause error-out from
dwc2_hsotg_enqueue_setup() due to wrong value in lx_state.
Issue can be reproduced by loading driver while connected
A-Connector (start in A-HOST mode) then disconnect A-Connector
to switch to B-DEVICE.
Acked-by: John Youn <johnyoun@synopsys.com>
Signed-off-by: Vardan Mikayelyan <mvardan@synopsys.com>
Signed-off-by: Grigor Tovmasyan <tovmasya@synopsys.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
STSPHSERCVD (status phase received) interrupt should be
handled when EP0 is in DWC2_EP0_DATA_OUT state.
Sometimes STSPHSERCVD interrupt asserted , when EP0
is not in DATA_OUT state. Spurios interrupt.
Acked-by: John Youn <johnyoun@synopsys.com>
Signed-off-by: Minas Harutyunyan <hminas@synopsys.com>
Signed-off-by: Grigor Tovmasyan <tovmasya@synopsys.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
In some cases device sending ZLP IN on non EP0 which
reassigning EP0 OUT descriptor pointer to that EP.
Dedicated for EP0 OUT descriptor multiple time re-used by
other EP while that descriptor already in use by EP0 OUT
for SETUP transaction. As result when SETUP packet received
BNA interrupt asserting.
In dwc2_hsotg_program_zlp() function dwc2_gadget_set_ep0_desc_chain()
must be called only for EP0.
Acked-by: John Youn <johnyoun@synopsys.com>
Signed-off-by: Minas Harutyunyan <hminas@synopsys.com>
Signed-off-by: Grigor Tovmasyan <tovmasya@synopsys.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Clang reports the following warning:
drivers/usb/gadget/udc/fsl_udc_core.c:1312:10: warning: address of array
'ep->name' will always evaluate to 'true' [-Wpointer-bool-conversion]
if (ep->name)
~~ ~~~~^~~~
It seems that the authors intention was to check if the ep has been
configured through struct_ep_setup. Check whether struct usb_ep name
pointer has been set instead.
Signed-off-by: Stefan Agner <stefan@agner.ch>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
This fixes an issue that a gadget driver (usb_f_fs) is possible to
stop rx transactions after the usb-dmac is used because the following
functions missed to set/check the "running" flag.
- usbhsf_dma_prepare_pop_with_usb_dmac()
- usbhsf_dma_pop_done_with_usb_dmac()
So, if next transaction uses pio, the usbhsf_prepare_pop() can not
start the transaction because the "running" flag is 0.
Fixes: 8355b2b308 ("usb: renesas_usbhs: fix the behavior of some usbhs_pkt_handle")
Cc: <stable@vger.kernel.org> # v3.19+
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
The FIFO/Queue type values are incorrect. Correct them according to
DWC_usb3 programming guide section 1.2.27 (or DWC_usb31 section 1.2.25).
Additionally, this patch includes ProtocolStatusQ and AuxEventQ types.
Fixes: cf6d867d3b ("usb: dwc3: core: add fifo space helper")
Signed-off-by: Thinh Nguyen <thinhn@synopsys.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
The USB cable state can change during suspend/resume
so be sure to check and update the extcon state.
Signed-off-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
In commit 2bfa0719ac ("usb: gadget: function: f_fs: pass
companion descriptor along") there is a pointer arithmetic
bug where the comp_desc is obtained as follows:
comp_desc = (struct usb_ss_ep_comp_descriptor *)(ds +
USB_DT_ENDPOINT_SIZE);
Since ds is a pointer to usb_endpoint_descriptor, adding
7 to it ends up going out of bounds (7 * sizeof(struct
usb_endpoint_descriptor), which is actually 7*9 bytes) past
the SS descriptor. As a result the maxburst value will be
read incorrectly, and the UDC driver will also get a garbage
comp_desc (assuming it uses it).
Since Felipe wrote, "Eventually, f_fs.c should be converted
to use config_ep_by_speed() like all other functions, though",
let's finally do it. This allows the other usb_ep fields to
be properly populated, such as maxpacket and mult. It also
eliminates the awkward speed-based descriptor lookup since
config_ep_by_speed() does that already using the ones found
in struct usb_function.
Fixes: 2bfa0719ac ("usb: gadget: function: f_fs: pass companion descriptor along")
Cc: stable@vger.kernel.org
Signed-off-by: Jack Pham <jackp@codeaurora.org>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
During _ffs_func_bind(), the received descriptors are evaluated
to prepare for binding with the gadget in order to allocate
endpoints and optionally set up OS descriptors. However, the
high- and super-speed descriptors are only parsed based on
whether the gadget_is_dualspeed() and gadget_is_superspeed()
calls are true, respectively.
This is a problem in case a userspace program always provides
all of the {full,high,super,OS} descriptors when configuring a
function. Then, for example if a gadget device is not capable
of SuperSpeed, the call to ffs_do_descs() for the SS descriptors
is skipped, resulting in an incorrect offset calculation for
the vla_ptr when moving on to the OS descriptors that follow.
This causes ffs_do_os_descs() to fail as it is now looking at
the SS descriptors' offset within the raw_descs buffer instead.
_ffs_func_bind() should evaluate the descriptors unconditionally,
so remove the checks for gadget speed.
Fixes: f0175ab519 ("usb: gadget: f_fs: OS descriptors support")
Cc: stable@vger.kernel.org
Co-Developed-by: Mayank Rana <mrana@codeaurora.org>
Signed-off-by: Mayank Rana <mrana@codeaurora.org>
Signed-off-by: Jack Pham <jackp@codeaurora.org>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Commit e93650994a ("usb: phy: mxs: add usb charger type detection")
causes the following kernel hang on i.MX28:
[ 2.207973] usbcore: registered new interface driver usb-storage
[ 2.235659] Unable to handle kernel NULL pointer dereference at virtual address 00000188
[ 2.244195] pgd = (ptrval)
[ 2.246994] [00000188] *pgd=00000000
[ 2.250676] Internal error: Oops: 5 [#1] ARM
[ 2.254979] Modules linked in:
[ 2.258089] CPU: 0 PID: 1 Comm: swapper Not tainted 4.15.0-rc8-next-20180117-00002-g75d5f21 #7
[ 2.266724] Hardware name: Freescale MXS (Device Tree)
[ 2.271921] PC is at regmap_read+0x0/0x5c
[ 2.275977] LR is at mxs_phy_charger_detect+0x34/0x1dc
mxs_phy_charger_detect() makes accesses to the anatop registers via regmap,
however i.MX23/28 do not have such registers, which causes a NULL pointer
dereference.
Fix the issue by doing a NULL check on the 'regmap' pointer.
Fixes: e93650994a ("usb: phy: mxs: add usb charger type detection")
Cc: <stable@vger.kernel.org> # v4.15
Reviewed-by: Li Jun <jun.li@nxp.com>
Acked-by: Peter Chen <peter.chen@nxp.com>
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Commit 689bf72c6e ("usb: dwc3: Don't reinitialize core during
host bus-suspend/resume") updated suspend/resume routines to not
power_off and reinit PHYs/core for host mode.
It broke platforms that rely on DWC3 core to power_off PHYs to
enter low power state on system suspend.
Perform dwc3_core_exit/init only during host mode system_suspend/
resume to addresses power regression from above mentioned patch
and also allow USB session to stay connected across
runtime_suspend/resume in host mode. While at it also replace
existing checks for HOST only dr_mode with current_dr_role to
have similar core driver behavior for both Host-only and DRD+Host
configurations.
Fixes: 689bf72c6e ("usb: dwc3: Don't reinitialize core during host bus-suspend/resume")
Reviewed-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Manu Gautam <mgautam@codeaurora.org>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
In this function, we init the USB2 and USB3 PHYs, but if soft reset
times out, we don't unwind this.
Noticed by inspection.
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
If there are multiple functions associated with a configuration, then
the UAC2 interfaces may not start at zero. Set the correct first
interface number in the association descriptor so that the audio
interfaces are enumerated correctly in this case.
Reviewed-by: Krzysztof Opasiak <k.opasiak@samsung.com>
Signed-off-by: John Keeping <john@metanate.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
DWC3 tracks TRB counter for each ep0 direction separately. In control
read transfer completion handler, the driver needs to reset the TRB
enqueue counter for ep0 IN direction. Currently the driver only resets
the TRB counter for control OUT endpoint. Check for the data direction
and properly reset the TRB counter from correct control endpoint.
Cc: stable@vger.kernel.org
Fixes: c2da2ff006 ("usb: dwc3: ep0: don't use ep0in for transfers")
Signed-off-by: Thinh Nguyen <thinhn@synopsys.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
There are 2 control endpoint structures for DWC3. However, the driver
only updates the OUT direction control endpoint structure during
ConnectDone event. DWC3 driver needs to update the endpoint max packet
size for control IN endpoint as well. If the max packet size is not
properly set, then the driver will incorrectly calculate the data
transfer size and fail to send ZLP for HS/FS 3-stage control read
transfer.
The fix is simply to update the max packet size for the ep0 IN direction
during ConnectDone event.
Cc: stable@vger.kernel.org
Fixes: 72246da40f ("usb: Introduce DesignWare USB3 DRD Driver")
Signed-off-by: Thinh Nguyen <thinhn@synopsys.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
This patch fixes an issue that the renesas_usb3_remove() causes
NULL pointer dereference because the usb3_to_dev() macro will use
the gadget instance and it will be deleted before.
Fixes: cf06df3fae ("usb: gadget: udc: renesas_usb3: move pm_runtime_{en,dis}able()")
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
dwc3_of_simple_dev_pm_ops has never been used since commit a0d8c4cfdf
("usb: dwc3: of-simple: set dev_pm_ops"), but this commit has brought
and oops when unbind the device due this sequence:
dwc3_of_simple_remove
-> clk_disable ...
-> pm_runtime_put_sync
-> dwc3_of_simple_runtime_suspend
-> clk_disable (again)
This double call to clk_core_disable causes a kernel oops like this:
WARNING: CPU: 1 PID: 4022 at drivers/clk/clk.c:656 clk_core_disable+0x78/0x80
CPU: 1 PID: 4022 Comm: bash Not tainted 4.15.0-rc4+ #44
Hardware name: Google Kevin (DT)
pstate: 80000085 (Nzcv daIf -PAN -UAO)
pc : clk_core_disable+0x78/0x80
lr : clk_core_disable_lock+0x20/0x38
sp : ffff00000bbf3a90
...
Call trace:
clk_core_disable+0x78/0x80
clk_disable+0x1c/0x30
dwc3_of_simple_runtime_suspend+0x30/0x50
pm_generic_runtime_suspend+0x28/0x40
This patch fixes the unbalanced clk disable call by setting the num_clocks
variable to zero once the clocks were disabled.
Fixes: a0d8c4cfdf ("usb: dwc3: of-simple: set dev_pm_ops")
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
The pclk_vio_grf supply power for VIO GRF IOs, if it is disabled,
driver would failed to operate the VIO GRF registers.
The clock is optional but one of the side effects of don't have this clk
is that the Samsung Chromebook Plus fails to recover display after a
suspend/resume with following errors:
rockchip-dp ff970000.edp: Input stream clock not detected.
rockchip-dp ff970000.edp: Timeout of video streamclk ok
rockchip-dp ff970000.edp: unable to config video
Signed-off-by: Yakir Yang <ykk@rock-chips.com>
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
[this should also fix display failures when building rockchip-drm as module]
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
The endpoint control gpio for rk3399-sapphire boards is gpio2_a4,
so correct it now.
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
This commit enables thresh dma mode as this forces to disable checksuming,
and chooses delay values which make the interface stable.
These changes are needed, because ROCK64 is faced with two problems:
1. tx checksuming does not work with packets larger than 1498,
2. the default delays for tx/rx are not stable when using 1Gbps connection.
Delays were found out with:
https://github.com/ayufan-rock64/linux-build/tree/master/recipes/gmac-delays-test
Signed-off-by: Kamil Trzciński <ayufan@ayufan.eu>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
IPv6 doesn't work on the MacchiatoBIN board. It is caused by broken
multicast address filter in the mvpp2 driver.
The driver loads doesn't load any multicast entries if "allmulti" is not
set. This condition should be reversed.
The condition !netdev_mc_empty(dev) is useless (because
netdev_for_each_mc_addr is nop if the list is empty).
This patch also fixes a possible overflow of the multicast list - if
mvpp2_prs_mac_da_accept fails, we set the allmulti flag and retry.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Switch to use dividing to prevent integer overflow when size is too
big to calculate allocation size properly.
Reported-by: Eric Biggers <ebiggers3@gmail.com>
Fixes: 6e6e41c311 ("ptr_ring: fail early if queue occupies more than KMALLOC_MAX_SIZE")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If matrix_keypad_stop() is executing and the keypad interrupt is triggered,
disable_row_irqs() may be called by both matrix_keypad_interrupt() and
matrix_keypad_stop() at the same time, causing interrupts to be disabled
twice and the keypad being "stuck" after resuming.
Take lock when setting keypad->stopped to ensure that ISR will not race
with matrix_keypad_stop() disabling interrupts.
Signed-off-by: Zhang Bo <zbsdta@126.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
stm32_vrefbuf_enable() wrongly checks VRR bit: 0 stands for not ready,
1 for ready. It currently checks the opposite.
This makes enable routine to exit immediately without waiting for ready
flag.
Fixes: 0cdbf481e9 ("regulator: Add support for stm32-vrefbuf")
Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
There is a race condition between clusterip_config_entry_put()
and clusterip_config_init(), after we release the spinlock in
clusterip_config_entry_put(), a new proc file with a same IP could
be created immediately since it is already removed from the configs
list, therefore it triggers this warning:
------------[ cut here ]------------
proc_dir_entry 'ipt_CLUSTERIP/172.20.0.170' already registered
WARNING: CPU: 1 PID: 4152 at fs/proc/generic.c:330 proc_register+0x2a4/0x370 fs/proc/generic.c:329
Kernel panic - not syncing: panic_on_warn set ...
As a quick fix, just move the proc_remove() inside the spinlock.
Reported-by: <syzbot+03218bcdba6aa76441a3@syzkaller.appspotmail.com>
Fixes: 6c5d5cfbe3 ("netfilter: ipt_CLUSTERIP: check duplicate config when initializing")
Tested-by: Paolo Abeni <pabeni@redhat.com>
Cc: Xin Long <lucien.xin@gmail.com>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Using compatible strings without the <manufacturer> part for at24 is
deprecated since commit 6da28acf74 ("dt-bindings: at24: consistently
document the compatible property"). Use a correct 'atmel,<model>'
value.
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
Acked-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Functions for triggered buffer support are needed by this module.
If they are not defined accidentally by another driver, there's an error
thrown out while linking.
Add a select of IIO_BUFFER and IIO_TRIGGERED_BUFFER in the Kconfig file.
Signed-off-by: Andreas Klinger <ak@it-klinger.de>
Fixes: a831959371 ("iio: srf08: add triggered buffer support")
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Replace the original license statement with the SPDX identifier.
Add also one line of description as recommended by the COPYING
file.
Signed-off-by: Andi Shyti <andi.shyti@samsung.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
The driver has been released with GNU Public License v2 as stated
in the header, but the module license information has been tagged
as "GPL" (GNU Public License v2 or later).
Fix the module license information so that it matches the one in
the header as "GPL v2".
Fixes: 07b8481d4a ("Input: add MELFAS mms114 touchscreen driver")
Reported-by: Marcus Folkesson <marcus.folkesson@gmail.com>
Signed-off-by: Andi Shyti <andi.shyti@samsung.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
We don't have a compat layer for xfrm, so userspace and kernel
structures have different sizes in this case. This results in
a broken configuration, so refuse to configure socket policies
when trying to insert from 32 bit userspace as we do it already
with policies inserted via netlink.
Reported-and-tested-by: syzbot+e1a1577ca8bcb47b769a@syzkaller.appspotmail.com
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
During a non-blocking commit, it is possible to return before the
commit_tail work is queued (-ERESTARTSYS, for example).
Since a reference on the crtc commit object is obtained for the pending
vblank event when preparing the commit, the above situation will leave
us with an extra reference.
Therefore, if the commit_tail worker has not consumed the event at the
end of a commit, release it's reference.
Changes since v1:
- Also check for state->event->base.completion being set, to
handle the case where stall_checks() fails in setup_crtc_commit().
Changes since v2:
- Add a flag to drm_crtc_commit, to prevent dereferencing a freed event.
i915 may unreference the state in a worker.
Fixes: 24835e442f ("drm: reference count event->completion")
Cc: <stable@vger.kernel.org> # v4.11+
Signed-off-by: Leo (Sunpeng) Li <sunpeng.li@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com> #v1
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180117115108.29608-1-maarten.lankhorst@linux.intel.com
Reviewed-by: Sean Paul <seanpaul@chromium.org>
mesh TTL offset in Mesh Channel Switch Parameters element depends on
not only Secondary Channel Offset element, but also affected by
HT Control field and Wide Bandwidth Channel Switch element.
So use element structure to manipulate mesh channel swich param IE
after removing its constant attribution to correct the miscalculation.
Signed-off-by: Peter Oh <peter.oh@bowerswilkins.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Mesh used to use the mandatory rates as basic rates, but we got
the calculation of mandatory rates wrong until some time ago.
Fix this this broke interoperability with older versions since
now more basic rates are required, and thus the MBSS isn't the
same and the network stops working.
Fix this by simply using only 1Mbps as the basic rate in 2.4GHz.
Since the changed mandatory rates only affected 2.4GHz, this is
all we need to make it work again.
Reported-and-tested-by: Matthias Schiffer <mschiffer@universe-factory.net>
Fixes: 1bd773c077 ("wireless: set correct mandatory rate flags")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Since commit 3a025e1d1c ("Add optional check for bad kernel-doc
comments") building with W=1 causes warnings to appear for issues in
kernel-doc headers. This patch avoids that the following warnings are
reported when building with W=1:
drivers/scsi/device_handler/scsi_dh_alua.c:867: warning: No description found for parameter 'pg'
drivers/scsi/device_handler/scsi_dh_alua.c:867: warning: No description found for parameter 'sdev'
drivers/scsi/device_handler/scsi_dh_alua.c:867: warning: No description found for parameter 'qdata'
drivers/scsi/device_handler/scsi_dh_alua.c:867: warning: No description found for parameter 'force'
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Remove line using non-existent files which were removed in
commit 642978beb4 ("[SCSI] remove m68k NCR53C9x based drivers")
[mkp: tweaked patch description]
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
aiclib.c is unused (and contains no code) since commit 1ff927306e
("[SCSI] aic7xxx: remove aiclib.c")
13 years later, finish the cleaning by removing it from tree.
[mkp: tweaked patch description]
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
A left shift must shift less than the bit width of the left argument.
Avoid triggering undefined behavior if ha->mbx_count == 32.
This patch avoids that UBSAN reports the following complaint:
UBSAN: Undefined behaviour in drivers/scsi/qla2xxx/qla_isr.c:275:14
shift exponent 32 is too large for 32-bit type 'int'
Call Trace:
dump_stack+0x4e/0x6c
ubsan_epilogue+0xd/0x3b
__ubsan_handle_shift_out_of_bounds+0x112/0x14c
qla2x00_mbx_completion+0x1c5/0x25d [qla2xxx]
qla2300_intr_handler+0x1ea/0x3bb [qla2xxx]
qla2x00_mailbox_command+0x77b/0x139a [qla2xxx]
qla2x00_mbx_reg_test+0x83/0x114 [qla2xxx]
qla2x00_chip_diag+0x354/0x45f [qla2xxx]
qla2x00_initialize_adapter+0x2c2/0xa4e [qla2xxx]
qla2x00_probe_one+0x1681/0x392e [qla2xxx]
pci_device_probe+0x10b/0x1f1
driver_probe_device+0x21f/0x3a4
__driver_attach+0xa9/0xe1
bus_for_each_dev+0x6e/0xb5
driver_attach+0x22/0x3c
bus_add_driver+0x1d1/0x2ae
driver_register+0x78/0x130
__pci_register_driver+0x75/0xa8
qla2x00_module_init+0x21b/0x267 [qla2xxx]
do_one_initcall+0x5a/0x1e2
do_init_module+0x9d/0x285
load_module+0x20db/0x38e3
SYSC_finit_module+0xa8/0xbc
SyS_finit_module+0x9/0xb
do_syscall_64+0x77/0x271
entry_SYSCALL64_slow_path+0x25/0x25
Reported-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Laurence Oberman <loberman@redhat.com>
Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
My static checker complains about an out of bounds read:
drivers/message/fusion/mptctl.c:2786 mptctl_hp_targetinfo()
error: buffer overflow 'hd->sel_timeout' 255 <= u32max.
It's true that we probably should have a bounds check here.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
We wanted to exit the loop with "div" set to zero, but instead, if we
don't hit the break then "div" is -1 when we finish the loop. It leads
to an array underflow a few lines later.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Acked-by: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When a request times out we set the io_req flag BNX2FC_FLAG_IO_COMPL so
that if a subsequent completion comes in on that task ID we will ignore
it. The issue is that in the check for this flag there is a missing
return so we will continue to process a request which may have already
been returned to the ownership of the SCSI layer. This can cause
unpredictable results.
Solution is to add in the missing return.
[mkp: typo plus title shortening]
Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Reviewed-by: Laurence Oberman <loberman@redhat.com>
Tested-by: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The pointer ln is assigned a value that is never read, it is re-assigned
a new value in the list_for_each loop hence the initialization is
redundant and can be removed.
Cleans up clang warning:
drivers/scsi/csiostor/csio_lnode.c:117:21: warning: Value stored to 'ln'
during its initialization is never read
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Varun Prakash <varun@chelsio.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
WRITE_SAME command is not supported by UFS. Enable a quirk for the upper
level drivers to not send WRITE SAME command.
[mkp: botched patch, applied by hand]
Signed-off-by: Sujit Reddy Thumma <sthumma@codeaurora.org>
Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Asutosh Das <asutoshd@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The fcp_rsp_info structure as defined in the FC spec has an initial 3
bytes reserved field. The ibmvfc driver mistakenly defined this field as
4 bytes resulting in the rsp_code field being defined in what should be
the start of the second reserved field and thus always being reported as
zero by the driver.
Ideally, we should wire ibmvfc up with libfc for the sake of code
deduplication, and ease of maintaining standardized structures in a
single place. However, for now simply fixup the definition in ibmvfc for
backporting to distros on older kernels. Wiring up with libfc will be
done in a followup patch.
Cc: <stable@vger.kernel.org>
Reported-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
cpu_msix_table is allocated to store online cpus, but pci_irq_get_affinity
may return cpu_possible_mask which is then used to access cpu_msix_table.
That causes bad user experience. Fix limits access to only online cpus,
I've also added an additional test to protect from an unlikely change in
cpu_online_mask.
[mkp: checkpatch]
Fixes: 1d55abc0e9 ("scsi: mpt3sas: switch to pci_alloc_irq_vectors")
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Acked-by: Suganath Prabu Subramani <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Since commit 152a6a884a ("staging:iio:accel:sca3000 move
to hybrid hard / soft buffer design.")
the buffer mechanism has changed and the
INDIO_BUFFER_HARDWARE flag has been unused.
Since commit 2d6ca60f32 ("iio: Add a DMAengine framework
based buffer")
the INDIO_BUFFER_HARDWARE flag has been re-purposed for
DMA buffers.
This driver has lagged behind these changes, and
in order for buffers to work, the INDIO_BUFFER_SOFTWARE
needs to be used.
Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Fixes: 2d6ca60f32 ("iio: Add a DMAengine framework based buffer")
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Error handling in stm32h7_adc_enable routine doesn't unwind enable
sequence correctly. ADEN can only be cleared by hardware (e.g. by
writing one to ADDIS).
It's also better to clear ADRDY just after it's been set by hardware.
Fixes: 95e339b6e8 ("iio: adc: stm32: add support for STM32H7")
Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
The external clock frequency was set only when selecting
the internal clock, which is fixed at 4.9152 Mhz.
This is incorrect, since it should be set when any of
the external clock or crystal settings is selected.
Added range validation for the external (crystal/clock)
frequency setting.
Valid values are between 2.4576 and 5.12 Mhz.
Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
The trailing semicolon is an empty statement that does no operation.
Removing it since it doesn't do anything.
Signed-off-by: Luis de Bethencourt <luisbg@kernel.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
The trailing semicolon is an empty statement that does no operation.
Removing it since it doesn't do anything.
Signed-off-by: Luis de Bethencourt <luisbg@kernel.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
It's very convenient to have fatal signals enabled on developemnt
platform as this allows to catch problems that happen early in
user-space (like crashing init or dynamic loader).
Otherwise we may either enable it later from alive taregt console
by "echo 1 > /proc/sys/kernel/print-fatal-signals" but:
1. We might be unfortunate enough to not reach working console
2. Forget to enable fatal signals and miss something interesting
Given we're talking about development platforms here it shouldn't
be a problem if a bit more data gets printed to debug console.
Moreover this makes behavior of all our dev platforms predictable
as today some platforms already have it enabled and some don't -
which is way too inconvenient.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
As per PRM "kflag" instruction doesn't change state of
L-flag ("Zero-Overhead loop disabled") in STATUS32 register
so let's not act as if we can affect this bit.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
slc_entire_op with OP_FLUSH command also invalidates it.
This is a preventive fix as the current use of slc_entire_op is only
with OP_FLUSH_N_INV where the invalidate is required.
Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
[vgupta: fixed changelog]
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
The labels and branching order of the error path of 'aspeed_adc_probe()'
are broken.
Re-order the labels and goto statements.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
The optional clocks must be enabled before the main clock after the
transition to clkctrl controlled clocks is done. Otherwise the module
we attempt to enable might be stuck in transition.
Reported-by: Keerthy <j-keerthy@ti.com>
Tested-by: Keerthy <j-keerthy@ti.com>
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2017-12-22 10:48:07 -08:00
1523 changed files with 15958 additions and 9658 deletions
@@ -185,8 +217,23 @@ static int __init dns323_read_mac_addr(void)
if(!mac_page)
return-ENOMEM;
if(!mac_pton((__forceconstchar*)mac_page,addr))
gotoerror_fail;
/* Sanity check the string we're looking at */
for(i=0;i<5;i++){
if(*(mac_page+(i*3)+2)!=':'){
gotoerror_fail;
}
}
for(i=0;i<6;i++){
intbyte;
byte=dns323_parse_hex_byte(mac_page+(i*3));
if(byte<0){
gotoerror_fail;
}
addr[i]=byte;
}
iounmap(mac_page);
printk("DNS-323: Found ethernet MAC address: %pM\n",addr);
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.